I’ve spent the last couple of days testing the integration between UI Vision and LLM. Here are my observations and thoughts:
- AiComputerUse consumes a significant number of tokens. Integrating with open-source LLMs, such as Meta or Qwen, locally might be a more efficient approach.
- I’m unsure how to leverage AiComputerUse to manage internal enterprise applications, like internal forms and CRM systems. However, if ComputerUse can automate tasks based on prompts, it could reduce RPA development work and focus on data and prompt preparation. This could be the future of RPA development.
- Existing prompted instructions, such as AIScreenXY, help handle dynamic web responses and minimize change requests due to business design changes. I’m excited to explore integrating these features with popular LLMs like OpenAI, Azure OpenAI, Gemini, and DeepSeek. It’s thrilling to test LLMs with RPA and manage mouse movements, clicks, and content filling.
Please continue the excellent work and provide more integration options with different LLMs.
Thanks!