The Mountain View company Google has officially introduced the latest AI model called Gemini 2.5 Computer Use and the same will enable the AI agents to interact with websites and user interfaces the way a human would. The new model is currently available in public preview via the Gemini API on Google AI Studio and Vertex AI.
Google has built the latest Gemini 2.5 Computer Use on Gemini 2.5 Pro’s visual understanding and reasoning capabilities. The new AI model is capable to perform a wide range of browser-based actions including clicking, typing, scrolling, hovering, opening dropdowns, and navigating through URLs.
The company says that the latest model outperforms competing tools on several benchmarks, including Online-Mind2Web, WebVoyager, and AndroidWorld, while maintaining lower latency. The new Gemini 2.5 Computer Use can able to process screenshots of web interfaces and it is capable to generate specific UI actions in response.
It is capable to receive a task prompt, a screenshot of the digital environment, and a history of recent actions. Then the model will be analysing the interface and returns a UI action, such as clicking a button or typing into a field. The action will be done on the client side, and a new screenshot will be sent back to the model to continue the task in a loop.
Currently, the newly launched model supports 13 actions and works best with web browsers. The company has also taken care of safety measures to prevent misuse. So share your thoughts about the new AI model in the comments section.
Read interesting news, reviews as well as tips & tricks on TechnoBugg website, and stay updated with the latest happenings of the tech world on the go with Technobugg App. Also follow on Google News and join our Telegram channel as well as WhatsApp Channel for the latest updates.





