New Google AI can surf independently and more

Google has announced the new AI model Gemini 2.0, which the company says can generate images and audio in addition to text. Gemini 2.0 is still in “experimental preview,” as the company calls it. Only the smaller, cheaper version of the model called “Flash” will be released.

Developers can work with it via the Gemini API. Demis Hassabis, CEO of Google Deepmind, emphasizes the speed of the new model to The Verge. The performance is “a whole lot better” at the same cost, he adds.

Multimodality should also be the basis for creating AI agents. “We see 2025 as the true beginning of the agent-based era,” says Hassabis, “and Gemini 2.0 is the foundation for that.”

Mariner: An AI bot for the browser

With Jules, the company is showing an agent who is supposed to help software developers improve code. “Project Mariner” is the agent for the Chrome browser. “Mariner would behave exactly as a human user would in a browser,” said Google manager Tulsee Doshi. “He can click, tap and scroll, just like you as a user.”

Jaclyn Konzelmann, head of Google Labs, showed “Techcrunch” how the agent can fill shopping carts based on shopping lists. The system automatically connected to an online retailer, but needed about five seconds to select an individual item in the shopping cart. But there are restrictions for Mariners. He is not allowed to make any purchases without asking first and is not allowed to accept cookies.