Google unveiled the first version of its Gemini 2.0 artificial intelligence family on Wednesday, introducing the Gemini 2.0 Flash model. The new AI system is available in a chat format for users globally, while an experimental multimodal version, offering text-to-speech and image generation capabilities, is accessible to developers.
Alongside the Flash model, Google also introduced a new research prototype, Project Mariner, from its DeepMind division. This Gemini-powered agent is designed to navigate websites like a human, controlling the Chrome browser by moving the cursor, clicking buttons, and filling out forms.
Google CEO Sundar Pichai highlighted the advancements of Gemini 2.0, saying, “If Gemini 1.0 was about organizing and understanding information, Gemini 2.0 is about making it much more useful.”
The new large language model promises significant improvements over its predecessors, particularly in areas such as code generation and delivering factually accurate responses to user queries. However, it is noted to be less effective than Gemini 1.5 Pro when it comes to evaluating longer contexts.
To use the chat-optimized version of Gemini 2.0 Flash, users can select it from the model drop-down menu on both desktop and mobile web versions. The company also announced plans to make it available soon on the Gemini mobile app.
The multimodal version, which includes text-to-speech and image generation features, will be accessible via Google's AI Studio and Vertex AI developer platforms. Full availability of this version is expected in January, alongside additional Gemini 2.0 model sizes. Google further revealed plans to expand Gemini 2.0 into more Google products by early 2025.
As part of its ongoing AI innovation, Google is positioning Gemini 2.0 to compete with leading tech companies in the AI space, including Microsoft, Meta, OpenAI, and startups like Perplexity and Anthropic. The new models are part of Google's efforts to develop more "agentic" AI systems—models that can understand the world around them, think ahead, and take actions on behalf of users under their supervision.
In a recent conversation at The New York Times' DealBook Summit, Pichai challenged Microsoft’s progress in AI, stating, “I’d love to do a side-by-side comparison” of the companies’ models “any day, any time.”