For developers, the next phase of the Gemini era
With state-of-the-art models, clever tools to write code more quickly, and smooth cross-platform and cross-device interoperability, Google is empowering developers to create the AI of the future. Since the release of Gemini 1.0 in December last year, millions of developers have built with Gemini in 109 languages using Google AI Studio and Vertex AI.
Today, Google is introducing new coding agents that will improve workflows by acting on behalf of the developer and Gemini 2.0 Flash Experimental, which will enable even more immersive and engaging applications.
Use Gemini 2.0 Flash to build
Building on the success of Gemini 1.5 Flash, Flash 2.0 features new multimodal outputs, native tool use, and twice the speed of 1.5 Pro while attaining stronger performance. Additionally, Google is launching a Multimodal Live API for creating dynamic apps that stream audio and video in real-time.
Gemini 2.0 Flash is now in its experimental phase and will be made generally available early next year. Developers can test and investigate it using the Gemini API in Google AI Studio and Vertex AI.
Gemini 2.0 Flash gives developers the ability to:
Improved output
Gemini 2.0 While still meeting developers’ expectations for speed and economy, Flash is more powerful than 1.5 Pro. Additionally, it exhibits enhanced performance on important benchmarks in multimodal, text, code, video, spatial understanding, and reasoning. Better object identification and labelling, as well as more precise bounding box generation on small objects in cluttered images, are made possible by enhanced spatial awareness.
Novel modes of output
Using only one API call, developers will be able to create integrated answers using Gemini 2.0 Flash that can contain text, audio, and graphics. Early testers have access to these new output modalities, and a broader release is anticipated the following year. All audio and image outputs will have SynthID invisible watermarks enabled, which will assist reduce issues about misattribution and false information.
- Native audio output in multiple languages: Gemini 2.0 With eight excellent voices and a variety of languages and accents, Flash’s native text-to-speech audio output gives developers precise control over not just what the model says but also how it says it.
- Gemini 2.0 is the native image output: Now that Flash has native picture generation and conversational, multi-turn editing capabilities, you may improve and expand on earlier outputs. Recipes and other multimodal materials benefit from its ability to produce interleaved text and visuals.
Use of native tools
Gemini 2.0 has learnt how to use tools, which is a fundamental skill for creating agentic experiences. It can use function calling to call bespoke third-party functions as well as native features like code execution and Google Search. In addition to increasing traffic to publishers, using Google Search natively as a tool produces more thorough and factual responses. By simultaneously locating more pertinent information from several sources and integrating them for accuracy, numerous searches can be conducted in parallel, improving information retrieval.
Multimodal Live API
Now, developers can create multimodal, real-time applications that transmit audio and video from screens or cameras. It supports natural communication patterns, such as vocal activity detection and interruptions. To complete complex use cases with a single API call, the API facilitates the integration of several technologies.
Toonsutra’s contextual international translation, Viggle’s virtual character creation and audio narration, Rooms’ addition of real-time audio, and tldraw’s visual playground are just a few examples of the innovative experiences that startups are creating using Gemini 2.0 Flash.
Google has made three beginning app experiences available in Google AI Studio, along with open source code for video analysis, spatial understanding, and Google Maps exploration, to help you get started with Gemini 2.0 Flash.
Facilitating the advancement of AI code support
It want to reveal the most recent development that will make use of Gemini 2.0: coding agents that can carry out activities on your behalf. From simple code searches to AI-powered assistants incorporated into developer workflows, AI code assistance is developing quickly.
In its most recent study, it is able to attain 51.8% on SWE-bench Verified, which evaluates agent performance on actual software engineering activities, by using Gemini 2.0 Flash with code execution tools. With 2.0 Flash’s state-of-the-art inference speed, the agent was able to sample hundreds of possible solutions and choose the best one based on Gemini’s own judgement and existing unit tests. It is now developing new developer products based on this research.
Introducing Jules, your code agent driven by AI
Imagine looking at a lengthy list of issues after your team has just concluded a bug bash. Jules is an experimental AI-powered code agent that will use Gemini 2.0 to handle Python and Javascript writing jobs. You can start using it immediately. Jules manages bug fixes and other time-consuming activities asynchronously and in tandem with your GitHub workflow, allowing you to concentrate on the things you truly want to accomplish. Jules efficiently edits numerous files, drafts pull requests to submit modifications straight back into GitHub, and develops thorough, multi-step strategies to resolve problems.
Although it’s early, Jules is providing developers with the following benefits, based on its internal experience:
- Increased output: For asynchronous coding efficiency, provide Jules problems and coding assignments.
- Monitoring progress: With real-time updates, you can stay informed and prioritise the things that need your attention.
- Complete authority over developers: Examine the plans that Jules develops along the process, and if necessary, offer suggestions or make changes. Review Jules’s code with ease and, if necessary, incorporate it into your project.
Jules will be made available to other interested developers in early 2025 after being made available to a small set of reliable testers today.
A data science agent from Colab will make notes for you
Google introduced an experimental Data Science Agent on labs.google/code during I/O this year, which enables anyone to upload a dataset and receive insights in a matter of minutes, all based on a functional Colab notebook. It is overjoyed to observe the impact and get such encouraging feedback from the development community. For instance, a scientist at Lawrence Berkeley National Laboratory involved in a global tropical wetland methane emissions project estimated that the time it took to analyze and analyze data was cut from one week to five minutes with the aid of a Data Science Agent.
Using Gemini 2.0, Colab has begun integrating these similar agentic capabilities. You can speed up your research and data analysis skills by just stating your analysis goals in clear terms and watching your notebook take shape on its own. Before the new functionality is made available to Colab users more broadly in the first half of 2025, developers can gain early access to it by enrolling in the trusted tester program.
The future is being built by developers
You can concentrate on creating amazing user experiences by using Google Gemini 2.0 models to enable you to create more powerful AI apps more quickly and easily. In the upcoming months, it will be introducing Gemini 2.0 to its platforms, including Firebase, Chrome DevTools, and Android Studio. For better coding help features in well-known IDEs like Visual Studio Code, IntelliJ, PyCharm, and others, developers can sign up to use Gemini 2.0 Flash in Gemini Code Assist.