Gemini Diffusion
Google DeepMind has created a new experimental research model called Gemini Diffusion. In particular, it is referred to as a state-of-the-art text dissemination model. Among Google DeepMind’s AI projects and prototypes is Gemini Diffusion.
The model approaches language modelling using a method called diffusion. Compared to conventional autoregressive language models, this approach is very different. Conventional autoregressive models produce text one word or token at a time in a sequential fashion. This sequential structure may limit the quality and coherence of the generated content and possibly lead to slower outputs.
Diffusion models, on the other hand, work by gradually improving noise to learn to produce outputs. They handle noisy input iteratively to produce a coherent output rather than immediately anticipating text in a linear method. Diffusion models can rapidly iterate on a possible solution through this iterative refinement process. An important benefit is that they may also fix mistakes made during the generation process. They are especially adept at editing-related jobs, such as those involving code and mathematics, because to their capacity for iterative improvement and error repair. Google DeepMind’s current state-of-the-art models for image and video synthesis operate similarly to this method of learning to produce outputs by turning random noise into intelligible text or code.
You can also read Spring AI 1.0 and Google Cloud to Build Intelligent Apps
Gemini Diffusion’s primary capabilities as a result of this diffusion strategy include:
- Quick response: Compared to Google’s quickest model to date, Gemini Diffusion can produce information much more quickly. With overhead excluded, the average stated sampling speed for all activities under evaluation is 1479 tokens per second. An overhead of 0.84 seconds is involved.
- More coherent language: Gemini Diffusion creates whole blocks of tokens all at once, in contrast to autoregressive models that construct text token by token. With this approach, the model reacts to a user’s query in a more logical manner.
- Iterative refinement: As previously indicated, the model can fix mistakes made during generation, producing outputs that are more reliable.
Benchmarks
Despite being faster, Gemini Diffusion’s external benchmark scores are reported to be on par with much bigger models in terms of performance. According to the insiders, it can code just as well as Google’s quickest model to date. For a number of domains, benchmarks contrasting Gemini Diffusion with Gemini 2.0 Flash-Lite are offered:
- Code: Gemini Diffusion performs better on LBPP (v2) (56.8% vs. 56.0%), MBPP (76.0% vs. 75.8%), and LiveCodeBench (v6) (30.9% vs. 28.5%). On BigCodeBench (45.8% vs. 45.4%), SWE-Bench Verified (28.5% vs. 22.9%), and HumanEval (90.2% vs. 89.6%), Gemini 2.0 Flash-Lite performs marginally better. It should be mentioned that a non-agentic evaluation (single-turn edit only) with a maximum prompt length of 32K is the basis for the SWE-Bench Verified result.
- Science: On GPQA Diamond, Gemini 2.0 Flash-Lite performs noticeably better than Gemini Diffusion (56.5% vs. 40.4%).
- Gemini Diffusion has a higher AIME 2025 score in mathematics (23.3% vs. 20.0%).
- Justification: Gemini 2.0 Flash-Lite performs better on BIG-Bench Extra Hard, scoring 21.0% as opposed to 15.0%.
- Multilingual: The Global MMLU (Lite) score for Gemini 2.0 Flash-Lite is higher (79.0% vs. 69.1%). According to the benchmark technique, no majority voting was employed, and all results are pass@1. The model-id gemini-2.0-flash-lite and the default sampling settings were used in the AI Studio API to execute the Gemini 2.0 Flash-Lite tests for comparison.
Benchmark | Gemini Diffusion | Gemini 2.0 Flash-Lite |
---|---|---|
CodeLiveCodeBench (v6) | 30.9% | 28.5% |
CodeBigCodeBench | 45.4% | 45.8% |
CodeLBPP (v2) | 56.8% | 56.0% |
CodeSWE-Bench Verified | 22.9% | 28.5% |
CodeHumanEval | 89.6% | 90.2% |
CodeMBPP | 76.0% | 75.8% |
ScienceGPQA Diamond | 40.4% | 56.5% |
MathematicsAIME 2025 | 23.3% | 20.0% |
ReasoningBIG-Bench Extra Hard | 15.0% | 21.0% |
MultilingualGlobal MMLU (Lite) | 69.1% | 79.0% |
Gemini Diffusion is currently accessible as an experimental demonstration. Future models are being developed and improved with the help of this internal demo. Anyone who would like to be added to the waitlist for admission to the demo can do so.
In general, Gemini Diffusion’s investigation of diffusion for text generation is presented as a means of giving users more autonomy, inventiveness, and speed while producing text. Diffusion is one of the various strategies Google DeepMind is still working on to enhance its models and make them more effective and efficient.