Google today unveiled Gemma, a line of cutting-edge, lightweight open models developed using the same science and technology as the Gemini models. They are happy to announce that Google Cloud users can begin utilizing Gemma open models in Vertex AI for customization and building, as well as for deployment on Google Kubernetes Engine (GKE), right now. Google’s next step in making AI more transparent and available to developers on Google Cloud is the launch of Gemma and their enhanced platform capabilities.
Gemma displays models
The Gemma family of open models is composed of lightweight, cutting-edge models that are constructed using the same technology and research as the Gemini models. Gemma, which was created by Google DeepMind and various other Google teams, was named after the Latin gemma, which means “precious stone.” Gemma was inspired by Gemini AI. Along with their model weights, Google is also making available resources to encourage developer creativity, promote teamwork, and direct the ethical use of Gemma models.
Gemma is currently accessible via Google Cloud
Google-capable Gemini models and Gemma models share infrastructure and technical components. When compared to other open models, this allows Gemma models to achieve best-in-class performance for their sizes. Gemma 2B and Gemma 7B are the two sizes of weights that they are releasing. To facilitate research and development, pre-trained and instruction-tuned variants of each size are released.
Gemma supports frameworks like JAX, PyTorch, Keras 3.0, Hugging Face Transformers, and Colab and Kaggle notebooks tools that Google Cloud developers love and use today. Gemma open models can be used on Google Cloud, a workstation, or a laptop. Developers can now work with and customize Vertex AI and run it on GKE thanks to these new open models. They have worked with NVIDIA to optimize Gemma for NVIDIA GPUs to maximize industry-leading performance.
Gemma is now accessible everywhere in the world. What you should know in particular is this:
- The Gemma 2B and Gemma 7B model weights are the two sizes that they are releasing. There are trained and fine-tuned versions available for every size.
- Using Gemma, a new Responsible Generative AI Toolkit offers instructions and necessary resources for developing safer AI applications.
- Google is offering native Keras 3.0 toolchains for inference and supervised fine-tuning (SFT) across all major frameworks, including PyTorch, TensorFlow, and JAX.
- Gemma is simple to get started with thanks to pre-configured Colab and Kaggle notebooks and integration with well-known programs like Hugging Face, MaxText, NVIDIA NeMo, and TensorRT-LLM.
- Pre-trained and fine-tuned Gemma open models can be easily deployed on Vertex AI and Google Kubernetes Engine (GKE) and run on your laptop, workstation, or Google Cloud.
- Industry-leading performance is ensured through optimization across multiple AI hardware platforms, such as NVIDIA GPUs and Google Cloud TPUs.
- All organizations, regardless of size, are permitted to use and distribute the terms of use in a responsible commercial manner.
Unlocking Gemma’s potential in Vertex AI
Gemma has joined more than 130 models in the Vertex AI Model Garden, which now includes the Gemini 1.0 Pro, 1.0 Ultra, and 1.5 Pro models they recently announced expanded access to Gemini.
Developers can benefit from an end-to-end ML platform that makes managing, tuning, and monitoring models easy and intuitive by utilizing Gemma open models on Vertex AI. By utilizing Vertex AI, builders can lower operational costs and concentrate on developing customized Gemma versions that are tailored to their specific use cases.
For instance, developers can do the following with Vertex AI’s Gemma open models:
- Create generative AI applications for simple tasks like Q&A, text generation, and summarization.
- Utilize lightweight, customized models to facilitate research and development through experimentation and exploration.
- Encourage low-latency real-time generative AI use cases, like streaming text
- Developers can easily transform their customized models into scalable endpoints that support AI applications of any size with Vertex AI’s assistance.
Utilize Gemma open models on GKE to scale from prototype to production
GKE offers resources for developing unique applications, from basic project prototyping to large-scale enterprise deployment. Developers can now use Gemma to build their generation AI applications for testing model capabilities or constructing prototypes by deploying it directly on GKE:
- Use recognizable toolchains to deploy personalized, optimized models alongside apps in portable containers.
- Adapt infrastructure configurations and model serving without having to provision or maintain nodes.
- Quickly integrate AI infrastructure that can grow to accommodate even the most complex training and inference scenarios.
GKE offers autoscaling, reliable operations environments, and effective resource management. Furthermore, it facilitates the easy orchestration of Google Cloud AI accelerators, such as GPUs and TPUs, to improve these environments and speed up training and inference for generative AI model construction.
Cutting-edge performance at scale
The infrastructure and technical components of Gemma open models are shared with Gemini, their most powerful AI model that is currently accessible to the public. In comparison to other open models, this allows Gemma 2B and 7B to achieve best-in-class performance for their sizes. Additionally, Gemma open models can be used directly on a desktop or laptop computer used by developers. Interestingly, Gemma meets their strict requirements for responsible and safe outputs while outperforming noticeably larger models on important benchmarks. For specifics on modeling techniques, dataset composition, and performance, consult the technical report.
At Google, we think AI should benefit everyone. Google has a long history of developing innovations and releasing them to the public, including JAX, AlphaFold, AlphaCode, Transformers, TensorFlow, BERT, and T5. They are thrilled to present a fresh batch of Google open models today to help researchers and developers create ethical AI.
Begin using Gemma on Google Cloud right now
Working with Gemma models in Vertex AI and GKE on Google Cloud is now possible. Visit ai.google.dev/gemma to access quickstart guides and additional information about Gemma.