Saturday, July 20, 2024

Google Gemini AI largest and most capable AI model

By Demis Hassabis, CEO and Co-Founder of Google DeepMind, on behalf of the Gemini team

Like with many of my research colleagues, the main focus of my life’s work has been introducing Google Gemini AI. Since I was a teenager and began creating artificial intelligence for video games, as well as during my years as a neuroscience researcher attempting to comprehend how the brain functions, I have always had the belief that if we could create computers with more intelligence, we might use them to tremendously improve humanity.

At Google DeepMind, our work is driven by the promise of a world where artificial intelligence is used ethically. We have long aspired to develop a new generation of AI models that are motivated by human perceptions and interactions with the environment. Artificial Intelligence (AI) that has the sense of a knowledgeable assistant or aid rather than a clever piece of software.

We’re a step closer to realizing this goal today with the release of Gemini, our most competent and versatile model to date.
Large-scale teamwork from teams within Google, including our friends at Google Research, produced Gemini. Since it was designed from the ground up to be multimodal, it can comprehend, operate on, and combine various forms of information, such as text, code, audio, images, and videos, with ease.

Watch video Gemini: Google’s newest and most capable AI model

Google Gemini AI Types

Our most adaptable model to date, Gemini can operate well on a wide range of platforms, including mobile phones and data centers. Its cutting-edge features will greatly improve how developers and business clients use AI to create and grow.
Our initial version, Gemini 1.0, has been tuned for three distinct sizes:

Gemini Ultra: which is our largest and most powerful model for extremely complicated activities;

Gemini Pro: which is our best model for scaling over a broad variety of jobs;

Gemini Nano: which is our most effective model for on-device tasks.

cutting-edge performance

Our Gemini models have undergone extensive testing, and we’ve been assessing how well they perform across a broad range of jobs. Gemini Ultra outperforms the state-of-the-art findings on 30 of the 32 commonly used academic benchmarks used in large language model (LLM) research and development, from natural picture, audio, and video understanding to mathematical reasoning.

With an MMLU (massive multitask language understanding) score of 90.0%, Gemini Ultra is the first model to surpass human specialists in this domain. MMLU tests both problem-solving and general knowledge across 57 areas, including arithmetic, physics, history, law, medicine, and ethics.

Our new benchmark method to MMLU allows Gemini to make substantial gains over simply relying on its initial impression by leveraging its reasoning powers to deliberate more thoroughly before responding to challenging issues.
Additionally, Gemini Ultra reaches a state-of-the-art score of 59.4% on the recently introduced MMMU benchmark, which comprises multimodal tasks demanding deliberate reasoning across multiple domains.

Gemini AI vs GPT 4

Gemini AI vs Chat GPT4
Image credit to Google

Gemini Ultra fared better than earlier state-of-the-art models using the picture benchmarks we examined, even without the use of optical character recognition (OCR) systems, which extract text from images for additional processing. These standards demonstrate Gemini’s innate multimodality and show early evidence of Gemini’s capacity for more sophisticated reasoning.

Gemini AI vs GPT-4V

Gemini AI vs GPT 4
Image credit to google

Enabling global access to Gemini AI

Gemini 1.0 is currently being released on several platforms and products:

Gemini Pro for Google goods

Bard will begin using an enhanced version of Gemini Pro today in order to perform more sophisticated planning, comprehending, reasoning, and other tasks. The largest update for Bard since its release is this one. More than 170 countries and territories will be able to access it in English, and we intend to soon add support for more languages and modalities as well as new languages and places.

Additionally, Gemini is coming to Pixel. With other messaging apps arriving next year, the Pixel 8 Pro is the first smartphone designed to run Gemini Nano, the technology behind new capabilities like Summarize in the Recorder app and rolling out in Gboard’s Smart Reply, starting with WhatsApp.

Gemini will be accessible across more of our products and services, including Search, Ads, Chrome, and Duet AI, in the upcoming months.

We have already begun testing Gemini in Search, where it is improving quality and reducing latency by 40% in English in the United States, thereby speeding up our Search Generative Experience (SGE) for users.

Constructing with Gemini AI

Developers and enterprise clients can use Google Cloud Vertex AI or Google AI Studio’s Gemini API to access Gemini Pro as of December 13.

With an API key, Google AI Studio is a free online tool for developers to quickly prototype and create programs. When the time comes for a fully-managed AI platform, Vertex AI offers further capabilities from Google Cloud for enterprise security, safety, privacy, and data governance and compliance. It also enables customization of Gemini with complete data management.

Through AICore, a new system feature coming with Android 14, Android developers will also be able to create with Gemini Nano, our most effective model for on-device tasks, starting on Pixel 8 Pro devices. Register to receive an early look at AICore.

Soon to come: Gemini AI Ultra

Before releasing Gemini Ultra to the public, we’re presently finishing up thorough safety and trust assessments, which include red-teaming by dependable outside parties. We’re also fine-tuning the model and applying reinforcement learning from human feedback (RLHF) to improve its performance.

During this process, we will first make Gemini Ultra available for early testing and input to a restricted group of consumers, developers, partners, and safety and responsibility experts. Later in the year, we will roll it out to developers and enterprise customers.

We’ll also be introducing Bard Advanced early in the upcoming year, which is a state-of-the-art AI experience that allows you to access our greatest models and features, beginning with Gemini Ultra.

The Gemini AI era: opening the door to an innovative future

This marks not only the beginning of a new era for Google as we continue to responsibly and quickly expand the capabilities of our models, but also a critical milestone in the development of AI.

While Gemini has come a long way, we still have a long way to go. We’re putting a lot of effort on improving its planning and memory capacities as well as expanding the context window to handle even more data and provide better answers in future iterations.

The exciting prospects of an ethically AI-enabled society thrill us. It’s a future of innovation that will boost creativity, expand knowledge, progress research, and revolutionize the way billions of people live and work worldwide.

Agarapu Ramesh was founder of the Govindhtech and Computer Hardware enthusiast. He interested in writing Technews articles. Working as an Editor of Govindhtech for one Year and previously working as a Computer Assembling Technician in G Traders from 2018 in India. His Education Qualification MSc.


Recent Posts

Popular Post Would you like to receive notifications on latest updates? No Yes