Unlock Multimodal Google Vector Search With Vertex AI

January 16, 2025

435

Unlock large-scale multimodal search: Use Vertex AI to combine text and picture power. Users’ search habits are changing. While looking for a product, people may utilise visuals or natural-sounding text. They expect customised, query-specific results in exchange. Developers want strong multimodal search systems to satisfy these requirements.

Google Cloud will discuss a potent method for creating a multimodal search engine with Google Cloud’s Vertex AI platform. Using an ensemble approach with weighted Rank-Biased Reciprocal Rank (RRF), Google Cloud will integrate the advantages of Google vector search with vertex AI search. This method enables:

A better experience for users: Finding the “perfect” keywords is less important as searching becomes more natural.
Improved product identification: With just words, users can find things they might not have otherwise.
Increased rates of conversion: Increased revenue and happy customers are the outcomes of more relevant and interesting search results.

The importance of using a mixed approach

Consider the way you look for things on the internet. Let’s say you wish to look for terms like “white marble countertops” or “homes with a large backyard.” While some of this data may only be accessible through photos, others may be saved in text. You want the system to use both modalities when you search for a product.

Asking a large language model (LLM) to provide a written description of a picture might be one strategy. However, this might cause delay for your consumers and be difficult to control over time. Alternatively, Google Cloud may use Vertex AI Search to merge the search results with text data by utilising picture embeddings. When combined, this multimodal strategy yields:

Richer visual comprehension: Multi-modal embeddings go beyond basic text annotations to capture the intricate visual relationships and elements seen in images.
Image-based queries: By enabling users to search directly with an image, this feature enables more natural discovery based on visual cues.
Precise filtering: This enables the search to be very exact and produces tailored results by filtering by specific parameters such as size, layout, materials, and features.

The Vertex AI platform from Google Cloud offers a full suite of resources for creating and implementing machine learning solutions, including robust search features:

Vertex AI search: An extremely feature-rich and scalable search engine for a variety of searches. Advanced features including synonyms, faceting, filtering, and custom relevance ranking are supported. Advanced document parsing is also made possible by it, notably for unstructured documents (PDFs) and even those that have embedded visuals (such as tables, infographics, etc.).
To create image embeddings, or numerical representations of pictures, utilise the Vertex AI multimodal embedding API.
Vertex AI Google vector search: This serves as a vector database where embeddings with searchable metadata are kept. Both dense and sparse embeddings, such text descriptions and pictures, can be stored in it.

Google Cloud’s group strategy: Power of text and images

Google Cloud will employ an ensemble technique to develop Google Cloud’s multimodal search engine, combining the advantages of Google vector search for photos with Vertex AI Search:

Vertex AI Search for text search

Use an agent builder to index the names, descriptions, and characteristics of your product catalogue data into a data store.
Vertex AI Search uses semantic understanding, keyword matching, and any custom ranking rules you specify to return pertinent products when a user types in a text query.
Additionally, this has the ability to return facets that may be used for filtering.
You can even see how complicated or unstructured documents are chunked and analysed.

Using vector embeddings for image search

Use the multimodal embeddings API to create picture embeddings for your goods.
These embeddings should be kept in Google vector search.
To locate visually related product pictures, turn user-uploaded text or photos to embeddings and query the vector database.

Using weighted RRF to combine results

Rank-biased Reciprocal Rank (RRF): This metric takes into account the location of the first relevant item in a ranked list to determine how relevant the list is. Lists with more pertinent items display higher.
Weighted RRF: Give the picture similarity score (from Google vector search) and the text relevance score (from Vertex AI Search) weights. This enables you to modify each modality’s weight in the final ranking (e.g., Vertex or Google vector search).
Ensemble: Present the user with the blended list after combining the text and picture search results and reranking them based on the weighted RRF score.

Google Cloud's group strategy: Power of text and images — Image credit to Google Cloud

Use the faceting features of Vertex AI Agent Builder Search to improve the search experience:

Describe the facets: Make facets for categories, features (colour, size, material), price ranges, etc. based on your product data.
By enabling users to dynamically customise their searches using these facets, dynamic filtering helps consumers focus on the most pertinent goods from the results. The term “dynamic” refers to the filters’ automatic adjustment based on the results that are returned.
Understanding natural language queries: To enhance search results, you may activate natural language query understanding in your Vertex AI Agent Builder Search if the textual input is organised. After that, you may use namespaces to apply the same filters to the Google vector search by parsing the response’s filters.

Why this strategy is effective

This method combines the rich capabilities of Vertex AI Search (such the parsing pipeline) with the capability of using photos as a query directly, giving developers the best of both worlds. Because it modifies the weights in your RRF ensemble and customises some aspects to meet your unique requirements, it is also adaptable and customisable.

The ability to search with ease using text, photos, or both, together with dynamic filtering choices for more focused results, is what this method most importantly provides for your users.

Start using multi-modal search now

You may create a very efficient and interesting search experience for your consumers by utilising Vertex AI’s capability and integrating text and picture search with a strong ensemble approach. Begin by:

Examine Vertex AI

Examine the documentation to learn more about Vertex AI Search’s and embedding generation’s capabilities.

Try out different embeddings

Experiment with several picture embedding models and adjust them on your data as necessary.

Put weighted RRF into practice

To maximise your search results, create your scoring algorithm and test out various weights.

Understanding of natural language queries

Apply the same filters to Google vector search by utilising the built-in features of Vertex AI agent builder Search to create filters on structured data.

Vector search filters

Use filters on your image embeddings to give consumers even more flexibility.

Unlock Multimodal Google Vector Search With Vertex AI

The importance of using a mixed approach

Google Cloud’s group strategy: Power of text and images

Vertex AI Search for text search

Using vector embeddings for image search

Using weighted RRF to combine results

Why this strategy is effective

Start using multi-modal search now

Examine Vertex AI

Try out different embeddings

Put weighted RRF into practice

Understanding of natural language queries

Vector search filters

Google Magic Mirror Experience Driven by Gemini Models

Pluto AI: A New Internal AI Platform For Enterprise Growth

Bolttech Improves Customer Experience with AWS Generative AI

LEAVE A REPLY Cancel reply

Page Content

Recent Posts

AMD Radeon Pro W6600 Benchmark in CAD, Video Editing

Intel Core Ultra 5 225H Performance for Everyday Tasks

Intel Core i9 13900K Price, Benchmark, and Specifications

NVIDIA Tesla V100 Price, Features And Specifications

Google Magic Mirror Experience Driven by Gemini Models

Pluto AI: A New Internal AI Platform For Enterprise Growth

About Us