Friday, February 7, 2025

Tailor Spanner Google Cloud AI-powered Hybrid Search Engine

From online purchasing to locating important information, search is crucial to Google Cloud interactions with the digital environment. With the advent of generative AI, consumer expectations have never been greater.

Regardless of how queries are phrased, apps must provide quick, precise, and contextually appropriate responses in order to satisfy a wide range of user demands. For instance:

  • Customers who purchase online anticipate finding “waterproof hiking boots with ankle support” as simple as a certain “SummitEdge Pro” model.
  • Attorneys must use a variety of search phrases to find specific case citations or investigate complex legal ideas.
  • When looking for vital patient data, doctors need to be precise. Whether the information is spelt incorrectly as “peniciln” or labelled as “drug sensitivities,” a doctor searching for “allergy to penicillin” must identify the data precisely.

With AI-powered hybrid search capabilities, Spanner, Google’s always-on multi-model database with almost infinite size, tackles these issues.

Using a recognisable SQL interface, Spanner Google Cloud enables developers to integrate machine learning (ML) model reranking, vector search, and full-text search into a single platform that is directly connected to the operational data store.

Building a tailored search engine on Spanner

A single search strategy frequently fails in e-commerce, as it does in many other businesses, leading to unhappy customers, inaccurate information, or lost sales. Vector search captures semantics but could miss some phrases; keyword search is excellent at accuracy but has trouble with natural language or alternative wording.

Companies might provide a more efficient search experience by combining the advantages of both.Users may utilise natural language or keywords to search for items on SpanMart, a fictitious e-commerce marketplace. With two specialised columns and related indexes, the products table accommodates a variety of search techniques:

  • The description column is tokenised into a description_tokens column, which separates the content into distinct phrases. This column’s search index (products_by_description) speeds up full-text search by functioning as an inverted index for information retrieval.
  • Instead of storing individual words, an embedding column captures semantic value by storing vector representations of the product descriptions.
  • In the “embedding space,” similar descriptions are mapped closely together. Models such as Vertex AI Embeddings are used to create these embeddings. For effective semantic searches, these embeddings are arranged using a ScaNN tree structure by a vector index (products_by_embedding).

Here’s how the products table and its indexes are defined Spanner:

CREATE TABLE products (
id INT64,
description STRING(MAX),
description_tokens TOKENLIST AS (TOKENIZE_FULLTEXT(description)) HIDDEN,
embedding ARRAY(vector_length=>768),
) PRIMARY KEY(id);

CREATE SEARCH INDEX products_by_description ON products(description_tokens);
CREATE VECTOR INDEX products_by_embedding ON products(embedding)
WHERE embedding IS NOT NULL
OPTIONS(distance_type=”COSINE”, num_leaves=1000);

When these elements are present, SpanMart may create a sophisticated search pipeline that incorporates:

  • For semantic relevance, use vector search.
  • For accurate keyword matching, use full-text search.
  • Combining results from several retrieval techniques is known as result fusion.
  • Reranking ML models for more sophisticated result refining.

The operational data is kept in Spanner Google Cloud, where this pipeline runs exclusively. Spanner removes the need for various technology stacks, extensive application logic, and complex ETL pipelines for inter-system communication by eschewing interaction with different search engines or vector databases.

By doing this, possible performance inefficiencies are avoided and architectural and operational overhead is decreased.

A high-level overview of how these elements interact in Spanner is shown in the diagram below.

Spanner
Image credit to Google Cloud

Combining the power of vector and full-text search

The system first use the embedding model to transform the user query into a vector that captures its semantic meaning when a user searches for items on SpanMart. SpanMart may then construct two queries:

Approximate nearest neighbors (ANN) vector search:

SELECT id, description
FROM products @{FORCE_INDEX=products_by_embedding}
WHERE embedding IS NOT NULL
ORDER BY APPROX_COSINE_DISTANCE(embedding, @vector,
OPTIONS=>JSON'{“num_leaves_to_search”: 10}’)
LIMIT 200;

Full-text search (FTS):

SELECT id, description
FROM products
WHERE SEARCH(description_tokens, @query)
ORDER BY SCORE(description_tokens, @query) DESC
LIMIT 200;

These two enquiries compliment one another and perform well in many contexts. The full-text search query, for example, may precisely locate the particular model when a user searches for a specific product model number, such “Supercar T-6468,” but the vector search query indicates related products.

On the other hand, full-text search could find it difficult to produce helpful results for more complicated natural language enquiries, such “gift for an 8-year-old who enjoys logical reasoning but not a toy,” vector search can provide relevant ideas. For both types of searches, combining the two queries would yield reliable results.

Reciprocal rank fusion (RRF)

RRF is a straightforward yet powerful method for merging search results from several queries. It rewards records that rank well in individual searches by calculating a relevance score for each item based on its position in all result sets.

When the relevance ratings from the many searches are determined in disparate locations, making direct comparisons challenging, this approach is very helpful. This is addressed by RRF, which emphasises relative ranks within each result set rather than scores.

In case, RRF functions as follows:

  • Determine rank reciprocals: After adding a constant (such as 60), use the inverse of the product’s rank to determine its rank reciprocal in each result set. This constant enables lower-ranked items to make a significant contribution while preventing top-ranked products from controlling the final score. A product rated fifth in one set of results, for example, would have a rank reciprocal of 1/(5 + 60) = 1/65 in that set.
  • Sum rank reciprocals: To determine a product’s ultimate RRF score, add up the rank reciprocals from each result set.

The formula for RRF is

image 30 1
  • D is a description of a product.
  • The two search queries in this instance are the set of retrievers, denoted by R.
  • The rank of product description d in retriever r’s results is denoted by rankr (d).
  • K is a constant.

Implementing RRF within Spanner’s SQL interface is relatively straightforward. Here’s how:

@{optimizer_version=7}
WITH ann AS (
SELECT offset + 1 AS rank, id, description
FROM UNNEST(ARRAY(
SELECT AS STRUCT id, description
FROM products @{FORCE_INDEX=products_by_embedding}
WHERE embedding IS NOT NULL
ORDER BY APPROX_COSINE_DISTANCE(embedding, @vector,
OPTIONS=>JSON'{“num_leaves_to_search”: 10}’)
LIMIT 200)) WITH OFFSET AS offset
),
fts AS (
SELECT offset + 1 AS rank, id, description
FROM UNNEST(ARRAY(
SELECT AS STRUCT id, description
FROM products
WHERE SEARCH(description_tokens, @query)
ORDER BY SCORE(description_tokens, @query) DESC
LIMIT 200)) WITH OFFSET AS offset
)
SELECT SUM(1 / (60 + rank)) AS rrf_score, id, ANY_VALUE(description) AS description
FROM ((
SELECT rank, id, description
FROM ann
)
UNION ALL (
SELECT rank, id, description
FROM fts
))
GROUP BY id
ORDER BY rrf_score DESC
LIMIT 50;

Explanations

  • CTEs, or common table expressions, are: The purpose of these WITH clauses is to make the query easier to read. They could, however, result in the query optimiser defaulting to an earlier version that does not enable full-text searches due to a current restriction. For the time being, the query suggests a more current optimiser version using the @{optimizer_version=7} hint.
  • The ANN CTE query is identical to the one before it, but it has a twist. Every product in the results is given a rank. There is a solution even if Spanner Google Cloud doesn’t offer a direct method of assigning rankings. It may utilise the offset of each member in the array as its rank by transforming the results into an array of structs. It use offset + 1 to represent the real rank because array offsets begin at zero. Keep in mind that there is no performance impact this is only a SQL language workaround. Each row in the result set is immediately assigned an offset by the query planner, which efficiently eliminates the array conversion.
  • FTS CTE: In a similar vein, this section uses the array offset to assign the rank, just as the previous full-text search query.
  • Ranking and combining: The outcomes of the two CTEs are combined and arranged according to the product ID. They determine the rrf_score for every product before choosing the top 50.

Although RRF works well, application developers may experiment with and use a variety of additional result fusion techniques with Spanner Google Cloud‘s flexible SQL interface. Developers might, for example, assign varying priority to each search technique by normalizing scores across several searches to a similar range and then combining them using a weighted total.

Because of this flexibility, developers may customise the search experience to meet the needs of particular applications and maintain fine-grained control over it.

Using an ML model to rerank search results 

Reranking based on ML models is a potent technique for improving search results and providing consumers with better outcomes. As previously mentioned, it applies a sophisticated but computationally costly model to a limited number of initial candidates that were found via techniques like vector search, full-text search, or their combination.

After the first retrieval narrows down the result set to a select few potential choices, ML model-based reranking is used due to its high computational cost.

Reranking based on ML models may be done directly within Spanner Google Cloud with its interface with Vertex AI. A model from the Vertex AI Model Garden or one that has been deployed to your Vertex AI endpoint can be used. You may make a matching reranker MODEL in Spanner after the model has been deployed.

SpanMart uses a Cross-Encoder model for reranking in this case. Text and text_pair are the two text inputs that this model uses, and it returns a relevance score that shows how well the two texts match.

A Cross-Encoder analyses the two texts together immediately, in contrast to vector search, which maps each text into a fixed-dimensional space individually using an embedding model before calculating how similar they are.

For example, “gift for an 8-year-old who enjoys logical reasoning but not a toy” enables the Cross-Encoder to catch more contextual and semantic subtleties in complicated enquiries and searches. To provide an even more thorough search experience, a more sophisticated configuration can utilise a custom-trained model that integrates other signals like product reviews, promotions, and user-specific information like browsing and purchase history.

After defining this model in Spanner Google Cloud, they can use the following query to add reranking to the original search results:

Explanations

  • These are the same previously specified full-text search, reciprocal rank fusion, and approximate nearest neighbours CTEs (ANN, FTS, and RRF, respectively).
  • ML.PREDICT Ranking: In this stage, the search query is converted to text_pair and the reranker model is applied to each product description as text from the RRF results. Every product is given a relevance score by the model. These ratings are then used to rank the items, and the top 10 are chosen.
Drakshi
Drakshi
Since June 2023, Drakshi has been writing articles of Artificial Intelligence for govindhtech. She was a postgraduate in business administration. She was an enthusiast of Artificial Intelligence.
RELATED ARTICLES

Recent Posts

Popular Post

Govindhtech.com Would you like to receive notifications on latest updates? No Yes