Friday, March 28, 2025

ScaNN For AlloyDB: Scalable PostgreSQL Vector Search

ScaNN For AlloyDB

Performing vector search on 100 million to 1 billion+ vectors, AlloyDB for PostgreSQL is used by many clients to support complex semantic search and generative AI use cases. The pgvector HNSW graph method from the Postgres OSS community is the first thing they seek for when they want a huge vector search index that works with the rest of their operational database. The PostgreSQL SQL language is extended by pgvector, which allows combining SQL queries with joins, filters, and vector search a crucial combination for contemporary applications.

Since 2023, It has supported the well-liked pgvector extension with HNSW in AlloyDB, and want to keep doing so. It anticipate that HNSW will be one of several alternative vector indexing systems that emerge throughout time. The pgvector HNSW graph technique is currently less efficient for larger datasets, although it performs well on queries for small datasets. The time and expense of creating the index, the size of the final index, and the index’s decreased performance if it becomes too big to fit in main memory can all be problems for workloads with a lot of vectors. It has to find an alternative for certain AlloyDB workloads, even if pgvector is a reasonable option for vector indexing in some circumstances.

To that end, in October 2024, It published the ScaNN for AlloyDB extension, which offers a leading vector search solution for all use scenarios. Google Research has spent the last 12 years developing the ScaNN vector search technique, which is incorporated into ScaNN for AlloyDB.

Given that ScaNN is used in Google Search, YouTube, Ads, and other applications involving hundreds of billions of vectors or more, it should come as no surprise that it performs well on massive datasets. Additionally, it’s a flexible and affordable solution that offers a pgvector-compatible index that works with all sizes, has a 4x reduced memory footprint, and improves latency by up to 4x, even for tiny datasets.

It demonstrate in this blog’s Benchmarks section that ScaNN for AlloyDB can create indices for one billion vectors at a cost up to 60 times less than that of other PostgreSQL systems. Because HNSW is a graph structure that might result in costly random access I/O when not in memory, it also provides up to 10 times better latency when the indices (ScaNN and HNSW) do not fit in main memory. It also demonstrate that ScaNN for AlloyDB is a competitive choice for small sizes, providing faster index build time and up to 4x better latency than pgvector HNSW. Lastly, present the main justifications for ScaNN’s performance in AlloyDB’s Algorithms section.

Benchmarks

Two well-known benchmark datasets, Glove-100 (~1 million vectors, 100 dimensions) and BigANN-1B (1 billion vectors, 128 dimensions), were used for performance assessments. It demonstrate the effectiveness of pgvector HNSW and ScaNN for AlloyDB using the Glove-100 when the indices fit in main memory, and BigANN-1B when they don’t. Let’s start by examining search performance.

Search performance

It used OSS Postgres 15 and a 16 vCPU, 128GB memory instance to test ScaNN for AlloyDB and pgvector HNSW 0.8.0. Additionally, Tested pgvector HNSW 0.7.4 on a Cloud (which will call Cloud Vendor X) on their 16 vCPU 128GB memory instance in accordance with the setup and outcomes that Cloud Vendor X posted on their blog in early 2024. For the Glove-100 benchmark, the indices fit in main memory, but not for the BigANN-1B benchmark.

Since the indices cannot fit in main memory, their performance is obviously significantly worse for BigANN-1B. While the ScaNN for AlloyDB offers 10x better latency (431ms), the pgvector HNSW’s latency is >4s (yes, that is seconds!), which is unacceptable for online applications. This is crucial for use cases that need to be economical while yet requiring latency in the range of 100s of ms.

It should be noted that in many usage instances, the ScaNN for AlloyDB index fits in main memory but the pgvector HNSW index does not, due to the generally much smaller footprint of ScaNN for AlloyDB. If it were to utilise AlloyDB on a 64 vCPU 512GB RAM instance, for instance, the BigANN-1B ScaNN for AlloyDB would have a latency of 30 ms, which would be almost two orders of magnitude faster than pgvector HNSW, which had a delay of >4 seconds!

 Smaller footprint of ScaNN for AlloyDB
Image credit to Google Cloud

Index build performance

Let’s now examine how long it took us to construct indices. When building the index for large datasets, pgvector HNSW is excessively sluggish, as many users rightly criticise. When creating the pgvector HNSW index for BigANN-1B, this becomes clear. Please take note that it were unable to construct the index using the 16 vCPU 128GB memory machine for both PostgreSQL and Cloud Vendor X. The pgvector HNSW indices were successfully built in a decent amount of time after several laborious tries using larger computers and setups, finally utilising extra-large instances.

However, the 16 vCPU 128GB RAM instance that lists for roughly 1/10th the price of these extra-large instances is what they utilised with ScaNN for AlloyDB. Consumers value the ease of rapidly and affordably creating the index.

Index build performance
Index build performance

Algorithms

In the Benchmarks section, it demonstrated that when the two vector indices do not fit in main memory, the performance disparity between ScaNN for AlloyDB and pgvector HNSW is significantly exacerbated. In fact, the pgvector community is well aware of the HNSW algorithm’s flaws, as demonstrated in this ticket, for example. Additionally, demonstrated how ScaNN for AlloyDB can fit in memory when pgvector HNSW cannot because to its 4x reduced memory footprint. These discrepancies can be explained by fundamental differences in the algorithms and data organisation of the two indices. Let’s begin with an explanation of the memory footprint difference in order to comprehend why.

ScaNN is a tree-quantization-based index, while HNSW is a graph-based index. In graph-based indices, the index is often a graph, with each vector representing a graph node. Every node, or vector, is linked to nodes that match specific neighbouring nodes. Connecting each node to around m=20 additional nodes where m is the maximum number of neighbours per network node is a common proposal. Additionally, HNSW has several layers that are arranged hierarchically, with the top layers serving as access points for the bottom layers.

HNSW
Image credit to Google Cloud

On the other hand, ScaNN has a data structure that is similar to a B-tree but shallow. A centroid is represented by each leaf node, and the leaf itself contains all of the vectors that are near this centroid. As seen in the picture below, which shows a two-level index, the centroids effectively divide the space. A tree has a lot less edges than a graph with 20 edges per node, which accounts for a significant portion of the memory footprint difference between ScaNN for AlloyDB and pgvector HNSW.

Leaf centroid
Image credit to Google Cloud

Let’s look at the performance difference next. HNSW finds the closest neighbours to the sought vector by conducting a greedy search in the top layer, beginning at the entry point. Until no closer neighbours are identified, the greedy search iteratively moves to the neighbours closest to the inserted vector. Until it reaches the bottom layer and the nearest neighbours are retrieved, it drops to the next lower layer and repeats the greedy search procedure.

Observe that the graph traversal access is random while using HNSW. As a result, these random accesses quickly degrade performance for a dataset of more than 100 million vectors where the graph nodes must page in and out between buffer and disc (see ticket #700). The ScaNN for AlloyDB index is cache-friendly, optimises for block-based access when the index is in secondary storage, and optimises for effective SIMD operations when the index is cached, in contrast to HNSW’s random access. Sequential and block-based access performs better than random access, as is frequently the case with out-of-memory database techniques.

Next steps

ScaNN vector search is essential to Google’s ability to provide the performance needed for applications with billions of users. Additionally, you can now utilise ScaNN for AlloyDB to power your own vector-based search applications. Read introduction to the ScaNN for AlloyDB index or ScaNN for AlloyDB whitepaper for a general overview of vector search followed by a detailed explanation of the ScaNN method and implementation in PostgreSQL and AlloyDB.

Thota nithya
Thota nithya
Thota Nithya has been writing Cloud Computing articles for govindhtech from APR 2023. She was a science graduate. She was an enthusiast of cloud computing.
RELATED ARTICLES

Recent Posts

Popular Post