Amazon OpenSearch Vector Engine’s Latest Release
AWS are pleased to announce that the vector engine for Amazon OpenSearch Serverless with additional capabilities is now generally available. Amazon released the vector engine preview release for Amazon OpenSearch Serverless in July 2023. It is a straightforward, scalable, and highly effective similarity search feature. Without having to worry about maintaining the underlying vector database architecture, the vector engine makes it simple to create cutting-edge generative artificial intelligence (generative AI) and machine learning (ML) augmented search experiences.
Now, billions of vector embeddings with thousands of dimensions may be stored, updated, and searched in milliseconds. Generative AI-powered applications may give constant millisecond-scale reaction times together with precise and dependable results thanks to the vector engine’s extremely performant similarity search capability.
By merging full-text and vector search into a single query, the vector engine also makes it possible to optimize and fine-tune hybrid search results. This eliminates the need for complicated application stacks or different data stores to manage and maintain. The vector engine offers an enterprise-ready, scalable, dependable, and safe platform for developing prototype applications at a reasonable cost and smoothly transitioning them into production.
By assembling a specialized vector engine-based collection a logical collection of embeddings that cooperate to serve a workload you can now begin using the vector engine in a matter of minutes.
The vector engine processes and executes similarity search queries using OpenSearch Compute Units (OCUs), a compute capacity unit. At a 99 percent recall rate, one OCU may manage up to 2 million vectors for 128 dimensions or 500,000 for 768 dimensions.
By default, the OpenSearch Serverless vector engine is a highly available service. For the first collection in an account, it needs a minimum of four OCUs (2 OCUs for the ingest, including primary and standby, and 2 OCUs for the search with two active replicas across Availability Zones). Those OCUs can be shared by any subsequent collections that use the same AWS Key Management Service (AWS KMS) key.
What has recently happened at GA?
Since the initial release, one of the vector database options in the Amazon Bedrock knowledge base for developing generative AI applications utilizing the Retrieval Augmented Generation (RAG) idea is the vector engine for Amazon OpenSearch Serverless.
For this GA release, the following features have been added or enhanced:
Turn off the redundant replica (centered on development and testing) option
This capability removes the requirement to install redundant OCUs in another Availability Zone just for availability reasons. It is possible to deploy a collection with two OCUs: one for search and one for indexing. Compared to the default deployment with redundant replicas, this results in a 50% cost savings. This configuration is appropriate and cost-effective for workloads related to development and testing because of its lower cost.
Since the vector engine keeps all of the data in Amazon S3, AWS will still ensure durability with this choice. However, single-AZ failures will affect your availability.
When building a new vector search collection, uncheck Enable redundancy if you wish to disable a redundant duplicate.
Fractional OCU for the option that is focused on development and testing
The floor price for vector search collection is lowered by supporting fractional OCU billing for development and test-focused workloads (i.e., no redundant replica option). The vector engine will scale up to a full OCU and beyond to meet your workload demand. Initially, it will deploy smaller 0.5 OCUs while offering the same features at a lesser scale. When experimenting with the vector engine, this option will further lower the monthly expenditures.
Automated scaling up to a billion
You can stop reindexing for scaling purposes with vector engine’s smooth auto-scaling. AWS were supporting roughly 20 million vector embeddings at trial. AWS are able to support a billion vector scale now that vector engines are widely available.
In all AWS Regions where Amazon OpenSearch Serverless is offered, the vector engine for the service is now accessible.