Cloud Bigtable, Google Cloud’s fully-managed, low-latency NoSQL database service, has had a busy few weeks. We made a number of significant announcements prior to Google Cloud Next ’23 and even more during the event, including those related to multi-cloud capabilities, hybrid analytical and transactional processing (HTAP), new cost savings, and additional ways to integrate with Google Cloud and the open-source ecosystem. Here are a few of the highlights:
Create event-driven software with change streams
With the new change streams feature, which is now generally accessible, you can easily track modifications to your Bigtable data and integrate it with other systems. To support multi-cloud or hybrid-cloud architectures, you can replicate changes from Bigtable to BigQuery for analytics, ElasticSearch for autocomplete and full-text search, or other databases. You can also use Pub/Sub and Cloud Functions to start downstream processes, and you can integrate with Vertex AI to deliver ML-driven experiences.
Retailers can use change streams to monitor changes to their product catalogs, such as those in pricing or availability, and use those changes to launch in-app, text, or email notifications that notify their customers. Banking apps can send uploaded documents to Cloud Vision AI for content parsing in the meantime.
Bigtable nodes can be purchased for 20–40% less with committed use discounts
In exchange for your promise to consistently use Bigtable compute capacity (as measured in Bigtable nodes) for a one- or three-year period, Bigtable is now offering significantly reduced prices. A three-year commitment offers a 40% discount, while a one-year commitment offers 20% off! All Google Cloud regions and projects are eligible for Bigtable committed use discounts, which are spend-based.
Bi-directional cloud multiplication Replication of Bigtable-HBase
Although multi-cloud deployments are now a crucial component of organizations’ IT strategies, not all cloud database services are compatible with them due to the existence of proprietary APIs and data models. Fortunately, Bigtable not only shares Apache HBase’s common wide-column data model but also offers a compatible API, making migrations between the two platforms very simple in the past. Additionally, you can use this compatibility to deliver multi-cloud or hybrid-cloud deployments thanks to a new bi-directional replication capability.
With request priorities, hybrid transactional and analytical processing
One of the biggest obstacles to organizations’ continued digital transformation is their inability to safely run batch and ad-hoc analytical workloads against live operational databases without running the risk of disruptions. These operations can use up a lot of resources, and if left unchecked, they can interfere with workloads that depend on serving data in a timely manner. Therefore, database owners impose strict controls and limits on any such scenarios.
Some teams choose to circumvent these restrictions by adding more replicas, restricting batch writes to times of low traffic, or over-provisioning their databases, but doing so incurs a high cost and management burden. Others attempt to construct intricate pipelines that deliver data to analytical systems; these can be costly, prone to error, and cause problems with data freshness and accuracy. These restrictions make it difficult for many organizations to effectively utilize their data, which inhibits data-driven innovation.
At Google Cloud Next this week, we disclosed the Bigtable request priorities. Now, on a Bigtable cluster that is also serving latency-sensitive queries as high priority jobs, you can execute large workloads that are not time-sensitive, such as analytical queries and batch writes, as low priority jobs, greatly reducing the impact of batch processing on the serving workloads.
Along with allowing analysts, data engineers, and data scientists to work with operational data easily, common Bigtable access paths for analytics like BigQuery federation, Dataflow, and Spark connectors also support request priorities. This gives admins the assurance that activities like batch data loading or online model training will have little impact on operational performance.
No ETL tools are necessary for the BigQuery to Bigtable export
Whether it’s serving app analytics on a mobile app or a machine learning model delivering millions of personalized ads every second, applications frequently need to serve analytics to their end users. Sometimes this pattern is referred to as “Reverse ETL.” But to get this type of data into operational databases, ETL pipelines are needed, and sometimes your teams will need to submit tickets to your data engineers for assistance. We believe there is a better way: Why not enable a developer or data scientist to transfer data in a self-service fashion from their data warehouse into their operational databases?
Many Google Cloud customers use dashboards powered by Bigtable to publish engagement metrics for their social media content or time-series data for IoT. In order to support the low-latency, high-throughput online feature store access patterns required by machine learning models, data scientists who are building ML features in BigQuery frequently materialize their features into Bigtable.
In order to fully leverage the scalability of Google’s two “Big” databases (BigQuery and Bigtable), we worked closely with the BigQuery team to build these export capabilities directly into BigQuery. This allows developers to easily export the analytics required for their applications, while data scientists can materialize their features directly from the BigQuery console all without having to touch any ETL tools.
Keep backups longer across more than one region to increase resilience
Not to mention, you can now duplicate a Cloud Bigtable backup and keep it in any project or region where you have a Bigtable instance. Additionally, you have up to 90 days to keep your backups.
[…] the help of a feature called Cloud Bigtable change streams, you can simply access and connect your Bigtable data with other systems while keeping track of changes made to it. With change streams, you may record […]