Friday, March 28, 2025

Tiered Storage In Spanner: Flexible And Cost-Effective

Google Cloud is thrilled to present Spanner’s latest feature, fully controlled tiered storage, which enables you to handle larger datasets by balancing cost and performance appropriately and reducing operational overhead through an intuitive, user-friendly interface.

Mission-critical operational applications at businesses across a variety of industries, including gaming, retail, and financial services, are powered by Spanner. For these workloads to provide always-on experiences at any size, Spanner’s global consistency and elastic scalability are essential. For instance, a multi-channel order and inventory management system at a store or a global trade ledger at a bank rely on Spanner to offer a consistent picture of real-time data for trades, risk assessment, order fulfilment, and dynamic price optimization.

However, over time, historical reporting or regulatory compliance take precedence over settled trade records or fulfilled orders in terms of corporate operations. Customers are searching for solutions to transfer this “cold” data to less expensive storage because these datasets don’t need the same real-time performance as “hot,” active, transactional data.

Tradeoff between cost and performance
Image credit to Google Cloud

However, switching to different storage options usually necessitates intricate data pipelines and may affect the operational system’s performance. Application-level reconciliation may be necessary if data is manually separated among storage solutions because this can lead to inconsistent reads. Additionally, the separation increases the number of governance touchpoints that require auditing and places severe restrictions on how applications can query across past and present data for purposes such as reacting to regulators.

With a new storage tier based on hard disc drives (HDD) that is 80% less expensive than the current tier based on solid-state drives (SSD) and tailored for low-latency and high-throughput queries, Spanner’s tiered storage solution tackles these issues.

In addition to the financial savings, advantages include:

Ease of management

With Spanner, storage tiering is completely policy-driven, reducing the effort and complexity involved in creating and maintaining extra pipelines or dividing/duplicating data across solutions. As part of background maintenance, asynchronous background operations transfer data automatically from SSD to HDD.

Unified and consistent experience

It can exactly where the data is stored in Spanner. Data from both SSD and HDD tiers can be accessed by queries on Spanner without requiring any changes. In the same way, backup policies are implemented uniformly to all data, allowing for consistent recoveries across both storage levels.

Flexibility and control

You can decide which data to migrate to HDD by applying tiering policies to the database, table, column, or secondary index. For instance, it is simple to transfer data to HDD without dividing database tables when it is in a column that is rarely requested, such as JSON blobs for a lengthy tail of product attributes. While the data is stored on HDD, you can also decide to have some indexes on SSD.

With over 18.7 million users, Mercari’s mobile payments platform, Merpay, uses Spanner as its database. It were looking for ways to preserve collected historical transaction data due to their continuously increasing transaction volume, but it didn’t want to incur the expense of continuously moving data to a new solution. With the introduction of Spanner tiered storage, Google Cloud will be able to store historical data more affordably and without the need for an additional solution, while still having the ability to query it as needed.

Let’s take a closer look

To begin, create a locality group that specifies storage options [‘SSD’ (default)/HDD] using the GoogleSQL/PostgreSQL data definition language (DDL). In order to maximise performance, locality groups are a way to give data localisation and isolation along a dimension (such as a table or column). ‘ssd_to_hdd_spill_timespan’ can also be used when setting a locality group to determine how long data should be held on SSD before moving off to HDD as part of a future compaction cycle.

Data is moved from SSD to HDD asynchronously between weekly compaction cycles at the underlying storage layer without user intervention once the DDL has been enabled.

System Insights, which shows the disc load at the instance level and the amount of HDD storage used per location group, can be used to track HDD utilization.

Spanner tiered storage is accessible in all regions where Spanner is available and supports databases with both GoogleSQL and PostgreSQL dialects. Enterprise and Enterprise Plus licenses of Spanner come with this feature for free, on top of the price of the HDD storage.

Thota nithya
Thota nithya
Thota Nithya has been writing Cloud Computing articles for govindhtech from APR 2023. She was a science graduate. She was an enthusiast of cloud computing.
RELATED ARTICLES

Recent Posts

Popular Post