Sunday, July 7, 2024

Azure Storage Actions: Serverless data management

The public preview of Azure Storage Actions, a fully managed platform that enables you to automate data management tasks for Azure Blob Storage and Azure Data Lake Storage, is being announced with great excitement.

Data management is becoming increasingly difficult for organizations as their data estates grow exponentially. For businesses to fully utilize their data assets, adhere to compliance requirements, cut expenses, and protect sensitive data, effective data management is crucial. Increasing resource investments to manage data at the same pace as the increase in data volumes is unsustainable, and the tools and methods available today to manage massive data assets are laborious. Customers using storage need an effective system to manage billions of objects across thousands of datasets, consistently and holistically, across all regions.

With a quicker time to value, Azure Storage Actions revolutionizes how you manage massive data assets in your object storage and data lakes. Without requiring any resource provisioning or management, its serverless architecture offers a dependable platform that grows to meet your data management requirements. Without the need for programming knowledge, you can define the conditional logic for processing objects using a no-code experience.

With a few clicks, the tasks you create can safely operate on several datasets with comparable requirements. By providing views that provide an overview of results at a glance, in addition to filters and drilldowns for more detail, monitoring overhead is reduced. For Azure Blob Storage and Azure Data Lake Storage, this release supports cost optimization, data protection, rehydration from archive, tagging, and a number of additional use cases.

The operation of Azure Storage Actions

You can quickly create, verify, and implement data management tasks by using Azure Storage Actions. These jobs can be set up to run on demand or according to a schedule.

You can create a condition that specifies the blobs and operations you want to perform on using the Azure portal interface. Without taking any action, you can safely verify the condition against your production data using the integrated validation experience, which displays the blobs that meet the condition and the operations that would be performed on them if the task were executed.

Any storage account within the same Microsoft Entra ID tenant can have tasks assigned to it to execute. When necessary, the service automatically sets up, scales, and optimizes the resources for either ongoing or one-time task execution. Aggregate metrics and dashboards provide a visual summary of operations and allow you to drill down into more in-depth reports with minimal intervention when and where needed.

REST APIs and the Azure SDK are additional programmatic means of controlling Azure Storage Actions. PowerShell, Azure Resource Manager (ARM) templates, and Azure Command-Line Interface (CLI) are all supported.

Supported operations: This release supports all built-in operations on Azure Blob Storage and Azure Data Lake Storage, such as adjusting tiers, controlling blob expiry, deleting or undeleting blobs, and setting time-based retention. Additional operations will be supported by the feature in future releases.

Azure storage actions
Image credit to Azure
Azure Storage Actions preview
Image credit to Azure

The rationale behind utilizing Azure Storage Actions

  • Utilizing Azure Storage Actions to automate your data management processes has the following benefits:
  • Reduces the amount of work needed to automate routine data management tasks, which increases productivity.
  • Reduces the overhead associated with managing or provisioning infrastructure.
  • Offers confidence through the no-code interface’s integrated validation experience for error-free application to your production data.
  • Makes reuse easier by allowing you to create a task once and quickly deploy it to any storage account.
  • Promotes the consistent application of metadata and blob tags in conditions and operations.

Use cases examples

Thousands of data sets with a variety of object types that are needed for different kinds of processing can be found in large data lakes. Individual objects within a blob container may need different tiering transitions, distinct labels for tagging, distinct retention or expiry periods, and other requirements based on their attributes. Tasks that scan billions of blobs, analyze each one based on dozens of properties (file extension, naming pattern, index tags, blob metadata, or system properties like creation time, content type, blob tier, and more), and decide how to handle each one can be defined with Azure Storage Actions.

This method can simplify a wide range of recurrent or one-time use cases, such as:

Depending on object tags, retention, and expiration: One of Azure’s international clients in the financial services industry uses Azure Blob Storage to ingest call recordings from customer support agents. These recordings contain blob tags that indicate when an order was placed for trading, when an account was updated, and other information. Depending on the type of call, these recordings have different retention requirements. Now, they can use Azure Storage Actions to create a task that uses a combination of blob tags and creation time to automatically manage the retention and expiry durations of ingested recordings.

Flexible data protection in datasets: Although blob versioning and snapshots are used by a prominent travel services company customer, the thousands of datasets in the storage account have varying data protection needs. Certain datasets must have a strict version history maintained, but others do not require this kind of security. It is prohibitively expensive to preserve the full blob version and snapshot history for every dataset in their storage account. They can now flexibly manage the appropriate retention and lifecycle of versions and snapshots for their datasets by using tags and metadata with Azure Storage Actions.

Cost optimization based on file types and naming conventions: A lot of Azure Storage users also need to control blob tiering, expiration, and retention according to file types, naming conventions, or path prefixes. To process the objects as desired, these attributes can be combined with blob properties like size, creation time, last modified or last accessed times, access tier, version counts, and more.

Processing blobs on demand at scale: Azure Storage Actions can be used for processing billions of objects on demand in addition to continuous data management tasks. For example, you can set up tasks to clean up redundant and outdated datasets, reset tags on a portion of a dataset when an analytic pipeline needs to be restarted, or initialize blob tags for a new or modified process. You can also define tasks to rehydrate large datasets from the archive tier.

How to begin using Azure Storage Actions

We cordially request that you check out Azure Storage Actions for object storage data management. During the preview, you can test the feature for free and only pay for the transactions that are initiated on your storage account. Before the feature’s widespread release, pricing details will be released. Please visit the feature support page to view the list of supported regions. Start by using the quickstart guide to quickly create and complete your first data management task. Please refer to the documentation for further information.

Thota nithya
Thota nithya
Thota Nithya has been writing Cloud Computing articles for govindhtech from APR 2023. She was a science graduate. She was an enthusiast of cloud computing.
RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Recent Posts

Popular Post

Govindhtech.com Would you like to receive notifications on latest updates? No Yes