What is Data Automation?
Eliminating human interference from tasks like extract, transform, load (ETL), data integration, data validation, and data analytics is known as data automation.
Data processing and retrieval are improved with new Amazon Bedrock features.
Amazon Bedrock Data Automation(Preview)
You can easily and affordably create automated workflows for media analysis, intelligent document processing (IDP), and retrieval-augmented generation (RAG) with Amazon Bedrock Data Automation. Video recaps of significant events, the identification of offensive visual material, automated document analysis, and much more are examples of insights. Outputs can be altered to fit insights into your own company requirements. When creating a knowledge base for RAG processes, Amazon Bedrock Data Automation may be utilized as a parser or as a stand-alone tool.
Multimodal data is now processed by Amazon Bedrock Knowledge Bases: You may set up a knowledge base to parse documents using either Amazon Bedrock Data Automation or a foundation model (FM) as the parser to assist in developing applications that handle both text and visual aspects in documents and photos. The accuracy and applicability of the answers you receive from a knowledge base that has information contained in both text and graphics can be enhanced by multimodal data processing.
GraphRAG (preview) is currently supported by Amazon Bedrock Knowledge Bases: One of the first fully-managed GraphRAG capabilities is now available from us. By combining RAG methods with graphs, GraphRAG improves generative AI applications by giving end users more thorough and accurate replies.
Structured data retrieval is now supported by Amazon Bedrock Knowledge Bases: By adding vital enterprise data, this feature expands a knowledge base to enable natural language querying of data lakes and warehouses, enabling applications to access business intelligence (BI) through conversational interfaces and increase answer accuracy. One of the first completely managed off-the-shelf RAG solutions that can natively query structured data from its location is offered by Amazon Bedrock Knowledge Bases. This feature speeds up the process of developing generative AI applications from more than a month to a few days and aids in dismantling data silos across data sources.
Building full AI systems that can analyse, comprehend, and retrieve data from both structured and unstructured data sources is made simpler by these new capabilities. For instance, a vehicle insurance business may increase the efficiency of their claims department by automating their claims adjudication workflow with Amazon Bedrock Data Automation. This would cut down on the time it takes to process auto claims.
In a similar vein, a media business can examine television programs and extract information such as scene summaries, industry standard advertising taxonomy (IAB), and brand logos that are necessary for strategic ad placement. A media production business can record important points in their video assets and provide scene-by-scene recaps. A financial services organisation can utilise GraphRAG to comprehend the links between various financial institutions and handle intricate financial records using tables and charts. All of these businesses may query their data warehouse and get information from their knowledge base using structured data retrieval.
Let’s examine these qualities in more detail.
Presenting Amazon Bedrock Data Automation
One feature of Amazon Bedrock that makes it easier to extract useful insights from multimodal, unstructured content like documents, photos, videos, and audio files is Amazon Bedrock Data Automation.
Developers can process multimodal material using a single interface with Amazon Bedrock Data Automation’s unified, API-driven experience, which removes the need to coordinate and maintain several AI models and services. The quality and reliability of the extracted insights are enhanced by Amazon Bedrock Data Automation’s built-in protections, which include visual grounding and confidence scores. This facilitates integration into organizational processes.
Four modalities are supported by Amazon Bedrock Data Automation: documents, photos, video, and audio. Results are written to an Amazon Simple Storage Service (Amazon S3) bucket, and all modalities utilize the same asynchronous inference API when utilized in an application.
You may create two different output kinds for each modality by configuring the output according to your processing requirements:
Standard output: Standard output provides you with preset insights that are predetermined and pertinent to the type of supplied data. Semantic document representation, scene-by-scene video summaries, audio transcripts, and more are a few examples. It only takes a few steps to specify which insights you wish to collect.
Custom output: With custom output, you may use artefacts known as “blueprints” to define and describe your extraction needs and provide insights that are specific to your business needs. Additionally, you have the option to convert the output into a certain format or schema that works with your downstream systems, such databases or other apps.
All formats (audio, documents, photos, and videos) can be utilised with standard output. Custom output is limited to documents and photos during the preview.
The Amazon Bedrock Data Automation inference API allows users to store both standard and bespoke output configurations in a project for later use. For every processed file, a project may be set up to provide both standard and bespoke output.