The integration of AWS SageMaker Lakehouse with Amazon S3 Tables is now widely accessible.
Amazon introduced Amazon SageMaker Lakehouse to streamline analytics and artificial intelligence with a single, open, and secure data lakehouse, and Amazon S3 Tables, the first cloud object store with integrated Apache Iceberg support to expedite storing tabular data at scale.
Additionally, the developers demonstrated how to integrate S3 Tables with Amazon Web Services (AWS) analytics services so that you can use Amazon Athena, Amazon Data Firehose, Amazon EMR, AWS Glue, Amazon Redshift, and Amazon QuickSight to stream, query, and visualize S3 Tables data.
S3 Tables were created in response to the customers’ desire to streamline the administration and optimization of their Apache Iceberg storage. At the same time, they were leveraging the SageMaker Lakehouse to dismantle data silos that hinder analytics cooperation and insight development.
They can obtain a complete platform that unifies access to various data sources, enabling both analytics and machine learning (ML) workflows, when combined with S3 Tables and SageMaker Lakehouse in addition to built-in connection with AWS analytics services.
The interface between Amazon S3 Tables and AWS SageMaker Lakehouse is now generally available, enabling unified S3 Tables data access across many analytics engines and tools. Amazon SageMaker Unified Studio, a unified data and AI development environment that combines features and tools from AWS analytics and AI/ML services, provides access to SageMaker Lakehouse. SageMaker Unified Studio and engines like Amazon Athena, Amazon EMR, and Amazon Redshift, as well as Apache Iceberg-compatible engines like Apache Spark or PyIceberg, may query any S3 table data that is integrated with SageMaker Lakehouse.
Building secure analytical processes that read and write to S3 Tables and combine data from third-party and federated data sources, like Amazon DynamoDB or PostgreSQL, with data in Amazon Redshift data warehouses is made easier with this interface.

Additionally, you can apply fine-grained access rights consistently across all analytics and query engines by centrally setting them up and managing them for the data in S3 Tables and other data in the SageMaker Lakehouse.
S3 Tables integration with AWS SageMaker Lakehouse in action
To begin using table buckets from AWS analytics services, navigate to the Amazon S3 console, select Table buckets from the navigation pane, and then click Enable integration.
To integrate with AWS SageMaker Lakehouse, you can now design your table bucket.
Create a table on Amazon S3 using Amazon Athena
Amazon Athena lets you build a table, load it with data, and query it from the Amazon S3 console in a few easy steps. Either choose Create table with Athena after selecting a table bucket, or choose Query table with Athena after selecting an existing table.
You must first define a namespace for your table before you can use Athena to construct one. You use the table namespace as the database in your Athena queries, and the namespace in an Amazon S3 table bucket is the same as a database in AWS Glue.
Use the SageMaker Unified Studio to query SageMaker Lakehouse
SageMaker Unified Studio now allows you to access unified data from third-party and federated data sources, Redshift data warehouses, and S3 data lakes in AWS SageMaker Lakehouse.
Create a SageMaker Unified Studio domain and project using the following example project profile to get started: Data Analytics and AI-ML model development.
Note the project role Amazon Resource Name (ARN) by scrolling down to project details from the project overview after formation.

The AWS Lake Formation console to approve Identity and Access Management (IAM) users and roles. Choose the mentioned in the preceding paragraph in the Principals section. In the LF-Tags or catalogue resources section, choose Named Data Catalogue resources, then pick the table bucket name you made for Catalogues.
Data query language (DQL) and data manipulation language (DML) queries are automatically conducted on S3 tables using Athena when you select Query with Athena.
Here is an example Athena query:
select * from "s3tablecatalog/s3tables-integblog-bucket”.”proddb"."customer" limit 10;
Setting up Amazon Redshift Serverless compute resources for data query analysis is necessary before using Amazon Redshift for querying. After that, you execute SQL in the Query Editor by selecting Query with Redshift. You must build a new JupyterLab space in Amazon EMR Serverless if you wish to use JupyterLab Notebook.
Join data from other sources with S3 Tables data
SageMaker Lakehouse now offers S3 Tables data, which you can connect with data warehouses, OLTP sources like relational or non-relational databases, Iceberg tables, and other sources to gain a deeper understanding.
By connecting to Google BigQuery, Amazon DocumentDB, Amazon DynamoDB, Amazon Redshift, PostgreSQL, MySQL, or Snowflake, SQL may combine data without ETL scripts.
Run the SQL query in the Query editor to link S3 Tables and DynamoDB data.
This is an example of a join query between DynamoDB and Athena:

Now available
SageMaker Lakehouse can now be integrated with S3 Tables in all AWS regions where S3 Tables are accessible.