AI Generate Table: Extracts Structured Data From Images

0
251
AI Generate Table
AI Generate Table: Extracts Structured Data From Images

AI Generate Table

A vast amount of unstructured data, including documents, movies, and photographs, has been produced by the explosion of digital information from social media, cellphones, and other sources. BigQuery is integrated with Vertex AI, Google Cloud’s potent AI platform, to assist you in analyzing this data. This allows you to employ sophisticated AI models, such as Gemini 2.5 Pro/Flash, to uncover the meaning concealed in your unstructured data.

A vast array of data forms, including text, photos, audio, and video, may be analyzed by Google’s sophisticated AI algorithms. They can turn raw data into organised insights that work with your current tools by extracting important information like names, dates, and keywords. Additionally, these models may also provide structured data in JSON format with novel approaches like restricted decoding, which helps to guarantee workflow compatibility.

Google Cloud recently introduced AI.GENERATE_TABLE(), a new BigQuery function that expands on the capabilities of ML.GENERATE_TEXT(), to further expedite this process. Using the prompt and table schema that are supplied, this function enables you to automatically transform the insights from your unstructured data into a structured table within BigQuery. You may use your current data analysis tools to readily analyze the retrieved information with this simplified method.

Extracting structured data from images

Let’s examine this new functionality in more detail using a three-image sample. The first is a photo of the famous Space Needle and the Seattle skyline. A city view of New York City follows. Lastly, there is a picture of flowers and cookies, which has nothing to do with cityscapes.

Seattle skyline featuring the iconic Space Needle
Image credit to Google Cloud
city view of New York City
Image credit to Google Cloud
Cookies and flowers
Image credit to Google Cloud

You must first make these photos available to BigQuery in order to use them with its generative AI features. To do this, create a table called “image_dataset” that links to the Google Cloud Storage bucket containing the photographs.

CREATE OR REPLACE EXTERNAL TABLE
 bqml_tutorial.image_dataset
WITH CONNECTION DEFAULT 
OPTIONS(object_metadata="DIRECTORY",
   uris=["gs://bqml-tutorial-bucket/images/*"])

Let’s connect to the potent Gemini 2.5 Flash model now that your image data is ready. This is accomplished by building a “remote model” in BigQuery that serves as a conduit to this sophisticated AI.

CREATE OR REPLACE MODEL
 bqml_tutorial.gemini25flash
REMOTE WITH CONNECTION DEFAULT 
OPTIONS (endpoint = "gemini-2.5-flash-preview-05-20")

Let’s now examine the pictures using the AI.GENERATE_TABLE() method. The remote model you made (linked to Gemini 2.5 Flash) and the table with your photos are the two items you’ll need to give the function.

The model will be instructed to “Identify the city from the image and provide its name, state of residence, brief history, and tourist attractions.” If the picture is not a city, kindly output nothing. It will establish a structured output format with the following fields to guarantee that the results are well-organized and user-friendly:

  • City_name (string)
  • State (string)
  • History_brief (string)
  • Attractions (string array)

The output is guaranteed to be consistent and work with other BigQuery tools with this standard, which is called a schema. You’ll see that the syntax used to define this schema is identical to that of BigQuery’s CREATE TABLE command.

SELECT
 city_name,
 state,
 brief_history,
 attractions,
 uri
FROM
 AI.GENERATE_TABLE( MODEL bqml_tutorial.gemini25flash,
   (
   SELECT
     ("Recognize the city from the picture and output its name, belonging state, brief history, and tourist attractions. Please output nothing if the image is not a city.", ref) AS prompt,
     uri
   FROM
     bqml_tutorial.image_dataset),
   STRUCT( "city_name STRING, state STRING, brief_history STRING, attractions ARRAY<STRING>" AS output_schema,
     8192 AS max_output_tokens))

The AI.GENERATE_TABLE() method creates a table with five columns when it is executed. The fifth column includes the picture URI from the input table, while the other four columns—city_name, state, brief_history, and attractions match the schema you specified.

As you can see, the first two photographs’ cities were correctly recognised by the model, which also provided their names and the states in which they are located. Using its own information, it even produced a list of attractions and a brief history for each city. This illustrates how huge language models may be used to directly extract insights and information from photos.

Large language models to extract information and insights directly from images
Image credit to Google Cloud

Extracting structured data from medical transcriptions

Let’s now see another instance of using AI.GENERATE_TABLE to retrieve data from unstructured data that is kept in a BQ controlled table. The Kaggle Medical Transcriptions dataset, which includes example medical transcriptions from a variety of specialisations, will be used.

Long and verbose, transcriptions contain a variety of information, such as a patient’s age, weight, blood pressure, ailments, and more. People find it difficult and time-consuming to manually sort them and organize them. However, it can now rely on the assistance of AI.GENERATE_TABLE and the LLM.

Let’s say you require the following details:

  • Age (int64)
  • (struct<high int64, low int64) blood_pressure
  • Weight (float64)
  • Conditions (string array)
  • Diagnosis (string array)
  • Drugs (a kind of strings)

Google Cloud may generate the following SQL query:

SELECT
 age,
 blood_pressure,
 weight,
 conditions,
 diagnosis,
 medications,
 prompt
FROM
 AI.GENERATE_TABLE(MODEL bqml_tutorial.gemini25flash,
   (
   SELECT
     input_text AS prompt
   FROM
     bqml_tutorial.kaggle_medical_transcriptions
   LIMIT
     3),
   STRUCT(
     "age INT64, blood_pressure STRUCT<high INT64, low INT64>, weight FLOAT64, conditions ARRAY<STRING>, diagnosis ARRAY<STRING>, medications ARRAY<STRING>" AS output_schema,
     1024 AS max_output_tokens))

With the aid of the AI.GENERATE_TABLE() method, you may convert your data and produce a BigQuery table for simple analysis and workflow integration.