Whats OSS Mean? Expand your AI toolkit Cloud Vertex AI now enables full support for open-source LLMs, offering developers more model choices than ever.
Using well-known SQL syntax, BigQuery Machine Learning enables you to apply large language models (LLMs), such as Gemini, to your data to carry out operations like entity extraction, sentiment analysis, translation, text creation, and more.
Google Cloud is expanding this functionality to include support for any open-source LLM from the Vertex AI Model Garden, including any OSS models you may have tweaked and any models you deploy via Hugging Face. This significantly increases the number of models that developers may choose from.
Google Cloud demonstrate how this integration functions using the Meta Llama 3.3 70B model. But by using the same procedures, you may employ any of the 170K+ text generating models that Hugging Face has to offer.
Whats OSS mean
Open Source Software (OSS). It covers software licensed to allow public access, modification, and sharing of the source code. As organisations and developer communities collaborate to build open-source software, it promotes openness, flexibility, and innovation.
Popular OSS examples include:
- Operating System (Linux)
- Web server Apache
- Web browser Mozilla Firefox
- Office Suite LibreOffice
- Version Control System, or Git
Using Open-Source Software (OSS) models with BigQuery ML
Host the model on a Vertex endpoint
Select a Hugging Face text-generation model first. Next, choose Vertex AI Model Garden > Hugging Face Deploy. Input the model URL and change the deployment endpoint’s machine specification, deployment region, and endpoint name if desired.
As an alternative, you may use the Vertex AI Model Garden UI to search for “Llama 3.3,” agree to the conditions, and then launch the model endpoint. This step can also be completed programmatically; refer to the lesson notes for details.
Note: You must accept conditions in the Vertex Model Garden UI or agree to the LLama 3.3 COMMUNITY LICENSE AGREEMENT on the LLama 3.3 Model Card in Hugging Face in order to utilise LLama models. Before deploying the model, you must finish this step.
Create a remote model in BigQuery
It takes a few minutes to install the model. Once the deployment is finished, use a SQL query similar to this one to establish a remote model in BigQuery:
CREATE OR REPLACE MODEL bqml_tutorial.llama_3_3_70b
REMOTE WITH CONNECTION `LOCATION.CONNECTION_ID'
OPTIONS
(endpoint='https://<region>-aiplatform.googleapis.com/v1/projects/<project_name>/locations/<region>/endpoints/<endpoint_id>'
)
You must supply a “Connection” in order for BigQuery to establish a connection with a distant endpoint. You can make a connection by following these steps if you don’t currently have one. In the code sample above, substitute the endpoint URL for the placeholder endpoint. Using Vertex AI > Online Prediction>Endpoints>Sample Request, you may retrieve endpoint_id information from the console.
Perform inference
You can now use BigQuery ML to execute inference against this model. Consider this dataset of medical transcripts as an example for this case. It contains unstructured and diverse raw transcripts that document the medical facility’s patients’ histories, diagnoses, and treatments. The picture below represents an example of a transcript:
Create a table
Make a table initially in order to analyse this data in BigQuery.
LOAD DATA OVERWRITE bqml_tutorial.medical_transcript
FROM FILES( format='NEWLINE_DELIMITED_JSON',uris = ['gs://cloud-samples-data/vertex-ai/model-evaluation/peft_eval_sample.jsonl'] )
Perform inference
The unstructured transcripts in your database can now have structured data extracted from them using your Llama model. Let’s say you wish to extract each entry’s patient’s age, gender, and list of illnesses. The resulting insights may be saved to a table using a SQL query similar to the one below. In the model prompt, include the data you wish to extract along with its schema.
Because the input prompt is included in the output delivered by this Llama endpoint, Google Cloud additionally created and utilised an ExtractOutput method to assist us in parsing the output. The following is the output table with the findings in the “generated_text” column:
Perform analytics on results
This data may now be used for a variety of analytics. Answer “What are the most common diseases in females with age 30+ in Google Cloud’s sample?” with a simple SQL query. Most common are “hypertension,” “arthritis,” and “hyperlipidaemia.”
WITH
parsed_data AS (
SELECT
JSON_EXTRACT_SCALAR(generated_text, '$.Gender') AS gender,
CAST(JSON_EXTRACT_SCALAR(generated_text, '$.Age') AS INT64) AS age,
JSON_EXTRACT_ARRAY(generated_text, '$.Disease') AS diseases,
FROM
bqml_tutorial.medical_transcript_analysis_test)
SELECT
disease,
count(*) AS occurrence
FROM
parsed_data, UNNEST(diseases) AS disease
WHERE
LOWER(gender) = 'female'
AND age >= 30
GROUP BY
disease
ORDER BY
occurrence DESC
LIMIT 3;
Start now
With the BigQuery and Vertex Model Garden interaction, you may test BigQuery using a tuned/distilled model or your own favourite open model.