Whats OSS Mean? Cloud Vertex AI to Support OSS LLM

February 19, 2025

62

Page Contents

Whats OSS Mean? Expand your AI toolkit Cloud Vertex AI now enables full support for open-source LLMs, offering developers more model choices than ever.

Using well-known SQL syntax, BigQuery Machine Learning enables you to apply large language models (LLMs), such as Gemini, to your data to carry out operations like entity extraction, sentiment analysis, translation, text creation, and more.

Google Cloud is expanding this functionality to include support for any open-source LLM from the Vertex AI Model Garden, including any OSS models you may have tweaked and any models you deploy via Hugging Face. This significantly increases the number of models that developers may choose from.

Google Cloud demonstrate how this integration functions using the Meta Llama 3.3 70B model. But by using the same procedures, you may employ any of the 170K+ text generating models that Hugging Face has to offer.

Whats OSS mean

Open Source Software (OSS). It covers software licensed to allow public access, modification, and sharing of the source code. As organisations and developer communities collaborate to build open-source software, it promotes openness, flexibility, and innovation.

Popular OSS examples include:

Operating System (Linux)
Web server Apache
Web browser Mozilla Firefox
Office Suite LibreOffice
Version Control System, or Git

Using Open-Source Software (OSS) models with BigQuery ML

Host the model on a Vertex endpoint

Select a Hugging Face text-generation model first. Next, choose Vertex AI Model Garden > Hugging Face Deploy. Input the model URL and change the deployment endpoint’s machine specification, deployment region, and endpoint name if desired.

Deploy from Hugging Face — Image credit to Google Cloud

As an alternative, you may use the Vertex AI Model Garden UI to search for “Llama 3.3,” agree to the conditions, and then launch the model endpoint. This step can also be completed programmatically; refer to the lesson notes for details.

Note: You must accept conditions in the Vertex Model Garden UI or agree to the LLama 3.3 COMMUNITY LICENSE AGREEMENT on the LLama 3.3 Model Card in Hugging Face in order to utilise LLama models. Before deploying the model, you must finish this step.

Create a remote model in BigQuery

It takes a few minutes to install the model. Once the deployment is finished, use a SQL query similar to this one to establish a remote model in BigQuery:

CREATE OR REPLACE MODEL bqml_tutorial.llama_3_3_70b

REMOTE WITH CONNECTION `LOCATION.CONNECTION_ID'

OPTIONS

(endpoint='https://<region>-aiplatform.googleapis.com/v1/projects/<project_name>/locations/<region>/endpoints/<endpoint_id>'

You must supply a “Connection” in order for BigQuery to establish a connection with a distant endpoint. You can make a connection by following these steps if you don’t currently have one. In the code sample above, substitute the endpoint URL for the placeholder endpoint. Using Vertex AI > Online Prediction>Endpoints>Sample Request, you may retrieve endpoint_id information from the console.

Perform inference

You can now use BigQuery ML to execute inference against this model. Consider this dataset of medical transcripts as an example for this case. It contains unstructured and diverse raw transcripts that document the medical facility’s patients’ histories, diagnoses, and treatments. The picture below represents an example of a transcript:

sample transcript — Image credit to Google Cloud

Create a table

Make a table initially in order to analyse this data in BigQuery.

LOAD DATA OVERWRITE bqml_tutorial.medical_transcript

FROM FILES( format='NEWLINE_DELIMITED_JSON',uris = ['gs://cloud-samples-data/vertex-ai/model-evaluation/peft_eval_sample.jsonl'] )

Perform inference

The unstructured transcripts in your database can now have structured data extracted from them using your Llama model. Let’s say you wish to extract each entry’s patient’s age, gender, and list of illnesses. The resulting insights may be saved to a table using a SQL query similar to the one below. In the model prompt, include the data you wish to extract along with its schema.

Because the input prompt is included in the output delivered by this Llama endpoint, Google Cloud additionally created and utilised an ExtractOutput method to assist us in parsing the output. The following is the output table with the findings in the “generated_text” column:

generated_text — Image credit to Google Cloud

Perform analytics on results

This data may now be used for a variety of analytics. Answer “What are the most common diseases in females with age 30+ in Google Cloud’s sample?” with a simple SQL query. Most common are “hypertension,” “arthritis,” and “hyperlipidaemia.”

WITH

  parsed_data AS (

  SELECT

    JSON_EXTRACT_SCALAR(generated_text, '$.Gender') AS gender,

    CAST(JSON_EXTRACT_SCALAR(generated_text, '$.Age') AS INT64) AS age,

    JSON_EXTRACT_ARRAY(generated_text, '$.Disease') AS diseases,

  FROM

    bqml_tutorial.medical_transcript_analysis_test)

SELECT

  disease,

  count(*) AS occurrence

FROM

  parsed_data, UNNEST(diseases) AS disease

WHERE

  LOWER(gender) = 'female'

  AND age >= 30

GROUP BY

  disease

ORDER BY

  occurrence DESC

LIMIT 3;

Start now

With the BigQuery and Vertex Model Garden interaction, you may test BigQuery using a tuned/distilled model or your own favourite open model.

Whats OSS Mean? Cloud Vertex AI to Support OSS LLM

Whats OSS mean

Using Open-Source Software (OSS) models with BigQuery ML

Host the model on a Vertex endpoint

Create a remote model in BigQuery

Perform inference

Create a table

Perform inference

Perform analytics on results

Start now

Internet of Things Devices Voice Recognition with Gemini API

OpenSHMEM 1.5 Implementation For Remote Memory Sharing

Embodied AI Robots And What is Embodied AI? & Its Importance

LEAVE A REPLY Cancel reply

Recent Posts

Internet of Things Devices Voice Recognition with Gemini API

OpenSearch Service AWS Gets Amazon Q Developer Support

IPv6 And IPv4 Dual Stack Now Available In Amazon API Gateway

OpenSHMEM 1.5 Implementation For Remote Memory Sharing

OCI Compute Shapes Unleash Cloud Efficiency with AMD EPYC

Embodied AI Robots And What is Embodied AI? & Its Importance

Popular Post

ASRock’s creative AMD FP6 series thin mini-ITX motherboard

ASUS ProArt PA602 The Most Elegant Computer Case!

Boost Your Apps Now: Amazon ElastiCache Serverless Unveiled!

What is Azure Policy in Microsoft Azure

The Ultimate Showdown: Redmi Watch 3 vs Redmi Watch 4!

Cardea Z540 SSD Revolutionizes Storage

About Us

POPULAR CATEGORY