Use contribution analysis in BigQuery ML to discover important insights.
Organizations struggle to interpret data changes as data volumes expand. Organizations struggle to determine the origin of major trends and swings, limiting decision-making. A corporation may question, “What factors drove revenue growth between Q1 and Q2?” or “Why did an advertisement click-through rate decrease 5% over the last week?”
It takes tools to analyze data segments at a time to uncover statistically significant important drivers. Google Cloud is announcing the public preview of contribution analysis in BigQuery ML to help enterprises identify insights and patterns in their data interactively and at scale.
Contribution analysis
Contribution analysis, sometimes referred to as key driver analysis, can help you learn more about how important indicators in your multi-dimensional data have changed. Contribution analysis can be used, for instance, to compare two sets of training data to understand changes in an ML model’s performance or to see changes in revenue figures over the course of two quarters. To establish a contribution analysis model in BigQuery, use the establish MODEL statement.
Augmented analytics, or the application of artificial intelligence (AI) to improve and automate data analysis and comprehension, includes contribution analysis. Finding trends in users’ data is one of the main objectives of augmented analytics, and it helps to achieve this goal.
By contrasting a test set of data with a control set of data, a contribution analysis model finds data points that exhibit statistically significant changes in a metric over time. This lets you track data changes by time, location, customer segment, or other metrics. You can compare a table snapshot from 2023 to 2022 to see how data evolves over two years.
The difference between the test and control data is measured and compared using a metric, which is a numerical number used in contribution analysis models. With contribution analysis models, you can provide a summable ratio metric or a summable metric.
A segment is a portion of the data that is distinguished by a particular set of dimension values. Every possible combination of the store_number, customer_id, and day dimensions, for instance, constitutes a segment in a contribution analysis model.
You can examine relevant metrics from your dataset across specified test and control subgroups by using contribution analysis. It functions by determining which combinations of “contributors” result in unexpected changes, and it grows well by minimizing the search space through pruning optimization. Numerous businesses and use cases can benefit from this kind of examination. Among the instances are:
- Telemetry monitoring: Examine variations in occurrences that software programs have logged.
- Sales and advertising: Investigate user involvement to adjust campaigns and ads according to click-through rates.
- Retail: To maximize stock levels, assess the effects of price adjustments and inventory management techniques.
- Healthcare: Look at important variables that affect patients’ health to aid improve prognoses and treatment approaches.
Contribution analysis model in BigQuery ML
How does it operate?
A single table with rows of a control set of baseline data and a test set to compare against the control, a metric to analyze (like revenue), and a list of contributors (like product_sku, category, etc.) are all you need to create a contribution analysis model in BigQuery ML. Next, using a certain combination of contributor values, the model finds significant data slices, or what we call segments.
With contribution analysis, you can examine summable metrics and summable ratio metrics, two distinct categories of metrics. Summable metrics aggregate a single measure of interest, such revenue, to summarize each data segment. indicators known as summable ratios examine the relationship between two important indicators, like earnings per share.
Additionally, contribution analysis models have pruning optimizations by default, allowing you to use the Apriori pruning technique to obtain insights more quickly. The model may quickly identify important segments by narrowing down the search space based on a minimum support value. The size of a segment in relation to the whole population is indicated by the support value. You can focus on the largest segments by excluding segments with low support values, which also shortens the query execution time.
BigQuery now offers Contribution Analysis in preview.