Google BigQuery ML Contribution Analysis for Metric Insights

Google BigQuery ML Contribution Analysis

BigQuery ML Contribution Analysis, which automates the generation of insights, is now widely accessible.

BigQuery ML contribution analysis is now generally available (GA), according to Google Cloud. This function, which was first revealed in preview form in September 2024, aims to assist customers in comprehending the reasons behind the changes in their metrics by detecting significant change drivers from large-scale multidimensional data. BigQuery ML software engineers Jenny Ortiz and Katelin Amann made the news.

Historically, manual examination and trial-and-error querying and visualization have been necessary to overcome the issue of extracting insights from enormous volumes of multidimensional data, such as assessing sales data across various products, stores, locations, and consumers in conjunction with other events. This approach becomes challenging and generates a lot of potential combinations to investigate. This procedure is automated using Google BigQuery ML contribution analysis, which enables users to identify areas that require action promptly.

To help find the most crucial insights more quickly, a number of new features have been included with the GA release:

Apriori support’s automated support tuning using top-k insights: By giving the model the number of insights they want returned, users may now allow the model to automatically set the min_apriori_support threshold in addition to the previously available option to specify it manually. Apriori support is the process by which the model gets the most important insights based on the size of the data segments. When compared to returning every conceivable insight, which may amount to millions, both approaches aid in lowering query latency.

Improved readability of insights with duplicate information pruning: Users can eliminate unnecessary insights with the new pruning_method option. When several similar insights, especially in connected data, have the same output metrics, redundancy may arise. For instance, if all of that store’s sales took place in that city, then sales data for segments defined by both [city=’Iowa City’, store_name=’General Store / Iowa City’] and [store_name=’General Store / Iowa City’] may have the same metrics. Only distinct insights with the most descriptive segment are returned thanks to pruning.

contributorsmetric_testmetric_control
[city=’Iowa City’, store_name=’General Store / Iowa City’]640047.08214317.46
[store_name=’General Store / Iowa City’]640047.08214317.46

The summable by category metric was added to the expanded metric support: Summable (aggregating a single measure) and summable ratio (aggregating a ratio of two measurements) were the two metric kinds available in the preview at first. To analyze the total of a metric normalized by the distinct values of a categorical variable, such as sales per customer or site visits per day, the GA release included the summable by category measure. When comparing groups with varying numbers of rows, such as revenue per month across years with varying data availability, this new metric is useful for correcting for outliers.

Contribution analysis in action

Use of the feature to comprehend a decline in apparel product sales per user on a public e-commerce dataset between 2020 and 2021 is illustrated with a useful example. In order to query the model for insights, users must first create an input table, define a model with MODEL_TYPE=’CONTRIBUTION_ANALYSIS’, specify dimension columns (DIMENSION_ID_COLS), the test column (IS_TEST_COL), and the new CONTRIBUTION_METRIC=’SUM(sales)/COUNT(DISTINCT user_id)’ with TOP_K_INSIGHTS_BY_APRIORI_SUPPORT = 15 and PRUNING_METHOD=’PRUNE_REDUNDANT_INSIGHTS’. The output, which is automatically sorted by contribution, provides particular information like the decline in US sales per user as a result of referred traffic. It is said that this knowledge is very helpful in guiding business strategy.

Users are invited to review the provided documentation and tutorial to begin contributing analysis.

Google Cloud gives $300 in free credit to new users so they may test out their data analytics services, which include free BigQuery monthly usage.

Drakshi
Drakshi
Since June 2023, Drakshi has been writing articles of Artificial Intelligence for govindhtech. She was a postgraduate in business administration. She was an enthusiast of Artificial Intelligence.
RELATED ARTICLES

Page Content

Recent Posts

Index