The Data Apps Conference
A yellow arrow pointing to the right.
A yellow arrow pointing to the right.
Phil Ballai
Enterprise Architect
February 27, 2025

Uncovering Key Insights With Snowflake Cortex AI And Sigma

February 27, 2025
Uncovering Key Insights With Snowflake Cortex AI And Sigma

Organizations today collect vast amounts of data—customer transactions, marketing campaigns, operational metrics, and beyond. Turning that data into meaningful insights has traditionally required advanced analytics and complex machine learning (“ML”) workflows, forcing data teams to extract data from one platform, write code in another, and iterate repeatedly to get trustworthy results. 

This approach is not only slow but also inefficient. By integrating Sigma with Snowflake ML to surface insights, organizations can eliminate redundant workflows and deliver real-time, AI-powered insights—easier and quicker than ever.

Three common ML use cases

Machine learning is transforming business intelligence by automating pattern recognition, uncovering hidden trends, and enabling predictive analytics. Here are three powerful ways organizations use ML today:

  1. Key Driver Analysis – Pinpointing which attributes most influence a desired outcome.
    1. Financial Services: Analyze which variables (credit history, income, transaction patterns) best predict loan default risk or churn.
    2. Healthcare: Determine which patient attributes (age, comorbidities, care pathways) most contribute to readmission risk or treatment outcomes.
    3. Retail: Identify which factors (promotions, product mix, marketing channels) most drive sales.
  2. Outlier Detection – Dynamically spotting unexpected or anomalous data points in real time, enabling proactive monitoring and alerts.
    1. Healthcare: Flag abnormal patient vitals or irregular billing claims for further investigation.
    2. Retail: Detect unusual transactions or suspicious spikes in returns that may signal fraud.
    3. Manufacturing: Identify production anomalies or equipment failures before they escalate into major downtime.
  3. Clustering – Grouping similar entities (e.g., customers, products) together for segmentation and deeper insights.
    1. Financial Services: Group customers by asset classes or transaction behavior to inform wealth management strategies.
    2. Healthcare: Cluster patient populations by risk profiles or disease characteristics, enabling more targeted interventions.
    3. Retail: Segment customers by spending habits or product affinity to tailor marketing campaigns.

With Sigma and Snowflake ML, these workflows can be executed seamelessly—no coding required.  This  unified approach combines scalable data storage, built-in (or user-defined) ML functions, and intuitive dashboards in one ecosystem.

Additionally, Sigma can leverage Snowflake Cortex—a capability that provides access to Large Language Models (LLMs) such as Claude or Llama 2 to summarize results, explain trends, and provide human-friendly interpretations of ML outputs, directly in Sigma.

A unified ML workflow in Sigma

In the three different data science tasks we just mentioned—Key Driver Analysis, Outlier Detection, and Clustering—each follows a similar end-to-end pipeline:

  1. Data Centralization: Keep your source data in the cloud data warehouse.
  2. Data at Scale: Large to massive datasets.
  3. ML Computation: Leverage built-in or custom ML functions (regression, anomaly detection, clustering, etc.).
  4. LLM Interpretation in Sigma: Leverage AI using Snowflake Cortex to deliver human-friendly summaries or explanations.
  5. Sigma Visualization: Surface the results in Sigma dashboards, where business stakeholders can explore insights without writing code.
  6. Sigma Data Apps: Create workflows and interactivity that provide far more functionality than a typical dashboard.

In this whitepaper, we walk through these concepts and demonstrate how easily they can be deployed in Sigma, so that both technical and non-technical users can benefit from advanced ML/AI insights—all without leaving Sigma.

To keep things practical and concise, we will focus our detailed example on Key Driver Analysis using a sample 900k record dataset (e.g., retail transactions) and Snowflake ML/AI summarization; all inside of Sigma. 

Once you see this in action, you’ll understand how straightforward it is to adapt the same workflow for other scenarios—be it time series forecasting, contribution analysis, outlier detection, or clustering—all by switching out the specific ML function and adjusting to the new dataset as needed.

Key driver analysis & AI assist in Sigma

Scenario: Understanding holiday sales performance

Imagine you’re a category manager at a large retailer reviewing holiday sales performance. You see a significant year-over-year revenue jump around key holiday weeks, and you want to understand why sales increased—specifically, which product lines, store tiers, or time periods contributed most. 

While Sigma does support exploring data down to the lowest level of granularity at scale, this can be time-consuming and interpreting the data can be a challenge. Instead, you can leverage Sigma’s integration with Snowflake and Cortex to rapidly pinpoint and explain key drivers, in simple language.

Lets take this in steps:

Holiday Sales Data Table

  • Using Sigma, we have access to all sales transactions that occurred between November through January for each of the last two years. There are about 900k rows, so spotting any trends can be challenging, especially when done manually.

Visualize the High-Level Trend

  • Of course we can use a chart to see what that yields. We created a bar chart in Sigma that compares weekly holiday sales in the current year (2024) vs. the prior year (2023). We do see a clear surge in 2024 sales, but need to know which segments made the biggest impact.

We could keep drilling and exploring, but let's do this in a better way.

Run a Key Driver Analysis in Sigma

The ability to make rich “data app” interactivity in Sigma allows us to give the user the power of ML models with just the click of a button. 

In this example, we will use Snowflake’s Top Insights ML function (“the model”). From Snowflake’s documentation:

“Top Insights is an ML Function for key driver analysis, helping you to identify drivers of a metric’s change over time or explain differences in a metric among various verticals. Top Insights is powered by a decision tree model that separates a dataset into segments that have different behavior in relation to the metric you want to analyze.”

Top Insights is perfect for our use case.

In our example, we will use the GET_DRIVERS method to extract key drivers from the dataset we want to perform key driver analytics on. 

We go one step further and allow the user to decide which metric from the dataset they want to run against the model with a simple button click. In our use case, we used a segmented control that allows the user to easily select Revenue, Cogs or Profit as the key driver.

In addition, users can also select the desired dimensions for the model to evaluate:

When the user clicks the  “Run Key Driver Analysis” button, a Sigma action is triggered to call a Snowflake stored procedure. The procedure receives a few parameters from Sigma, based on the user selections and uses them to prepare a temporary table that aggregates the user-specified metric by the chosen dimensions, in a format suitable for Snowflake’s Top Insights ML function to analyze.

The next action step fires after the stored procedure is complete. This refreshes another table called “Key Driver Output”. This table is the result of the call to the SNOWFLAKE.ML.TOP_INSIGHTS model. 

As you can see, understanding what this output means is yet another challenge!:

Essentially, the table represents the high-growth segments (e.g., a specific region + product combination) that contributed most to the revenue jump.

Clearly Sigma provides a much better way to summarize the results quickly and efficiently using Snowflake Cortex but interpreting the results presents yet another data challenge. Enter AI.

AI-assisted explanation with Cortex in Sigma

As we saw, the resulting table can be overwhelming for non-technical users. To address this, Sigma sends the output to Snowflake Cortex with a single button click.

When the user clicks the “Explain Key Drivers” button, Sigma makes a call to Cortex’s LLM (e.g., Claude, Jamba, etc.) which produces a concise narrative explaining why certain segments grew, highlighting patterns like regional performance, store format differences, and specific weeks that saw spikes.

We pre-designed the LLM prompt in Sigma to be configurable on the fly, so we can tweak as we see the response. In the first run, our prompt is written to represent a request from a data analyst:

Based on that, the user is presented with a clean summary (from the perspective of a data analyst) which they can easily review and then dig further into the data. This is like having a map show you the way and you can decide which detours you want to make along the way, when exploring the data further. How cool is that!

Customizable prompts for business context

Because Sigma integrates tightly with Snowflake, you can easily tweak the LLM prompt (e.g., change personas, add a more casual tone or include emojis) to tailor the explanation for your audience.

When the user clicks the “Customize Prompt” button, they are presented with a modal that allows them to adjust the LLM prompt as desired. Lets be the CFO this time:

Now we get a higher level summary that is more appropriate for a CFO:

Notifications and alerts

Often users prefer to be informed only when something meaningful falls out-of-range. There are many scenarios where this can be used; perhaps outliers is a good example to discuss. 

We have already seen how Sigma can leverage ML/AI in a data app so getting the data to drive any potential notification is just one more step forward.

For example, if we run the key driver analysis against just Product Line, and sort the table we can see that there is a row with a large negative surprise value.

A negative surprise means “this drop was more significant than the model’s baseline expectation.” It can be a strong indicator that this segment warrants a closer look, but it doesn’t necessarily reflect an absolute loss—just a notably weaker performance compared to the control period.

Regardless, we want to be notified when something like this happens. Creating an alert or notification in Sigma is simple but powerful too. For example, we can be notified (by email in this case) whenever there is a value exceeding -$10,000 in the Key Drive Output table / Surprise column.

Being able to configure conditional alerts against any (or multiple) columns is powerful and helps drive productivity to the right places as soon as possible.

Key takeaways: AI-powered insights, no-code execution 

By combining Snowflake’s ML functions, Cortex’s language capabilities, and Sigma’s user-friendly interface, organizations can move beyond static dashboards and genuinely understand the forces behind their business results. Here are just a few things that sets Sigma apart, based on this scenario:

No lengthy drill-downs: Instead of manually sifting through dozens of dimensions, Sigma’s Key Driver Analysis provides a data-driven approach to pinpointing what changed—and why—with just a few clicks.

Plain-English insights: With Snowflake Cortex, raw statistical output is transformed into easy-to-understand narratives tailored for executives or frontline managers alike.

Full transparency and control: Data remains within Snowflake’s secure, scalable infrastructure. Sigma’s low-code environment ensures teams at all skill levels can readily access and interpret insights.

Next-level planning: Armed with clear explanations of which segments drove a jump in holiday sales, you can make well-informed decisions—like reallocating resources or adjusting marketing strategies—supported by solid data.

Sigma makes machine learning and AI work for everyone. Whether you’re identifying sales drivers, spotting anomalies, or refining segmentation strategies, Sigma turns insights into action—without the complexity of traditional data science workflows.

THE ULTIMATE KPI PLAYBOOK