Understanding Variance and Standard Deviation On Snowflake – Blog 1

Mathematician Carl Friedrich Gauss Solves Astronomers’ Problem
Variance and standard deviation weren’t invented for business.
They were invented because astronomers kept getting different answers.
In the early 1800s, scientists were trying to calculate the exact orbits of planets and comets. They would measure the same star night after night… and the numbers never matched perfectly.
Was the telescope broken?
Was the math wrong?
Or was variation just part of reality?
A young mathematician named Carl Friedrich Gauss realized something profound:
The errors themselves had a pattern.
Instead of trying to eliminate variation, Gauss measured it. He showed that most observations cluster near the average, and fewer appear as you move farther away. From that idea came variance, standard deviation, and what we now call the normal distribution.
The insight was revolutionary:
Variation is not noise to ignore — it is information to understand.
Two hundred years later, that same idea helps businesses decide:
- Which products are safe to forecast
- Which trends are reliable
- Which numbers look calm but hide danger
How Consistent Is My Data? Understanding Variance and Standard Deviation.
Let’s start with a simple question:
“If I look at this data tomorrow, will it behave the same way it did today?”
That question matters everywhere.
In business.
In sports.
In school.
In life.
If something is consistent, you can plan around it.
If it’s wild, you’d better be careful.
That’s what variance and standard deviation help us understand.
They don’t predict the future.
They don’t explain why things happen.
They simply answer a very human question:
“How risky is this?”
Meet Our Data: Three Personalities
In this blog, we’ll use one simple dataset:
Daily sales over two full years for three types of products.
Each has 730 days, so we’re comparing apples to apples.
- STABLE
Calm. Predictable. Boring in a good way. - MODERATE
Moves around a bit. Some good days, some bad days. - VOLATILE
Roller coaster. Big wins. Big losses. Drama.
All examples use the same table: SALES_BEHAVIOR_LARGE
Step 1: Look at the Data Before Measuring It
Before you use any statistics, please always look.
Counts, minimums, maximums, and averages give us context. They don’t tell the whole story — but they tell us where to start.
As Dwight D. Eisenhower once said:
“In preparing for battle I have always found that plans are useless, but planning is indispensable.”
Numbers don’t give answers by themselves.
They give clues.
Query:
SELECT
SALES_CLASS
,COUNT(*) AS DAY_COUNT
,MIN(DAILY_SALES) AS MIN_SALES
,MAX(DAILY_SALES) AS MAX_SALES
,AVG(DAILY_SALES) AS AVG_SALES
FROM SALES_BEHAVIOR_LARGE
GROUP BY SALES_CLASS
ORDER BY SALES_CLASS;SALES_CLASS DAY_COUNT MIN_SALES MAX_SALES AVG_SALES
MODERATE 730 70.20 125.35 97.50445205
STABLE 730 99.60 100.40 100.00000000
VOLATILE 730 -145.76 406.24 80.05095890What This Tells Us
At first glance, the averages look reasonable. But the ranges tell a different story.
- STABLE barely moves at all
- MODERATE moves within a sensible band
- VOLATILE swings wildly — with MIN_SALES even going negative
Here’s the first big lesson:
Averages are polite. They hide trouble.
Two products can have similar averages and completely different risk.
Step 2: Variance — How Spread Out Is This?
Variance answers one question:
“How far apart are the numbers?”
Think of variance like this:
- A tidy desk → low variance
- Clothes on the chair → moderate variance
- A tornado went through the house → high variance
QUERY:
SELECT
SALES_CLASS
,VAR_POP(DAILY_SALES) AS VARIANCE_POP
FROM SALES_BEHAVIOR_LARGE
GROUP BY SALES_CLASS
ORDER BY VARIANCE_POP;
RESULTS:
SALES_CLASS VARIANCE_POP
STABLE 0.0800000000
MODERATE 148.3488329189
VOLATILE 4511.1996388065What This Tells Us
Variance grows fast as behavior becomes less predictable.
- STABLE stays tightly grouped
- MODERATE spreads out
- VOLATILE explodes outward
Variance doesn’t care about direction.
It cares about distance from the average.
Step 3: Standard Deviation — The Number Humans Understand
Variance is useful, but it’s squared.
Humans don’t think in squares.
Standard deviation fixes that by putting volatility back into real-world units.
QUERY:
SELECT
SALES_CLASS
,STDDEV_POP(DAILY_SALES) AS "STDDEV_POP"
FROM SALES_BEHAVIOR_LARGE
GROUP BY SALES_CLASS
ORDER BY "STDDEV_POP";RESULTS:
SALES_CLASS STDDEV_POP
STABLE 0.28
MODERATE 12.18
VOLATILE 67.17What This Tells Us
Now the story becomes clear:
- STABLE products usually move less than 30 cents per day
- MODERATE products move about $12 per day
- VOLATILE products move about $67 per day
That’s the difference between a savings account… and a casino.
Step 4: Population vs Sample — Why Two Versions Exist
Sometimes you have all the data.
Sometimes you only have a sample.
Sample statistics assume more uncertainty.
QUERY:
SELECT
SALES_CLASS AS CLASS
,VAR_POP(DAILY_SALES) AS "VAR_POP"
,VAR_SAMP(DAILY_SALES) AS "VAR_SAMP"
,STDDEV_POP(DAILY_SALES) AS "STDDEV_POP"
,STDDEV_SAMP(DAILY_SALES) AS "STDDEV_SAMP"
FROM SALES_BEHAVIOR_LARGE
GROUP BY SALES_CLASS
ORDER BY SALES_CLASS;RESULTS:
CLASS VAR_POP VAR_SAMP STDDEV_POP STDDEV_SAMP
MODERATE 148.3488329189 148.5523292604 12.18 12.19
STABLE 0.0800000000 0.0801097394 0.28 0.28
VOLATILE 4511.1996388065 4517.3878413289 67.17 67.21The ordering doesn’t change.
The story doesn’t change.
That’s good.
It means the math agrees with common sense.
Step 5: How Analysts Actually Look at Data
In the real world, analysts don’t look at one metric at a time. They look at several together.
Query:
SELECT
SALES_CLASS AS CLASS
,COUNT(*) AS COUNT
,AVG(DAILY_SALES) AS AVG
,MIN(DAILY_SALES) AS MIN
,MAX(DAILY_SALES) AS MAX
,VAR_POP(DAILY_SALES) AS "VAR_POP"
,STDDEV_POP(DAILY_SALES) AS "STDDEV_POP"
FROM SALES_BEHAVIOR_LARGE
GROUP BY SALES_CLASS
ORDER BY "STDDEV_POP";RESULTS:
CLASS COUNT AVG MIN MAX VAR_POP STDDEV_POP
STABLE 730 100.00000000 99.60 100.40 0.0800000000 0.28
MODERATE 730 97.50445205 70.20 125.35 148.3488329189 12.18
VOLATILE 730 80.05095890 -145.76 406.24 4511.1996388065 67.17
The Takeaway
Now let’s slow down and really look at that last table, because this is where most people get fooled.
At first glance, the averages don’t look that different. STABLE averages 100. MODERATE averages 97.5. If you stopped there, you might reasonably conclude that these two products behave almost the same. And that is exactly how bad decisions get made.
Look again.
STABLE has a standard deviation of 0.28. That means most days, sales barely move at all. Tomorrow looks a lot like today. You can plan around this. You can forecast this. You can sleep at night.
MODERATE, on the other hand, has a standard deviation of 12.18. That means a typical day can swing twelve dollars up or down from the average. Some days are good. Some days are not. The average hides that daily uncertainty.
VOLATILE makes the point impossible to ignore. Its average is only 80.05 — lower than both STABLE and MODERATE —, but its standard deviation is 67.17. That number tells you everything the average refuses to admit. This data is unpredictable. Extreme days dominate the story. Forecasts are fragile.
This is the real lesson:
The average tells you where the center is.
Standard deviation tells you how much danger surrounds it.
Carl Friedrich Gauss figured this out over 200 years ago while trying to understand why astronomers kept getting different answers when measuring the same stars. He realized the variation wasn’t the problem — misunderstanding it was. By measuring spread rather than ignoring it, he gave us a way to see clearly rather than guess.
Gauss would probably smile knowing that today, instead of “seeing stars,” you’re using his insight to spot risk, avoid surprises, and make smarter decisions with your data.
And that is exactly why variance and standard deviation still matter.
Nexus applies these same statistical ideas directly to result sets, allowing analysts to examine consistency without re-querying the database.
Download a free trial at www.CoffingDW.com.


