When working with data, we often want to compare more than two groups.
If you only had two groups, you could use a t-test. But what if you have three or more groups? Running multiple t-tests isn’t a good idea. This is where One-Way ANOVA comes in. “ANOVA” stands for Analysis of Variance, because the test works by comparing variation between groups to variation within groups.
What is One-Way ANOVA?
One-Way ANOVA is a statistical test used to compare the means of three or more independent groups to see if at least one group’s mean is significantly different from the others.
“One-Way” means there’s only one factor (independent variable) that defines the groups.
Example: "Quarter" when comparing quarterly sales.
How Does it Work?
Between-Group Variation → Measures how much the group means differ from the overall mean.
Within-Group Variation → Measures how much individual data points vary within each group.
If the differences between groups are large compared to the differences within groups, then at least one group mean is likely different.
Hypotheses in One-Way ANOVA
Null Hypothesis (H₀): All group means are equal.
Example: Average sales are the same in all four quarters.
Alternative Hypothesis (H₁): At least one group mean is different.
Example: At least one quarter’s sales are different from the others.
Let’s walk through a hands-on example using Python to illustrate how One-Way ANOVA works in practice.
We’ll simulate two scenarios:
Scenario 1: Significant change across quarters
We’ll use numpy.random.normal() to generate synthetic data. This function creates normally distributed numbers based on:
What’s Happening Here?
Expected Outcome
Because the group means are far apart and the data is tightly clustered, the between-group variance dominates. This leads to:
This means we reject the null hypothesis and conclude that at least one quarter’s sales are significantly different.
Scenario 2: No significant change across quarters
What’s Different This Time?
Expected Outcome
In this case, the ANOVA test will likely return:
This means we fail to reject the null hypothesis, concluding that there’s no statistically significant difference in sales across quarters.
This scenario shows how One-Way ANOVA helps avoid false conclusions. Even if the numbers look slightly different, ANOVA tells us whether those differences are statistically meaningful or just random variation.
One-Way ANOVA is powerful—but it only tells you that a difference exists among the group means. It doesn’t tell you where that difference lies.
Let’s say your ANOVA test returns a significant result (p-value < 0.05). That means at least one group is different—but which one? Is Q2 higher than Q1? Is Q4 different from Q3? To answer that, you need a post-hoc test.
What is Post-Hoc Test?
Post-hoc means “after the fact.” These tests are run after ANOVA to identify which specific groups differ from each other. The most commonly used post-hoc test is Tukey’s HSD (Honestly Significant Difference). It compares all possible pairs of group means and tells you:-
In our ex, Tukey's HSD would
For each pair, it gives you a p-value and confidence interval, helping you pinpoint exactly where the change occurred.
Running Tukey’s HSD in Python
Let’s use run Run Tukey's HSD for both scenarios
pairwise_tukeyhsd() Parameters Explained
The output will show:
Tukey HSD for Significant Change Scenario
Each row in the output compares two quarters and tells you whether their average sales are statistically different.
Column-by-Column Breakdown
What This Means
Tukey HSD for No Significant Change Scenario
The results tells us