Goodness of Fit
Fair Coin?
If we were to flip a coin 10 times, we would expect to see roughly 5 heads and 5 tails. Let's assign to heads and to tails. Therefore, we might see a sequence like this.
But what if we saw a sequence of 10 heads and 0 tails?
What are the chances! It's fairly easy to calculate the chances of seeing all heads.
However, we would also be equally surprised at seeing all tails, so we should account for both situations.
This probability is so low that it seems like we might not be dealing with a fair coin. We might conclude that the result of the coin flip isn't random since the observations would only occur less than 1% of the time.
But what about seeing something like 1 heads and 9 tails?
How likely is this to occur with a fair coin? What we actually want to know is how likely are we to see 1 or fewer heads or 1 or fewer tails?
What about if we have 2 heads and 8 tails? The calculations for this become very tedious very quick. And keep in mind this becomes even more annoying if we scale up to 100 coin flips, or even 1000. Clearly, we need a better approach.
The Chi-Squared Test
To determine whether an observed distribution of values matches an expected distribution of values, we can use a chi-squared test, sometimes written out with the greek letter as . The formula for the statistic is as follows.
In this formula,
- is the observed probability of class appearing.
- is the expected probability of class appearing.
What is a class? In our example, we have two classes: heads and tails. So for a sequence containing 1 heads and 9 tails, we have the following.
- (we observed 1 heads out of 10)
- (we expected 5 heads out of 10)
- (we observed 9 tails out of 10)
- (we expected 5 tails out of 10)
Plugging these numbers into the formula, we get the following value for .
There is one more thing we need, which is the degrees of freedom for our data. This is the number of classes minus 1. Since we have 2 classes (heads and tails), our degrees of freedom is equal to 1. The chi-squared variable is often written with the associated degrees of freedom as a subscript, like .
Now what do we do with our value? There is a magic formula that let's us calculate the probability that the observed distribution of values is equally or more different than the expected distribution of values. (You can read more about this "magic" formula in this article.)
With and the degrees of freedom equal to 1, we can calculate the probability:
You might notice that the 1.1% we just calculated differs from the 2.1% we calculated previously. This is because the chi-squared test is an approximation that doesn't perfectly match our given scenario. However, for practical purposes, it is a "good enough" approximation.
Chi-Squared Test Assumptions
It is very important to mention that the chi-squared test is only valid when both of the following assumptions to be true.
- The sample size must be large enough for the central limit theorem to be applicable. For the chi-squared test, the expected count per category should be greater than 5.
- The observations should be independent. If dependencies do exist between observations, then the calculated probabilities will not be correct.
Fair Die?
If we were to roll a die 36 times, we would expect to see roughly 6 occurrences of each of the 6 possible numbers. Therefore, we might see a sequence like this.
Let's count the occurrences of each number in this sequence and compare them to what we would expect.
- (we observed 4 ones)
- (we observed 7 twos)
- (we observed 7 threes)
- (we observed 6 fours)
- (we observed 6 fives)
- (we observed 6 sixes)
Now we can calculate the statistic for this data.
For this test, we have 6 classes (the numbers 1 through 6), so our degrees of freedom is . We can now use the chi-squared distribution with to calculate the following probability.
This high probability tells us that the observed distribution is very consistent with what we would expect from a fair die.