Out-of-the-Box Part 1 - Proportion Testing with CHISQ.TEST

So Excel Lambda is great! You can create custom functions that are not included in Excel. And we have seen in this blog how you could implement various statistical hypothesis testing

But then what are those Excel TEST functions for? What hypothesis testing can I do with these Out-of-the-box TEST functions?

In the next few posts, I will describe some common hypothesis testing you could do:

Excel's Chi Squared Test for One Proportion Testing

In Excel's documentation, CHISQ.TEST is described as a test for independence. It does this by doing a goodness-to-fit test on how well the data matches the expected. This means we can also use it for one proportion testing.

To use CHISQ.TEST we will need to compare the sample proportion against the expected proportion. But unlike DC-Den's One Proportion Testing where we only need to specify the expected proportion, here we will need to specify the expected proportion having the same sample size.

So let's say you want to know if a class has the same ratio of boys to girls. Let's say in class A there are 25 boys and 25 girls. Therefore total sample size is `25+25 = 50`. Let's say you are expecting 50% are boys. The expected proportion must be set at 25 boys and 25 girls, so that the total expected size is also 50. 

 

The diagram above illustrates we we just covered.

Let's make a few minor variations to this example as shown in the diagram below. 

Here you see Class A has exactly the expected proportion, hence the Chi Squared p-value is 1 indicating an exact fit.

In Class B and C, the proportion of boys is lesser, but the p-values are still greater than 0.05. We can conclude that the proportion difference between boys and girls is not significant enough.

In Class D, the p-value is 0.0109 which is less than 0.05. So we can conclude the the proportion difference between boys and girls is significant compared to the expected value.

Questions and Answers

Can we use CHISQ.TEST for Two Proportion Testing?

Yes, if both sample sizes are the same.

So if you want to compare two classes A and B if the proportion of boys are the same, then you need to sample the same number of students from class A and class B.

Here we see that the proportion difference between Class A and B is not significantly different, so we can conclude that Class A and B have similar proportions.

So we can use CHISQ.TEST for two proportion testing as long as the sample sizes are the same.

Why I get different p-values using different methods?

Yes. This is because different methods start with different assumptions. You can use different methods to validate your conclusion but not for comparing the p-values numerically.

But you may say that the p-values led to different conclusions. If this is the case, you may need to re-assess the assumptions involved.

I remember my ex-colleague who mentioned that calculating p-value is not so much as to come to a conclusion but rather to ascertain your observation. 

Conclusion

In this posting, we learnt to use Excel's CHISQ.TEST for one- and two- proportion testing, with some constraints on the latter. If you are not be interested in setting up the Lambda formula or you just want a quick test, then remember this method to save yourself time.

Now go and proportion test your DC-DEN!

Comments