One-sample chi-square test

By Data Tricks, 28 July 2020

Statistics

What is a chi-square test?

A chi-square test – pronounced kai and sometimes written as a χ2 test – is designed to analyse nominal (also known as categorical) data. It is used to compare the observed frequencies in each response category to the expected frequencies if the null hypothesis were true.

Example in R

First let’s simulate some categorical data. In this example, we’ll create some responses to a client satisfaction survey.

set.seed(150)
data <- data.frame(value = sample(c("Very satisfied",
                                    "Very satisfied",
                                    "Very satisfied",
                                    "Somewhat satisfied",
                                    "Somewhat satisfied",
                                    "Somewhat dissatisfied",
                                    "Very dissatisfied"),
                                   300, replace = TRUE))

We’ll also need to calculate the frequencies of each category:

frequencies <- table(data$value)
> frequencies

Somewhat dissatisfied    Somewhat satisfied    Very dissatisfied
                   40                    85                   47
       Very satisfied
                  128

Now let’s say that we want to test whether changes that were brought in to improve client satisfaction have made a difference. The client satisfaction before these changes was 20%, 25%, 15% and 40% for the categories Somewhat dissatisfied, Somewhat satisfied, Very dissatisfied and Very satisfied, respectively.

The null hypothesis is that there is no change, ie. the client satisfaction levels are the same as previously recorded.

A chi-square test can be performed using the chisq.test function. The chi-square test is evaluating whether the distribution of categories in the data fit the expected – ie. the previous – distribution.

test <- chisq.test(frequencies, p = c(0.20, 0.25, 0.15, 0.40))
> test

        Chi-squared test for given probabilities
data:  frequencies
X-squared = 8.6222, df = 3, p-value = 0.03476

p-value

The p-value is 0.035, which is below the 5% significance level, therefore the null hypothesis can be rejected.

χ2 statistic

Another way of evaluating the test is to look at the χ2 statistic. A large χ2 statistic means that the null hypothesis can be rejected. To determine how large it needs to be, the critical value can be found using the degrees of freedom and the significance level.

In our example, we have 3 degrees of freedom and our significance level is 0.05. Using a table of probabilities for the χ2 distribution (example here), we can see that the critical χ2 value is 7.815. Therefore, the null hypothesis can be rejected where χ2 >= 7.815, which in this case, it is.

Is a one-sample chi-square the right test?

Use our interactive tool to help you choose the right statistical test or read our article on how to choose the right statistical test.

Tags: chi-square, one-sample, statistics

Free data science in R guide

Sign up to our newsletter and we will send you a series of guides containing tips and tricks on data science and machine learning in R.

No thanks

Featured

How to choose the right statistical test

September 9, 2020

What is a statistical test and how do I choose the right one?

Statistics

Linear regression

July 28, 2020

What is linear regression and how to apply it in R.

Statistics

Spearman’s correlation in R

What is Spearman’s correlation coefficient and how to calculate it in R.

Statistics

Pearson correlation in R

What is the Pearson correlation coefficient and how to calculate it in R.

Statistics

Fisher’s test

What is a Fisher’s test and how to apply it in R.

One-sample chi-square test

What is a chi-square test?

Example in R

p-value

χ2 statistic

Is a one-sample chi-square the right test?

Leave a Reply Cancel reply

Free data science in R guide

You might also like

How to choose the right statistical test

Linear regression

Spearman’s correlation in R

Pearson correlation in R

Fisher’s test