Paired samples t-test

By Data Tricks, 28 July 2020

Statistics

What is a paired samples t-test?

A paired samples t-test is used to compare the means of two related groups of samples. The data should contain two values (a pair) for each case in the sample.

A paired samples t-test has a null hypothesis that the mean difference between the pairs of values is zero. The alternative hypothesis is that the mean difference is not equal to zero, which is called a two-tailed test.

We can also perform a one-tailed t-test if we have a prior belief that the mean difference is either larger or smaller than zero. In a one-tailed test, the null and alternative hypotheses can take two forms:

1: The null hypothesis is the mean difference is greater than or equal to zero; the alternative is that the mean difference is less than zero.

2 :The null hypothesis is the mean difference is less than or equal to zero; the alternative is that the mean difference is greater than zero.

These are called one-tailed tests.

Example in R

Let’s say we have a some data on student performance before and after training:

set.seed(150)
randomsequence <- sample(c(-4:5), 100, replace = TRUE)
data <- data.frame(studentId = c(1:100),
                   before = rnorm(100, mean = 50, sd = 10))
data$after <- data$before + randomsequence

We want to test whether the training has had a statistically significant impact on performance. The null hypothesis is that the mean difference is zero, ie. the training has not helped. The alternative hypothesis is that the mean difference is not equal to zero.

To perform the paired samples t-test in R, use the following code:

test <- t.test(x = data$after, y = data$before, paired = TRUE, alternative = "two.sided")

Note: take care to get the x and y the right way around; data$after should be assigned to x.

Now analyse the output of the test:

> test

       Paired t-test

data: data$before and data$after
t = 3.5068, df = 99, p-value = 0.0006833
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 0.4298462 1.5501538
sample estimates:
mean of the differences
                   0.99

p-value

The p-value is 0.00068 which is below the 5% significance level, therefore the null hypothesis can be rejected. This indicates that the alternative hypothesis is true – the training has had a statistically significant difference on performance. The mean of the difference is 0.99 which shows that the effect has been positive.

t-value

The t-value measures the size of the difference relative to the variation in the sample data. The greater the value of the t-value, the more likely it is that the null hypothesis should be rejected.

Degrees of freedom

In a t-test, one degree of freedom is “spent” estimating the mean, so the degrees of freedom will be the number of values in the sample minus 1, which in this example is 99.

95% confidence interval

The 95% confidence interval for our test is 0.43 to 1.55. This means that at the 5% significance level, the mean difference falls somewhere between 0.43 and 1.55.

Is a paired samples t-test the right test?

Use our interactive tool to help you choose the right statistical test or read our article on how to choose the right statistical test.

Tags: paired samples, statistics, t-test

Free data science in R guide

Sign up to our newsletter and we will send you a series of guides containing tips and tricks on data science and machine learning in R.

No thanks

Featured

How to choose the right statistical test

September 9, 2020

What is a statistical test and how do I choose the right one?

Statistics

Linear regression

July 28, 2020

What is linear regression and how to apply it in R.

Statistics

Spearman’s correlation in R

What is Spearman’s correlation coefficient and how to calculate it in R.

Statistics

Pearson correlation in R

What is the Pearson correlation coefficient and how to calculate it in R.

Statistics

Fisher’s test

What is a Fisher’s test and how to apply it in R.

Paired samples t-test

What is a paired samples t-test?

Example in R

p-value

t-value

Degrees of freedom

95% confidence interval

Is a paired samples t-test the right test?

Leave a Reply Cancel reply

Free data science in R guide

You might also like

How to choose the right statistical test

Linear regression

Spearman’s correlation in R

Pearson correlation in R

Fisher’s test