Spearman’s correlation in R

By Data Tricks, 28 July 2020

What is Spearman’s correlation coefficient?

Spearman’s correlation coefficient is a non-parametric measure of the correlation between two variables. It is useful in analysing the correlation between variables where the relationship is monotonic but not necessarily linear.

A monotonic relationship exists when one variable increases, the other always increases, or when one variable increases, the other always decreases. Visualised as a chart of x against y, the slope of the relationship must be either always positive or always negative, but must never switch between the two.

Example in R

Let’s create some example data:

set.seed(150)
data <- data.frame(x = rnorm(100, mean = 50, sd = 10),
                   random = sample(c(100:500), 100, replace = TRUE))
data$y <- (data$x^5/1000000) + (data$random)
plot(data$x, data$y)

If we want to calculate the Spearman’ correlation of x and y in data, we can use the following code:

correlation <- cor(data$x, data$y, method = 'spearman')

Checking the results:

> correlation
[1] 0.8950255

The Spearman’s correlation coefficient is 0.90, which indicates a strong correlation between x and y.

Note that if we calculate the Pearson correlation coefficient of the same variables, we get a value of 0.85:

> cor(data$x, data$y, method = 'pearson')
[1] 0.8536495

This is slightly lower than the Spearman’s correlation because the Pearson correlation coefficient measures the linear relationship between variables. Thus the Spearman’s coefficient is the appropriate statistic for non-linear relationships.

Is Spearman’s correlation the right test?

Use our interactive tool to help you choose the right statistical test or read our article on how to choose the right statistical test.

Tags: , ,

Leave a Reply

Your email address will not be published. Required fields are marked *

Please note that your first comment on this site will be moderated, after which you will be able to comment freely.