**By Data Tricks, 28 July 2020**

Spearman’s correlation coefficient is a non-parametric measure of the correlation between two variables. It is useful in analysing the correlation between variables where the relationship is monotonic but not necessarily linear.

A monotonic relationship exists when one variable increases, the other always increases, or when one variable increases, the other always decreases. Visualised as a chart of x against y, the slope of the relationship must be either always positive or always negative, but must never switch between the two.

Let’s create some example data:

set.seed(150) data <- data.frame(x = rnorm(100, mean = 50, sd = 10), random = sample(c(100:500), 100, replace = TRUE)) data$y <- (data$x^5/1000000) + (data$random) plot(data$x, data$y)

If we want to calculate the Spearman’ correlation of *x* and *y* in *data*, we can use the following code:

correlation <- cor(data$x, data$y, method = 'spearman')

Checking the results:

> correlation [1] 0.8950255

The Spearman’s correlation coefficient is 0.90, which indicates a strong correlation between *x* and *y*.

Note that if we calculate the Pearson correlation coefficient of the same variables, we get a value of 0.85:

> cor(data$x, data$y, method = 'pearson') [1] 0.8536495

This is slightly lower than the Spearman’s correlation because the Pearson correlation coefficient measures the *linear *relationship between variables. Thus the Spearman’s coefficient is the appropriate statistic for non-linear relationships.

Use our interactive tool to help you choose the right statistical test or read our article on how to choose the right statistical test.

Tags: correlation, spearmans, statistics

Please note that your first comment on this site will be moderated, after which you will be able to comment freely.

Sign up to our newsletter and we will send you a series of guides containing tips and tricks on data science and machine learning in R.

No thanks