Linear regression

By Data Tricks, 28 July 2020

Statistics

What is linear regression?

Linear regression is a statistical method to analyse the linear relationship between a dependent variables and one or more independent variables. Where there is more than one independent variable, it is commonly called multiple linear regression.

Example in R

First let’s create some simulated data:

set.seed(150)
rand.num <- sample(c(-5:5), 500, replace = TRUE)
data <- data.frame(dep = rnorm(500, mean = 50, sd = 10))
data$ind <- data$dep + rand.num

In the above example, we created a dataframe with two columns to simulate a dependent and independent variable. The independent variable was created simply by adding a random number between -5 and +5 to the dependent variable, so once we apply a linear regression model we expect to find a linear relationship between the two variables.

Apply a linear regression using the lm function:

model <- lm(data = data, dep ~ ind)=

Analyse the model:

> summary(model)

Call:
lm(formula = dep ~ ind, data = data)

Residuals:
    Min      1Q  Median      3Q    Max
-5.8305 -2.5890  0.1251  2.3778  6.5174

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept)  3.53004    0.67628    5.22 2.63e-07 ***
ind          0.93003    0.01341   69.37 < 2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2.98 on 498 degrees of freedom
Multiple R-squared:  0.9062,   Adjusted R-squared: 0.906
F-statistic:  4812 on 1 and 498 DF,  p-value: < 2.2e-16

If you’re not familiar with the output of a linear regression the above might look daunting. For a detailed explanation of how to interpret a linear regression, read this detailed tutorial.

Is Linear Regression the right test?

Use our interactive tool to help you choose the right statistical test or read our article on how to choose the right statistical test.

Tags: linear regression, statistics

Free data science in R guide

Sign up to our newsletter and we will send you a series of guides containing tips and tricks on data science and machine learning in R.

No thanks

Featured

How to choose the right statistical test

September 9, 2020

What is a statistical test and how do I choose the right one?

Statistics

Spearman’s correlation in R

July 28, 2020

What is Spearman’s correlation coefficient and how to calculate it in R.

Statistics

Pearson correlation in R

What is the Pearson correlation coefficient and how to calculate it in R.

Statistics

Fisher’s test

What is a Fisher’s test and how to apply it in R.

Statistics

Chi-square test

What is a chi-square test and how to apply it in R.

Linear regression

What is linear regression?

Example in R

Is Linear Regression the right test?

Leave a Reply Cancel reply

Free data science in R guide

You might also like

How to choose the right statistical test

Spearman’s correlation in R

Pearson correlation in R

Fisher’s test

Chi-square test