What is a good classification accuracy in machine learning?

By Data Tricks, 1 June 2020

One of the most common questions I’m asked when it comes to classification problems in machine learning is what is a good classification accuracy.

And the answer is, unfortunately, in the form of another question: what are you trying to measure? A “good” classification accuracy will largely depend on what you’re trying to predict and what those predictions are going to be used for. Indeed, accuracy might not even be the best statistic to use at all.

How to measure classifier performance

First it is common to create a confusion matrix, which looks like the following:

		Predicted
		Positive	Negative
Actual	Positive	True positive	False negative
Actual	Negative	False positive	True negative

Using this confusion matrix you can calculate a range of measures (scroll to the bottom of this article for a tool to calculate these):

Accuracy

The overall proportion of correct classifications.

Precision

Proportion of predicted positives that were correct.

Sensitivity

Proportion of actual positives that were predicted correctly (sometimes called recall).

Specificity

Proportion of actual fails that were predicted correctly.

F-score

Sometimes called the F1 score, this provides a balanced measure of precision and sensitivity.

Examples

Now let’s consider two scenarios:

Scenario A: you’re training a machine learning algorithm to be used for facial recognition on a social media platform.

Scenario B: you’re training a machine learning algorithm to determine the immediate risk posed to vulnerable people.

Let’s say you achieved a classification accuracy of 80% in both scenarios. In Scenario A your algorithm tagged lots of photos correctly but miss-classified 1 in 5 photos, leading to a minor inconvenience for some users. In Scenario B, however, if you miss-classified 1 in 5 vulnerable people as not at risk, then that’s 1 person who may be in imminent danger but ignored – the stakes are much higher.

In Scenario B it might be better to maximise sensitivity rather than accuracy. Put another way, you might want to get the number of false negatives (people who you predicted were not at risk, but actually were) as close to zero as possible. Of course this will come at the expense of your overall accuracy which might decrease, but you can probably live with your model having more false positives, or ‘false alarms’, rather than false negatives.

Classifier performance calculator

Calculate accuracy, precision, sensitivity, specificity and F-score

Enter the number of observations in each category below and click Calculate

		Predicted
		Positive	Negative
Actual	Positive
Actual	Negative

Tags: accuracy, classification, f-score, machine learning, precision, sensitivity, specificity

Free data science in R guide

Sign up to our newsletter and we will send you a series of guides containing tips and tricks on data science and machine learning in R.

No thanks

Machine learning

Confusion matrix in R: two simple methods

April 13, 2021

Two of the best methods to calculate a confusion matrix in R – from scratch or with the caret package.

Machine learning

Feature scaling in R: five simple methods

November 18, 2020

Five simple methods for applying features scaling in R.

Machine learning

The quickest way to check for missing values in an R data frame

November 3, 2020

How to check how much missing data you have in your data frame, and in which columns.

Machine learning

How to apply and interpret linear regression in R

May 28, 2020

Learning how to apply linear regression in R and how to interpret the output using house price data.

Machine learning blog

What is Machine Learning?

April 17, 2020

What is Machine Learning? Machine Learning is a subset of artificial intelligence which involves getting computers to learn autonomously from hidden patterns in existing data in order to make predictions on unseen data. There are two main types of machine learning – supervised and unsupervised. Supervised machine learning algorithms are used when the existing data […]

What is a good classification accuracy in machine learning?

How to measure classifier performance

Accuracy

Precision

Sensitivity

Specificity

F-score

Examples

Classifier performance calculator

Calculate accuracy, precision, sensitivity, specificity and F-score

Leave a Reply Cancel reply

Free data science in R guide

You might also like

Confusion matrix in R: two simple methods

Feature scaling in R: five simple methods

The quickest way to check for missing values in an R data frame

How to apply and interpret linear regression in R

What is Machine Learning?