Over the past five years or so, it’s been hard to ignore the rise of data science. Everyone seems to be talking about it. And, as put by a senior director I was recently talking to, ‘everybody wants to be a data scientist’.
If Google’s search trends are anything to go by, the rapid rise of data science began around 2013.
But the concepts that underpin the field we now know as data science didn’t suddenly materialise in 2013. In fact, machine learning – a cornerstone of data science – has been around for at least 70 years thanks to pioneers such as Alan Turing.
Interestingly, many machine learning algorithms don’t attract the same level of interest in Google searches. More searches for support vector machines were carried out in 2004 than today.
Perhaps this is not surprising given that support vector machines has been around for 25 years.
So, if machine learning predates the music of Elvis Presley, why has it taken so long for data science to take off?
Digital transformation is now widespread among many industries, which has led to more volume, velocity and variety of data. Over 2.5 quintillion bytes of data (that’s 2,500,000,000,000,000,000 bytes) are created every single day (see Domo’s report), and that’s increasing all the time. At the same time, advances in cloud computing and distributed file systems have led to the ability to store, retrieve and analyse this data easier than ever before.
For a long time, organisations have used data to measure how well they are performing. But now, an increasing number of businesses are realising that data can help them predict future trends, and can even help inform their direction and strategy to drive greater success.
Data science vs. statistics
So we’ve got more data, a greater ability to analyse it, and the will of businesses to extract useful information from it. This is where data science comes in. But you might also be thinking, isn’t this where statistics comes in? And you’d be right. As an article (which, ironically, sets out to explain the differences between data science and statistics) published on Educba rather unhelpfully puts it, “statistics is the science of data”.
Other opinions, including this article (which is in need of updating, but interesting nevertheless), state that broadly there is very little difference between the two. Indeed, pick up any syllabus for a degree in statistics – particularly an Applied Statistics degree – and you’re likely to find modules on machine learning, forecasting, or even data science.
It might not be a coincidence that Google trends for the term statistics have seen a gradual decline whilst data science has been rising.
So is data science simply statistics rebranded? The now ubiquitous data science venn diagram is one of the most helpful illustrations of what data science is.
From this illustration, data science is clearly a very wide field. Generally however, it is the process of using statistical and mathematical models and scientific methods to extract knowledge and insight from data, and using that insight to inform business strategy and direction.
Today, great data scientists will have a combination of statistical and mathematical understanding, coding skills in order to apply that knowledge to real scenarios, and the business acumen to convert findings into future strategy.
Privacy & Cookies Policy
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.