The 5 most important skills of a data scientist
By Data Tricks, 17 January 2020
Last modified 12 April 2020
Thinking about getting into data science? Here is my take on the top skills needed to be an effective and successful data scientist.
There’s no getting round this one, statistical knowledge and understanding – including machine learning – is the most important skill you will need to be a successful data scientist.
Many argue that coding skills are more important, but without a strong foundation in statistics, the risk that incorrect conclusions are drawn from analyses could lead to bad business decisions and will ultimately be more costly in the long run.
Data scientists often work with very large datasets, which are only going to get even bigger. Old school analytics software isn’t going to cut it, so you’ll need to use more sophisticated tools.
Being able to code in either R or Python is usually considered the bare essentials. SQL is also a must-have for many roles, although some larger companies might have dedicated teams with expertise in databases and SQL working alongside data scientists. Knowledge of other languages such as Java or C/C++ are also advantageous, as is experience of using technology such as Apache Spark and Hadoop.
This is perhaps one of the most underrated skills of a data scientist but it’s going to be crucial for your success. You could complete a great piece of analysis or train a highly accurate machine learning algorithm, but if you can’t get stakeholders on board, you’ll have wasted your time.
There’s a lot of talk about analytical skills being in short supply and high demand at the moment, so it’s likely that you’ll find yourself having to explain your work to an audience with little knowledge of data science or analytics. It’s therefore important to be able to communicate in layman’s terms. A good grounding in statistics will help you do this.
4 Data visualisation
Visualisation goes hand-in-hand with communication. It might be easier to explain what a 0.8 correlation is by showing this visually rather than trying to explain it in words. Knowledge of popular visualisation tools such as ggplot2, plotly or D3.js is a good start in making engaging visualisations.
Take a look at our tutorials in R charts, geospatial visualisation and data art projects.
5 Business acumen
If you work in the commercial world, it’s important to be able to see how all the parts of a company work together to drive growth, understand key stakeholders and customers, and how actions and decisions affect key measures and objectives.
Your data science project will have objectives beyond the final ‘statistical’ answer, be that increased profit or greater efficiency etc. It’s also likely that you’ll come across obstacles and will sometimes need to compromise to keep the project from derailing. The ability to understand and deal with these issues will help you not only produce the analysis to drive business decisions, but be involved in those business decisions yourself. This in turn should improve the reputation of the data science function in your company, leading to an earlier and greater involvement in future decisions.
Tags: data science, machine learning
Please note that your first comment on this site will be moderated, after which you will be able to comment freely.