“[A data scientist is] a person who is better at statistics than any software engineer and better at software engineering than any statistician”. - Josh Wills, director of Data Engineering at Slack
In recent years, there’s been an increasing demand for data scientists left and right, across industries and across departments. In the same vein, companies are getting more and more data than they know what to do with. In fact, according to IBM, 90% of the data in the world today has been created in the last two years alone. To put this influx to good use, organizations are turning to data scientists.
Do you have what it takes to be a data scientist?
With today’s focus on data-driven decision-making, there has never been more opportunities to be in the data science field. But while demand is on the rise, there’s still the question of whether the current pool of talent can actually meet requirements.
At face value, the data scientist role relies heavily on numerical proficiency. The phrase “data science” alone connotes a foundation of STEM-related skills. But the emergence of analytics tools has allowed data scientists to go beyond the math and focus on producing concrete results and real-world solutions. In fact, there’s an abundance of self-serve, advanced analytics platforms today that enable even novice data scientists to dabble in machine learning and AI.
While a quantitative background is still an advantage, data science isn’t so much about the underlying equation than it is about understanding what these concepts bring to the table in the business landscape.
So if math isn’t your strong suit, don’t worry. After all, data scientists can span across various industries and functions, and so the skills required can vary as well. The important thing is: you should be willing to adopt to various situations and cultivate a diverse set of skills.
A confluence of three areas of expertise
So what exactly are the skills needed in this line of work? According to a resident data scientist from Ducen, the required skillset is often a combination of soft skills and technical skills. To illustrate, let’s look at the confluence between three different, but related fields: Mathematics, Computer Science, and Domain Expertise. Identifying the converging points between these fields is key to succeeding in the data science world.
1. Statistical research
Domain Expertise + Mathematics
The data scientist may be a technical role but having a basic understanding of the industry would go a long way. Strong business acumen is critical in solving business challenges. By combining it with applied mathematical skills and logical thinking, a data scientist will be able to define advanced analytics solutions by converting the problem into mathematical equations. This includes identifying the relevant mathematical function or model.
2. Data processing
Domain Expertise + Computer Science
The amount of data available is increasing year after year, but this doesn’t mean that all of it is useable. Often, data is messy and inaccurate, and it’s up to a data scientist to deal with these imperfections. But in order to effectively wrangle data, one should have the intuition to identify what’s relevant and valuable for the business—and this is where domain expertise comes in. Successfully juggling business acumen and computer science skills will enable data scientist to properly work on data warehousing / data integration, data manipulation, cleaning, and visualization tasks.
3. Machine learning
Mathematics + Computer Science
While there is some overlap, the difference between a data scientist and other programmers lies mainly in applied mathematics and statistics. These disciplines play a larger role in the data science world. On top of knowledge of programming languages like R, Python, and SQL, data scientists also need to know statistics. Harnessing these two quantitative capabilities allows them to effectively build machine learning models and build artificial intelligence solutions.
Must-read: Common Machine Learning algorithms
To put it simply, data scientists are required to juggle multiple skills and responsibilities across different domains—not just those of a quantitative nature. They’re expected to dip their toes in various areas, and this requires a basic understanding of these three fields and how they intersect.
For example, while data scientists deal mostly with numbers, not only are they expected to be knowledgeable about advanced data analytics, but they’re also expected to have basic industry knowledge. They might be asked to make a market basket analysis for optimal product recommendations in the retail sphere, but they should also know enough about the business domain to know what variables to leave out, what result is significant, and what actions should be taken from the study.
Do you have to learn everything?
Since the data scientist role functions across disciplines and areas, it might be hard to be a master in all required areas. After all, each of these fields require many years of study and experience. So it isn’t essential to be an expert per se, but it is important to have the foundational knowledge. This will help aspiring professionals to narrow down their focus and hone their skills in a specific area of specialization.
Organizations should play their part in tempering their expectations too. Most tend to look for a what people call a “unicorn data scientist”: a business executive, data engineer, and machine learning expert all rolled into one. This is an inefficient approach to take because while there is probably an abundance of quantitative and business talent, you’d be hard pressed to find someone skilled in all areas.
Data science is a vast field that includes numerous expertise and areas of study: predictive modeling, data management, data warehousing, visualization, artificial intelligence, and more. That might sound intimidating, but as long as you have the basics pinned down, you’ll go a long way. The data science world always needs more passionate people.