Data Science Skills: Domain Expertise, Programming, Statistics

Source: Business Over Broadway
Based on a study of 620+ data professionals, we found that data science skills fall into three broad areas: domain expertise (in our case, business), technology/programming and math/statistics.
Recruiting Data Scientists to Mine the Data Explosion
Wes Hunt, Chief Data Officer (CDO) at Nationwide Mutual Insurance Co. on recruiting data scientists:
Finding talent is my largest challenge. Someone who understands our business, who has quantitative skills, who has the technical skills to create the models, and who is able to persuade others that the insights they’ve come up with are ones you can trust and take action on. The hardest part is persuasion. You get the quantitative skills, but there’s a struggle in that ability to communicate effectively. We’ll often pair people together, but we’d really like to grow the talent.
When I was in marketing, we put a focus on liberal-arts-educated individuals, because abstract thinking where there are ambiguous data sets is an area where they are comfortable. Ph.D.s in psychology were a great recruiting pool. A psych Ph.D. has a fair amount of statistical training. We created a program to recruit Ph.D.s.
There’s not yet an educational discipline and curriculum that produces data scientists at the scale that would clear the market. So the way we’ve focused on it is to find people with innate curiosity and critical thinking. You can teach the other skills. On my team, I have a pathologist, a bioengineering student who trained in doing heart research, an M.B.A., and someone who is trained in traditional data architecture. I also have a landscape construction engineer and a psychology Ph.D.
The World’s #1 Data Scientist Talks about Data Science Skills and Tools
[youtube https://www.youtube.com/watch?v=dpzxW6buh9Y]
Owen Zhang is ranked #1 on Kaggle, the online stadium for data science competitions. An engineer by training, Zhang says that data science is finding “practical solutions to not very well-defined problems,” similar to engineering. He believes that good data scientists, “otherwise known as unicorn data scientists,” have three types of expertise. Since data science deals with practical problems, the first one is being familiar with a specific domain and knowing how to solve a problem in that domain. The second is the ability to distinguish signal from noise, or understanding statistics. The third skill is software engineering.
[youtube https://www.youtube.com/watch?v=7YnVZrabTA8]
Zhang, Chief Product Officer at DataRobot, shares in this talk his experience with open source tools in data science competitions. Slides here.
Salaries of Data Scientists
Advancing Your AI Career

“AI Career Pathways” is designed to guide aspiring AI engineers in finding jobs and building a career. The table above shows Workera’s key findings about AI roles and the tasks they perform. You’ll find more insights like this in the free PDF.
From the report:
People in charge of data engineering need strong coding and software
engineering skills, ideally combined with machine learning skills to help them
make good design decisions related to data. Most of the time, data engineering is done using database query languages such as SQL and object-oriented programming languages such as Python, C++, and Java. Big data tools such as Hadoop and Hive are also commonly used.
Modeling is usually programmed in Python, R, Matlab, C++, Java, or another language. It requires strong foundations in mathematics, data science, and machine learning. Deep learning skills are required by some organizations, especially those focusing on computer vision, natural language processing, or speech recognition.
People working in deployment need to write production code, possess strong back-end engineering skills (in Python, Java, C++, and the like), and understand cloud technologies (for example AWS, GCP, and Azure).
Team members working on business analysis need an understanding of
mathematics and data science for analytics, as well as strong communication skills and business acumen. They sometimes use programming languages suchas R, Python, and Tableau, although many tasks can be carried out in a spreadsheet, PowerPoint or Keynote, or an A/B testing software.
Working on AI infrastructure requires broad software engineering skills to write production code and understand cloud technologies.
9 Categories of Data Scientists

- Those strong in statistics: they sometimes develop new statistical theories for big data, that even traditional statisticians are not aware of. They are expert in statistical modeling, experimental design, sampling, clustering, data reduction, confidence intervals, testing, modeling, predictive modeling and other related techniques.
- Those strong in mathematics: NSA (national security agency) or defense/military people working on big data, astronomers, and operations research people doing analytic business optimization (inventory management and forecasting, pricing optimization, supply chain, quality control, yield optimization) as they collect, analyse and extract value out of data.
- Those strong in data engineering, Hadoop, database/memory/file systems optimization and architecture, API’s, Analytics as a Service, optimization of data flows, data plumbing.
- Those strong in machine learning / computer science (algorithms, computational complexity)
- Those strong in business, ROI optimization, decision sciences, involved in some of the tasks traditionally performed by business analysts in bigger companies (dashboards design, metric mix selection and metric definitions, ROI optimization, high-level database design)
- Those strong in production code development, software engineering (they know a few programming languages)
- Those strong in visualization
- Those strong in GIS, spatial data, data modeled by graphs, graph databases
- Those strong in a few of the above. After 20 years of experience across many industries, big and small companies (and lots of training), I’m strong both in stats, machine learning, business, mathematics and more than just familiar with visualization and data engineering. This could happen to you as well over time, as you build experience. I mention this because so many people still think that it is not possible to develop a strong knowledge base across multiple domains that are traditionally perceived as separated (the silo mentality). Indeed, that’s the very reason why data science was created.
Data Scientists Still Hot, Salaries Cool Off


The third annual Burtch Works Study: Salaries of Data Scientists April 2016 is out, documenting the continuation of a very favorable market for those with the sexiest job of the 21st century. However, the salaries of data scientists appear to be leveling off: Every job category except one (entry-level individual contributors) experienced a marginal single-digit shift in median base salary over the past year. This compared to the overall increase in compensation of 14% in last year’s report.
The Burtch Works Study is based on compensation and demographic data for 374 data scientists collected in interviews conducted by Burtch’s recruiting staff during the 12 months ending March 2016. It focuses on data scientists as distinguished from other analytics professionals, defining them as follows:
Data scientists apply sophisticated quantitative and computer science skills to both structure and analyze massive unstructured datasets or continuously streaming data, with the intent to derive insights and prescribe action. The depth and breadth of their coding skills distinguishes them from other predictive analytics professionals and allows them to exploit data regardless of its source, size, or format. Through the use of one or more general-purpose coding languages and data infrastructures, data scientists can tackle problems made very difficult by the size and disorganization of the data.
Here are the highlights of the new report.
Individual contributors: Median base salaries range from $97,000 at level 1 to $152,000 at level 3 plus bonuses ranging from $10,000 to $21,000 (over 73% of all individual contributors are eligible for bonuses).
Managers: Median base salaries range from $140,000 at level 1 to $240,000 at level 3 plus bonuses ranging from $15,000 to $80,000 (over 80% of managers are eligible for bonuses).
Salary changes from last year’s study: Base salaries for individual contributors have increased 7% at level 1 and 1% at level 3, while salaries remained steady at level 2. For managers, salaries remained steady at level 1 while those at level 2 increased 3%. At level 3, the median base salary decreased by 4% ($10,000).
Data scientists continue to get top compensation for analytics professionals: Data scientists earn base salaries up to 39% higher than other predictive analytics professionals depending on job category.

A shift in the educational background of data scientists: 59% of level 1 individual contributors’ highest degree is a Master’s, a significant increase from last year’s 48%.
An increase in the number of U.S. citizens in the data science talent pool: Among level 1 individual contributors, only 43% of this year’s professionals are foreign-born vs. 53% last year.
It appears that the increase in the number of graduate-level programs in data science has started to make its mark and is contributing to an increase in the supply of entry-level data scientists with a Master’s degree. Other trends Burtch Works has observed in its recent conversations with data scientists are increased desire to work for “more mission-driven organizations attempting to make an impact on society” rather than large companies such as Facebook or Google and “the increasing pressure on many startups to show their value,” otherwise known as the coming burst of the Unicorn Bubble.
If we do see a contraction in startup activity and attractiveness over the next year, it may well be that larger and more stable companies, even in traditional industries, will become more desirable for budding—and even experienced—data scientists, regardless of their desire to “change the world.” The job opportunities—and the high compensation—will certainly be there as the practice of data science spreads into all corners of the economy. As Burtch Works predicts: “The use of data science will become more ubiquitous, the talent supply will improve, and there will be even more use cases for these techniques.”
Originally published on Forbes.com




