“The United States alone faces a shortage of 140,000 to 190,000 people with analytical expertise and 1.5 million managers and analysts with the skills to understand and make decisions based on the analysis of big data.” –McKinsey Continue reading
Domain Expertise vs. Machine Learning: The Debate Continues
By starting to rank all the data scientists participating in its competitions, Kaggle today advanced further its argument that data science is a generic set of skills that can be applied to any problem without prior domain expertise. Talking to The New York Times‘ Quentin Hardy, Jeremy Howard, Kaggle’s president and chief scientist, said that “it makes little difference for a top performer if the problem is public health or essays in Arabic. The argument that great data science is just about letting the data talk holds true.”
For a (short, recent) history of the debate, see Mike Driscoll’s summary of the deliberations of the panel arguing for and against machine learning and domain expertise at the recent Strata conference (video here), the results of a KDnuggets poll, and Mike Loukides’ passionate defense of expertise, concluding that “the real value of a subject matter expert: not just asking the right questions, but understanding the results and finding the story that the data wants to tell. Results are good, but we can’t forget that data is ultimately about insight, and insight is inextricably tied to the stories we build from the data.”
Big Data Bytes: More on What’s a Data Scientist?
Chuck Hollis calls Data Scientists “rock stars” and argues that they are “a fundamentally different profession with a different profile than the BI analysts that came before [them]. They’re more likely to have advanced degrees, frequently have a background in the sciences (vs. business) and they interact with data in more ways — and using different tools.”
Over at Vator they call them “rocket scientists” and “data junkies.” And an article in the November/December issue of the IEEE Intelligent Systems explores a “what if” scenario in which data scientists are criminals. No quotation marks.
Crowdsourcing and Big Data
The Wikipedia article on Big Data says it “requires exceptional technologies to efficiently process large quantities of data within tolerable elapsed times.” The examples given (Hadoop, MapReduce, Cloud Computing, etc.) do not include one very exceptional technology, the human brain, and a new way to harness its power, “crowdsourcing.” In the 2006 Wired article in which he coined the term, Jeff Howe wrote: “Just as distributed computing projects like UC Berkeley’s SETI@home have tapped the unused processing power of millions of individual computers, so distributed labor networks are using the Internet to exploit the spare processing power of millions of human brains.” Isn’t crowdsourcing one of the “exceptional technologies” required by Big Data?
To find out more about crowdsourcing and its role in the service of Big Data, I attended yesterday a Crowdsortium Meetup. Karim Lakhani from the Harvard Business School opened with a brief keynote, reminding us of (Bill) Joy’s Law: “No matter where you are, most smart people work for someone else.” Following him was a panel with the aforementioned Howe, Dwayne Spradlin (CEO of Innocentive), Doron Reuveni (CEO of uTest), Dan Sullivan (CEO of Appswell), moderated expertly by Jim Savage, partner and co-founder of Longworth Venture Partners. Continue reading
Big Data Bytes: Data Scientists Wanted
“Businesses now looking for talent with deep analytical and statistical backgrounds include big publishers, portals, ad networks, and e-commerce sites – just about any company that possesses massive amounts of data. Salaries range from $75,000 to $100,000 for someone starting out with strong analytical skills and background to as much as $150,000 to $300,000 for experienced professionals.”–“Wanted: Data Scientist With a Human Touch”
“[Former Vertica CEO] Lynch told his staff during the February meeting that he has no intention of retiring. Indeed, he pledged to his staff that he would assist in starting-up or otherwise supporting no less than 20 Big Data start-ups in the Boston area over the next five years.”–“HP Lead Big Data Exec Chris Lynch Resigns”
“This is the time to be super aggressive.”–Chris Lynch
“As the amount of data in the world grows, the only certainty is that there will need to be more qualified peopled to make sense of it. That should be good news as we stop and salute our machine overlords.”–“The Age of Big Data”
Data Scientist: 6 Definitions
From Simon Rogers, “What is a Data Scientist?”:
“Someone who can bridge the raw data and the analysis – and make it accessible. It’s a democratising role; by bringing the data to the people, you make the world just a little bit better.”–Simon Rogers
“A data scientist is that unique blend of skills that can both unlock the insights of data and tell a fantastic story via the data.”–DJ Patil
“A data scientist is someone who blends, math, algorithms, and an understanding of human behavior with the ability to hack systems together to get answers to interesting human questions from data.”–Hilary Mason
“A data scientist is a rare hybrid, a computer scientist with the programming abilities to build software to scrape, combine, and manage data from a variety of sources and a statistican who knows how to derive insights from the information within. S/he combines the skills to create new protoypes with the creativity and thoroughness to ask and answer the deepest questions about the data and what secrets it holds”–Jake Porway
“The four qualities of a great data scientist are creativity, tenacity, curiosity, and deep technical skills. They use skills in data gathering and data munging, visualization, machine learning, and computer programming to make data driven decisions and data driven products. They prefer to let the data do the talking.”–Jeremy Howard
“By definition all scientists are data scientists. In my opinion, they are half hacker, half analyst, they use data to build products and find insights. It’s Columbus meet Columbo – starry eyed explorers and skeptical detectives.”–Monica Rogati
Asking Good Questions is What Will Make Big Data Work for You
Asking good questions as the key to unleashing the potential of big data got significant blog time this past week. Continue reading
What’s a Data Scientist? One More Definition
Shawn Hessinger at AllAnalytics.com summarizes yesterday’s e-chat with Gartner’s Doug Laney on what data scientists do and who they are. Gartner’s definition of a data scientist:
Responsible for mining, modeling, interpreting, blending, and extracting information from large datasets and then presenting something of use to non-data experts. These experts combine expertise in mathematics-based semantics in computer science with knowledge of the physics of digital systems.
And Laney thinks that a “A good data scientist could probably be a good data scientist in any industry and with almost any problem.”
Survey: The Hunt for Unicorn Data Scientists Boosts the Salaries of Predictive Analytics Professionals
Unicorn Data Scientists (upgraded from “sexy data scientists”) are hard to find and are paid more than $200,000 per year. A new survey finds that the rising data science tide lifts the compensation of all other data analytics professionals, even if they don’t know how to code.
The Burtch Works Study: Salaries for Predictive Analytics Professionals is based on interviews with 1,757 data analytics professionals conducted over the 12 months ending April 2015 by executive recruiting firm Burtch Works. It is a unique source of information in that it does not rely on self-reporting or data provided by human resources departments. It also provides insights into how the demand for data scientists impact the salaries of other data analytics professionals because it excludes data scientists, covered in a separate Burtch Works study, published earlier this year (I wrote about that study here).
Burtch Works defines predictive analytics professionals as those who can “apply sophisticated quantitative skills to data describing transactions, interactions, or other behaviors of people to derive insights and prescribe actions.” Data scientists are a subset of this group—they have the “computer science skills necessary to acquire and clean or transform unstructured or continuously streaming data, regardless of its format, size, or source.”
The additional computer science skills put data scientists on top in terms of compensation regardless of their levels of experience and managerial responsibilities but predictive analytics professionals are keeping up, seeing their salaries and bonuses rise. For example, the median base salary for the most experienced individual contributors rose from $115,250 last year to $125,000 this year and for managers managing teams of ten or more the median base salary rose from $225,000 to $235,000.
Predictive analytics professionals continue to benefit from the increasing demand and short supply for their quantitative analysis skills. The median base salary of individual contributors varies from $76,000 for those at level 1 (0 to 3 years of experience) to $125,000 for those at level 3 (9+ years of experience). The median bonus received varies from $8,100 to $18,100, depending on job level.
The median base salary of managers varies from $125,500 for those at level 1 (1 to 3 reports) to $235,000 for those at level 3 (10+ reports). The median bonus received by managers varies from $23,000 to $75,000 depending on job level.
More and more people are attracted by the demand for data analytics professionals and the potential to become a unicorn. Data recently released by the National Center for Education Statistics, according to Phys.org, shows bachelor’s degrees in statistics grew 17% from 2013 to 2014. This marks 15 consecutive years the number of undergraduates in statistics has risen, increasing by more than 300% since the 1990s. In addition, from 2000 to 2014, master’s and doctorate degrees in statistics also grew significantly at 260% and 132%, respectively.
“The Bureau of Labor Statistics projects job growth for statisticians will increase 27% between 2012 and 2022, outpacing the projected 11% rate for all other occupations. The number of graduates in statistics each year—approximately 2,000 bachelor’s degrees, 3,000 master’s degrees and 575 doctorate degrees—seems unlikely to match this demand,” says Phys.org.
Originally published on Forbes.com


