Two new research reports on big data flash out its early impact on enterprise IT. Continue reading
3 Scenarios for the Future of Data Science
The last couple of weeks were great for the future of data science. First Wikibon, and then IDC, promised a big data market in 2015 of between $16.9 billion (IDC) to $32.1 billion (Wikibon) (more on these reports in Chuck Hollis’ Big Data: From Meme to Marketplace). And the Strata conference showcased the promising startups and data scientists that are going to make the big big data market a reality (see Daniel Tunkelang’s excellent summary here). Tim O’Reilly aptly summarized all of this excitement by declaring that “data science is the new black.”
So where do we go from here? How will data scientists’ careers shape up over the next decade? Continue reading
Data Science: Ranking Online Influencers
Data science is the defining specialty of the business of big data and an emerging career path for those who love to find new insights in the gazillion bytes of data created each day. It’s where you find fierce competition for talent, the jobs of the future, new training programs and courses, new ventures, and new products. But where to find the data science-relevant online conversations with the most impact? Continue reading
Big Data and the Demise of Analog Retail
News today that the CEO of Best Buy has abruptly stepped down. According to the Wall Street Journal he did do apparently because of his “personal conduct.” But the recently announced $1.7 billion quartely loss is still the news that matters most to the future of Best Buy and other “Big Box” retailers. And while the cost of operating brick-and-mortar stores as opposed to selling online is what seems to most as the culprit, I would argue that missing the potential of big data is–or will be–the great undoing of traditional retailers.
The Journal article quotes Craig Johnson of retail consultancy Customer Growth Partners: “Best Buy is a very dated store experience, rooted in the 1990s, and they need someone visionary.” Question is, what exactly is dated about the “dated store experience”? Johnson provides the numbers that most commentators focus on: Best Buy’s operating income per square foot was $18.52 last year (down from $50.61 in 2006). By contrast, “Apple’s retail stores reaped an astronomical $4,700 per square foot last year.”
Indeed, Best Buy finds itself “stuck in the middle,” to use Michael Porter’s terms, between Apple’s product differentiation (both the design of the actual products sold and the design of its stores) and Amazon’s cost leadership. But maybe Porter’s terms are also somewhat dated. Maybe we are witnessing the rise of a completely new big data “generic strategy” which leaves Best Buy and other traditional retailers “stuck outside.” They are left outside of the big data analytics mainstream, stuck on the bank of the river of data that is generated by online sales, watching their online competitors generating not only less-costly sales transactions but also data–on transactions, locations, logistics, customers, potential customers–and knowledge that is used in a virtuous circle to generate more sales and increase customer loyalty. Continue reading
Domain Expertise vs. Machine Learning: The Debate Continues
“The startup’s three co-founders have backgrounds in engineering and data science, but not weather, and there are no meteorological models involved. By keeping weather predictions within a two-hour window, they believe statistics are sufficient.”–Mashable in “Can Statistics Predict Weather Without Meteorologists? This App Thinks So” on Ourcast, a new app that uses real-time radar data and crowdsourcing to predict how weather at a given location will change within the next two hours. Continue reading
Top Ten Kaggle Data Scientists
1. Alexander D’yakonov
An academic in the Faculty of Computational Mathematics and Cybernetics department at Moscow State University, Alexander modestly describes his favorite problem-solving technique as “luck.” Despite this, the 33-year-old Russian has earned a reputation for using methods known for their theoretical rigor and elegant simplicity. This helped him to win the dunnhumby Shopper Challenge, which asked competitors to predict the amount and timing of supermarket shoppers’ next spends. Continue reading
Kirk Borne on Data Science: Start Small, Think Big
Domain Expertise vs. Machine Learning: The Debate Continues
By starting to rank all the data scientists participating in its competitions, Kaggle today advanced further its argument that data science is a generic set of skills that can be applied to any problem without prior domain expertise. Talking to The New York Times‘ Quentin Hardy, Jeremy Howard, Kaggle’s president and chief scientist, said that “it makes little difference for a top performer if the problem is public health or essays in Arabic. The argument that great data science is just about letting the data talk holds true.”
For a (short, recent) history of the debate, see Mike Driscoll’s summary of the deliberations of the panel arguing for and against machine learning and domain expertise at the recent Strata conference (video here), the results of a KDnuggets poll, and Mike Loukides’ passionate defense of expertise, concluding that “the real value of a subject matter expert: not just asking the right questions, but understanding the results and finding the story that the data wants to tell. Results are good, but we can’t forget that data is ultimately about insight, and insight is inextricably tied to the stories we build from the data.”
Big Data Bytes: Data Scientists Wanted
“Businesses now looking for talent with deep analytical and statistical backgrounds include big publishers, portals, ad networks, and e-commerce sites – just about any company that possesses massive amounts of data. Salaries range from $75,000 to $100,000 for someone starting out with strong analytical skills and background to as much as $150,000 to $300,000 for experienced professionals.”–“Wanted: Data Scientist With a Human Touch”
“[Former Vertica CEO] Lynch told his staff during the February meeting that he has no intention of retiring. Indeed, he pledged to his staff that he would assist in starting-up or otherwise supporting no less than 20 Big Data start-ups in the Boston area over the next five years.”–“HP Lead Big Data Exec Chris Lynch Resigns”
“This is the time to be super aggressive.”–Chris Lynch
“As the amount of data in the world grows, the only certainty is that there will need to be more qualified peopled to make sense of it. That should be good news as we stop and salute our machine overlords.”–“The Age of Big Data”
Big Data Startups News
Wikibon’s Jeff Kelly bravely put a stake in the ground recently, first among IT market observers, by estimating the big data market at $5 billion, growing to $50 billion in five years. Kelly’s 5/50/5 plan is a great guide to the initial jostling for market position in this very promising and very emerging market. it shows that most–if not all–of the innovation in big data came from startups, and some have already been acquired by established IT firms.
The big data market, as defined by Wikibon, includes the hardware, software, and services designed to address the shortcomings of traditional data base technologies in handling large data sets. This means that the $5 billion estimate is a conservative one as it represents a narrow market, the market comprised of what we could call the hardware, software, and services platforms for big data. Continue reading