Top 5 Data Science Influencers: June 17, 2012

1.  Harish Kotadia, Infosys, @HKotadia

2. Gregory Piatetsky, KDnuggests@kdnuggets

3. Avkash Chauhan, Microsoft@avkashchauhan

4. Alex Popescu, MyNoSQL@al3xandru

5. Gil Press, What’s the Big Data? @GilPress

Source: Traackr

Posted in Data Science | Leave a comment

Facebook’s IPO and the Laws of Big Data

Without using any predictive analytics tools, I confidently predict that Facebook’s IPO will give rise to more vocal demands for people to “get a cut” of its—and other social media companies’—profits. People deserve, so the argument goes, a share of any profits derived from mining the social data pool which they have so willingly helped create. Occupy Facebook, anyone?

But before you set up a tent in Menlo Park, consider this proposition: The value of personal data is zero. Personal data is not worth much if it’s kept personal and a sample of one is good for answering a very limited set of questions. Personal data gains value when it is shared, when it is combined with and compared to other data.  Continue reading

Posted in Big Data Analytics, Data Science | Leave a comment

Top 5 Data Science Influencers: June 3, 2012

1.  Harish Kotadia, Infosys, @HKotadia

2. Gregory Piatetsky, KDnuggests@kdnuggets

3. Avkash Chauhan, Microsoft@avkashchauhan

4. Alex Popescu, MyNoSQL@al3xandru

5. David Smith, Revolution Analytics@revodavid

Source: Traackr

Posted in Misc | Leave a comment

Top 5 Data Science Influencers: June 10, 2012

1.  Harish Kotadia, Infosys, @HKotadia

2. Gregory Piatetsky, KDnuggests@kdnuggets

3. Avkash Chauhan, Microsoft@avkashchauhan

4. Alex Popescu, MyNoSQL@al3xandru

5. David Smith, Revolution Analytics@revodavid

Source: Traackr

Posted in Misc | Leave a comment

Machines vs. Models, Noise vs. Signal

An excerpt from Nassim Taleb’s forthcoming book, Antifragile, was posted yesterday on the Farnam Street blog. In “Noise and Signal,” Taleb says that “In business and economic decision-making, data causes severe side effects —data is now plentiful thanks to connectivity; and the share of spuriousness in the data increases as one gets more immersed into it. A not well discussed property of data: it is toxic in large quantities—even in moderate quantities…. the best way… to mitigate interventionism is to ration the supply of information, as naturalistically as possible. This is hard to accept in the age of the internet. It has been very hard for me to explain that the more data you get, the less you know what’s going on, and the more iatrogenics you will cause.”   Continue reading

Posted in AI, Big Data Analytics, Machine Learning | Leave a comment

The Reality of Big Data: Findings from Recent Surveys

Big data tools and technologies emerged first from the companies the Web gave birth to–Google, Facebook, Yahoo, and Amazon. No wonder that the term has become associated primarily with the ability to process and analyze large sets of unstructured, web-generated data, for consumer- and market-related activities such as targeted advertising or improving customer loyalty.   Continue reading

Posted in Misc | Leave a comment

New Research Reports on Big Data

Two new research reports on big data flash out its early impact on enterprise IT. Continue reading

Posted in Big Data Analytics, Data Science | Leave a comment

3 Scenarios for the Future of Data Science

The last couple of weeks were great for the future of data science. First Wikibon, and then IDC, promised a big data market in 2015 of between $16.9 billion (IDC) to $32.1 billion (Wikibon) (more on these reports in Chuck Hollis’ Big Data: From Meme to Marketplace). And the Strata conference showcased the promising startups and data scientists that are going to make the big big data market a reality (see Daniel Tunkelang’s excellent summary here).  Tim O’Reilly aptly summarized all of this excitement by declaring that “data science is the new black.”

So where do we go from here? How will data scientists’ careers shape up over the next decade?   Continue reading

Posted in Data Science, Predictions | Leave a comment

Data Science: Ranking Online Influencers

Data science is the defining specialty of the business of big data and an emerging career path for those who love to find new insights in the gazillion bytes of data created each day. It’s where you find fierce competition for talent, the jobs of the future, new training programs and courses, new ventures, and new products. But where to find the data science-relevant online conversations with the most impact?  Continue reading

Posted in Data Science, Data Scientists | Leave a comment

Big Data and the Demise of Analog Retail

News today that the CEO of Best Buy has abruptly stepped down. According to the Wall Street Journal he did do apparently because of his “personal conduct.” But the recently announced $1.7 billion quartely loss is still the news that matters most to the future of Best Buy and other “Big Box” retailers. And while the cost of operating brick-and-mortar stores as opposed to selling online is what seems to most as the culprit, I would argue that missing the potential of big data is–or will be–the great undoing of traditional retailers.

The Journal article quotes Craig Johnson of retail consultancy Customer Growth Partners: “Best Buy is a very dated store experience, rooted in the 1990s, and they need someone visionary.” Question is, what exactly is dated about the “dated store experience”? Johnson provides the numbers that most commentators focus on: Best Buy’s operating income per square foot was  $18.52 last year (down from $50.61 in 2006). By contrast, “Apple’s retail stores reaped an astronomical $4,700 per square foot last year.”

Indeed, Best Buy finds itself “stuck in the middle,” to use Michael Porter’s terms, between Apple’s product differentiation (both the design of the actual products sold and the design of its stores) and Amazon’s cost leadership. But maybe Porter’s terms are also somewhat dated. Maybe we are witnessing the rise of a completely new big data “generic strategy” which leaves Best Buy and other traditional retailers “stuck outside.” They are left outside of the big data analytics mainstream, stuck on the bank of the river of data that is generated by online sales, watching their online competitors generating not only less-costly sales transactions but also data–on transactions, locations, logistics, customers, potential customers–and knowledge that is used in a virtuous circle to generate more sales and increase customer loyalty. Continue reading

Posted in Misc | Leave a comment