What Has Steve Jobs Wrought?

Steve Jobs had an insanely great ride on the waves of digitization that have transformed the way we work and play over the last few decades. But taking a cursory look at the hundreds of tributes published to commemorate the anniversary of his passing, I was surprised to find lots of trees but not a single forest. The pig picture view of Jobs’ life is sorely missing.

We hear about a lot of specific things that he did or stimulated: He was “a genius toymaker,” a “genuine human being,” a “patent warrior.” He invented this, pushed for that, and denounced the other thing. All true. But wasn’t there something bigger that connected all the dots besides his creativity and drive?

Continue reading
Posted in Digitization | Leave a comment

What Will Make You a Big Data Leader?

The IBM Institute for Business Value’s 2013 analytics survey surveyed 900 business and IT executives from 70 countries. “Leaders” (19% of the sample) were respondents self identified as “substantially outperforming their market or industry peers” in a question used by the IBM Institute for Business Value for years across a wide variety of surveys.

The full report is here

Posted in Big Data Analytics, Big Data Practice, Stats | Leave a comment

Gartner’s Hype Cycle for Big Data

Louis Columbus at Forbes.com surveys key big data forecasts and market size estimates, including Gartner’s recent Hype Cycle for Big Data. The winning technologies in the immediate future? “Column-Store DBMS, Cloud Computing, In-Memory Database Management Systems will be the three most transformational technologies in the next five years.  Gartner goes on to predict that Complex Event Processing, Content Analytics, Context-Enriched Services, Hybrid Cloud Computing, Information Capabilities Framework and Telematics round out the technologies the research firm considers transformational.”

More on the report from Beth Schultz at AllAnalytics:

Gartner’s Hype Cycle is extremely crowded, with nearly 50 technologies represented on it. Many of them are clustered at what the firm calls the peak of inflated expectations, which it says indicates the high level of interest and experimentation in this area. As experimentation increases, many technologies will slide into the “trough of disillusionment,” as MapReduce, text analytics, and in-memory data grids have already done, the report says. This reflects the fact that, even though these technologies have been around for a while, their use as big-data technologies is a newer development.

Interestingly, Gartner says it doesn’t believe big-data will be a hyped term for too long. “Unlike other Hype Cycles, which are published year after year, we believe it is possible that within two to three years, the ability to address new sources and types, and increasing volumes of information will be ‘table stakes’ — part of the cost of entry of playing in the global economy,” the report says. “When the hype goes, so will the Hype Cycle.”

Posted in Big Data Analytics | Leave a comment

Data Science is so 1996!

 

Source: A History of the International Federation of Classifi cation Societies

Posted in Data Science, Data Science History, Data Scientists | Leave a comment

Big Data Quotes

“Big data is like teenage sex: everyone talks about it, nobody really knows how to do it, everyone thinks everyone else is doing it, so everyone claims they are doing it…”—Dan Ariely

“I’m a data janitor. That’s the sexiest job of the 21st century. It’s very flattering, but it’s also a little baffling”–Josh Wills, a senior director of data science at Cloudera

“Given enough data, everything is statistically significant”–Douglas Merrill

Posted in Big Data Analytics, Quotes | Leave a comment

Gartner on Big Data

In its just-published Hype Cycle for Cloud Computing 2012, Gartner predicts that “Big Data will deliver transformational benefits to enterprises within 2 to 5 years, and by 2015 will enable enterprises adopting this technology to outperform competitors by 20% in every available financial metric.” The “transformational benefits,” however, will be delivered to very few enterprises according to another Gartner prediction, from December 2011: “Through 2015, more than 85 percent of Fortune 500 organizations will fail to effectively exploit big data for competitive advantage.”

Gartner currently positions Big Data just below “the peak of inflated expectations.”

Posted in Big Data Analytics, Big Data Bubble, Big Data Futures, Predictions | Leave a comment

IBM Watson and Healthcare Big Data Analytics

Presiding over the ceremonial opening of the new IBM Watson Health global headquarters in Cambridge, Mass., IBM’s senior vice president Mike Rhodin highlighted the sometime-neglected focus of the effort to mine the ever-increasing quantities of health data. “We know that technology alone isn’t the answer,” said Rhodin. “At its core, Watson Health provides the means to orient the entire system around us.”

In a telephone conversation before the event, Dr. Lynda Chin, associate vice chancellor for health transformation at the University of Texas system, voiced a similar perspective: “Technology and innovation are the instigators for change, but they alone won’t do it. We have to think about implementation, about translating the technology into desired outcomes. Implementation is never just a technology play.”

Before assuming her current position in April, Dr. Chin was the founding chair of Genomic Medicine and scientific director of the Institute for Applied Cancer Science at The University of Texas MD Anderson Cancer Center. Two years ago, IBM and MD Anderson announced the Oncology Expert Advisor (OEA), based on IBM’s Watson data analytics engine, an expert system enabling clinicians to “uncover valuable insights from the cancer center’s rich patient and research databases.”

Dr. Chin reports that MD Anderson has by now developed two “apps,” each dealing with a different type of cancer, and is in the process of developing a third one, with each successive cancer-specific solution taking less time to develop. The ultimate goal is to make these solutions available to MD Anderson’s national and international network, so general oncologists in remote hospitals and clinics could tap into its accumulated and evolving expertise.  “To show that the OEA is a knowledge democratization tool, we have to build a network cloud infrastructure to support it. The OEA will not be useful if it doesn’t fit into the everyday life of the general oncologist.”

To achieve that goal, MD Anderson has also partnered with PwC for the development of the cloud information interchange and with AT&T for a secure, dedicated network. It is now piloting its first network link, to one of its network partners in New Jersey.

The integration with the general oncologist’s workflow is moving the expert system from a research reference resource and clinical decision support tool to helping manage the care of specific patients. “The OEA is trained to simulate the exchange between a physician and an expert,” says Dr. Chin. “So for the OEA to work, it has to be connected to the EHR system so we can learn about the patient. The OEA is trained not only to understand the profile of the patient in terms of what is the appropriate evidence-based treatment options but also sharing the experience in managing patients on that type of therapy and helping the general oncologist manage it. It’s as if the oncologist has the ability to call up the expert 24/7 to ask for advice.”

Still, one of the lessons learned so far is that “there will always be a question the OEA was not trained on,” so a teleconferencing component has been built into the system.  Other lessons include the need to provide mobile-device-based solutions, the challenge of teaching the OEA the relative value of each piece of information, and that the expert system “is very valuable from a learning perspective,” as a teaching tool for doctors in training. It also turned out that the OEA is useful in helping research nurses screen patients for clinical trials. Before, the nurses were often considering only the trials they knew about. Now they have at their disposal a clinical trial recommendation engine that screens through all the available trials and an expert system that helps with monitoring the patients participating in the clinical trial.

The development of the OEA is a never-ending journey. Healthcare is a complex and constantly changing endeavor involving research and practice, experiments and established procedures, professionals, institutions, and providers of all sorts, and most important of all, the people they serve—both patients and everybody else trying to keep themselves healthy.  Over the last decade healthcare has gone, at long last, through rapid digitization, transforming mounds of paper into electronic records and introducing computers to many aspects of the physician’s work.

As in other fields, the introduction of computer technology provides opportunities for reducing costs and increasing quality and effectiveness, while at the same time increasing the potential for errors caused by over-reliance on technology and automation. Similarly, while digitization facilitates the collection and sharing of practical knowledge and research expertise, it also produces mountains of data that threaten to impede rather than accelerate progress.

IBM Watson helps in processing and analyzing this data and presenting it as confidence-level-ranked suggestions and recommendations.  At the IBM event on September 10, Dr. Jeff Burns described how OPENPediatrics doesn’t tell the physician “do this,” but rather “tells the doctor what to think about.” OPENPediatrics is a Boston Children’s Hospital-led initiative bringing medical knowledge to pediatric caregivers worldwide (currently reaching 900 hospitals in 127 countries). IBM and Boston Children’s Hospital plan to develop “solutions for commercialization, initially pursuing applications in personalized medicine, heart health and critical care,” leveraging Watson’s genomic, image, and streaming analytics capabilities.

At the new Watson Health headquarters in Cambridge, Mass., Dr. Watson—and 700 other IBM employees—will be joining more than 600 Massachusetts-based life sciences companies and research organizations employing about 60,000 people. IBM plans to open there an interactive Watson Health Experience Center (a demonstration center for IBM customers) and establish a dedicated Health Research lab.

“We have to do it as a community,” Mike Rohdin declared at the event. Other participants, executives from Yale University, Sage Bionetworks, Medtronic, CVS Health, Modernizing Medicine and Teva Pharmaceuticals, echoed the sentiment. And Like Rhodin, they stressed putting people at the center of their efforts to improve healthcare, highlighting the specific goal of helping patients manage their disease.

Digitization—and smart analysis of the data it generates—helps in building a community around shared knowledge. The competition for profits and prestige among healthcare providers, however, while driving the innovation that may lead to better healthcare, also could stand in the way of cooperation and knowledge sharing. It may also lead to hasty development of technology-based solutions without a careful evaluation of the actual benefits and potential risks. Let’s hope that IBM and its partners will do everything they can to uphold the medical community’s tradition of controlled experiments and contributing to the ever-growing public repository of knowledge of what works and what doesn’t work in healthcare.

Originally published on Forbes.com

Posted in Misc | Leave a comment

Predictions for CMOs and Digital Marketing in 2015

In 2015, digital marketing budgets will increase by 8%, according to a recent Gartner’s CMO Spend Report, a survey of 315 marketing decision makers representing organizations with more than $500 million in annual revenue.

Customer experience is the top innovation project for 2015, continuing its role as the top priority for marketing investment in 2014. The survey also found that

  • In 79% of companies, marketing has a budget for capital expenditures — primarily, for infrastructure and software
  • Marketers are managing a P&L and generating revenue from digital advertising, digital commerce and sale of data
  • 68% of organizations have a separate digital marketing budget — it averages a quarter of the total marketing budget
  • Two-thirds of companies are funding digital marketing via reinvestment of existing marketing budgets

Earlier this year, IBM found in its worldwide survey of CMOs that CEOs increasingly call on them for strategic input. Furthermore, the CMO now comes second only to the CFO in terms of the influence he or she exerts on the CEO. The survey also found, however, that very few CMOs have made much progress in building a robust digital marketing capability: Only 20%, for example, have set up social networks for the purpose of engaging with customers, and the percentage of CMOs who have integrated their company’s interactions with customers across different channels, installed analytical programs to mine customer data and created digitally enabled supply chains to respond rapidly to changes in customer demand is even smaller. Almost all CMOs, 82% of survey respondents, felt underprepared to deal with the explosion of data.

With this as a background, here’s a summary of what digital marketing and the CMO will look like in 2015, based on observations by Scott Brinker, a leading commentator on marketing technology, Forrester, TopRank online marketing blog, Wheelhouse Advisors, and Brian Solis.

CMOs will take charge of focusing their companies on the customer

CMOs and their marketing teams will become the primary driver behind customer-centric company growth. Leveraging their knowledge of the customer and the competitive landscape, CMOs will advise and council CEOs on how to win, serve, and retain customers to grow the business. They will also lead organizational changes and new collaboration initiatives aimed at unifying all customer engagement activities across the enterprise.

CMOs will poach IT staff to help them manage a rapidly expanding digital marketing landscape

The number of digital marketing tools will grow in 2015 with new startups and large, established tech companies confusing even more that CMO with their numerous offerings. To help manage this embarrassment of riches and move their companies further on their digital marketing journey, CMOs will be poaching IT staff looking for new challenges and better salaries.

CMOs should expect heavy rains from proliferating digital marketing clouds

Digital marketing tools will be increasingly offered as a cloud-based solution (“marketing-as-a-service”) rather than licensed software. Cloud-based solutions will continue to expand their ecosystems, with many small software developers adding apps to existing cloud-based digital marketing platforms.

CMOs will invest in new digital marketing hot areas

Content marketing and predictive analytics will continue to be hot areas of interest and investment for CMOs, but they will be joined in 2015 by sales enablement, post-sale customer marketing, marketing finance, marketing talent management, and new tools based on the Internet of Things, allowing for the integration of offline and online experiences.

CMOs will become brand publishers

CMOs in 2015 will act as heads of a publishing house, overseeing the entire spectrum of brand engagement, increasing the quality of their output, and improving the perceived value of digital interactions with customers and prospects.

[First published on Forbes.com]

Posted in Predictions | Leave a comment

2015 Predictions for the Big Data Analytics Market

The big data and analytics market will reach $125 billion worldwide in 2015, according to IDC. Both IDC and The International Institute of Analytics (IIA) discussed their big data and analytics predictions for 2015 in separate webcasts last month. Here are the highlights:

Security will become the killer app for big data analytics

Big data analytics tools will be the first line of defense, combining machine learning, text mining and ontology modeling to provide holistic and integrated security threat prediction, detection, and deterrence and prevention programs. (IIA)

IoT analytics will be hot, with a five-year CAGR of 30%

The Internet of Things (IoT) will be the next critical focus for data/analytics services. (IDC) While the IoT trend has focused on the data generation and production (sensors) side of the equation, the “Analytics” of Things is a particular form of big data analytics that often involves anomaly detection and “bringing the data to the analytics.” (IIA)

Adoption of technology to continuously analyze streams of events will accelerate in 2015—it’s all about speed and small units of data. IoT back end as a service (BaaS) will emerge, as players—including Amazon, IBM, and Microsoft—continue to stitch together a wider variety of platform as a service (PaaS) services, including stream processing, data triggers, indexing and synchronization, and notifications, into more tightly integrated offerings directly marketed to the growing community of IoT developers. (IDC)

Buying and selling data will become the new business bread and butter

70% of large organizations already purchase external data and 100% will do so by 2019. In parallel, more organizations will begin to monetize their data by selling them or providing value added content. (IDC) Companies will double their investment in generating new and unique data. “You can’t go into a data-based business without some unique data that gives you competitive differentiation.” 2015 will mark an inflection point of intentional investment by mainstream firms in generating and monetizing new and unique data sources. (IIA)

Companies will invest in self-service, automation, and augmentation to answer the skills shortage  

Shortage of skilled staff will persist. In the U.S. alone there will be 181,000 deep analytics roles in 2018 and 5x that many positions requiring related skills in data management and interpretation. (IDC—note that data was not provided for the supply side of the equation). Visual data discovery, an important enabler of end user self-service, will grow 2.5x faster than the rest of the market, becoming by 2018 a requirement for all enterprises. (IDC)

Automated decision-making will come of age in 2015 and the organizational implications will be profound. The very way that firms operate and organize themselves will be questioned this year as common workflows become rationalized through analytics. Key to success is the transparency of the automated systems and preparing managers “to occasionally look under the cover” of established models and algorithms.  (IIA)

Google’s announced Tuesday an automated statistician research project which aims to build an “artificial intelligence for data science.” But augmentation, rather than automation, may be the better option with knowledge workers. In 2015, companies will begin considering how to augment knowledge work jobs rather than automating them—moving from artificial intelligence to intelligent augmentation. Analytics, machine learning, and cognitive computing will increasingly take over the jobs of knowledge workers, and we will become more conscious of this in 2015. (IIA)

By 2018, half of all consumers will interact with services based on cognitive computing on a regular basis. Current personal services such as Apple Siri, Microsoft Cortana, and Google Now will raise expectations for employees to seek access to similar services in the enterprise. In 2015, PaaS competitors will step up their efforts to compete in the cognitive space. (IDC)

Image, video, and audio analytics will become pervasive

Rich media analytics will at least triple in 2015 and emerge as the key driver for big data technology investment. Already half of large organizations in North America are reporting use of rich media (video, audio, image) data as part of their big data analytics projects, and all large organizations will analyze rich media in five years. (IDC)

Storytelling will be the hot new job in analytics

The most important attribute sought in candidates for big data analytics jobs is communications skills. As organizations run into obstacles in understanding and adopting analytics, they rightly place more emphasis on communication, which is not a strength of most analysts. Companies will increasingly recognize the value of putting an experienced storyteller into the mix (IIA)… possibly looking to fill these positions from the large pool of unemployed journalists?

Posted in Big Data Analytics, Predictions | Leave a comment

How to Become a Unicorn Data Scientist and Make More than $240,000

What makes a good data scientist? And if you are a good data scientist, how much should you expect to get paid?

Owen Zhang, ranked #1 on Kaggle, the online stadium for data science competitions, lists his skills on his Kaggle profile as “excessive effort,” “luck,” and “other people’s code.” An engineer by training, Zhang says in this ODSC interview that data science is finding “practical solutions to not very well-defined problems,” similar to engineering. He believes that good data scientists, “otherwise known as unicorn data scientists,” have three types of expertise. Since data science deals with practical problems, the first one is being familiar with a specific domain and knowing how to solve a problem in that domain. The second is the ability to distinguish signal from noise, or understanding statistics. The third skill is software engineering.

Not having formal education in statistics or software engineering, Zhang explains that he acquired his data science skills by competing in Kaggle and learning from its community. No doubt being very good at learning on your own is a required skill, to say nothing about hanging out with the right people, preferably unicorn data scientists. Galit Shmueli, Professor of Business Analytics at NTHU, told rjmetrics that her one piece of advice for data scientists just getting started is to “attend a conference or two, see what people are working on, what are the challenges, and what’s the atmosphere.”

Recent data shows that unicorn data scientists can make more than $240,000 annually. This according to the 2015 Data Science Salary Survey where O’Reilly Media’s John King and Roger Magoulas report the results of a survey of 600 “data practitioners” (reflecting the recency of the term, only one-quarter of the respondents have job titles that explicitly identify them as “data scientists”).

The median annual base salary of the survey sample is $91,000, and among U.S. respondents is $104,000, similar to last year’s results. 23% said that it would be “very easy” for them to find another position.

Keep in mind that “23% of the sample hold a doctorate degree,” and additional 44% hold a master’s. The word “sample” here means, as it does in almost all other surveys today, “the people that wanted to answer our survey.” But unlike other survey report authors, King and Magoulas make sure to issue this warning: “We should be careful when making conclusions about survey data from a self-selecting sample—it is a major assumption to claim it is an unbiased representation of all data scientists and engineers… the O’Reilly audience tends to use more newer, open source tools, and underrepresents non-tech industries such as insurance and energy.”

Still, we can learn quite a lot about the background and skills required for admission into this well-paid group of data masters. Two-thirds of respondents had academic backgrounds in computer science, mathematics, statistics, or physics.

Beyond the initial training, it is important to keep abreast of the ever-changing landscape of data science tools: “It seems likely that in the long run knowing the highest paying tools will increase your chances of joining the ranks of the highest paid,” say King and Magoulas. And the most recent additions to the data science tool pantheon provide the greatest boost to salaries: “…learning Spark could apparently have more of an impact on salary than getting a PhD. Scala is another bonus: those who use both are expected to earn over $15,000 more than an otherwise equivalent data professional.”

The bad news is that the more time spent in meetings (even for non-managers), the more money a data scientist makes. Another widely discussed unpleasant part of the job—data cleaning—is the #2 task on which data scientists spend the most time, with 39% of survey participants spending at least one hour per day on this task. The good news is that exploratory data analysis is what occupies them most, with 46% spending one to three hours per day on this task and 12% spending four hours or more.

More data on the skills employed by practicing data scientists comes from an AnalyticsWeek survey of 410 data professionals. In Optimizing Your Data Science Team, Bob E. Hayes reports that respondents were asked to indicate their level of proficiency for 25 different skills.” Solving problems with data,” says Hayes, “requires expertise across different skill areas: 1) Business, 2) Technology, 3) Programming, 4) Math & Modeling and 5) Statistics. Proficiency in each skill area is related to job role.”

All of these skills may not present themselves in a single data scientist but it’s possible to assemble all of them by putting together a top-notch data science team. In “Tips for building a data science capability” from consulting firm Booz Allen Hamilton, we learn that “rather than illuminate a single data science rock star, it is important to highlight a diversity of talent at all levels to help others self-identify with the capability. It is also a more realistic version of the truth. Very rarely will you find ‘magical unicorns’ that embody the full breadth of math and computer science skills along with the requisite domain knowledge. More often, you will build diverse teams that when combined provide you with the ‘triple-threat’ (computer science, math/statistics, and domain expertise) model needed for the toughest data science problems.”

The concept of a data science team, combining various skills and educational backgrounds, is high on the agenda of the 175-year-old American Statistical Association (ASA) which is probably looking in dismay at the oodles of funds going to establishing new data science programs and research centers at American universities, to say nothing about the salaries of data scientists as opposed to the salaries of statisticians.

The ASA issued a “policy statement” on October 1, reminding the world that statistics is one of the three disciplines “foundational to data science” (the other two being database management and distributed and parallel systems, providing a “computational infrastructure”). The statement concludes with “The next generation [of statisticians] must include more researchers with skills that cross the traditional boundaries of statistics, databases and distributed systems; there will be an ever-increasing demand for such ‘multi-lingual’ experts.”

In other words, if you aspire to a $200,000+ salary, better call yourself a data scientist and start coding.

Posted in Data Science Careers | Tagged , , | Leave a comment