The Big Data Interview: Sanjay Mirchandani, CIO, EMC

If data sits on a desk somewhere and is not being used, it’s an opportunity wasted

Sanjay Mirchandani believes IT has to take the lead in adding value to the business in the form of big data “addictive analytics.” Mirchandani is Chief Information Officer and COO, Global Centers of Excellence, at EMC Corporation. He has been recognized as one of Computerworld’s Premier 100 IT Leaders and Boston Business Journal’s CIOs of the Year. The following is an edited transcript of our recent phone conversation.

What would you say to a CIO who dismisses big data as just another buzzword?

I would say that for too long we have been trying to manage down information. The IT world that we have become comfortable with for many years was mostly within the enterprise, maybe connecting to some partners and customers. It was also mostly structured, basically revolving around transactional data. Today, the volume, variety, velocity and complexity of information have changed the IT landscape. These are the four things I challenge CIOs to really think about. We all know how to do structured information. But the moment you throw in unstructured and semi-structured information, life changes. This is where the value is for organizations today.

Does this also change the relationships between IT and the business?

Only IT has a complete picture of all the data in the enterprise. At the same time, IT today cannot have a monopoly on information. That changes the role and responsibilities of IT and the business. We in IT want to deliver more as a service and the business wants to consume more as a service.  And IT and the business increasingly share tools and capabilities. For example, I can offer a tool like Greenplum Chorus, which is a community-based BI-data warehousing-analytics tool, where data scientists in IT work collaboratively with data scientists sitting in the business. If there’s something we can do better, we’ll take it on ourselves; if there’s something they can do better, like creating their own wrappers around the analytics, they will do it. What’s clear is that IT and the business have never been better aligned.    Continue reading

Posted in Big Data Analytics, Data Science, Data Scientists | Leave a comment

The Data Science Interview: Mok Oh, PayPal

To Do Data Science, You Need a Team of Specialists

Currently the Chief Scientist at PayPal, Mok Oh came on board when eBay acquired WHERE, where he was Chief Innovation Officer.  Prior to WHERE, Mok founded EveryScape, a data visualization company.  The following is an edited transcript of our recent phone conversation.

How do you define a data scientist?   Continue reading

Posted in Data Science, Data Scientists | Leave a comment

Mingsheng Hong: The Data Scientist is the New Product Manager

Boston’s new data science-related meetup, The Data Scientist, got off to a great start yesterday with a presentation titled “The Scientist, The Team and The Purpose,” entertainingly delivered by Mingsheng Hong, Chief Data Scientist at Hadapt.  

Continue reading
Posted in Big Data Analytics, Data Science, Data Scientists | Leave a comment

What Has Steve Jobs Wrought?

Steve Jobs had an insanely great ride on the waves of digitization that have transformed the way we work and play over the last few decades. But taking a cursory look at the hundreds of tributes published to commemorate the anniversary of his passing, I was surprised to find lots of trees but not a single forest. The pig picture view of Jobs’ life is sorely missing.

We hear about a lot of specific things that he did or stimulated: He was “a genius toymaker,” a “genuine human being,” a “patent warrior.” He invented this, pushed for that, and denounced the other thing. All true. But wasn’t there something bigger that connected all the dots besides his creativity and drive?

Continue reading
Posted in Digitization | Leave a comment

What Will Make You a Big Data Leader?

The IBM Institute for Business Value’s 2013 analytics survey surveyed 900 business and IT executives from 70 countries. “Leaders” (19% of the sample) were respondents self identified as “substantially outperforming their market or industry peers” in a question used by the IBM Institute for Business Value for years across a wide variety of surveys.

The full report is here

Posted in Big Data Analytics, Big Data Practice, Stats | Leave a comment

Gartner’s Hype Cycle for Big Data

Louis Columbus at Forbes.com surveys key big data forecasts and market size estimates, including Gartner’s recent Hype Cycle for Big Data. The winning technologies in the immediate future? “Column-Store DBMS, Cloud Computing, In-Memory Database Management Systems will be the three most transformational technologies in the next five years.  Gartner goes on to predict that Complex Event Processing, Content Analytics, Context-Enriched Services, Hybrid Cloud Computing, Information Capabilities Framework and Telematics round out the technologies the research firm considers transformational.”

More on the report from Beth Schultz at AllAnalytics:

Gartner’s Hype Cycle is extremely crowded, with nearly 50 technologies represented on it. Many of them are clustered at what the firm calls the peak of inflated expectations, which it says indicates the high level of interest and experimentation in this area. As experimentation increases, many technologies will slide into the “trough of disillusionment,” as MapReduce, text analytics, and in-memory data grids have already done, the report says. This reflects the fact that, even though these technologies have been around for a while, their use as big-data technologies is a newer development.

Interestingly, Gartner says it doesn’t believe big-data will be a hyped term for too long. “Unlike other Hype Cycles, which are published year after year, we believe it is possible that within two to three years, the ability to address new sources and types, and increasing volumes of information will be ‘table stakes’ — part of the cost of entry of playing in the global economy,” the report says. “When the hype goes, so will the Hype Cycle.”

Posted in Big Data Analytics | Leave a comment

Data Science is so 1996!

 

Source: A History of the International Federation of Classifi cation Societies

Posted in Data Science, Data Science History, Data Scientists | Leave a comment

Big Data Quotes

“Big data is like teenage sex: everyone talks about it, nobody really knows how to do it, everyone thinks everyone else is doing it, so everyone claims they are doing it…”—Dan Ariely

“I’m a data janitor. That’s the sexiest job of the 21st century. It’s very flattering, but it’s also a little baffling”–Josh Wills, a senior director of data science at Cloudera

“Given enough data, everything is statistically significant”–Douglas Merrill

Posted in Big Data Analytics, Quotes | Leave a comment

Gartner on Big Data

In its just-published Hype Cycle for Cloud Computing 2012, Gartner predicts that “Big Data will deliver transformational benefits to enterprises within 2 to 5 years, and by 2015 will enable enterprises adopting this technology to outperform competitors by 20% in every available financial metric.” The “transformational benefits,” however, will be delivered to very few enterprises according to another Gartner prediction, from December 2011: “Through 2015, more than 85 percent of Fortune 500 organizations will fail to effectively exploit big data for competitive advantage.”

Gartner currently positions Big Data just below “the peak of inflated expectations.”

Posted in Big Data Analytics, Big Data Bubble, Big Data Futures, Predictions | Leave a comment

IBM Watson and Healthcare Big Data Analytics

Presiding over the ceremonial opening of the new IBM Watson Health global headquarters in Cambridge, Mass., IBM’s senior vice president Mike Rhodin highlighted the sometime-neglected focus of the effort to mine the ever-increasing quantities of health data. “We know that technology alone isn’t the answer,” said Rhodin. “At its core, Watson Health provides the means to orient the entire system around us.”

In a telephone conversation before the event, Dr. Lynda Chin, associate vice chancellor for health transformation at the University of Texas system, voiced a similar perspective: “Technology and innovation are the instigators for change, but they alone won’t do it. We have to think about implementation, about translating the technology into desired outcomes. Implementation is never just a technology play.”

Before assuming her current position in April, Dr. Chin was the founding chair of Genomic Medicine and scientific director of the Institute for Applied Cancer Science at The University of Texas MD Anderson Cancer Center. Two years ago, IBM and MD Anderson announced the Oncology Expert Advisor (OEA), based on IBM’s Watson data analytics engine, an expert system enabling clinicians to “uncover valuable insights from the cancer center’s rich patient and research databases.”

Dr. Chin reports that MD Anderson has by now developed two “apps,” each dealing with a different type of cancer, and is in the process of developing a third one, with each successive cancer-specific solution taking less time to develop. The ultimate goal is to make these solutions available to MD Anderson’s national and international network, so general oncologists in remote hospitals and clinics could tap into its accumulated and evolving expertise.  “To show that the OEA is a knowledge democratization tool, we have to build a network cloud infrastructure to support it. The OEA will not be useful if it doesn’t fit into the everyday life of the general oncologist.”

To achieve that goal, MD Anderson has also partnered with PwC for the development of the cloud information interchange and with AT&T for a secure, dedicated network. It is now piloting its first network link, to one of its network partners in New Jersey.

The integration with the general oncologist’s workflow is moving the expert system from a research reference resource and clinical decision support tool to helping manage the care of specific patients. “The OEA is trained to simulate the exchange between a physician and an expert,” says Dr. Chin. “So for the OEA to work, it has to be connected to the EHR system so we can learn about the patient. The OEA is trained not only to understand the profile of the patient in terms of what is the appropriate evidence-based treatment options but also sharing the experience in managing patients on that type of therapy and helping the general oncologist manage it. It’s as if the oncologist has the ability to call up the expert 24/7 to ask for advice.”

Still, one of the lessons learned so far is that “there will always be a question the OEA was not trained on,” so a teleconferencing component has been built into the system.  Other lessons include the need to provide mobile-device-based solutions, the challenge of teaching the OEA the relative value of each piece of information, and that the expert system “is very valuable from a learning perspective,” as a teaching tool for doctors in training. It also turned out that the OEA is useful in helping research nurses screen patients for clinical trials. Before, the nurses were often considering only the trials they knew about. Now they have at their disposal a clinical trial recommendation engine that screens through all the available trials and an expert system that helps with monitoring the patients participating in the clinical trial.

The development of the OEA is a never-ending journey. Healthcare is a complex and constantly changing endeavor involving research and practice, experiments and established procedures, professionals, institutions, and providers of all sorts, and most important of all, the people they serve—both patients and everybody else trying to keep themselves healthy.  Over the last decade healthcare has gone, at long last, through rapid digitization, transforming mounds of paper into electronic records and introducing computers to many aspects of the physician’s work.

As in other fields, the introduction of computer technology provides opportunities for reducing costs and increasing quality and effectiveness, while at the same time increasing the potential for errors caused by over-reliance on technology and automation. Similarly, while digitization facilitates the collection and sharing of practical knowledge and research expertise, it also produces mountains of data that threaten to impede rather than accelerate progress.

IBM Watson helps in processing and analyzing this data and presenting it as confidence-level-ranked suggestions and recommendations.  At the IBM event on September 10, Dr. Jeff Burns described how OPENPediatrics doesn’t tell the physician “do this,” but rather “tells the doctor what to think about.” OPENPediatrics is a Boston Children’s Hospital-led initiative bringing medical knowledge to pediatric caregivers worldwide (currently reaching 900 hospitals in 127 countries). IBM and Boston Children’s Hospital plan to develop “solutions for commercialization, initially pursuing applications in personalized medicine, heart health and critical care,” leveraging Watson’s genomic, image, and streaming analytics capabilities.

At the new Watson Health headquarters in Cambridge, Mass., Dr. Watson—and 700 other IBM employees—will be joining more than 600 Massachusetts-based life sciences companies and research organizations employing about 60,000 people. IBM plans to open there an interactive Watson Health Experience Center (a demonstration center for IBM customers) and establish a dedicated Health Research lab.

“We have to do it as a community,” Mike Rohdin declared at the event. Other participants, executives from Yale University, Sage Bionetworks, Medtronic, CVS Health, Modernizing Medicine and Teva Pharmaceuticals, echoed the sentiment. And Like Rhodin, they stressed putting people at the center of their efforts to improve healthcare, highlighting the specific goal of helping patients manage their disease.

Digitization—and smart analysis of the data it generates—helps in building a community around shared knowledge. The competition for profits and prestige among healthcare providers, however, while driving the innovation that may lead to better healthcare, also could stand in the way of cooperation and knowledge sharing. It may also lead to hasty development of technology-based solutions without a careful evaluation of the actual benefits and potential risks. Let’s hope that IBM and its partners will do everything they can to uphold the medical community’s tradition of controlled experiments and contributing to the ever-growing public repository of knowledge of what works and what doesn’t work in healthcare.

Originally published on Forbes.com

Posted in Misc | Leave a comment