Gartner’s Hype Cycle for Big Data

Louis Columbus at Forbes.com surveys key big data forecasts and market size estimates, including Gartner’s recent Hype Cycle for Big Data. The winning technologies in the immediate future? “Column-Store DBMS, Cloud Computing, In-Memory Database Management Systems will be the three most transformational technologies in the next five years.  Gartner goes on to predict that Complex Event Processing, Content Analytics, Context-Enriched Services, Hybrid Cloud Computing, Information Capabilities Framework and Telematics round out the technologies the research firm considers transformational.”

More on the report from Beth Schultz at AllAnalytics:

Gartner’s Hype Cycle is extremely crowded, with nearly 50 technologies represented on it. Many of them are clustered at what the firm calls the peak of inflated expectations, which it says indicates the high level of interest and experimentation in this area. As experimentation increases, many technologies will slide into the “trough of disillusionment,” as MapReduce, text analytics, and in-memory data grids have already done, the report says. This reflects the fact that, even though these technologies have been around for a while, their use as big-data technologies is a newer development.

Interestingly, Gartner says it doesn’t believe big-data will be a hyped term for too long. “Unlike other Hype Cycles, which are published year after year, we believe it is possible that within two to three years, the ability to address new sources and types, and increasing volumes of information will be ‘table stakes’ — part of the cost of entry of playing in the global economy,” the report says. “When the hype goes, so will the Hype Cycle.”

Posted in Big Data Analytics | Leave a comment

Data Science is so 1996!

 

Source: A History of the International Federation of Classifi cation Societies

Posted in Data Science, Data Science History, Data Scientists | Leave a comment

Big Data Quotes

“Big data is like teenage sex: everyone talks about it, nobody really knows how to do it, everyone thinks everyone else is doing it, so everyone claims they are doing it…”—Dan Ariely

“I’m a data janitor. That’s the sexiest job of the 21st century. It’s very flattering, but it’s also a little baffling”–Josh Wills, a senior director of data science at Cloudera

“Given enough data, everything is statistically significant”–Douglas Merrill

Posted in Big Data Analytics, Quotes | Leave a comment

Gartner on Big Data

In its just-published Hype Cycle for Cloud Computing 2012, Gartner predicts that “Big Data will deliver transformational benefits to enterprises within 2 to 5 years, and by 2015 will enable enterprises adopting this technology to outperform competitors by 20% in every available financial metric.” The “transformational benefits,” however, will be delivered to very few enterprises according to another Gartner prediction, from December 2011: “Through 2015, more than 85 percent of Fortune 500 organizations will fail to effectively exploit big data for competitive advantage.”

Gartner currently positions Big Data just below “the peak of inflated expectations.”

Posted in Big Data Analytics, Big Data Bubble, Big Data Futures, Predictions | Leave a comment

IBM Watson and Healthcare Big Data Analytics

Presiding over the ceremonial opening of the new IBM Watson Health global headquarters in Cambridge, Mass., IBM’s senior vice president Mike Rhodin highlighted the sometime-neglected focus of the effort to mine the ever-increasing quantities of health data. “We know that technology alone isn’t the answer,” said Rhodin. “At its core, Watson Health provides the means to orient the entire system around us.”

In a telephone conversation before the event, Dr. Lynda Chin, associate vice chancellor for health transformation at the University of Texas system, voiced a similar perspective: “Technology and innovation are the instigators for change, but they alone won’t do it. We have to think about implementation, about translating the technology into desired outcomes. Implementation is never just a technology play.”

Before assuming her current position in April, Dr. Chin was the founding chair of Genomic Medicine and scientific director of the Institute for Applied Cancer Science at The University of Texas MD Anderson Cancer Center. Two years ago, IBM and MD Anderson announced the Oncology Expert Advisor (OEA), based on IBM’s Watson data analytics engine, an expert system enabling clinicians to “uncover valuable insights from the cancer center’s rich patient and research databases.”

Dr. Chin reports that MD Anderson has by now developed two “apps,” each dealing with a different type of cancer, and is in the process of developing a third one, with each successive cancer-specific solution taking less time to develop. The ultimate goal is to make these solutions available to MD Anderson’s national and international network, so general oncologists in remote hospitals and clinics could tap into its accumulated and evolving expertise.  “To show that the OEA is a knowledge democratization tool, we have to build a network cloud infrastructure to support it. The OEA will not be useful if it doesn’t fit into the everyday life of the general oncologist.”

To achieve that goal, MD Anderson has also partnered with PwC for the development of the cloud information interchange and with AT&T for a secure, dedicated network. It is now piloting its first network link, to one of its network partners in New Jersey.

The integration with the general oncologist’s workflow is moving the expert system from a research reference resource and clinical decision support tool to helping manage the care of specific patients. “The OEA is trained to simulate the exchange between a physician and an expert,” says Dr. Chin. “So for the OEA to work, it has to be connected to the EHR system so we can learn about the patient. The OEA is trained not only to understand the profile of the patient in terms of what is the appropriate evidence-based treatment options but also sharing the experience in managing patients on that type of therapy and helping the general oncologist manage it. It’s as if the oncologist has the ability to call up the expert 24/7 to ask for advice.”

Still, one of the lessons learned so far is that “there will always be a question the OEA was not trained on,” so a teleconferencing component has been built into the system.  Other lessons include the need to provide mobile-device-based solutions, the challenge of teaching the OEA the relative value of each piece of information, and that the expert system “is very valuable from a learning perspective,” as a teaching tool for doctors in training. It also turned out that the OEA is useful in helping research nurses screen patients for clinical trials. Before, the nurses were often considering only the trials they knew about. Now they have at their disposal a clinical trial recommendation engine that screens through all the available trials and an expert system that helps with monitoring the patients participating in the clinical trial.

The development of the OEA is a never-ending journey. Healthcare is a complex and constantly changing endeavor involving research and practice, experiments and established procedures, professionals, institutions, and providers of all sorts, and most important of all, the people they serve—both patients and everybody else trying to keep themselves healthy.  Over the last decade healthcare has gone, at long last, through rapid digitization, transforming mounds of paper into electronic records and introducing computers to many aspects of the physician’s work.

As in other fields, the introduction of computer technology provides opportunities for reducing costs and increasing quality and effectiveness, while at the same time increasing the potential for errors caused by over-reliance on technology and automation. Similarly, while digitization facilitates the collection and sharing of practical knowledge and research expertise, it also produces mountains of data that threaten to impede rather than accelerate progress.

IBM Watson helps in processing and analyzing this data and presenting it as confidence-level-ranked suggestions and recommendations.  At the IBM event on September 10, Dr. Jeff Burns described how OPENPediatrics doesn’t tell the physician “do this,” but rather “tells the doctor what to think about.” OPENPediatrics is a Boston Children’s Hospital-led initiative bringing medical knowledge to pediatric caregivers worldwide (currently reaching 900 hospitals in 127 countries). IBM and Boston Children’s Hospital plan to develop “solutions for commercialization, initially pursuing applications in personalized medicine, heart health and critical care,” leveraging Watson’s genomic, image, and streaming analytics capabilities.

At the new Watson Health headquarters in Cambridge, Mass., Dr. Watson—and 700 other IBM employees—will be joining more than 600 Massachusetts-based life sciences companies and research organizations employing about 60,000 people. IBM plans to open there an interactive Watson Health Experience Center (a demonstration center for IBM customers) and establish a dedicated Health Research lab.

“We have to do it as a community,” Mike Rohdin declared at the event. Other participants, executives from Yale University, Sage Bionetworks, Medtronic, CVS Health, Modernizing Medicine and Teva Pharmaceuticals, echoed the sentiment. And Like Rhodin, they stressed putting people at the center of their efforts to improve healthcare, highlighting the specific goal of helping patients manage their disease.

Digitization—and smart analysis of the data it generates—helps in building a community around shared knowledge. The competition for profits and prestige among healthcare providers, however, while driving the innovation that may lead to better healthcare, also could stand in the way of cooperation and knowledge sharing. It may also lead to hasty development of technology-based solutions without a careful evaluation of the actual benefits and potential risks. Let’s hope that IBM and its partners will do everything they can to uphold the medical community’s tradition of controlled experiments and contributing to the ever-growing public repository of knowledge of what works and what doesn’t work in healthcare.

Originally published on Forbes.com

Posted in Misc | Leave a comment

Predictions for CMOs and Digital Marketing in 2015

In 2015, digital marketing budgets will increase by 8%, according to a recent Gartner’s CMO Spend Report, a survey of 315 marketing decision makers representing organizations with more than $500 million in annual revenue.

Customer experience is the top innovation project for 2015, continuing its role as the top priority for marketing investment in 2014. The survey also found that

  • In 79% of companies, marketing has a budget for capital expenditures — primarily, for infrastructure and software
  • Marketers are managing a P&L and generating revenue from digital advertising, digital commerce and sale of data
  • 68% of organizations have a separate digital marketing budget — it averages a quarter of the total marketing budget
  • Two-thirds of companies are funding digital marketing via reinvestment of existing marketing budgets

Earlier this year, IBM found in its worldwide survey of CMOs that CEOs increasingly call on them for strategic input. Furthermore, the CMO now comes second only to the CFO in terms of the influence he or she exerts on the CEO. The survey also found, however, that very few CMOs have made much progress in building a robust digital marketing capability: Only 20%, for example, have set up social networks for the purpose of engaging with customers, and the percentage of CMOs who have integrated their company’s interactions with customers across different channels, installed analytical programs to mine customer data and created digitally enabled supply chains to respond rapidly to changes in customer demand is even smaller. Almost all CMOs, 82% of survey respondents, felt underprepared to deal with the explosion of data.

With this as a background, here’s a summary of what digital marketing and the CMO will look like in 2015, based on observations by Scott Brinker, a leading commentator on marketing technology, Forrester, TopRank online marketing blog, Wheelhouse Advisors, and Brian Solis.

CMOs will take charge of focusing their companies on the customer

CMOs and their marketing teams will become the primary driver behind customer-centric company growth. Leveraging their knowledge of the customer and the competitive landscape, CMOs will advise and council CEOs on how to win, serve, and retain customers to grow the business. They will also lead organizational changes and new collaboration initiatives aimed at unifying all customer engagement activities across the enterprise.

CMOs will poach IT staff to help them manage a rapidly expanding digital marketing landscape

The number of digital marketing tools will grow in 2015 with new startups and large, established tech companies confusing even more that CMO with their numerous offerings. To help manage this embarrassment of riches and move their companies further on their digital marketing journey, CMOs will be poaching IT staff looking for new challenges and better salaries.

CMOs should expect heavy rains from proliferating digital marketing clouds

Digital marketing tools will be increasingly offered as a cloud-based solution (“marketing-as-a-service”) rather than licensed software. Cloud-based solutions will continue to expand their ecosystems, with many small software developers adding apps to existing cloud-based digital marketing platforms.

CMOs will invest in new digital marketing hot areas

Content marketing and predictive analytics will continue to be hot areas of interest and investment for CMOs, but they will be joined in 2015 by sales enablement, post-sale customer marketing, marketing finance, marketing talent management, and new tools based on the Internet of Things, allowing for the integration of offline and online experiences.

CMOs will become brand publishers

CMOs in 2015 will act as heads of a publishing house, overseeing the entire spectrum of brand engagement, increasing the quality of their output, and improving the perceived value of digital interactions with customers and prospects.

[First published on Forbes.com]

Posted in Predictions | Leave a comment

2015 Predictions for the Big Data Analytics Market

The big data and analytics market will reach $125 billion worldwide in 2015, according to IDC. Both IDC and The International Institute of Analytics (IIA) discussed their big data and analytics predictions for 2015 in separate webcasts last month. Here are the highlights:

Security will become the killer app for big data analytics

Big data analytics tools will be the first line of defense, combining machine learning, text mining and ontology modeling to provide holistic and integrated security threat prediction, detection, and deterrence and prevention programs. (IIA)

IoT analytics will be hot, with a five-year CAGR of 30%

The Internet of Things (IoT) will be the next critical focus for data/analytics services. (IDC) While the IoT trend has focused on the data generation and production (sensors) side of the equation, the “Analytics” of Things is a particular form of big data analytics that often involves anomaly detection and “bringing the data to the analytics.” (IIA)

Adoption of technology to continuously analyze streams of events will accelerate in 2015—it’s all about speed and small units of data. IoT back end as a service (BaaS) will emerge, as players—including Amazon, IBM, and Microsoft—continue to stitch together a wider variety of platform as a service (PaaS) services, including stream processing, data triggers, indexing and synchronization, and notifications, into more tightly integrated offerings directly marketed to the growing community of IoT developers. (IDC)

Buying and selling data will become the new business bread and butter

70% of large organizations already purchase external data and 100% will do so by 2019. In parallel, more organizations will begin to monetize their data by selling them or providing value added content. (IDC) Companies will double their investment in generating new and unique data. “You can’t go into a data-based business without some unique data that gives you competitive differentiation.” 2015 will mark an inflection point of intentional investment by mainstream firms in generating and monetizing new and unique data sources. (IIA)

Companies will invest in self-service, automation, and augmentation to answer the skills shortage  

Shortage of skilled staff will persist. In the U.S. alone there will be 181,000 deep analytics roles in 2018 and 5x that many positions requiring related skills in data management and interpretation. (IDC—note that data was not provided for the supply side of the equation). Visual data discovery, an important enabler of end user self-service, will grow 2.5x faster than the rest of the market, becoming by 2018 a requirement for all enterprises. (IDC)

Automated decision-making will come of age in 2015 and the organizational implications will be profound. The very way that firms operate and organize themselves will be questioned this year as common workflows become rationalized through analytics. Key to success is the transparency of the automated systems and preparing managers “to occasionally look under the cover” of established models and algorithms.  (IIA)

Google’s announced Tuesday an automated statistician research project which aims to build an “artificial intelligence for data science.” But augmentation, rather than automation, may be the better option with knowledge workers. In 2015, companies will begin considering how to augment knowledge work jobs rather than automating them—moving from artificial intelligence to intelligent augmentation. Analytics, machine learning, and cognitive computing will increasingly take over the jobs of knowledge workers, and we will become more conscious of this in 2015. (IIA)

By 2018, half of all consumers will interact with services based on cognitive computing on a regular basis. Current personal services such as Apple Siri, Microsoft Cortana, and Google Now will raise expectations for employees to seek access to similar services in the enterprise. In 2015, PaaS competitors will step up their efforts to compete in the cognitive space. (IDC)

Image, video, and audio analytics will become pervasive

Rich media analytics will at least triple in 2015 and emerge as the key driver for big data technology investment. Already half of large organizations in North America are reporting use of rich media (video, audio, image) data as part of their big data analytics projects, and all large organizations will analyze rich media in five years. (IDC)

Storytelling will be the hot new job in analytics

The most important attribute sought in candidates for big data analytics jobs is communications skills. As organizations run into obstacles in understanding and adopting analytics, they rightly place more emphasis on communication, which is not a strength of most analysts. Companies will increasingly recognize the value of putting an experienced storyteller into the mix (IIA)… possibly looking to fill these positions from the large pool of unemployed journalists?

Posted in Big Data Analytics, Predictions | Leave a comment

How to Become a Unicorn Data Scientist and Make More than $240,000

What makes a good data scientist? And if you are a good data scientist, how much should you expect to get paid?

Owen Zhang, ranked #1 on Kaggle, the online stadium for data science competitions, lists his skills on his Kaggle profile as “excessive effort,” “luck,” and “other people’s code.” An engineer by training, Zhang says in this ODSC interview that data science is finding “practical solutions to not very well-defined problems,” similar to engineering. He believes that good data scientists, “otherwise known as unicorn data scientists,” have three types of expertise. Since data science deals with practical problems, the first one is being familiar with a specific domain and knowing how to solve a problem in that domain. The second is the ability to distinguish signal from noise, or understanding statistics. The third skill is software engineering.

Not having formal education in statistics or software engineering, Zhang explains that he acquired his data science skills by competing in Kaggle and learning from its community. No doubt being very good at learning on your own is a required skill, to say nothing about hanging out with the right people, preferably unicorn data scientists. Galit Shmueli, Professor of Business Analytics at NTHU, told rjmetrics that her one piece of advice for data scientists just getting started is to “attend a conference or two, see what people are working on, what are the challenges, and what’s the atmosphere.”

Recent data shows that unicorn data scientists can make more than $240,000 annually. This according to the 2015 Data Science Salary Survey where O’Reilly Media’s John King and Roger Magoulas report the results of a survey of 600 “data practitioners” (reflecting the recency of the term, only one-quarter of the respondents have job titles that explicitly identify them as “data scientists”).

The median annual base salary of the survey sample is $91,000, and among U.S. respondents is $104,000, similar to last year’s results. 23% said that it would be “very easy” for them to find another position.

Keep in mind that “23% of the sample hold a doctorate degree,” and additional 44% hold a master’s. The word “sample” here means, as it does in almost all other surveys today, “the people that wanted to answer our survey.” But unlike other survey report authors, King and Magoulas make sure to issue this warning: “We should be careful when making conclusions about survey data from a self-selecting sample—it is a major assumption to claim it is an unbiased representation of all data scientists and engineers… the O’Reilly audience tends to use more newer, open source tools, and underrepresents non-tech industries such as insurance and energy.”

Still, we can learn quite a lot about the background and skills required for admission into this well-paid group of data masters. Two-thirds of respondents had academic backgrounds in computer science, mathematics, statistics, or physics.

Beyond the initial training, it is important to keep abreast of the ever-changing landscape of data science tools: “It seems likely that in the long run knowing the highest paying tools will increase your chances of joining the ranks of the highest paid,” say King and Magoulas. And the most recent additions to the data science tool pantheon provide the greatest boost to salaries: “…learning Spark could apparently have more of an impact on salary than getting a PhD. Scala is another bonus: those who use both are expected to earn over $15,000 more than an otherwise equivalent data professional.”

The bad news is that the more time spent in meetings (even for non-managers), the more money a data scientist makes. Another widely discussed unpleasant part of the job—data cleaning—is the #2 task on which data scientists spend the most time, with 39% of survey participants spending at least one hour per day on this task. The good news is that exploratory data analysis is what occupies them most, with 46% spending one to three hours per day on this task and 12% spending four hours or more.

More data on the skills employed by practicing data scientists comes from an AnalyticsWeek survey of 410 data professionals. In Optimizing Your Data Science Team, Bob E. Hayes reports that respondents were asked to indicate their level of proficiency for 25 different skills.” Solving problems with data,” says Hayes, “requires expertise across different skill areas: 1) Business, 2) Technology, 3) Programming, 4) Math & Modeling and 5) Statistics. Proficiency in each skill area is related to job role.”

All of these skills may not present themselves in a single data scientist but it’s possible to assemble all of them by putting together a top-notch data science team. In “Tips for building a data science capability” from consulting firm Booz Allen Hamilton, we learn that “rather than illuminate a single data science rock star, it is important to highlight a diversity of talent at all levels to help others self-identify with the capability. It is also a more realistic version of the truth. Very rarely will you find ‘magical unicorns’ that embody the full breadth of math and computer science skills along with the requisite domain knowledge. More often, you will build diverse teams that when combined provide you with the ‘triple-threat’ (computer science, math/statistics, and domain expertise) model needed for the toughest data science problems.”

The concept of a data science team, combining various skills and educational backgrounds, is high on the agenda of the 175-year-old American Statistical Association (ASA) which is probably looking in dismay at the oodles of funds going to establishing new data science programs and research centers at American universities, to say nothing about the salaries of data scientists as opposed to the salaries of statisticians.

The ASA issued a “policy statement” on October 1, reminding the world that statistics is one of the three disciplines “foundational to data science” (the other two being database management and distributed and parallel systems, providing a “computational infrastructure”). The statement concludes with “The next generation [of statisticians] must include more researchers with skills that cross the traditional boundaries of statistics, databases and distributed systems; there will be an ever-increasing demand for such ‘multi-lingual’ experts.”

In other words, if you aspire to a $200,000+ salary, better call yourself a data scientist and start coding.

Posted in Data Science Careers | Tagged , , | Leave a comment

3 Recent Books on Data Mining, Data Science and Big Data Analytics

Now that most of the hype around big data has died down, overtaken by the buzz over the Internet of Things, we are sometimes treated to serious discussions of the state-of-the-art (or science, for that matter) in data analysis. If you are planning a career as a data scientist or you are a business executive trying to understand what the data scientists are telling you, three recent books provide excellent and accessible overviews:

The Analytics Revolution: How to Improve Your Business By Making Analytics Operational In The Big Data Era by Bill Franks

Data Mining For Dummies by Meta S. Brown

Data Science For Dummies by Lillian Pierson

Bill Franks is the Chief Analytics Officer for Teradata, and his specialty is translating complex analytics into terms that business users can understand. The Analystics Revolution follows Franks’ Taming the Big Data Tidal Wave, which was listed on Tom Peters’ 2014 list of “Must Read” books.

“With all the hype around big data, it is easy to assume that nothing of interest was happening in the past if you don’t know better from experience” says Franks. The over-excitement about big data caused many organizations to re-create solutions that already exist and build new groups dedicated to big data analysis, separate from their traditional analytics functions. As a correction, Franks advocates “a new, integrated, and evolved analytics paradigm,” combining traditional analytics on traditional data with big data analytics on big data.

The focus of this new approach–and the book–is Operational Analytics. It takes us from the descriptive and predictive analytics of traditional and big data analytics to prescriptive analytics. It pays close attention to the numerous decisions and actions, mostly tactical, taking place every day in your business. Most important, it places great emphasis on the process of analytics, on embedding it everywhere, and on automating the required response to events and changing conditions.

“Of course,” says Franks, “it takes human intervention to decide that an operational analytics process is needed and to build the process.”  But once the process is designed and turned on, the process accesses data, performs analysis, makes decisions, and then actually causes actions to occur. And humans are crucial to the success of this new brand of automated analytics, not only at the design phase, but also in the on-going monitoring and tweaking of the process.

An example of operational analytics is the development of an improved maintenance schedule using sensor data. There will be no value in the Internet of Things without an automated process for data analysis and action based on that analysis. “As traditional manufacturers suddenly find themselves embedding sensors, collecting data, and producing analytics for their customers, industry lines blur. Not only are new competencies needed, but the reason customers choose a product may have less to do with traditional selection criteria than with the data and analytics offered with the product,” says Franks.

The practical advice Franks provides in the book ranges from how to set up an analytics organization to developing and maintaining a corporate culture dedicated to discovery (finding new insights in the data and quickly acting on them) to implementing operational analytics. The Analytics Revolution is an excellent guide to the new business world of blurred industry lines and innovative data products.

If you are ready to move on from understanding the why of analytics today and how to think about it in a broad business and organizational context to a more specific understanding of the how of analyzing data, Data Mining for Dummies by Meta Brown should be your first step. The book was written for “average business people,” showing them that you don’t need to be a data scientist and “you don’t need to be an expert in statistics, a scientist, or a computer programmer to be a data miner.”

Brown is a consultant, speaker and writer with hands-on experience in business analytics. She’s the creator of the Storytelling for Data Analysts and Storytelling for Tech workshops. In Data Mining for Dummies, Brown tells the story of what data miners do.

It starts with a description of a day in the life of a data miner and goes on to discuss in clear, easy-to-understand prose all the key data mining concepts, how to plan and organize for data mining, getting data from internal, public and commercial sources, how to prepare data for exploration and predictive modeling, building predictive models, and selecting software and dealing with vendors. Data Mining for Dummies is an excellent step-by-step guide to understanding data mining and how to become a data miner.

If you are ready to move on from understanding data mining and being a data miner to more advanced tools and applications for data analysis, Data Science for Dummies by Lillian Pierson should be your first step. The book was written for readers with some technical and math skills and experience, but it aims to provide a general introduction to one and all: “Although data science may be a new topic for many, it’s a skill that any individual who wants to stay relevant in her career field and industry needs to know.”

Pierson is a data scientist and environmental engineer and the founder of Data-Mania, a start-up that focuses mainly on web analytics, data-driven growth services, data journalism, and data science training services. “Data scientists,” she explains, “use coding, quantitative methods (mathematical, statistical, and machine learning), and highly specialized [domain] expertise in their study area to derive solutions to complex business and scientific problems.

Data Science for Dummies is an excellent practical introduction to the fundamentals of data science.  It provides a guided tour of the data science landscape today, from data engineering and processing tools such as Hadoop and MapReduce to supervised and unsupervised machine learning, statistics and mathematical modeling, using open-source applications such as Python and the R statistical programming language, finding resources for publicly-available data, and data visualization techniques for showcasing the results of your analysis. Stressing the importance of domain expertise for data scientists, Pierson provides detailed examples of applying data science in specific domains such as journalism, environmental intelligence, and e-commerce.

“A lot of times,” says Pierson, “data scientists get caught up analyzing the bark of the trees that they simply forget to look for their way out of the forest.” The three books reviewed here provide a handy map to the maze of data analysis and a safe conduct pass for business executives, IT staff, and students, ensuring that they successfully get in and out of the data forest. Remember, as ones and zeros eat the world, data is the new product and operational analytics, data mining, and data science is the new process of innovation.

Posted in Big Data Analytics, Data Science | Leave a comment

Will Google Own AI? (4)

Norm Jouppi, Google:

We’ve been using compute-intensive machine learning in our products for the past 15 years. We use it so much that we even designed an entirely new class of custom machine learning accelerator, the Tensor Processing Unit. Just how fast is the TPU, actually? Today, in conjunction with a TPU talk for a National Academy of Engineering meeting at the Computer History Museum in Silicon Valley, we’re releasing a study that shares new details on these custom chips, which have been running machine learning applications in our data centers since 2015. This first generation of TPUs targeted inference (the use of an already trained model, as opposed to the training phase of a model, which has somewhat different characteristics), and here are some of the results we’ve seen:

  • On our production AI workloads that utilize neural network inference, the TPU is 15x to 30x faster than contemporary GPUs and CPUs.
  • The TPU also achieves much better energy efficiency than conventional chips, achieving 30x to 80x improvement in TOPS/Watt measure (tera-operations [trillion or 1012 operations] of computation per Watt of energy consumed).
  • The neural networks powering these applications require a surprisingly small amount of code: just 100 to 1500 lines. The code is based on TensorFlow, our popular open-source machine learning framework.
  • More than 70 authors contributed to this report. It really does take a village to design, verify, implement and deploy the hardware and software of a system like this.
Posted in AI, deep learning | Tagged | Leave a comment