3 Big Data Milestones

If you were asked to name the top three events in the history of the IT industry, which ones would you choose? Here’s my list:

June 30, 1945: John Von Neumann published the First Draft of a Report on the EDVAC, the first documented discussion of the stored program concept and the blueprint for computer architecture to this day.

May 22, 1973: Bob Metcalfe “banged out the memo inventing Ethernet” at Xerox Palo Alto Research Center (PARC).

March 1989: Tim Berners-Lee circulated “Information management: A proposal” at CERN in which he outlined a global hypertext system.

[Note: if round numbers are your passion, you may opt—without changing the substance of this condensed history—for the ENIAC proposal of April 1943, Ethernet in 1973, and CERN making the World Wide Web available to the world free of charge in April 1993, so that 2013 marks the 70th, 40th, and 20th anniversaries of these events.]

Why bother at all to look back? And why did I select these as the top three milestones in the evolution of information technology?

Most observers of the IT industry prefer and are expected to talk about what’s coming, not what’s happened. But to make educated guesses about the future of the IT industry, it helps to understand its past. Here I depart from most commentators who, if they talk at all about the industry’s past, divide it into hardware-defined “eras,” usually labeled “mainframes,” “PCs,” “Internet,” and “Post-PC.”

Another way of looking at the evolution of IT is to focus on the specific contributions of technological inventions and advances to the industry’s key growth driver: digitization and the resulting growth in the amount of digital data created, shared, and consumed. Each of these three events represents a leap forward, a quantitative and qualitative change in the growth trajectory of what we now call big data.

The industry was born with the first giant calculators digitally processing and manipulating numbers and then expanded to digitize other, mostly transaction-oriented activities, such as airline reservations.  But until the 1980s, all computer-related activities revolved around interactions between a person and a computer. That did not change when the first PCs arrived on the scene.

The PC was simply a mainframe on your desk. Of course it unleashed a wonderful stream of personal productivity applications that in turn contributed greatly to the growth of enterprise data and the start of digitizing leisure-related, home-based activities. But I would argue that the major quantitative and qualitative leap occurred only when work PCs were connected to each other via Local Area Networks (LANs)—where Ethernet became the standard—and then long-distance via Wide Area Networks (WANs). With the PC, you could digitally create the memo you previously typed on a typewriter, but to distribute it, you still had to print it and make paper copies. Computer networks (and their “killer app,” email) made the entire process digital, ensuring the proliferation of the message, drastically increasing the amount of data created, stored, moved, and consumed.

Connecting people in a vast and distributed network of computers not only increased the amount of data generated but also led to numerous new ways of getting value out of it, unleashing many new enterprise applications and a new passion for “data mining.” This in turn changed the nature of competition and gave rise to new “horizontal” players, focused on one IT component as opposed to the vertically integrated, “end-to-end solution” business model that has dominated the industry until then. Intel in semiconductors, Microsoft in operating systems, Oracle in databases, Cisco in networking, Dell in PCs (or rather, build-to-order PCs), and EMC in storage have made the 1990s the decade in which “best-of-breed” was what many IT buyers believed in, assembling their IT infrastructures from components sold by focused, specialized IT vendors.

The next phase in the evolution of the industry, the next quantitative and qualitative leap in the amount of data generated, came with the invention of the World Wide Web (commonly mislabeled as “the Internet”). It led to the proliferation of new applications which were no longer limited to enterprise-related activities but digitized almost any activity in our lives. Most important, it provided us with tools that greatly facilitated the creation and sharing of information by anyone with access to the Internet (the open and almost free wide area network only few people cared or knew about before the invention of the World Wide Web). The work memo I typed on a typewriter which became a digital document sent across the enterprise and beyond now became my life journal which I could discuss with others, including people on the other side of the globe I have never met.  While computer networks took IT from the accounting department to all corners of the enterprise, the World Wide Web took IT to all corners of the globe, connecting millions of people. Interactive conversations and sharing of information among these millions replaced and augmented broadcasting and drastically increased (again) the amount of data created, stored, moved, and consumed. And just as in the previous phase, a bunch of new players emerged, all of them born on the Web, all of them regarding “IT” not as specific function responsible for running the infrastructure but as the essence of their business, data and its analysis becoming their competitive edge.

We are probably going to see soon—and maybe already are experiencing—a new phase in the evolution of IT and a new quantitative and qualitative leap in the growth of data. The cloud—a new way to deliver IT, big data—a new attitude towards data and its potential value, and The Internet of Things (including wearable computers such as Google Glass)—connecting billions of monitoring and measurement devices quantifying everything—combine to sketch for us the future of IT.

[Originally published on Forbes.com]

Posted in Big Data Analytics, Big Data Futures, Big Data History, Data Growth | Leave a comment

The Real World of Big Data (Infographic)

Click image to see a larger version

The Real World of Big Data via Wikibon Infographics

Posted in Big Data Analytics, Infographics | Leave a comment

The Digital Marketing Landscape: 2 Views

Gartner Digital Marketing Transit Map

Source: Gartner

marketing_technology_landscape_2012

Source: chiefmartec.com

Posted in Misc | Leave a comment

Big Data Bytes of the Week: The End of Big Data?

The end of Big Data? Based on his discussions with CIOs, reports Derrick Harris at GigaOm, Opera Solutions’ CEO Arnab Gupta “thinks the analytics market will crest around the end of next year as CIOs face enormous data spikes.”  Is this what he means by “turning Big Data into Small Data?” Apparently saying “crest” is a very convincing way to get $84 million, but does he really believe that the Big Data flood is going to start tapering off next year?

Continue reading

Posted in Data Growth, Data Scientists, Predictions | Leave a comment

The Landscape of the Internet of Things

Source: Entrepreneur and Media Lab researcher David Rose talks ‘enchanted objects’

The book on Amazon: Enchanted Objects: Design, Human Desire, and the Internet of Things

Posted in Internet of Things | Leave a comment

Scenarios for the Future of the IT Industry

In November 1998, I sent to my then-colleagues at EMC an email with the subject line “The Demise of Dell.” I wrote:

“My fail-proof crystal ball just talked to me again: By the end of 2000, Dell’s market cap (today at $80B) will be cut in half.

Dell’s only strength, as we all know, is in low-cost distribution. Distribution (of everything) is going to undergo a radical change in the near future because of the Internet.  There will be new players in the PC market that will figure out how to sell PCs over the Internet at half the cost of Dell’s distribution infrastructure. On top of that, the corporate PC market will grind to a halt and we may even see a slight drop in PC revenues in the year 2000. On the consumer side, appliances is where the action will be—led by new players. “

After I sent my email, Dell’s stock went on to almost double to a peak of just over $56 in March 2000. It closed yesterday at $14.09, about half of where it was in late 1998.

Continue reading

Posted in Predictions | Leave a comment

The Digitization of IT

In many companies today, the “consumerization of IT” is turning into the “Digitization of IT.” The spreading of consumer technologies and services into the workplace is being expanded into a larger set of IT practices, borrowed from Silicon Valley innovators and adapted to the needs of enterprises in a variety of industries.

The old IT was analog IT: A single-purpose function designed to automate specific business activities, provide support and governance, and “keep the trains running on time.” The new IT is digital: Multi-purpose, extremely flexible, weaved into every aspect of the business, and gushing with unexplored and previously unknown opportunities.

The digitization of IT means that the IT organization is both stable and innovative, fault tolerant and fast learning, reliable and experimental. It solves the paradox of “safe is risky, stable is dangerous.” It promotes a culture of constant change which ensures resilience, and experimentation which safeguards continuity. Yes, you can have the best of both worlds.

Continue reading

Posted in Misc | Leave a comment

Big Data: Who, Why, and How (Infographic)

“Early adopters of Big Data analytics have gained a significant lead over the rest of the corporate world. Examining more than 400 large companies, we found that those with the most advanced analytics capabilities are outperforming competitors by wide margins.”

Source: Bain & Company

Posted in Big Data Analytics, Stats | Leave a comment

Big Data Analytics and Data Science at Netflix (Video)

Chris Pouliot, the Director of Analytics and Algorithms at Netflix: “…my team does not only personalizations for movies, but we also deal with content demand prediction. Helping our buyer down in Beverly Hills figure out how much do we pay for a piece of content. The personalization recommendations for helping users find good movies and TV shows. Marketing analytics, how do we optimize our marketing spin. Streaming platform, how do we optimize the user experience once I press play. There’s a wide range of data, so theres a lot of diversity. We have a lot of scale, a lot of challenging problems. The question then is, how do we attract great data scientists that can just see this as a playground, a sandbox of really exciting things. Challenging problems, challenging data, great tools, and then just the ability to have fun and create great products.”
[youtube http://www.youtube.com/watch?v=pJd3PKm9XUk]

Posted in Big Data Analytics, Data Science, Data Science Careers, Data Scientists | Leave a comment

The Data Science Interview: Yun Xiong, Fudan University

The Goal of Data Science is to Study the Phenomena and Laws of Datanature

Yun Xiong is an Associate Professor of Computer Science and the Associate Director of the Center for Data Science and Dataology at Fudan University, Shanghai, China. She received her Ph.D. in Computer and Software Theory from Fudan University in 2008. Her research interests include dataology and data science, data mining, big data analysis, developing effective and efficient data analysis techniques for various applications including finance, economics, insurance, bioinformatics, and sociology. The following is an edited version of our recent email exchange.

How has data science developed in China?    Continue reading

Posted in Data Scientists | Leave a comment