Whats the Big Data

Teaching Kids to Code (Infographic)

Posted on December 10, 2015 by GilPress

Brought to you by DataScience@SMU, a masters in data science

Posted in Misc | Tagged SMU | Leave a comment

Using Deep Learning in Medicine (Video)

Posted on December 8, 2015 by GilPress

https://youtu.be/3WBpJKDv1U8

Jeremy Howard, founder and CEO of Enlitic, argues that the release of Google’s TensorFlow will have an impact similar to the release of the C programming language and that Deep Learning will have an impact similar to that of the advent of the “Internet” (I’m sure he knows it’s actually the World Wide Web) in the 1990s.

Posted in AI, deep learning, Misc | Tagged Enlitic, Google, Jeremy Howard, TensorFlow | Leave a comment

7 Predictions for 2016 from IDC

Posted on December 6, 2015 by GilPress

IDC_2016predictions Issuing IDC’s top 10 predictions for 2016, chief analyst Frank Gens advised enterprises to transform or die, noting that the overarching theme for 2016 is “digital transformation scales up.” Scale is the “critical ingredient in the unfolding battle for digital success,” said Gens, warning that while for some enterprises, “scale wins,” for others “scale kills.”

Digital transformation (DX) will drive “everything that matters in IT” over the next several years. Succeeding in what IDC calls the DX economy means using technologies such as mobile, cloud, big data analytics, IoT, AI and robotics to “create competitive advantage through new offerings, new business models, and new customer, supplier, and distributor relationships.”

Here’s my edited version of IDC’s 2016 predictions, based on the IT Industry, Digital Transformation, and CIO Agenda webcasts.

Digital transformation will reach massive scale

Over the next three to five years enterprises will commit to digital transformation on a massive scale, leading to the emergence of the DX Economy. Acknowledging the seemingly paradoxical nature of the term “customer intimacy at scale,” IDC predicts it will be the biggest, most complex enterprise-wide DX initiative that organizations will have to face, requiring a fundamental cultural and operational transformation. In addition, enterprises will pursue a rapid expansion of their customer base and they will have to deliver dramatically more personalized customer service.

As the DX economy is depressing pricing, scaling customers is a must, says IDC, prompting enterprises to “serve the global 50 million.” Scaling up your customer population and how you interact with them will be mandatory for increasing revenue and sustaining market share.

“IT spending will be driven by doing entirely new things rather than using new technologies to do old things,” said Gens on the IDC webcast.

Every company will be a software company

The DX economy – operating at scale – will be “driven primarily by code.” Enterprises’ ability to grow and compete will increasingly depend on their digital “innovation capacity”—the size and talent of their software development teams. In the DX economy, “code plus data equal innovation.”

This will make “software developer” the sexiest job of the 21^st century. Enterprises will compete to hire for these essential jobs and software developers will be key for the creation of the new, revenue-generating, productivity-enhancing applications, tapping into the potential of new technologies (e.g., IoT) and embedding data analytics.

The roles and activities of CxOs will be re-defined

In many companies, CEOs will be directly involved in digital transformation initiatives, ensuring it is a primary component of the company’s overall strategy. CEOs will actively participate in the recruitment of the best software developers and “must understand that being a technology company takes more than bold statements.” Line-of-Business (LOB) executives will manage software developers and will have to become adept at technology governance.

CIOs will have to adapt to an increasingly market-facing role for them and their organizations and also work on developing and maintaining internal relationships, partnering with business executives and reaching out to the software developers working in the business units.

A new executive position, that of the Chief Digital Officer (CDO), is needed, but it could be filled by business-savvy CIOs. The CIO must have a plan for transforming IT from being the custodian of the infrastructure to more of a service provider. While digital transformation marginalizes IT, it (and the CIO) could be “a true change agent” and a “transformation engine.”

The Cloud will be the new IT

“Cloud First” will become the new mantra for enterprise IT as the “cloud is the new core of enterprise IT,” said IDC’s Gens. The most “functionally-rich IT offerings” will be found in the cloud.

Enterprises with advanced DX initiatives will create and/or partner with industry cloud platforms to scale up their digital supply and distribution networks. While there will be major consolidation in the public cloud market (down to six “mega-platforms”), there will be rapid proliferation of industry cloud platforms. With industry cloud platforms, GE and others are building epicenters of growth, creating innovation communities, and re-inventing their industries and how to source and distribute innovation at massive scale.

Big data becomes bigger and richer (and enriching)

Success in the DX economy will depend on the ability to build robust “data pipelines” that flow both in and out of the enterprise. Data analytics (Cognitive Services) will be embedded in new apps, and the top new Investment areas over the next couple of years will be Contextual Understanding and Automated Next Best Action capabilities. Companies will look to monetize their own data, participating in a “data race” to fuel innovation.

Mastering “cognitive” is a must, says IDC, recommending making machine learning a top priority for 2016—“lots of startups in your industry are already using it to disrupt you.”

The IoT will be a key driver of the DX Economy

IoT devices and solutions have the potential to redefine competitive advantage in virtually every industry. IDC predicts that the most active IoT development will cluster around the manufacturing, transportation, retail, and healthcare industries.

The coming IT Industry shakeout

A significant portion of today’s IT suppliers will be acquired, merged, downsized, or significantly repositioned. In this environment, enterprises will have to constantly monitor and assess the solutions offered by their suppliers and partners and be prepared to realign these relationships as needed. Possibly more than in any other economic sector, the players in the IT industry need to transform or die.

In the face of the increased volatility of the IT industry, IDC advises IT buyers to increase investments in vendor/partner management, pursue open and multi-source strategies, and find ways to share risks and rewards.

———————-

As Gens reminded listeners to this year’s webcast, IDC predicted last year that “by 2018, one third of the top 20 in every industry will be disrupted by digitally transformed competitors.” Gens advised then companies in all industries to “Amazon” themselves, but also predicted that the best job of “Amazoning” will be done by Amazon itself.

Indeed, Amazon and other Web-born companies (also called “digital natives”) such as Google and Facebook, are not “digitally transformed,” they define “digital.” This begs the question: is the “DX Economy” going to be dominated by relatively young companies that have been in the business of digitally transforming everything in their path and regarding “IT” as their business, not as a “function” or a ”service”?

IDC predicts digital transformation to be a key strategy for 67% of the Global 2000 by 2018 and that by 2017, over 50% of the IT budget will be spent on new technologies. This implies an unprecedented rapid transformation in the current allocation of IT dollars, according to CIO surveys. As just one example, Deloitte has found out in a recent survey of more than 1,200 senior technology executives working for mostly large companies worldwide that only 15% of CIOs are investing in emerging technologies and only 16% of the IT budget is spent on “business innovation.” That’s a stark contrast to Gens’ declaration that over the next few years “IT spending will be driven by doing entirely new things.”

There is also almost no reference in IDC’s predictions to possible challenges and roadblocks to the progression to a full-blown DX Economy. The only factors that were mentioned that could possibly slow down the predicted rapid transformation (and they were not mentioned as speed bumps) were a privacy backlash and security (and the latter only in the context of IoT).

But that’s the nature of the business. Soothsayers are not expected to equivocate, to talk about possible scenarios, or why their forecasts could be wrong or unrealistic. They are expected to provide quantitative certainties and warn us that if we don’t follow the one and only path to the future, we will be “disrupted.” Still, I believe that what they tell us is very valuable as a great summary of contemporary conventional wisdom and what has happened over the previous few years which could serve as a solid foundation for discussion and debate.

IDC sees the DX Economy in its crystal ball, Gartner conjures the Programmable Economy (“a massive technology-enabled transformation of traditional concepts of value exchange, empowering individuals and smart machines to both define value and determine how it is exchanged”).

Industry analyst firms such as IDC and Gartner are expected to tell us what the future will look like and they provide the goods. Inventors such as Tim Berners-Lee and ambitious entrepreneurs such as the people that founded Amazon, Google, and Facebook, create the future. And most of the time, it is an unexpected future that no one has predicted.

Originally published on Forbes.com

Posted in Misc, Predictions | Tagged IDC | Leave a comment

28 Numbers from IDC about IT Futures

Posted on December 6, 2015 by GilPress

crystal-ball-with-a-bar-chart-inside

By 2020, almost 50% of IT budgets will be tied into DX (digital transformation) initiatives.

By 2018, Line of Business (LOB) executives will control 45%+ of all IT spending worldwide, over 60% in the U.S.

By 2017, over 50% of IT spending will be for new technologies (mobile, cloud, big data, etc.).

By 2018, 35% of IT resources will be spent to support the creation of new digital revenue streams.

By 2017, 40% of services managed by IT will be business services oriented to augmented experience and smart products.

By 2018, at least 50% of IT spending will be cloud based.

By 2018, 65% of all enterprise IT assets will be housed offsite and 33% of IT staff will be employed by third-party managed service providers.

By 2020, more than 30% of current IT vendors will not exist as we know them today.

By 2018, 75-80% of public cloud services will be consolidated to 6 platform vendors, including Amazon, Google, IBM, Microsoft, and Salesforce.

By 2018, over 50% of enterprises will create and/or partner with Industry Cloud Platforms (ICPs) to distribute their own innovations and source others‘.

By 2018, over 80% of advanced DX will plug into these communities and the number of ICPs will grow 4-5X from 100 to close to 500.

By 2020, the percentage of enterprises creating advanced DX initiatives will more than double from today’s 22% to almost 50%.

By 2018, 67% of the CEOs of Global 2000 enterprises will have DX at the center of their corporate strategy.

By 2017, 60% of enterprises with a DX strategy will deem it too critical for any one functional area and create an independent corporate executive to oversee the implementation.

By 2017, 80% of global CIOs will initiate a data transformation and governance framework to turn information into a competitive business differentiator.

By 2018, 75% of the G2000 will deploy “Digital Twins” of their products/services, supply network, sales channels, and operations.

By 2020, 60% of the G2000 will double their productivity by digitally transforming any processes from human-based to software-based delivery.

By 2018, 80% of B2C and 60% of B2B enterprises will overhaul their “Digital Front Door” to support 1,000X to 10,000X more customers and customer touch points.

By 2018, enterprises with DX initiatives will double the size of their software development teams.

By 2018, 67% of developers will be focused on business innovation, up from less than 33% today.

By 2018, enterprises with DX strategies will expand external data sources by at least 3X to 5X and delivery of data to the market by 100X or more.

By 2018, 67% of Software-as-a-Service (SaaS) vendors will offer data as part of their service.

By 2018, over 50% of developer teams will embed “Cognitive Services” (i.e., data analytics) in their apps, up from 1% today, providing U.S. enterprises $60+ billion in annual savings by 2020.

By 2018, at Least 20% of all workers will use automated assistance technologies to make decisions and get work done.

By 2018, there will be 22 Billion IoT devices installed, driving the development of over 200,000 new IoT apps and services.

By 2018, IoT spending will grow 1.5X.

By 2018, 66% of networks will have an IoT security breach.

By 2020, there will be 5X increase in the capability of robots in manufacturing.

Source: IDC’s IT Industry, Digital Transformation, and CIO Agenda webcasts.

Originally published on Forbes.com

Posted in Misc, Predictions | Tagged IDC | Leave a comment

How to evaluate a data scientist

Posted on December 2, 2015 by GilPress

datascience_skills

Jerry Overton:

What’s commonly expected from a data scientist is a combination of subject matter expertise, mathematics, and computer science. This is a tall order and it makes sense that there would be a shortage of people who fit the description. The more knowledge you have, the better, however, I’ve found that the skillset you need to be effective, in practice, tends to be more specific and much more attainable. This approach changes both what you look for from data science and what you look for in a data scientist.

A background in computer science helps with understanding software engineering, but writing working data products requires specific techniques for writing solid data science code. Subject matter expertise is needed to pose interesting questions and interpret results, but this is often done in collaboration between the data scientist and subject matter experts (SMEs). In practice, it is much more important for data scientists to be skilled at engaging SMEs in agile experimentation. A background in mathematics and statistics is necessary to understand the details of most machine learning algorithms, but to be effective at applying those algorithms requires a more specific understanding of how to evaluate hypotheses…

We tend to judge data scientists by how much they’ve stored in their heads. We look for detailed knowledge of machine learning algorithms, a history of experiences in a particular domain, and an all-around understanding of computers. I believe it’s better, however, to judge the skill of a data scientist based on their track record of shepherding ideas through funnels of evidence and arriving at insights that are useful in the real world.

Posted in Data Science, Data Science Careers, Data Scientists, Misc | Tagged creers, data science skills | Leave a comment

65% growth in mobile data traffic over last 12 months driven by video

Posted on December 2, 2015 by GilPress

Ericsson_Mobility2015_5

Ericsson_Mobility2015

Ericsson_Mobility2015_2

Ericsson_Mobility2015_3

Ericsson_Mobility2015_4

Ericsson Mobility Report:

Video dominates data traffic: Global mobile data traffic is forecast to grow ten-fold by 2021, and video is forecast to account for 70 percent of total mobile traffic in the same year. In many networks today, YouTube accounts for up to 70 percent of all video traffic, while Netflix’s share of video traffic can reach as high as 20 percent in markets where it is available.
Mainland China overtakes the US as world’s largest LTE market: By the end of 2015, Mainland China will have 350 million LTE subscriptions – nearly 35 percent of the world’s total LTE subscriptions. The market is predicted to have 1.2 billion LTE subscriptions by 2021.
Africa becomes an increasingly connected continent: Five years ago (2010) there were 500 million mobile subscriptions across Africa; by the end of 2015 this number will double to 1 billion. Increased connectivity improves the prospect of financial inclusion for the 70 percent unbanked through mobile money services starting to take form across Africa.

Posted in Data Growth, Misc | Tagged Ericsson, Ericsson mobility report, smart phones | Leave a comment

Only 16% of IT budgets are allocated to investments in innovation and growth

Posted on November 28, 2015 by GilPress

While 45% of CIOs identify “innovation” and 44% point to “growth” as their organizations’ most important priorities, only 15% are investing in emerging technologies . Only 16% of IT budgets are allocated to investments in innovation and growth, with the balance spent on running day-to-day operations and incremental change.

This gap between aspirations and reality is one of the key findings of the just-published Deloitte 2015 Global CIO Survey. The study is based on interviews with 1,271 CIOs from 43 countries, with the majority of the participants working for organizations with revenues of more than $1 billion.

Figure 7 Deloitte CIO survey

“Every company today is a technology company,” says Khalid Kark, U.S. CIO research director at Deloitte Services, and the business priorities reported by CIOs are the same regardless of industry, geography, and company size. All companies are embracing digital technologies and see them as critical for their future.

The study uncovered, however, differences in how CIOs see themselves, what is expected of them, and how they would like their career to progress. In contrast to other surveys that focus on spending priorities, the Deloitte study is mostly about different roles CIOs play today in their organizations and the impact they would like to make in the coming years.

The CIOs were asked detailed questions along the four dimensions that frame the impact of a CIO—the organization’s priorities, competencies and strengths of the CIO, building relationships internally and externally, and technology investments. Here are some of the more interesting results of the survey:

Only 9% of CIOs say they have all the skills they need to succeed

CIOs have to be ambidextrous, mixing business strategy skills with operations management: Out of 12 leadership capabilities, CIOs selected six as the most important for success in their role—influence with internal stakeholders, communication skills, understanding strategic business priorities, talent management, technology vision and leadership, and the ability to lead complex, fast-changing environments.

CIOs think they need especially to improve their leadership skills

The CIOs were asked to select the top five competencies that a successful technology leader need and to identify their own top five strengths. The skills with the largest gaps were the ability to influence internal stakeholders, talent management, and technology vision and leadership. Conversely, CIOs think they are strong in operations and execution, ability to run large-scale projects, and leverage with external partners but do not consider these as differentiating skills for successful technology leaders.

Figure 3 Deloitte CIO survey (1)

Strong relationships with other executives do not necessarily mean strong influence on the business

48% of CIOs report “strong relationships” with their CEO and interaction at least once a week, and an additional 17% report daily interactions. But only 42% of the CIOs were co-leaders in business strategy decisions and only 19% in M&A activities.

No common definition for “digital”

Digital (71% of respondents), along with analytics and business intelligence (77%) are expected to have the most impact on the business over the next two years. But when asked further to describe their digital initiatives, the answers ranged from analyzing customer data and developing new products and services to improving customer experience and enabling the workforce to better collaborate or be more productive. The lack of common definition, says the study’s report, “is often confusing for business leaders and can lead to misunderstandings and conflicting goals.”

Figure 8 Deloitte CIO survey

Analyzing the answers to the questions about CIO performance and impact, Deloitte uncovered three distinct CIO “archetypes,” describing how CIOs are delivering value today—and how they are preparing for what comes next:

Trusted Operators keep the lights on. They focus on cost, operational efficiency, and performance reliability. They also provide enabling technologies, support business transformation efforts, and align to business strategy. Their core competency is to drive down costs by rationalizing, renewing, and consolidating technology, and they focus on internal customers. 42% of the CIOs surveyed fall into this category.

Change Instigators drive transformation. They take the lead on technology-enabled business transformation and change initiatives. They look outside the organization for partners and are focused on the end-customer of the business. They are 21% more likely than other CIOs to call technology vision a strength. 22% of the CIOs surveyed fall into this category.

Business Co-Creators perform a balancing act, handling both business strategy and efficient operations. They operate across multiple dimensions of creating and delivering value, and were 24% more likely to cite ability to influence internal stakeholders as a top-five strength. They invest in emerging technologies as a way to drive new sources of revenue or to transform the way they deliver value to customers. 36% of the CIOs surveyed fall into this category.

“Change Instigators try to bring enhancements to existing business models, while Business Co-Creators often have the mandate to find new business opportunities and define new business models,” says Deloitte’s Kark. It’s the Business Co-Creators that tend to invest more in emerging technologies and co-create new business models with internal business partners.

Kark thinks about the three archetypes as a self-diagnostic tool for CIOs to examine where they are and how they fit the needs of their organization. It can also help identify shifting business needs and with them, an emerging shift in how the business defines the CIO role.

Many of the CIOs surveyed indeed see a transformation in their roles in the near future or would like to see such a transformation. The proportion of Change Instigators is expected to remain the same at 22%. A big shift will occur, however with the other two types of roles: The proportion of Trusted Operators will go down from 42% to 12% and the proportion of Business Co-Creators will expand from 36% to 66%.

Almost a third of the CIOs surveyed aspire to shift their role into a business leadership position, working with other business executives to define and pursue new business opportunities, while maintaining their reputation as top-notch IT operators. But, says Kark, “if they don’t build the right skill set, if they don’t build the relationships, it’s going to be hard for them to make that transition. CIOs have to drive technology into the core of the business and if they are not able to do it, someone else will.”

The good news is that for those making the transition, career opportunities abound. “Over the next 3 years, more than half of all businesses will need CIOs of the Business Co-Creators type,” predicts Kark.

Originally published on Forbes.com

Posted in digital transformation, Misc | Tagged Deloitte services | Leave a comment

Artifical Intelligence Machines to Replace Physicians and Transform Healthcare

Posted on November 23, 2015 by GilPress

Dilbert_WatsonHealth

Posted in Misc | Leave a comment

10 New Big Data Observations from Tom Davenport

Posted on November 20, 2015 by GilPress

[youtube https://www.youtube.com/watch?v=DdHhD4n3iFE?rel=0]

The term “big data” has become nearly ubiquitous. Indeed, it seems that every day we hear new reports of how some company is using big data and sophisticated analytics to become increasingly competitive. The topic first began to take off in late 2010 (at least according to search results from Google Trends) and, now that we’re approaching a five-year anniversary, perhaps it’s a good time to take a step back and reflect on this major approach to doing business. This article describes 10 of my observations about big data.

Posted in Big Data Analytics, Misc | Tagged Tom Davenport | Leave a comment

Google’s RankBrain Outranks the Best Brains in the Industry

Posted on November 15, 2015 by GilPress

Bloomberg recently broke the news that Google is “turning its lucrative Web search over to AI machines.” Google revealed to the reporter that for the past few months, a very large fraction of the millions of search queries Google responds to every second have been “interpreted by an artificial intelligence system, nicknamed RankBrain.”

The company that has tried hard to automate its mission to organize the world’s information was happy to report that its machines have again triumphed over humans. When Google search engineers “were asked to eyeball some pages and guess which they thought Google’s search engine technology would rank on top,” RankBrain had an 80% success rate compared to “the humans [who] guessed correctly 70 percent of the time.”

There you have it. Google’s AI machine RankBrain, after only a few months on the job, already outranks the best brains in the industry, the elite engineers that Google typically hires.

Or maybe not. Is RankBrain really “smarter than your average engineer” and already “living up to its AI hype,” as the Bloomberg article informs us, or is this all just, well, hype?

Desperate to find out how far our future machine overlords are already ahead of the best and the brightest (certainly not “average”), I asked Google to shed more light on the test, e.g., how do they determine the “success rate”?

Here’s the answer I got from a Google spokesperson:

“That test was fairly informal, but it was some of our top search engineers looking at search queries and potential search results and guessing which would be favored by users. (We don’t have more detail to share on how that’s determined; our evaluations are a pretty complex process).”

I guess both RankBrain and Google search engineers were given possible search results to a given query and RankBrain outperformed humans in guessing which are the “better” results, according to some undisclosed criteria.

I don’t know about you, but my TinyBrain is still confused. Wouldn’t Google search engine, with or without RankBrain, outperform any human being, including the smartest people on earth, in terms of “guessing” which search results “would be favored by users”? Haven’t they been mining the entire corpus of human knowledge for more than fifteen years and, by definition, have produced a search engine that “understands” relevance more than any individual human being?

The key to the competition, I guess, is that the “search queries” used in it were not just any search queries but complex queries containing words that have different meaning in different context. It’s the kind of queries that will stump most human beings and it’s quite surprising that Google engineers scored 70% on search queries that presumably require deep domain knowledge in all human endeavors, in addition to search expertise.

The only example of a complex query given in the Bloomberg article is “What’s the title of the consumer at the highest level of a food chain?” The word “consumer” in this context is a scientific term for something that consumes food and the label (the “title”) at highest level of the food chain is “predator.”

This explanation comes from search guru Danny Sullivan who has come to the rescue of perplexed humans like me, providing a detailed RankBrain FAQ, up to the limits imposed by Google’s legitimate reluctance to fully share its secrets. Sullivan: “From emailing with Google, I gather RankBrain is mainly used as a way to interpret the searches that people submit to find pages that might not have the exact words that were searched for.”

Sullivan points out that a lot of work done by humans is behind Google’s outstanding search results (e.g., creating a synonym list or a database with connections between “entities”—places, people, ideas, objects, etc.). But Google needs now to respond to some 450 million new queries per day, queries that have never been entered before into its search engine.

RankBrain “can see patterns between seemingly unconnected complex searches to understand how they’re actually similar to each other,” writes Sullivan. In addition, “RankBrain might be able to better summarize what a page is about than Google’s existing systems have done.”

Finding out the “unknown unknowns,” discovering previously unknown (to humans) links between words and concepts is the marriage of search technology with the hottest trend in big data analysis—deep learning. The real news about RankBrain is that it is the first time Google applied deep learning, the latest incarnation of “neural networks” and a specific type of machine learning, to its most prized asset—its search engine.

Google has been doing machine learning since its inception. The first published paper listed in the AI and machine learning section of its research page is from 2001, and, to use just one example, Gmail is so good at detecting spam because of machine learning). But Goggle hasn’t applied machine learning to search. That there has been internal opposition to doing so we learn from a summary of a 2008 conversation between Anand Rajaraman and Peter Norvig, co-author of the most popular AI textbook and leader of Google search R&D since 2001. Here’s the most relevant excerpt:

The big surprise is that Google still uses the manually-crafted formula for its search results. They haven’t cut over to the machine learned model yet. Peter suggests two reasons for this. The first is hubris: the human experts who created the algorithm believe they can do better than a machine-learned model. The second reason is more interesting. Google’s search team worries that machine-learned models may be susceptible to catastrophic errors on searches that look very different from the training data. They believe the manually crafted model is less susceptible to such catastrophic errors on unforeseen query types.

This was written three years after Microsoft has applied machine learning to its search technology. But now, Google got over its hubris. 450 million unforeseen query types per day are probably too much for “manually crafted models” and google has decided that a “deep learning” system such as RankBrain provides good enough protection against “catastrophic errors.”

Deep learning has taken the computer science community by storm since it was used to win an image recognition competition in 2012, performing better than traditional approaches to teaching computers to identify images.

With deep learning, the computer “learns” by putting together the pieces of a puzzle (e.g., an image of a cat), moving up a hierarchy created by the computer scientist, from simple concepts to more complex ones. (see here and here for overviews of deep learning). Decades ago this idea got the unfortunate name “neural networks” under the misguided (and hype-generating) notion that the computer networks were “mimicking the brain” (what they were mimicking were speculations about how neurons work in the human brain). The hype did not produce the promised results but starting about ten years ago, with the availability of greater computer power and much larger sets of data and more sophisticated algorithms, neural networks have been reincarnated as deep learning.

In 2012, Google engineers made their first deep learning splash when they announced that Google computers have detected the image of a cat after processing zillions of unlabeled still frames from YouTube videos.

In their post on this deep learning experiment, Jeff Dean, a Google Fellow, and Andrew Ng, A Stanford professor on leave at Google at the time, wrote:

“And this isn’t just about images—we’re actively working with other groups within Google on applying this artificial neural network approach to other areas such as speech recognition and natural language modeling.”

And in 2013, Google engineers announced an open source toolkit called word2vec “that aims to learn the meaning behind words.” They wrote: “Now we apply neural networks to understanding words by having them ‘read’ vast quantities of text on the web. We’re scaling this approach to datasets thousands of times larger than what has been possible before, and we’ve seen a dramatic improvement of performance — but we think it could be even better.”

2013 was also the year Google hired Geoffrey Hinton of the University of Toronto, “widely known as the godfather of neural networks,” according to Wired. But the two other widely known members of the (self-labeled) “deep learning conspiracy” went to Google’s competitors: Yann LeCun to Facebook (leading a new AI research lab) and Yoshua Bengio to IBM (teaching Watson a few deep learning tricks).

Then there’s Apple, Yelp, Twitter and others—all of Google’s competitors are rushing to adopt deep learning.

This creates a serious competition for talent, for all the graduate students who three or four years ago switched the topic of their dissertations to something related to deep learning and all the others who have joined recently this “computers can learn on their own” movement. Hence the need to tell the world via Bloomberg that Google is in the game and for Google’s CEO to insist on its latest earnings call that “machine learning is a core transformative way by which we are rethinking everything we are doing.”

But beyond PR and prestige, future profits could be the most important incentive for Google to add deep learning to its search technology. It’s not only reducing costs by reducing the need to rely on humans and their “manually crafted models.” It’s also search quality, the reason Google has become the dominant search engine and a verb.

A Search Engine Land columnist, Kristine Schachinger, sheds further light on RankBrain in the context of search quality and Google’s shift in 2013 (the “Hummingbird” overhaul of their search algorithms) from providing search results based on words (strings of letters) to search results based on its knowledge of “things” (entities, facts):

Google has become really excellent at telling you all about the weather, the movie, the restaurant and what the score of last night’s game happened to be. It can give you definitions and related terms and even act like a digital encyclopedia. It is great at pulling back data points based around entity understanding.

Therein lies the rub. Things Google returns well are known and have known, mapped or inferred relationships. However, if the item is not easily mapped or the items are not mapped to each other, Google has difficulty in understanding the query…

While Google has been experimenting with RankBrain, they have lost market share — not a lot, but still, their US numbers are down. In fact, Google has lost approximately three percent of share since Hummingbird launched, so it seems these results were not received as more relevant or improved (and in some cases, you could say they are worse)…

Google might have to decide whether it is an answer engine or a search engine, or maybe it will separate these and do both.

I will go even further and speculate that Google is seeing the end of search as we know it (and they perfected), the possibility that in the future we will not enter search queries into search boxes but will rely on “knowledge navigators” (to use the term Apple coined in 1986), going beyond the current answer engines to communicating with us, providing relevant information and news, and anticipating our needs by linking things in our past, present, and future.

Now, is it possible that with Facebook’s investment in AI and deep learning, it will be the first to provide us with a futuristic knowledge navigator? What will happen to Google’s advertising revenues if the social network will consist not only of people but also deep learning machines?

Given its past performance and the competitive people running it (and its parent company), it’s obvious that RankBrain is just one of the many investments Google is making in “disrupting itself before others do” (I’m pretty sure that’s how they talk about it). Google will continue to provide outstanding, free, advertising-supported service to its users, no matter what form this service will take in the future.

Or maybe not. Being a devoted and admiring Google search user, I was a bit skeptical when I read Schachinger’s words quoted above that Google’s search results “were not received as more relevant or improved (and in some cases, you could say they are worse).” But one very surprising search result I recently got from Google, led me to think that, indeed, sometimes when you invest in the future, you sacrifice the present.

I Googled the address “75 Amherst Street, Cambridge, MA 02139.” What I got (a number of times, over three days) at the top of the search results was a map of 75 Amherst Alley, Cambridge, MA 02139.

There is such a place, but I have never heard about it or ever been there. What’s more, 75 Amherst Street is the home of MIT’s Media Lab, so this is not only a very simple query but also one that probably has been entered into Google numerous times (the Media Lab’s contact page appears as the second result, just under the erroneous map).

Time to invest in more humans working diligently on “manually crafted models”?

Posted in AI, Machine Learning | Tagged Amazon, deep learning, Facebook, IBM, search, twitter | Leave a comment

Whats the Big Data

Teaching Kids to Code (Infographic)

Using Deep Learning in Medicine (Video)

7 Predictions for 2016 from IDC

28 Numbers from IDC about IT Futures

How to evaluate a data scientist

65% growth in mobile data traffic over last 12 months driven by video

Only 16% of IT budgets are allocated to investments in innovation and growth

Artifical Intelligence Machines to Replace Physicians and Transform Healthcare

10 New Big Data Observations from Tom Davenport

Google’s RankBrain Outranks the Best Brains in the Industry

Categories

Archives