Monica Rogati in The LinkedIn Blog on Coursera, which offers free online courses: “Coursera’s approach to feedback and assessment is a very interesting application of data science. Tests are either computer-graded or peer-graded — the latter following industry tested crowdsourcing best practices (clear instructions, gold standards, training, qualification tasks assessor agreement monitoring etc.). Peer grading isn’t just treated as a means for scaling — it is part of the learning process. One of Daphne’s charts showed that students significantly improved on subsequent tests after peer- and self-grading. Interestingly, the better students learned even more from self-grading than from grading others… Continue reading
The Web at 25: Tim Berners-Lee on the Web of Data
In 2009, on the occasion of the 20th anniversary of the Web, Jason Rubin and I talked to Tim Berners-Lee about his invention and its future, the Semantic Web, which he described as “the Web of data.”
Twenty years on, the World Wide Web has proven itself both ubiquitous and indispensible. Did you anticipate it would reach this status, and in this time frame?
Tim Berners-Lee: I think while it’s very tempting for us to look at the Web and say, “Well, here it is, and this is what it is,” it has, of course, been constantly growing and changing—and it will continue to do so. So to think of this as a static “This is how the Web is” sort of thing is, I think, unwise. In fact, it’s changed in the last few years faster than it changed before, and it’s crazy for us to imagine this acceleration will suddenly stop. So yes, the 20-year point goes by in a flash, but we should realize that, and we are constantly changing it, and it’s very important that we do so.
I believe that 20 years from now, people will look back at where we are today as being a time when the Web of documents was fairly well established, such that if someone wanted to find a document, there’s a pretty good chance it could be found on the Web. The Web of data, though, which we call the Semantic Web, would be seen as just starting to take off. We have the standards but still just a small community of true believers who recognize the value of putting data on the Web for people to share and mash up and use at will. And there are other aspects of the online world that are still fairly “pre-Web.” Social networking sites, for example, are still siloed; you can’t share your information from one site with a contact on another site. Hopefully, in a few years’ time, we’ll see that quite large category of social information truly Web-ized, rather than being held in individual lockdown applications.
You mentioned a “small community” of people who see the value of the Semantic Web. Is that a repeat occurrence of the struggle 20 years ago to get people to understand the scope and potential impact of the World Wide Web?
It’s remarkably similar. It’s very funny. You’d think that once people had seen the effect of Web-izing documents to produce the World Wide Web, doing likewise with their data would seem the next logical step. But for one thing, the Web was a paradigm shift. A paradigm shift is when you don’t have in your vocabulary the concepts and the ideas with which to understand the new world. Today, the idea that a web link could connect to a document that originates anywhere on the planet is completely second nature, but back then it took a very strong imagination for somebody to understand it.
Now, with data, almost all the data you come across is locked in a database. The idea that you could access and combine data anywhere in the world and immediately make it part of your spreadsheet is another paradigm shift. It’s difficult to get people to buy into it. But in the same way as before, those who do get it become tremendously fired up. Once somebody has realized what it would be like to have linked data across the world, then they become very enthusiastic, and so we now have this corps of people in many countries all working together to make it happen.
Do you see the Semantic Web as enabling greater collaboration between and among parties, as opposed to the point-to-point or point-to-many communication that seems more prevalent in the current Web?
The original web browser was a browser editor and it was supposed to be a collaborative tool, but it only ran on the NeXT workstation on which it was developed. However, the idea that the Web should be a collaborative place has always been a very important goal for me. I think harnessing the creative energy of people is really important. When you get people who are trying to solve big problems like cure AIDS, fight cancer, and understand Alzheimer’s disease, there are a huge number of people involved, all of them with half-formed ideas in their minds. How do we get them communicating so that the half of an idea in one person’s head will connect with half of an idea in somebody else’s head, and they’ll come up with the solution?
That’s been a goal for the Web of documents, and it’s certainly a goal for the Web of data, where different pieces of data can be used for all kinds of different things. For example, a genomist may suspect that a particular protein is connected to a certain syndrome in a cell line, search for and find data relating to each area, and then suddenly put together the different strains of data and discover something new. And this is something he can do with the owners of the respective pieces of data, who might never have found each other or known that their data was connected. So the Web of data will absolutely lead to greater collaboration.
Is your vision of the Semantic Web one in which data is freely available, or are there access rights attached to it?
A lot of information is already public, so one of the simple things to do in building the new Web of data is to start with that information. And recently, I’ve been working with both the U.K. government and the U.S. government in trying not only to get more information on the Web, but also to make it linked data. But it’s also very important that systems are aware of the social aspects of data. And it’s not just access control, because an authorized user can still use the right data for the wrong purpose. So we need to focus on what are the purposes for accessing different kinds of data, and for that we’ve been looking at accountable systems.
Accountable systems are aware of the appropriate use of data, and they allow you to make sure that certain kinds of information that you are comfortable sharing with people in a social context, for example, are not able to be accessed and considered by people looking to hire you. For example, I have a GPS trail that I took on vacation. Certainly, I want to give it to my friends and my family, but I don’t necessarily wish to license people I don’t know who are curious about me and my work and let them see where I’ve been. Companies may want to do the same thing. They might say, “We’re going to give you access to certain product information because you’re part of our supply chain and you can use it to fine-tune your manufacturing schedule to meet our demand. However, we do not license you to use it to give to our competition to modify their pricing.”
You need to be able to ask the system to show you just the data that you can use for a given task, because how you wish to use it will be the difference in whether you can use it. So we need systems for recording what the appropriate use of data is, and we need systems for helping people use data in an appropriate way so they can meet an ethical standard.
Ultimately, what is one of the most significant things the Semantic Web will enable?
One thing I think we’ll be able to do is to write intelligent programs that run across the Web of data looking for patterns when something went wrong—like when a company failed, or when a product turned out to be dangerous, or when an ecological catastrophe happened. We can then identify patterns in a broad range of data types that resulted in something serious happening, and that will allow us to identify when these patterns recur, and we’ll be better able to prepare for or prevent the situation.
I think when we have a lot of data available on the Web about the world, including social data, ecological data, meteorological data, and financial data, we’ll be able to make much better models. It’s been quite evident over the last year, for example, that we have a really bad grasp of the financial system. Part of the reason for that might be that we have insufficient data from which to draw conclusions, or that the experts are too selective in which data they use. The more data we have, the more accurate our models will be.
After 20 years, what about the Web—either its current or future capabilities—excites you the most?
One of the things that gets me the most excited are the mash-ups, where there’s one market of people providing data and there’s a second layer of people mashing up the data, picking from a rich variety of data sources to create a useful new application or service. A classic example of a mash-up is when I find a seminar I want to go to, and the web page has information about the sponsor, the presenter, the topic, and the logistics. I have to write all that down on the back of an envelope and then go and put it in my address book; I have to put it in my calendar; I have to enter the address in my GPS—basically, I have to copy this information into every device I use to manage my life, which is inefficient and time-consuming. This is because there is no common format for this data to become integrated into my devices.
Now, the vision of Semantic Web is that the seminar’s web page has information pointed at data about the event. So I just tell my computer I’m going to be attending that seminar and then, automatically, there is a calendar that shows things that I’m attending. And automatically, an address book I define as having in it the people who have given seminars that I’ve attended within the last six months appears, with a link to the presenter’s public profile. And automatically, my PDA starts pointing towards somewhere I need to be at an appropriate time to get me there. All I need to do is say, “I’m going to that seminar,” and then the rest should follow.
The Web is such a mélange of useful, noble content and stuff that runs the gamut from the mundane to the grotesque. Do you think humanity is using this incredible invention of yours appropriately?
Yes. The Web, after all, is just a tool. It’s a powerful one, and it reconfigures what we can do, but it’s just a tool, a piece of white paper, if you will. So what you see on it reflects humanity—or at least the 20 percent of humanity that currently has access to the Web.
As a standards body, the W3C is not interested in policing the Web or in censoring content, nor should we be. No one owns the World Wide Web, no one has a copyright for it, and no one collects royalties from it. It belongs to humanity, and when it comes to humanity, I’m tremendously optimistic. After 20 years, I’m still very excited and extremely hopeful.
[First published in ON magazine]
What is the Internet of Things? (Infographic)

What Has Steve Jobs Wrought?

Steve Jobs had an insanely great ride on the waves of digitization that have transformed the way we work and play over the last few decades. But taking a cursory look at the hundreds of tributes published to commemorate the anniversary of his passing, I was surprised to find lots of trees but not a single forest. The pig picture view of Jobs’ life is sorely missing.
We hear about a lot of specific things that he did or stimulated: He was “a genius toymaker,” a “genuine human being,” a “patent warrior.” He invented this, pushed for that, and denounced the other thing. All true. But wasn’t there something bigger that connected all the dots besides his creativity and drive?
Continue readingIBM Watson and Healthcare Big Data Analytics
Presiding over the ceremonial opening of the new IBM Watson Health global headquarters in Cambridge, Mass., IBM’s senior vice president Mike Rhodin highlighted the sometime-neglected focus of the effort to mine the ever-increasing quantities of health data. “We know that technology alone isn’t the answer,” said Rhodin. “At its core, Watson Health provides the means to orient the entire system around us.”
In a telephone conversation before the event, Dr. Lynda Chin, associate vice chancellor for health transformation at the University of Texas system, voiced a similar perspective: “Technology and innovation are the instigators for change, but they alone won’t do it. We have to think about implementation, about translating the technology into desired outcomes. Implementation is never just a technology play.”
Before assuming her current position in April, Dr. Chin was the founding chair of Genomic Medicine and scientific director of the Institute for Applied Cancer Science at The University of Texas MD Anderson Cancer Center. Two years ago, IBM and MD Anderson announced the Oncology Expert Advisor (OEA), based on IBM’s Watson data analytics engine, an expert system enabling clinicians to “uncover valuable insights from the cancer center’s rich patient and research databases.”
Dr. Chin reports that MD Anderson has by now developed two “apps,” each dealing with a different type of cancer, and is in the process of developing a third one, with each successive cancer-specific solution taking less time to develop. The ultimate goal is to make these solutions available to MD Anderson’s national and international network, so general oncologists in remote hospitals and clinics could tap into its accumulated and evolving expertise. “To show that the OEA is a knowledge democratization tool, we have to build a network cloud infrastructure to support it. The OEA will not be useful if it doesn’t fit into the everyday life of the general oncologist.”
To achieve that goal, MD Anderson has also partnered with PwC for the development of the cloud information interchange and with AT&T for a secure, dedicated network. It is now piloting its first network link, to one of its network partners in New Jersey.
The integration with the general oncologist’s workflow is moving the expert system from a research reference resource and clinical decision support tool to helping manage the care of specific patients. “The OEA is trained to simulate the exchange between a physician and an expert,” says Dr. Chin. “So for the OEA to work, it has to be connected to the EHR system so we can learn about the patient. The OEA is trained not only to understand the profile of the patient in terms of what is the appropriate evidence-based treatment options but also sharing the experience in managing patients on that type of therapy and helping the general oncologist manage it. It’s as if the oncologist has the ability to call up the expert 24/7 to ask for advice.”
Still, one of the lessons learned so far is that “there will always be a question the OEA was not trained on,” so a teleconferencing component has been built into the system. Other lessons include the need to provide mobile-device-based solutions, the challenge of teaching the OEA the relative value of each piece of information, and that the expert system “is very valuable from a learning perspective,” as a teaching tool for doctors in training. It also turned out that the OEA is useful in helping research nurses screen patients for clinical trials. Before, the nurses were often considering only the trials they knew about. Now they have at their disposal a clinical trial recommendation engine that screens through all the available trials and an expert system that helps with monitoring the patients participating in the clinical trial.
The development of the OEA is a never-ending journey. Healthcare is a complex and constantly changing endeavor involving research and practice, experiments and established procedures, professionals, institutions, and providers of all sorts, and most important of all, the people they serve—both patients and everybody else trying to keep themselves healthy. Over the last decade healthcare has gone, at long last, through rapid digitization, transforming mounds of paper into electronic records and introducing computers to many aspects of the physician’s work.
As in other fields, the introduction of computer technology provides opportunities for reducing costs and increasing quality and effectiveness, while at the same time increasing the potential for errors caused by over-reliance on technology and automation. Similarly, while digitization facilitates the collection and sharing of practical knowledge and research expertise, it also produces mountains of data that threaten to impede rather than accelerate progress.
IBM Watson helps in processing and analyzing this data and presenting it as confidence-level-ranked suggestions and recommendations. At the IBM event on September 10, Dr. Jeff Burns described how OPENPediatrics doesn’t tell the physician “do this,” but rather “tells the doctor what to think about.” OPENPediatrics is a Boston Children’s Hospital-led initiative bringing medical knowledge to pediatric caregivers worldwide (currently reaching 900 hospitals in 127 countries). IBM and Boston Children’s Hospital plan to develop “solutions for commercialization, initially pursuing applications in personalized medicine, heart health and critical care,” leveraging Watson’s genomic, image, and streaming analytics capabilities.
At the new Watson Health headquarters in Cambridge, Mass., Dr. Watson—and 700 other IBM employees—will be joining more than 600 Massachusetts-based life sciences companies and research organizations employing about 60,000 people. IBM plans to open there an interactive Watson Health Experience Center (a demonstration center for IBM customers) and establish a dedicated Health Research lab.
“We have to do it as a community,” Mike Rohdin declared at the event. Other participants, executives from Yale University, Sage Bionetworks, Medtronic, CVS Health, Modernizing Medicine and Teva Pharmaceuticals, echoed the sentiment. And Like Rhodin, they stressed putting people at the center of their efforts to improve healthcare, highlighting the specific goal of helping patients manage their disease.
Digitization—and smart analysis of the data it generates—helps in building a community around shared knowledge. The competition for profits and prestige among healthcare providers, however, while driving the innovation that may lead to better healthcare, also could stand in the way of cooperation and knowledge sharing. It may also lead to hasty development of technology-based solutions without a careful evaluation of the actual benefits and potential risks. Let’s hope that IBM and its partners will do everything they can to uphold the medical community’s tradition of controlled experiments and contributing to the ever-growing public repository of knowledge of what works and what doesn’t work in healthcare.
Originally published on Forbes.com
Predictions for CMOs and Digital Marketing in 2015
In 2015, digital marketing budgets will increase by 8%, according to a recent Gartner’s CMO Spend Report, a survey of 315 marketing decision makers representing organizations with more than $500 million in annual revenue.
Customer experience is the top innovation project for 2015, continuing its role as the top priority for marketing investment in 2014. The survey also found that
- In 79% of companies, marketing has a budget for capital expenditures — primarily, for infrastructure and software
- Marketers are managing a P&L and generating revenue from digital advertising, digital commerce and sale of data
- 68% of organizations have a separate digital marketing budget — it averages a quarter of the total marketing budget
- Two-thirds of companies are funding digital marketing via reinvestment of existing marketing budgets
Earlier this year, IBM found in its worldwide survey of CMOs that CEOs increasingly call on them for strategic input. Furthermore, the CMO now comes second only to the CFO in terms of the influence he or she exerts on the CEO. The survey also found, however, that very few CMOs have made much progress in building a robust digital marketing capability: Only 20%, for example, have set up social networks for the purpose of engaging with customers, and the percentage of CMOs who have integrated their company’s interactions with customers across different channels, installed analytical programs to mine customer data and created digitally enabled supply chains to respond rapidly to changes in customer demand is even smaller. Almost all CMOs, 82% of survey respondents, felt underprepared to deal with the explosion of data.
With this as a background, here’s a summary of what digital marketing and the CMO will look like in 2015, based on observations by Scott Brinker, a leading commentator on marketing technology, Forrester, TopRank online marketing blog, Wheelhouse Advisors, and Brian Solis.
CMOs will take charge of focusing their companies on the customer
CMOs and their marketing teams will become the primary driver behind customer-centric company growth. Leveraging their knowledge of the customer and the competitive landscape, CMOs will advise and council CEOs on how to win, serve, and retain customers to grow the business. They will also lead organizational changes and new collaboration initiatives aimed at unifying all customer engagement activities across the enterprise.
CMOs will poach IT staff to help them manage a rapidly expanding digital marketing landscape
The number of digital marketing tools will grow in 2015 with new startups and large, established tech companies confusing even more that CMO with their numerous offerings. To help manage this embarrassment of riches and move their companies further on their digital marketing journey, CMOs will be poaching IT staff looking for new challenges and better salaries.
CMOs should expect heavy rains from proliferating digital marketing clouds
Digital marketing tools will be increasingly offered as a cloud-based solution (“marketing-as-a-service”) rather than licensed software. Cloud-based solutions will continue to expand their ecosystems, with many small software developers adding apps to existing cloud-based digital marketing platforms.
CMOs will invest in new digital marketing hot areas
Content marketing and predictive analytics will continue to be hot areas of interest and investment for CMOs, but they will be joined in 2015 by sales enablement, post-sale customer marketing, marketing finance, marketing talent management, and new tools based on the Internet of Things, allowing for the integration of offline and online experiences.
CMOs will become brand publishers
CMOs in 2015 will act as heads of a publishing house, overseeing the entire spectrum of brand engagement, increasing the quality of their output, and improving the perceived value of digital interactions with customers and prospects.
[First published on Forbes.com]
Will Google Own AI? (4)
We’ve been using compute-intensive machine learning in our products for the past 15 years. We use it so much that we even designed an entirely new class of custom machine learning accelerator, the Tensor Processing Unit. Just how fast is the TPU, actually? Today, in conjunction with a TPU talk for a National Academy of Engineering meeting at the Computer History Museum in Silicon Valley, we’re releasing a study that shares new details on these custom chips, which have been running machine learning applications in our data centers since 2015. This first generation of TPUs targeted inference (the use of an already trained model, as opposed to the training phase of a model, which has somewhat different characteristics), and here are some of the results we’ve seen:
- On our production AI workloads that utilize neural network inference, the TPU is 15x to 30x faster than contemporary GPUs and CPUs.
- The TPU also achieves much better energy efficiency than conventional chips, achieving 30x to 80x improvement in TOPS/Watt measure (tera-operations [trillion or 1012 operations] of computation per Watt of energy consumed).
- The neural networks powering these applications require a surprisingly small amount of code: just 100 to 1500 lines. The code is based on TensorFlow, our popular open-source machine learning framework.
- More than 70 authors contributed to this report. It really does take a village to design, verify, implement and deploy the hardware and software of a system like this.
Industrial IoT Market to Reach $151.01 Billion by 2020
The Industrial Internet of Things (IIoT) market was valued at $93.99 Billion in 2014, to reach $151.01 Billion by 2020 and is expected to grow at a CAGR of 8.03% between 2015 and 2020.
IIoT is the integration of complex physical machinery with industrial networks and data analytics solutions to improve operational efficiency and reduce costs. It comprises advanced sensor technologies, machine-to machine communication, real-time data analytics, and machine learning algorithms to enhance the decision-making capabilities of the industries. The need to identify potential failures in machinery in advance to avoid unplanned outages by the use of predictive maintenance techniques is a key influencing factor for the adoption of IIoT solutions. Advancements in sensor technologies as well as improved reliability, coverage area, and bandwidth of cellular technologies are enabling IIoT in sectors such as manufacturing, energy & power, and healthcare among others. The implementation of IIoT is expected to give rise to new business models and provide opportunities to a wide range of new and established companies in the market.
The key players in the market include General Electric (U.S.), Cisco Inc. (U.S.), Intel Corporation (U.S.), Rockwell Automation (U.S.), ARM Holdings plc. (U.K.), ABB Ltd. (Switzerland), Siemens AG (Germany), Honeywell International Inc. (U.S.), Dassault Systèmes SA (France), Huawei Technology Co., Ltd. (China), Zebra Technologies (U.S.), IBM Corporation (U.S.), and Robert Bosch GmbH (Germany) among others.



