Pictured above: South Korea’s Lee Sedol, the world’s top Go player, bows during a news conference ahead of matches against Googles artificial intelligence program AlphaGo, in Seoul, South Korea, March 8, 2016. REUTERS/Kim Hong-Ji
There are a few moments in the history of artificial intelligence (AI) that are considered major breakthroughs – events that showed the power of machine intelligence in matching or surpassing human performance.
Two examples are Deep Blue versus Kasparov in 1997 and Watson versus Jennings in 2008. Most recently, AlphaGo versus Lee Sedol became another major victory, this time driven by a fast developing field known as “deep learning.”
What is deep learning?
Deep learning–a machine learning technique based on artificial neural networks–is growing in popularity due to a series of developments in the science and business of data mining.
Prior to AlphaGo’s victory over the currently best Go player Lee Sedol, computer programs that played Go had only been able to beat average players. Indeed, the accomplishment can be seen as a major milestone in developing computer programs that are on a par with and even exceed human levels of intelligent behavior.
Why is this happening now in particular, considering neural networks have been around for at least 50 years?
Some history of deep learning
Researchers like Frank Rosenblatt created one of the first artificial neural networks inspired by findings in neuroscience from the 1940s. Rosenblatt developed the so-called “perceptron” that can learn from a set of input data similar to how biological neurons learn from stimuli.
An artificial neuron consists of a set of input weights similar to the dendrites of a neuron. The biological neuron processes the electronic input charges and produces an output that is channeled through the axon that again is connected to other neurons.
Figure 1: A biological neuron consists of dendrites, the nucleus and the axon that transmits electrical impulses to other neurons.
Figure 2: An artificial neuron can be seen as an abstraction of a biological neuron.
Interest in this research, however, was dampened when Marvin Minsky and Seymor Papert published a book on perceptrons that highlighted various shortcomings. They showed, for example, that a single artificial neuron cannot model the logical operator XOR (the output value of this operator is only true if the input values differ, also known as exclusive or).
More complex neural networks with dense connectivity between the neurons are able to model those Boolean operators (operators that connect one or two sentences resulting in only two possible values: true or false), as was already known at the time. But their findings still had a chilling effect on the research community and strengthened the so-called symbolic Artificial Intelligence approaches that relied more on rule-based systems.
Despite this climate, a few researchers continued to develop these techniques in the 1980s and 1990s, and most current algorithms are based on those networks. Researchers like Geoffrey Hinton, professor at Toronto University and employee at Google, and Yann LeCun, director of Facebook AI research, persisted in their research in this area despite it being generally not recommended to grad students at the time.
At the beginning of the new millennium, a perfect storm developed and neural networks suddenly are back in the limelight.
What has fueled this development?
First, the models depended on copious amounts of training data to perform well. An image recognition system would require millions of labeled or captioned images, which were previously only available through small, digitized collections of books and news articles. The advent of Web 2.0 and social media platforms such as flickr, instagram and facebook supplied these models with the scale and scope of data they needed.
Second, deep networks often require immense processing power. The advent of powerful processing technologies (GPUs) and architectural paradigms that facilitate distributed processing (cluster/cloud computing) allowed deep learning models to thrive. These developments, in turn, led to advancements in deep learning algorithms. Since the early 2010s, several new models and network configurations have emerged.
Recent breakthroughs in deep learning
Deep learning uses the power of abstraction to derive meaning from data. When processing a collection of portraits, for instance, instead of focusing on individual pixels, a deep network identifies recurring patterns such as eyes, noses, the silhouette of the face, etc.
Each layer of the network contributes to constructing these abstractions from data (see image below). As a result, image processing has benefited from the advent of deep learning models.
Figure 3: Each layer in a network recognizes a different aspect of the image.
Text mining is another area of success. Automated language translation systems often tried to align source text with target text using a word-to-word or phrase-to-phrase mapping. Deep models, on the other hand, are able to recognize one-to-many and many-to-one relations between source and target languages.
Figure 4: Deep networks are able to translate one language to another by recognizing sophisticated alignments.
There are many further areas of success in applying deep learning to data mining. Google famously improved its voice transcription service thanks to a deep network that could recognize complicated speech patterns. Twitter uses deep learning to identify and remove inappropriate content. Houzz uses it to help customers find relevant products in images and buy them. GE Healthcare uses it to classify organs in CT scans. The list goes on.
One of the exciting promises that deep learning brings to the table is the potential to have a natural human machine interaction, because the machine will be able to understand language, reason, and interact with humans in a more natural way. Given the advances in natural language processing and vision detection, it seems only a question of time until machines will be smart enough to reach this level.
However, there will be roadblocks. Even though speech recognition software now has an error rate of less than 5 percent compared to more than 10 percent just a few years ago, one has to consider that this still means that 40 words of an 800-word article would be wrongly recognized. Andrew Ng, another machine learning world expert, has suggested that naturally interacting with a computer will not happen until we reach error rates of <1%.
Although neural networks can already carry out complex cognitive tasks, one needs to bear in mind that the complexity of current artificial networks is somewhere between the neural capacity of a bee’s and a frog’s brain. Human brains are at least an order of magnitude larger than current neural networks and one should not expect performance in complex cognitive tasks at a human level anytime soon.
There are a couple of indicators in support of this view. For example, Anh Nugyen and colleagues showed that it is very easy to fool a neural network with artificially created images, like those pictured here:
Some other recent studies have shown that neural networks are very good at learning human biases. So-called word embeddings that are often used as input for a neural network are learned from large text collections and are able to solve analogy puzzles such as “Paris is to France as Tokyo is to ?,” producing the correct answer, “Japan.”
Unfortunately, it also comes up with the following analogy: “man is to computer programmer as woman is to homemaker.” A research team at Boston University and Microsoft Research addressed this problem by re-shaping the vector space, but since our biases are embedded in the data we feed computer algorithms, this problem is probably only the tip of the iceberg of hidden biases that make their way into advertising programs, CV pre-screening apps, and surveillance systems.
Given these limitations, it makes sense to focus on narrow tasks that provide value. There are already interesting narrow applications where deep learning applications excel, including board games. Neural networks are increasingly applied to problems in areas ranging from medicine, engineering, to finance and more:
- Melanoma detection and screening
- Detecting cyclone activity
- Short-term energy market price forecasting
- Predicting bankruptcies of cooperations
Another nice, narrow application is Google’s Wavenet, because it focused on a well-defined sub-task: making text-to-speech application sound more natural.
Given a well-defined task, enough annotated data and computing power, many complex tasks that required human expertise and reasoning seem to be ripe to be modeled by a neural network approach nowadays. Luckily, there are many open-source tools available which makes this machine technique very accessible.
Companies like Google, Facebook and Microsoft have open-sourced their deep learning libraries providing the tools for building neural networks, and various learning materials are available to anybody who wants to go up the learning curve:
Given this eco system of easily available software, it shouldn’t come as a surprise that this is the ideal breeding ground for innovative start-ups.
Startups and growing business potential
As with any novel technology, the advent of deep learning has led to a booming startup market across different verticals, projected to generate more than $10 billion in revenue by 2024. This in turn has opened a potential $2.4 billion market for chip-makers such as Nvidia and Intel, and created vast opportunities for investment in infrastructure, software, and services related to deep learning.
The recent wave of acquisitions by major technology companies (DeepMind by Google, MetaMind by Salesforce, Madbits by Twitter, Lookflow by Flickr, and AlchemyAPI by IBM, to name a few) signals the growing business potential of this space.
Leveraging the power of deep learning
At Thomson Reuters, we leverage massive corpora of archival content and real-time data streams to provide effective and timely answers to our customers’ questions. This content naturally lends itself to scalable analytics and mining technologies like deep learning.
For example, neural language models can be applied to our massive archives of news, legal, financial and other text documents to derive intra and cross-domain insights. Reuters Pictures provides a rich corpus of annotated images that reflect on current events, and has already been used in deep image processing projects. And our quantitative data around finance and risk can be mined against unstructured sources to identify common patterns and indicators.
We are currently exploring all of these venues in order to leverage the power of deep learning to provide the right answers at the right time for our customers.
Visit Innovation @ ThomsonReuters.com for more on how we bring together smart data and human expertise to find trusted answers.