The past decade has seen a remarkable series of advances in machine learning, and in particular deep learning approaches based on artificial neural networks, to improve our abilities to build more accurate systems across a broad range of areas, including computer vision, speech recognition, language translation, and natural language understanding tasks. This paper is a companion paper to a keynote talk at the 2020 International Solid-State Circuits Conference (ISSCC) discussing some of the advances in machine learning, and their implications on the kinds of computational devices we need to build, especially in the post-Moore's Law-era. It also discusses some of the ways that machine learning may also be able to help with some aspects of the circuit design process. Finally, it provides a sketch of at least one interesting direction towards much larger-scale multi-task models that are sparsely activated and employ much more dynamic, example- and task-based routing than the machine learning models of today.
Deep Dive into The Deep Learning Revolution and Its Implications for Computer Architecture and Chip Design.
The past decade has seen a remarkable series of advances in machine learning, and in particular deep learning approaches based on artificial neural networks, to improve our abilities to build more accurate systems across a broad range of areas, including computer vision, speech recognition, language translation, and natural language understanding tasks. This paper is a companion paper to a keynote talk at the 2020 International Solid-State Circuits Conference (ISSCC) discussing some of the advances in machine learning, and their implications on the kinds of computational devices we need to build, especially in the post-Moore’s Law-era. It also discusses some of the ways that machine learning may also be able to help with some aspects of the circuit design process. Finally, it provides a sketch of at least one interesting direction towards much larger-scale multi-task models that are sparsely activated and employ much more dynamic, example- and task-based routing than the machine learni
The Deep Learning Revolution and Its Implications
for Computer Architecture and Chip Design
Jeffrey Dean
Google Research
jeff@google.com
Abstract
The past decade has seen a remarkable series of advances in machine learning, and in particular deep
learning approaches based on artificial neural networks, to improve our abilities to build more accurate
systems across a broad range of areas, including computer vision, speech recognition, language
translation, and natural language understanding tasks. This paper is a companion paper to a keynote talk
at the 2020 International Solid-State Circuits Conference (ISSCC) discussing some of the advances in
machine learning, and their implications on the kinds of computational devices we need to build,
especially in the post-Moore’s Law-era. It also discusses some of the ways that machine learning may
also be able to help with some aspects of the circuit design process. Finally, it provides a sketch of at
least one interesting direction towards much larger-scale multi-task models that are sparsely activated
and employ much more dynamic, example- and task-based routing than the machine learning models of
today.
Introduction
The past decade has seen a remarkable series of advances in machine learning (ML), and in particular
deep learning approaches based on artificial neural networks, to improve our abilities to build more
accurate systems across a broad range of areas [LeCun et al. 2015]. Major areas of significant advances
include computer vision [Krizhevsky et al. 2012, Szegedy et al. 2015, He et al. 2016, Real et al. 2017, Tan
and Le 2019], speech recognition [Hinton et al. 2012, Chan et al. 2016], language translation [Wu et al.
2016] and other natural language tasks [Collobert et al. 2011, Mikolov et al. 2013, Sutskever et al. 2014,
Shazeer et al. 2017, Vaswani et al. 2017, Devlin et al. 2018]. The machine learning research community
has also been able to train systems to accomplish some challenging tasks by learning from interacting
with environments, often using reinforcement learning, showing success and promising advances in areas
such as playing the game of Go [Silver et al. 2017], playing video games such as Atari games [Mnih et al.
2013, Mnih et al. 2015] and Starcraft [Vinyals et al. 2019], accomplishing robotics tasks such as
substantially improved grasping for unseen objects [Levine et al. 2016, Kalashnikov et al. 2018],
emulating observed human behavior [Sermanet et al. 2018], and navigating complex urban environments
using autonomous vehicles [Angelova et al. 2015, Bansal et al. 2018].
As an illustration of the dramatic progress in the field of computer vision, Figure 1 shows a graph of the
improvement over time for the Imagenet challenge, an annual contest run by Stanford University [Deng et
al. 2009] where contestants are given a training set of one million color images across 1000 categories,
and then use this data to train a model to generalize to an evaluation set of images across the same
categories. In 2010 and 2011, prior to the use of deep learning approaches in this contest, the winning
entrants used hand-engineered computer vision features and the top-5 error rate was above 25%. In
2012, Alex Krishevsky, Ilya Sutskever, and Geoffrey Hinton used a deep neural network, commonly
referred to as “AlexNet”, to take first place in the contest with a major reduction in the top-5 error rate to
16% [Krishevsky et al. 2012]. Their team was the only team that used a neural network in 2012. The
next year, the deep learning computer vision revolution was in full force with the vast majority of entries
from teams using deep neural networks, and the winning error rate again dropped substantially to 11.7%.
We know from a careful study that Andrej Karpathy performed that human error on this task is just above
5% if the human practices for ~20 hours, or 12% if a different person practices for just a few hours
[Karpathy 2014]. Over the course of the years 2011 to 2017, the winning Imagenet error rate dropped
sharply from 26% in 2011 to 2.3% in 2017.
Figure 1: ImageNet classification contest winner accuracy over time
These advances in fundamental areas like computer vision, speech recognition, language understanding,
and large-scale reinforcement learning have dramatic implications for many fields. We have seen a
steady series of results in many different fields of science and medicine by applying the basic research
results that have been generated over the past decade to these problem areas. Examples include
promising areas of medical imaging diagnostic tasks including for diabetic retinopathy [Gulshan et al.
2016, Krause et al. 2018], breast cancer pathology [Liu et al. 2017], lung cancer CT scan interpretation
[Ardila et al. 2019], and dermatology [Esteva et al. 2017]. Sequential prediction methods that are useful
for language
…(Full text truncated)…
This content is AI-processed based on ArXiv data.