ActiVis: Visual Exploration of Industry-Scale Deep Neural Network Models

A C T I V I S : Visual Exploration of Industr y-Scale Deep Neural Netw or k Models Minsuk Kahng, Pierre Y . Andrews , Aditya Kalro , and Duen Horng (Polo) Chau Fig. 1. A C T I V I S integrates se veral coordinated views to support exploration of comple x deep neural network models, at both instance- and subset-le vel. 1. Our user Susan starts exploring the model architecture, through its computation graph ov er vie w (at A). Selecting a data node (in yello w) displays its neuron activations (at B). 2. The neuron activation matrix view shows the activ ations for instances and instance subsets; the projected view displays the 2-D projection of instance activ ations. 3. F rom the instance selection panel (at C), she e xplores individual instances and their classiﬁcation results. 4. Adding instances to the matrix view enab les comparison of activation patterns across instances, subsets, and classes , revealing causes f or misclassiﬁcation. Abstract — While deep learning models hav e achie ved state-of-the-art accuracies f or many prediction tasks, understanding these models remains a challenge. Despite the recent interest in de veloping visual tools to help users interpret deep learning models, the complexity and wide v ar iety of models deploy ed in industr y , and the large-scale datasets that the y used, pose unique design challenges that are inadequately addressed by existing work. Through par ticipator y design sessions with ov er 15 researchers and engineers at F acebook, we hav e developed, deplo yed, and iterativ ely improved A C T I V I S , an interactive visualization system f or inter preting large-scale deep learning models and results. By tightly integrating multiple coordinated vie ws, such as a computation graph ov er vie w of the model architecture, and a neuron activation view f or pattern discovery and comparison, users can e xplore complex deep neural network models at both the instance- and subset-lev el. A C T I V I S has been deplo yed on F acebook’ s machine lear ning platf orm. W e present case studies with F acebook researchers and engineers, and usage scenarios of how A C T I V I S ma y work with different models. Index T erms —Visual analytics, deep learning, machine lear ning, inf or mation visualization. 1 I N T R O D U C T I O N Deep learning has led to major breakthroughs in various domains, such as computer vision, natural language processing, and healthcare. Many technology companies, like Facebook, hav e been increasingly adopting deep learning models for their products [1, 2, 11]. While powerful deep neural netw ork models hav e signiﬁcantly improv ed prediction accuracy , understanding these models remains a challenge. Deep learning models • Minsuk Kahng and Duen Horng (P olo) Chau ar e with Geor gia Institute of T echnology . E-mail: { kahng,polo } @gatech.edu. This work was done while Minsuk Kahng was at F acebook. • Pierr e Y . Andrews and Aditya Kalr o are with F acebook. E-mail: { mortimer ,adityakalr o } @fb .com. • This paper will be pr esented at the IEEE Conference on V isual Analytics Science and T echnology (V AST) in October 2017 and published in the IEEE T ransactions on V isualization and Computer Graphics (TVCG), V ol. 24, No. 1, J anuary 2018. are more difﬁcult to interpret than most existing machine learning models, because they capture nonlinear hidden structures of data using a huge number of parameters. Therefore, in practice, people often use them as “black boxes”, which could be detrimental because when the models do not perform satisfactorily , users would not understand the causes or know ho w to ﬁx them [23, 33]. Despite the recent increasing interest in de veloping visual tools to help users interpret deep learning models [10, 26, 35, 38], the complexity and wide variety of models deployed in industry , and the large-scale datasets that they use, pose unique challenges that are inadequately ad- dressed by existing work. For example, deep learning tasks in industry often in volv e different types of data, including text and numerical data; howe ver most e xisting visualization research targets image datasets [38]. Furthermore, in designing interpretation tools for real-world use and deployment at technology companies, it is a high priority that the tools be ﬂexible and generalizable to the wide v ariety of models and datasets that the companies use for their many products and services. These observations moti vate us to design and de velop a visualization tool for 1 interpreting industry-scale deep neural network models, one that can work with a wide range of models, and can be readily deployed on Facebook’ s machine learning platform. Through participatory design with researchers, data scientists, and engineers at Facebook, we ha ve identiﬁed common analysis strate gies that they use to interpret machine learning models. Speciﬁcally , we learned that both instance- and subset-based exploration approaches are common and effecti ve. Instance-based exploration (e.g., how indi- vidual instances contrib ute to a model’ s accuracy) ha ve demonstrated success in a number of machine learning tasks [3, 23, 29]. As individual instances are f amiliar to users, exploring by instances accelerates model understanding. Another effecti ve strategy is to le verage input features or instance subsets speciﬁed by users [21, 23]. Slicing results by fea- tures helps re veal relationships between data attributes and machine learning algorithms’ outputs [17, 28, 29]. Subset-based exploration is especially beneﬁcial when dealing with huge datasets in industry , which may consist of millions or billions of data points. Interpreting model results at a higher , more abstract le vel helps dri ve down computation time, and help user dev elop general sense about the models. Our tool, called A C T I V I S , aims to support both interpretation strate- gies for visualization and comparison of multiple instances and subsets. A C T I V I S is an interacti ve visualization system for deep neural network models that (1) uniﬁes instance- and subset-le vel inspections, (2) tightly integrates o vervie w of complex models and localized inspection, and (3) scales to a variety of industry-scale datasets and models. A C T I V I S visualizes how neur ons are activ ated by user-speciﬁed instances or instance subsets, to help users understand how a model derives its predictions. Users can freely deﬁne subsets with raw data attributes, transformed features, and output results, enabling model inspection from multiple angles. While many e xisting deep learning visualization tools support instance-based exploration [10, 14, 18, 35, 38], A C T I V I S is the ﬁrst tool that simultaneously supports instance- and subset-based exploration of the deep neural network models. In addition, to help users get a high-lev el overvie w of the model, A C T I V I S provides a graph-based representation of the model architecture, from which the user can drill down to perform localized inspection of activations at each model layer (node). Illustrative scenario. T o illustrate how A C T I V I S works in practice, consider our user Susan who is training a word-le vel con volutional neural network (CNN) model [19] to classify question sentences into one of six categories (e.g., whether a question asks about numeric values, as in “what is the diameter of a golf ball?” ). Her dataset is part of the TREC question answering data collections 1 [25]. Susan is new to using this CNN model, so she decides to start by using its default training parameters. After training completes, she launches A C TI V I S , which runs in a web browser . A C T I V I S provides an ov erview of the model by displaying its architecture as a computation graph (Fig. 1A, top), summarizing the model structure. By exploring the graph, Susan learns about the kind of operations (e.g., con volution) that are performed, and how the y are combined in the model. Based on her experience w orking with other deep learning models, she knows that a model’ s performance is strongly correlated with its last hidden layer , thus it would be informative to analyze that layer . In A C T I V I S , a layer is represented as a rounded rectangular node (highlighted in yellow , in Fig. 1A, bottom). Susan clicks the node for the last hidden layer , and A C T I V I S displays the layer’ s neur on activation in a panel (Fig. 1B): the neur on activation matrix vie w on the left shows ho w neurons (shown as columns) respond to instances from different classes (rows); and the pr ojected view on the right shows the 2-D projection of instance acti vations. In the matrix view , stronger neuron acti vations are shown in dark er gray . Susan sees that the acti vation patterns for the six classes (rows) are quite visually distincti ve, which may indicate satisf actory classiﬁcation. Howe ver , in the pr ojected view , instances from dif ferent classes are not clearly separated, which suggests some degree of misclassiﬁcation. T o examine the misclassiﬁed instances and to in vestigate why they are mislabeled, Susan brings up the instance selection panel (Fig. 1C). 1 http://cogcomp.cs.illinois.edu/Data/QA/QC/ The classiﬁcation results for the NUM ber class alarm Susan, as many instances in that class are misclassiﬁed (shown in right column). She examines their associated question text by mouse-o vering them, which shows the text in popup tooltips. She wants to compare the acti vation patterns of the correctly classiﬁed instances with those of the misclassi- ﬁed. So she adds two correct instances (#38, #47) and two misclassiﬁed instances (#120, #126) to the neur on activation matrix view — indeed, their activ ation patterns are very different (Fig. 1.4). T aking a closer look at the instance selection panel , Susan sees that many instances hav e blue borders, meaning they are misclassiﬁed as DESC ription. Inspecting the instances’ text re veals that the y often be gin with “What is” , which is typical for questions asking for descriptions, though they are also common for other question types, as in “What is the diameter of a golf ball?” which is a numeric question (Fig. 1.3). T o understand the extent to which instances starting with “What is” are generally misclassiﬁed by the model, Susan creates an instance sub- set for them, and A C T I V I S adds this subset as a new ro w in the neur on activation matrix view . Susan cannot discern any visual patterns from the subset’ s seemingly scattered, random neuron acti vations, suggesting that the model may not yet ha ve learned effecti ve ways to distinguish between the different intents of “What is” questions. Based on this ﬁnding, she proceeds to train more models with different parameters (e.g., consider longer n -grams) to better classify these questions. A C T I V I S integrates multiple coordinated views to enable Susan to work with complex models, and to ﬂexibly explore them at instance- and subset-lev el, helping her discover and narrow in to speciﬁc issues. Deployment. A C T I V I S has been deployed on the machine learning platform at Facebook. A dev eloper can visualize a deep learning model using A C T I V I S by adding only a few lines of code, which instructs the model’s training process to generate data needed for A C T I V I S . A C T I V I S users at Facebook (e.g., data scientists) can then train models and use A C T I V I S via FBLearner Flow [4, 12], Facebook’ s internal machine learning web interface, without writing any additional code. A C T I V I S ’ s main contributions include: • A nov el visual representation that uniﬁes instance- and subset-lev el inspections of neuron activ ations, which facilitates comparison of activ ation patterns for multiple instances and instance subsets. Users can ﬂexibly specify subsets using input features, labels, or any inter- mediate outcomes in a machine learning pipeline (Sect. 4.2). • An interface that tightly inte grates an overvie w of graph-structured complex models and local inspection of neuron activ ations, allowing users to explore the model at dif ferent le vels of abstraction (Sect. 4.3). • A deployed system scaling to lar ge datasets and models (Sect. 4.4). • Case studies with Facebook engineers and data scientists that high- light how A C T I V I S helps them with their work, and usage scenarios that describe ho w A C TI V I S may work with dif ferent models (Sect. 6). 2 R E L AT E D W O R K 2.1 Machine Learning Interpretation through Visualization As the complexity of machine learning algorithms increases, many researchers hav e recognized the importance of model interpreta- tion and dev eloped interactive tools to help users better understand them [9, 13, 21, 24, 33, 37]. While ov erall model accuracy can be used to select models, users often want to understand wh y and when a model would perform better than others, so that the y can trust the model and know how to further improve it. In de veloping interpretation tools, rev ealing relationships between data and models is one of the the most important design goals [29, 30]. Belo w we present two important ana- lytics strategies that existing works adopt to help users understand how data respond to machine learning models. Instance-based exploration. A widely-used approach to under- standing complex algorithms is by tracking how an example (i.e., train- ing or test instance) behaves inside the models. Kulesza et al. [23] presented an interactiv e system that explains ho w models made predic- tions for each instance. Amershi et al. [3] developed ModelTracker , a visualization tool that shows the distrib ution of instance scores for 2 binary classiﬁcation tasks and allo ws users to examine each instance in- dividually . The researchers from the same group recently extended their work for multi-classiﬁcation tasks [32]. While the above-mentioned tools were designed for model-agnostic, there are also tools designed speciﬁcally for neural network models [14, 18, 34]. These tools enable users to pick an instance and feed it to the models and sho w how the parameters of the models change. W e will describe them in more detail shortly , in Sect. 2.2. Featur e- and subset-based exploration. While instance-based ex- ploration is helpful for tracking how models respond to individual examples, feature- or subset-based exploration enables users to bet- ter understand the relationships between data and models, as machine learning features make it possible for instances to be grouped and sliced in multiple ways. Researchers have utilized featur es to visually de- scribe how the models captured the structure of datasets [8, 20, 21, 23]. Kulesza et al. [23] used the importance weight of each feature in the Naiv e Bayes algorithm, and Krause et al. [21] used partial dependence to show the relationships between features and results. T o enable users to analyze results not only by predeﬁned features, researchers hav e dev eloped tools that enable users to specify instance subsets. Speci- fying groups can be a good ﬁrst step for analyzing machine learning results [22], as it provides users with an effecti ve way for analyzing complex multidimensional data. In particular, people in the medical domain often perform similar processes, called cohort construction , and Krause et al. [22] developed an interactiv e tool that helps this process. McMahan et al. [28] presented their internal tool that allows users to visually compare the performance differences between models by subsets. MLCube [17] enabled users to interacti vely explore and deﬁne instance subsets using both raw data attrib utes and transformed features, and compute ev aluation metrics over the subsets. 2.2 Interactive Visualization of Deep Learning Models Deep learning has become very popular , largely thanks to the state-of- the-art performance achiev ed by con volutional neural network models, commonly used for analyzing image datasets in computer vision. Since deep neural network models typically consist of man y parameters, re- searchers hav e recognized deep learning interpretation as an important research area. A common approach is to show ﬁlters or activations for each neural network layer . This helps users understand what the models hav e learned in the hidden structure throughout the layers. Interactive visualization tools. A number of interactiv e tools have been de veloped to effectiv ely visualize the activ ation information. Tzeng and Ma [36] was one of the ﬁrst visualization tools designed for neural network models. While it did not target deep networks, it represented each neuron as a node and visualized a given instance’ s activ ations. This idea has been extended to the case of deep neural networks. Karpathy [18] visualized the activ ations for each layer of a neural network on his website. Harley [14] dev eloped an interac- tiv e prototype that sho ws activ ations for a giv en instance. Smilkov et al. [34] developed an interactiv e prototype for educational purposes, called T ensorFlow Playgr ound , which visualized training parameters to help users explore ho w models process a giv en instance to make predictions. Howe ver , these tools do not scale to large dataset or the complex models commonly used in industry . T owards scalable visualization systems. CNNV is [26] is an inter- activ e visual analytics system designed for con volutional networks. It modeled neurons as a directed graph and utilized several techniques to make it scalable. For example, it uses hierarchical clustering to group neurons and uses bi-directional edge bundling to summarize edges among neurons. They also compute a verage acti vations for instances from the same class. Ho wev er , users cannot feed instances into the sys- tem, to perform instance-based analysis which is an ef fectiv e strategy for understanding machine learning models. Another way of handling lar ge number of neurons is to employ di- mensionality reduction techniques. By projecting a high-dimensional vector into two-dimensional space, we can better represent the high- dimensional nature of deep neural network models. Rauber et al. [31] studied how 2-D projected vie w of instance activ ations and neuron ﬁl- ters can help users better understand neural network models. Google’ s Embedding Pr ojector [35] tool, which is integrated into their T ensor- ﬂow deep learning framework [1], provides an interactive 3-D pro- jection with some additional features (e.g., similar instance search). ReV ACNN [10] is an interacti ve visual analytics system that uses di- mensionality reduction for conv olutional netw orks. While CNNV is [26] uses clustering to handle large number of neurons, ReV ACNN sho ws both individual neurons and a 2-D projection embedded space (through t-SNE). The individual neuron vie w helps users explore how indi vidual neurons respond to a user-selected instance; the projected view can help them get a visual summary of instance acti vations. Howe ver , these two vie ws work independently . It is dif ﬁcult for users to combine their analyses, or compare multiple instances’ neuron activ ations. 3 A N A L Y T I C S N E E D S F O R I N D U S T RY - S C A L E P R O B L E M S The A C T I V I S project started in April 2016. Since its inception, we hav e conducted participatory design sessions with ov er 15 Facebook engineers, researchers, and data scientists across multiple teams to learn about their visual analytics needs. T ogether , we collaborativ ely design and dev elop A C T I V I S and iterati vely impro ve it. In Sect. 3.1, we describe the workﬂo w of how machine learning models are typically trained and used at Facebook, and how results are interpreted. This discussion provides the background information and context for which visualization tools may help improve deep learning model interpretation. In Sect. 3.2, we summarize our main ﬁndings from our participa- tory design sessions to highlight six ke y design challenges that stem from Facebook’ s needs to work with large-scale datasets, complex deep learning model architectures, and di verse analytics needs. These challenges hav e been inadequately addressed by current deep learning visualization tools, and they motiv ate and shape our design goals for A C T I V I S , which we will describe in Sect. 4.1. 3.1 Backgr ound: Machine Learning Practice at Facebook Facebook uses machine learning for some of their products. Re- searchers, engineers, and data scientists from dif ferent teams at Face- book perform a wide range of machine learning tasks. W e ﬁrst describe how F acebook’ s machine learning platform helps users train models and interpret their results. Then, we present ﬁndings from our discussion with machine learning users and their common analytics patterns in interpreting machine learning models. These ﬁndings guide our discov ery of design challenges that A C T I V I S aims to address. 3.1.1 FBLearner Flow: F acebook’ s Machine Lear ning Platform T o help engineers, including non-experts of machine learning, to more easily reuse algorithms in different products and manage experiments with ease, Facebook b uilt a uniﬁed machine learning platform called FBLearner Flow [4, 12]. It supports many machine learning workﬂows. Users can easily train models and see their results using the FBLearner Flow interf ace without writing any code. For example, users can train a model by picking a rele vant workﬂo w from a collection of existing workﬂo ws and specifying several input parameters for the selected workﬂo w (e.g., location of training dataset, learning parameters). The FBLearner Flo w interface is particularly helpful for users who want to use existing machine learning models for their datasets without knowing their internal details. Once the training process is done, the interface provides high-le vel information to aid result analysis (e.g., precision, accuracy). T o help users interpret the results from additional multiple aspects, sev eral other statistics are av ailable in the interface (e.g., partial dependence plots). Users can inspect models’ internal details via interacti ve visualization (e.g., for decision trees) [4]. As deep neural network models gain popularity , dev eloping visualization for their interpretation is a natural step for FBLearner Flow . 3.1.2 Analytics P atter ns for Interpretation T o better understand ho w machine learning users at Facebook interpret model results, and how we may design A C TI V I S to better support their analysis, we conducted participatory design sessions with over 15 3 engineers and data scientists who re gularly work with machine learning and deep neural network models. At the high level, we learned that instance- and subset-based strategies are both common and effecti ve, echoing ﬁndings from existing research. Instance-based analysis. One natural way for users at Facebook to understand complex models is by tracking ho w an individual e xample (i.e., training or test instance) behav es inside the models; users often hav e their o wn collection of example instances, for which they kno w their characteristics and ground truth labels. Instance-le vel exploration is especially useful when an instance is easy to interpret. For example, an instance consisting of text only is much easier to understand than an instance consisting of thousands of numerical features e xtracted from an end user’ s data. Subset-based analysis. Instance-based analysis, ho wev er , is insuf- ﬁcient for all cases. Inspecting instances individually can be tedious, and sometimes hinder insight discov ery , such as when instances are associated with many hard-to-interpret numerical features. W e learned that some Facebook researchers ﬁnd subset-based analysis to be more helpful for their work. F or example, suppose an instance represents an article that consists of many numerical features extracted from its attributes (e.g., length, popularity). Some users would like to under- stand ho w the models behave at higher-le vel categorization (e.g., by topic, publication date). In addition, some users have curated instance subsets. Understanding model behavior through such familiar subsets promotes their understanding. 3.2 Design Challenges Besides reafﬁrming the importance of two analysis strategies discussed abov e, and the need to support them simultaneously in A C T I V I S , we hav e identiﬁed additional design challenges through the participatory design sessions. W e summarize them into six key design challenges. Thus far , they hav e not been adequately addressed by existing deep learning visualization tools. And they shape the main design goals of A C T I V I S , which we will describe in Sect. 4.1. W e have labeled the six challenges C1 – C6 and have grouped them into three categories with the labels data , model , and analytics , which indicate the causes for which the challenges arise. C1. Diverse input sour ces and formats D AT A While deep learning has become popular because of its superior performance for image data, it has also been applied to many different data formats, including text and numerical features [2, 11, 16, 19]. Furthermore, a single model may jointly use multiple types of data at a time. For example, to classify a Facebook post, a model may jointly lev erage its textual content, attached photos, and user information, each of which may be associated with many data attributes [2]. W orking with such variety of data sources and formats opens up many opportunities for model interpretation; for example, we may be able to more easily cate gorize instances using their associated numerical features that can be more readily understood, instead of going the harder route of using image- based features. C2. High data volume D A TA Facebook, like many other companies, has a large amount of data. The size of training data often reaches billions of rows and thousands of features. This sheer size of data render many existing visualization tools unusable as they are often designed to visualize the whole dataset. C3. Complex model architectur e M OD E L Many e xisting visualization tools for deep learning models often assume simple linear architectures where data linearly ﬂow from the input layer to the output layer (e.g., a series of con volution and max-pooling layer in Ale xNet) [10, 26, 38]. Howev er , most practical model architectures deployed in industry are v ery com- plex [11]; they are often deep and wide, consisting of many layers, neurons, and operations. C4. A great v ariety of models M O D EL Researchers and engineers at F acebook de velop and e valuate mod- els for products e very day . It is important for visualization tools to be generalizable so they can work with many different kinds of models. A visualization system would lik ely be impractical to use or to deploy if a small change to a model requires signiﬁcant changes made to existing code or special case handling. C5. Diverse subset deﬁnitions A N A L Y T I C S When performing subset-based analysis, users may want to deﬁne subsets in many different ways. Since there are a large number of input formats and input features, there are numerous ways to specify subsets. Instead of providing a ﬁxed set of ways to deﬁne subsets, it is desirable to make this process ﬂexible so that users can ﬂexibly deﬁne subsets that are relev ant to their tasks and goals. C6. Simultaneous need f or performing instance- and subset-level analysis A NA L Y T I C S Instance- and subset-based are complementary analytics strate- gies, and it is important to support both at the same time. Instance- based analysis helps users track how an indi vidual instance be- hav es in the models, but it is tedious to inspect many instances one by one. By specifying subsets and enabling their comparison with indi vidual instances, users can learn how the models respond to many dif ferent slices of the data. 4 A C T I V I S : V I S UA L E X P L O R AT I O N O F N E U R A L N E T WO R K S Through the design challenges we identiﬁed (in Sect. 3.2) in our partici- patory design sessions with researchers, engineers, and data scientists at Facebook, we design and de velop A C T I V I S , a novel interacti ve visual tool for exploring a wide range of industry-scale deep neural network models. In this section, we ﬁrst present three main design goals distilled from our con versations with Facebook participants (Sect. 4.1). Then, for each design goal, we elaborate on ho w A C TI V I S achiev es it through its system design and visual exploration features (Sects. 4.2-4.4). W e label the three design goals G1 – G3. 4.1 Design Goals G1. Unifying instance- and subset-based analysis to facilitate comparison of multiple instance activations. From our par- ticipatory design sessions, we learned that both instance- and subset-based analysis are useful and complementary . W e aim to support subset-le vel e xploration by enabling users to ﬂe xibly deﬁne instance subsets for different data types (C1, C5), e.g., a set of documents that contain a speciﬁc word. Subset-based analysis also allo ws users to explore datasets at higher-le vel ab- straction, scaling to billion-scale data or larger (C2). Furthermore, we w ould lik e to unify instance- and subset-le vel inspections to fa- cilitate comparison of multiple instances and groups of instances in a single view (C6). G2. Tight integration of over view of model architecture and lo- calized inspection of activations. Industry-scale deep neural network models are often very complex, consisting of many op- erations (C3). V isualizing ev ery detail and activ ation v alue for all intermediate layers can ov erwhelm users. Therefore, we aim to present the architecture of the models as a starting point of exploration, and let users switch to the detailed inspection of activ ations. G3. Scaling to industry-scale datasets and models through ﬂexi- ble system design. For A C T I V I S to work with many different large-scale models and datasets used in practice, it is important for the system to be ﬂexible and scalable. W e aim to support as many different kinds of data types and classiﬁcation models as what FBLearner currently does (e.g., image, te xt, numerical) (C1, C4). W e would like to achiev e this by de veloping a ﬂexible, modularized system that allows de velopers to use A C T I V I S for their models with simple API functions, while addressing visual and computational scalability challenges through a multipronged approach (C2, C3). 4 Fig. 2. A C T I V I S integrates multiple coordinated views. A. The computation gr aph summarizes the model architecture. B. The neuron activ ation panel’ s matrix view displays activations f or instances, subsets, and classes (at B1), and its projected vie w shows a 2-D t-SNE projection of the instance activations (at B2). C. The instance selection panel displays instances and their classiﬁcation results; correctly classiﬁed instances shown on the left, misclassiﬁed on the right. Clicking an instance adds it to the neuron activ ation matrix view . The dataset used is from the public TREC question answ ering data collections [25]. The trained model is a word-le vel conv olutional model based on [19]. 4.2 Exploring Neuron Activ ations by Instance Subsets Drawing inspiration from e xisting visualizations [14, 18, 26, 38], A C - T I V I S supports the visualization for indi vidual instances. Howe ver , it is difﬁcult for users to spot interesting patterns and insights if he can only visualize one instance at a time. For e xample, consider a hidden layer consisting of 100 neurons. The neuron activ ations for an instance is a 100-dimension v ector consisting of 100 numerical v alues, where each element in the vector does not have an y speciﬁc meaning. Instead, if multiple vectors of acti vation values are presented together , the user may more readily derive meaning by comparing them. For example, users may ﬁnd that some dimensions may respond more strongly to certain instances, or some dimensions are negativ ely correlated with certain classes. A challenge in supporting the comparison of multiple instances stems from the sheer size of data instances; it is impossible to present activ ations for all instances. T o tackle this challenge, we enable users to deﬁne instance subsets . Then we compute the average acti vations for instances within the subsets. The vector of av erage activ ations for a subset can then be placed next to the vectors of other instances or subsets for comparison. The neur on activation matrix , sho wn at Fig. 2B.1, illustrates this concept of comparing multiple instances and instance subsets, using the TREC question classiﬁcation dataset 2 [25]. The dataset consists of 5,500 question sentences and each sentence is labeled by one of six categories (e.g., is a question asking about location ?). Fig. 2B shows the activ ations for the last hidden layer of the word-le vel CNN model [7, 19]. Each row represents either an instance or a subset of instances. For example, the ﬁrst row represents a subset of instances 2 http://cogcomp.cs.illinois.edu/Data/QA/QC/ whose true class is ‘DESC’ (descriptions). Each column represents a neuron. Each cell (circle) is a neuron acti vation v alue for a subset. A darker circle indicates stronger acti vation. This matrix view exposes the hidden relationships between neurons and data. For instance, a user may ﬁnd out a certain neuron is highly activ ated by instances whose true class is ‘LOC’. Flexible subset deﬁnition. In A C T I V I S , users can ﬂexibly deﬁne instance subsets. A subset can be speciﬁed using multiple properties of the instances, in man y different ways. Example properties include raw data attrib utes, labels, features, textual content, output scores, and predicted label. Our datasets consist of instances with man y features and a combination of different types of data. Flexible subset deﬁnition enables users to analyze models from different angles. For example, for instances representing text documents, the user may create a subset for documents that contains a speciﬁc phrase. For instances contain- ing numerical features, users can specify conditions, using operations similar to relational selections in databases (e.g., age > 20 , topic = ’sports’ ). By default, a subset is created for each class (e.g., a subset for the ‘DESC’ class). Sorting to rev eal patterns. The difﬁculty in recognizing patterns increases with the number of neurons. A C T I V I S allows users to sort neurons (i.e., columns) by their activ ation values. For example, in Fig. 3, the neurons are sorted based on the av erage acti vation values for the class ‘LOC’. Sorting facilitates acti vation comparison and helps rev eal patterns, such as spotting instances that are positiv ely correlated with their true class in terms of the activ ation pattern (e.g., instances #94 and #30 correlate with the ‘LOC’ class in Fig. 3). 2-D projection of activations. T o help users visually examine in- stance subsets, A C T I V I S provides a 2-D pr ojected view of instance ac- tiv ations. Projection of high-dimensional data into 2-D space has been 5 Fig. 3. Sor ting neurons (columns) by their a verage activ ation values f or the LOC (location) class helps users more easily spot instances whose activation patterns are positively correlated with that of the class, e.g., instances #94 and #30 (see green arrows). Fig. 4. Hov ering over an instance subset (e.g., for the NUM ber class) highlights its instances (purple dots) in the t-SNE projected view . considered an effecti ve exploration approach [9, 10, 31, 35]. A C T I V I S performs t-distributed stoc hastic neighbor embedding (t-SNE) [27] of instance activ ations. Fig. 2B.2 shows an example where each dot in the view represents an instance (colored by its true class), and instances with similar activ ation values are placed closer together by t-SNE. The projected view complements with the neur on activation matrix view (Fig. 2B.1). Hovering over a subset’ s row in the matrix would highlight the subset’ s instances in the projected view , allowing the user to see how instances within the subsets are distributed. In the projected view , hovering o ver an instance would display its acti vations; clicking that instance will add it to the matrix view as a ne w row . 4.3 Interface: Tight Integration of Model, Instances, and Activation Visualization The abov e visual representation of activ ations is the core of our visual analytics system. T o help users interacti vely specify where to start their exploration of a large model, we designed and de veloped an integrated system interface. As depicted in Fig. 2, the interface consists of multiple panels. W e describe each of them below . A: Overview of Model Architecture Deep learning models often consist of many operations, which makes it difﬁcult for users to fully understand their structure. W e aim to provide an ov erview of the model architecture to users, so they can ﬁrst make sense of the models, before moving on to parts of the models that they are interested in. Deep neural network models are often represented as computation graphs (DA Gs) (as in many deep learning frameworks like Caffe2 3 , T ensorFlow [1], and Theano [6]). The framew orks provide a set of op- erators (e.g., con volution, matrix multiplication, concatenation) to b uild machine learning programs, and model developers (who create new machine learning workﬂo ws for FBLearner Flow) write the programs using these building blocks. Presenting this graph to users would help them ﬁrst understand the structure of the models and ﬁnd interesting layers to explore the detailed acti vations. There are se veral possible ways in visualizing computation graphs. One approach is to represent operators as nodes and variables as edges. This approach has gained popularity , thanks to its adoption by T ensor- Flow . Another way is to consider both an operator and a variable as a single node. Then the graph becomes a bipartite graph: the direct neighbors of an operator node are alw ays variable nodes; the neighbors of a v ariable node are always operator nodes. Both approaches hav e their pros and cons. While the ﬁrst approach can have a compact repre- sentation by reducing the number of nodes, the second one, a classical way to represent programs and diagrams, makes it easier to track data. For A C T I V I S , it would be better to make variable nodes easy to locate as we present activ ations for a selected v ariable. Therefore, we decided to represent the graph using the second approach. The visualization of the computation graph is shown on the top panel (Fig. 2A). The direction of data ﬂo w is from left (input) to right (output). Each node represents either an operator (dark rectangle) or tensor (circle). T o explore this medium-sized graph (often > 100 nodes), users can zoom and pan the graph using a mouse. When users hov er over a node, its full name is sho wn, and when they click it, its corresponding activ ation is shown in the neuron acti vation panel. B: Activation f or Selected Node When users select a node of interest from the computation graph, the corresponding neuron activ ation panel (Fig. 2B) will be added to the bottom of the computation graph panel. The neuron activ ation panel has three subpanels: (0) the names of the selected node and its neighbors, (1) the neuron activ ation matrix view , and (2) the projected view . The left subpanel shows the name of the selected variable node and its neighbors. Users can hov er over a node to highlight where it is located in the computation graph on the top. The neuron matrix vie w (Fig. 2B.1) and projected view (Fig. 2B.2) show instance activ ations for the selected node. Note that we described these views in Sect. 4.2. Users can select multiple nodes and visually compare their activ ation patterns. Fig. 5 illustrates that users can visually explore ho w models learned the hidden structure of data through multiple layers. The ﬁgure shows three layers, from top to bottom: the second-to-last hidden layer which concatenates multiple maxpool layers [19], the last hidden layer , and the output layer . As shown in the ﬁgure, the layer’ s projected vie ws show that as data ﬂo w through the network, from input (top) to output (bottom), neuron activ ation patterns gradually become more discernible and clustered. C: Instance Selection The instance selection panel helps users get an overvie w of instances with their prediction results and determine which ones should be added to the neuron activ ation view for further exploration and comparison. The panel is located at the right side on the interface. It visually summarizes prediction results. Each square represents an instance. Instances are vertically grouped based on their true label. W ithin a true label (row group), the left column sho ws correctly classiﬁed instances, sorted by their prediction scores in descending order (from top to bottom, and left to right within each row). The right column sho ws misclassiﬁed instances. An instance’ s ﬁll color represents its true label, its border color the predicted label. When the user hovers over an instance, a tooltip will display basic information about the instance (e.g., textual content, prediction scores). The panel also helps users determine which instances can be added to the activ ation view for further exploration. By hovering o ver one of the instance boxes, users can see the instance’ s activ ations. A ne w row 3 https://caffe2.ai/ 6 Fig. 5. Users can simultaneously visualize and compare multiple la yers’ activations . Shown here, from top to bottom, are: the second-to-last hidden la yer , the last hidden lay er , and the output layer . Their projected views sho w that as instances ﬂow through the network from input (top) to output (bottom), their activ ation patterns gradually become more dis- cernible and clustered (in projected view). is added to the activ ation view presenting the acti vation v alues for the selected instance. When users’ mouse leav es the box, the added row disappears. T o make a ro w persistent, users can simply click the box. In a similar fashion, users can add many ro ws by clicking the instance boxes. Then, they can compare acti vations for multiple instances and also compare those for instances with those for groups of instances. 4.4 Deploying A C T I V I S : Scaling to Industr y-scale Datasets and Models W e have deployed A C T I V I S on F acebook’ s machine learning platform. Dev elopers who want to use A C T I V I S for their model can easily do so by adding only a few lines of code, which instructs their models’ training process to generate information needed for A C T I V I S ’ s visu- alization. Once model training has completed, the FBLearner Flow interface provides the user with a link to A C T I V I S to visualize and explore the model. The link opens in a new web bro wser window . A C T I V I S is designed to work with classiﬁcation tasks that use deep neural network models. As complex models and large datasets are commonly used at Facebook, it is important that A C T I V I S be scalable and ﬂexible, so that engineers can easily adopt A C T I V I S for their mod- els. This section describes our approaches to building and deplo ying A C T I V I S on FBLearner, F acebook’ s machine learning platform. 4.4.1 Generalizing to Diff erent Models and Data T ypes One of our main goals is to support as many different kinds of data types and models as what FBLearner currently does (e.g., images, text, numerical). The key challenge is to enable existing deployed models to generate data needed for A C T I V I S with as little modiﬁcation as possi- ble. Without careful thinking, we would ha ve to add a large amount of model-speciﬁc code, to enable A C T I V I S to work with dif ferent models. T o tackle this challenge, we modularize the data generation process and deﬁne API functions for model dev elopers so that they can simply call them in their code, to acti vate A C T I V I S for their models. In prac- tice, for a de veloper to use A C T I V I S for a model, only three function calls are needed to be added (i.e., calling the pr epr ocess , pr ocess , and postpr ocess methods). F or example, developers can specify a list of variable nodes that users can explore, as an argument of the pr eprocess function (described in detail in Sect. 4.4.2). Furthermore, de velopers can lev erage user-deﬁned functions to specify ho w subsets are deﬁned in A C T I V I S , a capability particularly helpful for the more abstract, unstructured data types, such as image and audio. For example, de vel- opers may le verage the output of an object recognition algorithm that detects objects (e.g., cats, dogs) to deﬁne image subsets (e.g., subset of images that contain dogs). 4.4.2 Scaling to Large Data and Models A C T I V I S addresses visual and computational scalability challenges through multiple complementary approaches. Some of them were introduced in earlier sections (e.g., Sect. 4.2), such as A C T I V I S ’ s over - arching subset-based analysis, and the simultaneous use of neuron matrix (for individual neuron inspection) and pr ojected view (in case of many neurons). W e elaborate on some of our other ke y ideas below . Selective precomputation f or variable nodes of interest. Industry-scale models often consist of a large number operations (i.e., variable nodes), up to hundreds. Although any variable node can be visualized in the activ ation visualization, if we compute activ ations for all of them, it will require signiﬁcant computation time and space for storing the data. W e learned from our discussion with e xperts and design sessions with potential users that it is typical for only a few v ari- able nodes in a model to be of particular interest (e.g., last hidden layer in CNN). Therefore, instead of generating activ ations for all v ariable nodes, we let model dev elopers specify their o wn default set of v ariable nodes. The model developers can simply specify them as an argument of the prepr ocess method. T o explore variable nodes not included in the default set, a user can add them by specifying the variable nodes in the FBLearner Flow interface. Such nodes will then be available in the computation graph (highlighted in yellow). User -guided sampling and visual instance selection. For billion- scale datasets, it is undesirable to display all data points in the instance selection panel. Furthermore, we learned from our design sessions that researchers and engineers are primarily interested in a small number of representativ e examples, such as “test cases” that they ha ve curated (e.g., instances that should be labeled as Class ‘LOC’ by all well- performing models). T o meet such needs, by default, we present a sample of instances in the interface (around 1,000), which meet the practical needs of most Facebook engineers. In addition, users may also guide the sampling to include arbitrary examples that they specify (e.g., their test cases). Computing neuron activation matrix for large datasets. The main computational challenge of A C T I V I S is in computing the neuron activ ation matrix ov er large datasets. Here, we describe our scalable ap- proach whose time complexity is linear in the number of data instances. W e ﬁrst create a matrix S (#instances × #subsets) that describes all instance-to-subset mappings. Once a model predicts labels for in- stances, it produces an acti vation matrix A (#instances × #neurons) for each v ariable node. By multiplying these two matrices (i.e., S T A ), followed by normalization, we obtain a matrix containing all subsets’ av erage neuron activation v alues, which are visualized in the neuron matrix vie w . As the number of instances dominates, the abo ve computa- tion’ s time complexity is linear in the number of instances. In practice, this computation roughly takes the same amount of time as testing a model. W e hav e tested A C T I V I S with many datasets (e.g., one with 5 million training instances). A C T I V I S can now scale to an y data sizes that FBLearner supports (e.g., billion-scale or larger). 4.4.3 Implementation Details The visualization and interactions are implemented mainly with Re- act.js. 4 W e additionally use a few D3.js V4 components. 5 The computa- tion graph is visualized using Dagre, 6 a Ja vaScript library for rendering directed graphs. All the backend code is implemented in Python (in- cluding scikit-learn 7 for t-SNE) and the activ ation data generated from backend are passed to the interface using the JSON format. 4 https://facebook.github.io/react/ 5 https://d3js.org/ 6 https://github.com/cpettitt/dagre 7 http://scikit- learn.org/ 7 Fig. 6. V ersion 1 of A C T I V I S , showing an instance’s neuron activation strengths, encoded using color intensity . A main dra wback of this design was that users could only see the activ ations for a single instance at a time. Activation comparison across multiple instances was not possib le. Fig. 7. V ersion 2 of A C T I V I S , which uniﬁed instance- and subset-le vel activation visualization. This design was too visually ov erwhelming and did not scale to comple x models, as it allocated a matrix b lock f or each operator; a complex model could hav e close to a hundred operators. 5 I N F O R M E D D E S I G N T H R O U G H I T E R A T I O N S The current design of A C T I V I S is the result of twelve months of in ves- tigation and de velopment ef fort through many iterations. Unifying instances and subsets to facilitate comparison of mul- tiple instances. The ﬁrst version of A C T I V I S , depicted in Fig. 6, visualizes activ ations for all layers (each column group represents a single layer). A main drawback of this design is that users can only see the activ ations for a single instance at a time; they cannot compare multiple instances’ activations. While, for the subsets, we use an ap- proach similar to A C T I V IS ’ s design (each dot represents the av erage values for the subset), we encode acti vations for a gi ven instance using background color (here, in green). This means that the visualization cannot support acti vation comparison across multiple instances. This ﬁnding prompted us to unify the treatment for instances and subsets to enable comparison across them. Fig. 7 sho ws our next design iteration that implements this idea. Separating program and data to handle complex models. Al- though the updated version (Fig. 7) shows activ ations for multiple instances, which helps users explore more information at once, it be- comes visually too overwhelming when visualizing large, complex models. Some engineers expressed concern that this design might not generalize well to dif ferent models. Also, engineers are often interested in only a few variable nodes, rather than looking at many variable nodes. Therefore, we decided to separate the visualization of the model architecture and the activ ations for a speciﬁc variable node. Presenting 2-D pr ojection of instances. One researcher suggested that A C T I V I S should provide more detail for each neuron, in addition to averag e activ ations. Our ﬁrst solution was to present statistics (e.g., variance) and distrib utions for each neuron. Howe ver , some researchers cautioned that this approach could be misleading, because these sum- maries might not fully capture high-dimensional activ ation patterns. This prompted us to add the projected view (t-SNE), which enabled users to better explore the high-dimensional patterns (see Fig. 4). 6 C A S E S T U DI E S & U S AG E S C E N A R IO S T o better understand ho w A C T I V I S may help Facebook machine learn- ing users with their interpretation of deep neural netw ork models, we recruited three Facebook engineers and data scientists to use the latest version of A C T I V I S to explore text classiﬁcation models relev ant to their work. W e summarize key observations from these studies to high- light A C T I V I S ’ s beneﬁts (Sect. 6.1). Then, based on observ ations and feedback from these users and others who participated in our earlier participatory design sessions, we present example usage scenarios for ranking models to illustrate how A C T I V I S would generalize (Sect. 6.2). 6.1 Case Studies: Exploring T ext Classiﬁcation Models with A C T I V I S 6.1.1 P ar ticipants and Study Protocol W e recruited three Facebook engineers and data scientists to use our tools (their names substituted for priv acy): Bob is a software engineer who has expertise in natural language processing. He is experimenting with applying text classiﬁcation models to some Facebook experiences, such as for detecting intents from a te xt snippet, like understanding when the user may want to go somewhere [2]. For e xample, suppose a user writes “I need a ride” , Bob may w ant the models to discov er if the user needs transportation to reach the destination. He is interested in selecting the best models based on experimenting with many parameters and a fe w different models, as in [16, 19]. Dave is a relatively new software engineer . Like Bob, he is also working with text classiﬁcation models for user intent detection, but unlike Bob, he is more interested in preparing training datasets from large collections of databases. Car ol is a data scientist who holds a Ph.D. in the area of natural language processing. Unlike Bob and Dave, she is working with many dif ferent machine learning tasks, focusing on textual data. W e had a 60-minute session with each of the three participants. For the ﬁrst 20 minutes, we asked them a few questions about their typical workﬂows, and how they train models and interpret results. Then we introduced them to A C T I V I S by describing its components. The participants used their own datasets and models, available from FBLearner Flo w . After the introduction, the participants used A C T I V I S while thinking aloud. They also gav e us feedback on how we could further improve A C T I V I S . W e recorded audio during the entire session and video for the last part. 6.1.2 K ey Obser v ations W e summarize our key observations from interacting with the three participants into the following three themes, each highlighting how our tool helped them with the analysis. Spot-checking models with user -deﬁned instances and subsets. A C T I V I S supports ﬂexible subset deﬁnition. This feature was dev el- oped based on the common model development pattern where prac- titioners often curate “test cases” that they are familiar with, and for which they know their associated labels. For example, a text snippet “Let’ s take a cab” should be classiﬁed as a positiv e class of detecting transportation-related intent. Both Bob and Dave indeed found this feature useful (i.e., they also had their o wn “test cases”), and they appre- ciated the ability to specify and use their o wn cases. This would help them better understand whether their models are working well, by com- paring the activ ation patterns of their own instances with those of other instances in the positi ve or negati ve classes. Bob’ s usage of A C T I V I S and comments echo and support the need for subset-le vel visualization and exploration, currently inadequately supported by e xisting tools. Graph over view as a crucial entry point to model exploration. From our early participatory design sessions, we learned that A C - T I V I S ’ s graph overvie w was important for practitioners who work with complex models whose tasks only require them to focus on speciﬁc components of the models. Bob, who works with many dif ferent v aria- tions of text classiﬁcation models, has kno wn that the model he works with mainly uses con volution operations and was curious to see how the con volution works in detail. When he launched A C T I V I S , he ﬁrst examined the model architecture around the con volution operators us- ing the computation graph panel. He appreciated that he could see ho w 8 model training parameters are used in the model, which helped him dev elop better understanding of the internal working mechanism of the models. For e xample, he found how and where padding are used in the models by exploring the graph [7]. After he got a better sense about how the model function around the conv olution operators, he examined the acti vation patterns of the con volution output layer . This example sho ws that the graph overvie w is important for understanding complex architectures and locating parts that are rele vant to the user’ s tasks. In other words, the graph serves as an important entry point of Bob’ s analysis. Existing tools assuming user familiarity with models may not hold in real-world lar ge-scale deployment scenarios. V isual exploration of activation patterns f or evaluating model performances and f or debugging hints. One of the main components of A C T I V I S is the visual representation of activ ations that helps users easily recognize patterns and anomalies. As Carol interacted with the visualization, she gleaned a number of ne w insights, and a few hints for how to debug deep learning models in general. She interactively selected many dif ferent instances and added them to the neuron activa- tion matrix to see ho w they acti vated neurons. She found out that the activ ation patterns for some instances are unexpectedly similar , e ven though the textual content of the instances seem v ery different. Also, she spotted that some neurons were not activ ated at all. She hypothe- sized that the model could be further improv ed by changing some of the training parameters, so she decided to modify them to improve the model. While the neuron acti vation panel helps Carol ﬁnd models that can be further improved, Bob found some interesting patterns from the activ ation patterns for the con volution output layer . He quickly found out that some particular words are highly activ ated while some other words, which he thought can be highly activ ated, do not respond much. This helped him identify words that are potentially more effecti ve for classiﬁcation. The examples above demonstrate the power of visual exploration. A C T I V I S helps users recognize patterns by interacting with instances and instance subsets they are familiar with. 6.2 Usage Scenario: Exploring Ranking Models As there are many potential uses for A C T I V I S at Facebook, we also discussed with a number of researchers and engineers at dif ferent teams to understand how they may adopt A C T I V I S . Below , we present a usage scenario of A C T I V I S for exploring ranking models, based on our discussion. W e note the scenario strongly resembles others that we hav e discussed so far; this is encouraging because enabling A C T I V I S to generalize across teams and models is one of our main goals. Alice is a research scientist working with ranking models, one of the important machine learning tasks in industry . The ranking models can be used to recommend rele vant content to users by analyzing a large number of numerical features extracted from databases [5, 15]. Alice is experimenting with deep neural network models to ev aluate how these models work for a number of ranking tasks. She often performs subset- based analysis when examining model performance, such as deﬁning subsets based on categories of page content. Subset-based analysis is essential for Alice, because she works with very large amount of training data (billions of data points, thousands of features). A C T I V I S ’ s instance-based exploration feature is not yet helpful for Alice, since she is still familiarizing herself with the data and has not identiﬁed instances that she would like to use for spot-checking the model. In A C T I V I S , Alice is free to use either or both of instance- and subset- based exploration. For ne w , unfamiliar datasets, Alice ﬁnds it much easier to start her analysis from the high lev el, then drill down into subsets, using attributes or features. Alice has trained a fully-connected deep neural network model with some default parameters. When she launches A C TI V I S , she ﬁrst examines the output layer to see ho w the activ ation patterns for the positiv e and negativ e classes may be different. T o her surprise, they look similar . Furthermore, by inspecting the neuron activ ation matrix view , she realizes that many neurons are not activated at all — their activ ation values are close to 0. This signals that the model may be using more neurons than necessary . So, she decided to train additional models with different parameter combinations (e.g., reduce neurons) to reliev e the above issue. The performances of some models indeed improve . Happy with this improvement, Alice mov es on to perform deeper analysis of the trained models. She ﬁrst creates a number of instance subsets by using features . She utilizes 50 top features known to be important for ranking. For categorical features, she deﬁnes a subset for each category v alue. For numerical features, she quantizes them into a small number of subsets based on the feature v alue distribution. A C T I V I S ’ s neuron activ ation matrix view visualizes how the subsets that Alice has deﬁned are activ ating the neurons. Maximizing the matrix vie w to take up the entire screen (and minimizing the computation graph view), Alice visually explores the activ ation matrix and identiﬁes a number of informativ e, distinguishing acti vation patterns. For example, one neuron is highly activ ated for a single subset, and much less so for other subsets, suggesting that neuron’ s potential predicti ve po wer . With A C T I V I S , Alice can train models that perform well and understand how the models capture the structure of datasets by examining the relationships between features and neurons. 7 D I S C U S S I O N A N D F U T U R E W O R K V isualizing gradients. Examining gradients is one of the ef fective ways to explore deep learning models [10, 18]. It is straightforward to extend A C T I V I S to visualize gradients by replacing acti vations with gradients. While activ ation represents forward data ﬂow from input to output layers, gradient represents backward ﬂo w . Gradients would help dev elopers to locate neurons or datasets where the models do not perform well. Real-time subset deﬁnition. For A C T I V I S to work with a new subset, it needs to load the dataset into RAM to check which instances satisfy the subset’ s conditions. Currently , it is not of high priority for the above process to be performed in real time, because users often hav e pre-determined subsets to explore. W e plan to integrate dynamic ﬁltering and searching capabilities, to speed up both subset deﬁnition and instance selection. A utomatic discovery of interesting subsets. W ith A C T I V I S , users can ﬂexibly specify subsets in inﬁnitely many ways. One of the engi- neers commented that A C T I V I S could help suggest interesting subsets for exploration, based on heuristics or measures. For example, for text datasets, such a subset could include phrases whose activ ation patterns are very similar or dif ferent to those for a given instance or class. Supporting input-dependent models. An interesting research di- rection is to e xtend A C T I V I S to support models that contain variable nodes whose number of neurons changes depending on the input (e.g., the number of words in a document), and to study the relationships between neurons and subsets for such cases. Understanding how A C T I V I S informs model training. W e plan to conduct a longitudinal study to better understand A C T I V I S ’ s impact on F acebook’ s machine learning workﬂo ws, such as how A C T I V I S may inform the model training process. For e xample, a sparse neuron matrix may indicate that a model is using more neurons than needed, which could inform engineers on their decisions for hyperparameter tuning. 8 C O N C L U S I O N W e presented A C T I V I S , a visual analytics system for deep neural net- work models. W e conducted participatory design session with ov er 15 researchers and engineers across many teams at Facebook to identify key design challenges, and based on them, we distilled three main de- sign goals: (1) unifying instance- and subset-lev el exploration; (2) tight integration of model architecture and localized acti vation inspection; and (3) scaling to industry-scale data and models. A C T I V I S has been deployed on Facebook’ s machine learning platform. W e presented case studies with Facebook engineers and data scientists, and usage scenarios of how A C T I V I S may be used with different applications. A C K N OW L E D G M E N T S W e thank Facebook Applied Machine Learning Group, especially Y angqing Jia, Andre w T ulloch, Liang Xiong, and Zhao T an for their ad- vice and feedback. This work is partly supported by the NSF Graduate Research Fellowship Program under Grant No. DGE-1650044. 9 R E F E R E N C E S [1] M. Abadi, A. Agarwal, P . Barham, E. Bre vdo, Z. Chen, C. Citro, G. S. Cor- rado, A. Davis, J. Dean, M. De vin, S. Ghemawat, I. Goodfello w , A. Harp, G. Irving, M. Isard, Y . Jia, R. Jozefowicz, L. Kaiser, M. Kudlur , J. Lev- enberg, D. Man ´ e, R. Monga, S. Moore, D. Murray , C. Olah, M. Schuster , J. Shlens, B. Steiner , I. Sutske ver, K. T alwar, P . T ucker, V . V anhoucke, V . V asude van, F . V i ´ egas, O. V inyals, P . W arden, M. W attenberg, M. Wick e, Y . Y u, and X. Zheng. T ensorFlow: Large-scale machine learning on het- erogeneous distributed systems. arXiv preprint , 2016. [2] A. Abdulkader, A. Lakshmiratan, and J. Zhang. Intro- ducing DeepT ext: Facebook’ s text understanding engine. https://code.facebook.com/posts/181565595577955/ introducing- deeptext- facebook- s- text- understanding- engine/ , 2016. Accessed: 2017-06-26. [3] S. Amershi, M. Chickering, S. M. Drucker , B. Lee, P . Simard, and J. Suh. ModelT racker: Redesigning performance analysis tools for machine learn- ing. In Pr oceedings of the 33rd Annual ACM Confer ence on Human F actors in Computing Systems (CHI) , pages 337–346. A CM, 2015. [4] P . Andrews, A. Kalro, H. Mehanna, and A. Sidorov . Productionizing machine learning pipelines at scale. In ML Systems W orkshop at the 33rd International Confer ence on Machine Learning (ICML) , 2016. [5] L. Backstrom. Serving a billion personalized news feeds. In 12th Interna- tional W orkshop on Mining and Learning with Graphs at the 22nd A CM SIGKDD International Confer ence on Knowledge Discovery and Data Mining . A CM, 2016. A v ailable at https://youtu.be/Xpx5RYNTQvg . [6] J. Ber gstra, O. Breuleux, F . Bastien, P . Lamblin, R. Pascanu, G. Desjardins, J. Turian, D. W arde-Farle y , and Y . Bengio. Theano: A CPU and GPU math expression compiler . In Pr oceedings of the Python for Scientiﬁc Computing Confer ence (SciPy) , 2010. [7] D. Britz. Implementing a CNN for text classiﬁcation in T ensorFlow . http://www.wildml.com/2015/12/ implementing- a- cnn- for- text- classification- in- tensorflow , 2015. Accessed: 2017-06-26. [8] M. Brooks, S. Amershi, B. Lee, S. M. Drucker , A. Kapoor, and P . Simard. FeatureInsight: V isual support for error-dri ven feature ideation in text clas- siﬁcation. In IEEE Conference on V isual Analytics Science and T echnology (V AST) , pages 105–112. IEEE, 2015. [9] J. Choo, H. Lee, J. Kihm, and H. Park. iV isClassiﬁer: An interacti ve visual analytics system for classiﬁcation based on supervised dimension reduction. In IEEE Symposium on V isual Analytics Science and T echnology (V AST) , pages 27–34. IEEE, 2010. [10] S. Chung, C. Park, S. Suh, K. Kang, J. Choo, and B. C. Kwon. ReV ACNN: Steering con volutional neural network via real-time visual analytics. In Futur e of Interactive Learning Machines W orkshop at the 30th Annual Confer ence on Neural Information Pr ocessing Systems (NIPS) , 2016. [11] P . Co vington, J. Adams, and E. Sar gin. Deep neural networks for Y ouT ube recommendations. In Proceedings of the 10th A CM Confer ence on Rec- ommender Systems , pages 191–198. A CM, 2016. [12] J. Dunn. Introducing FBLearner Flow: Facebook’ s AI back- bone. https://code.facebook.com/posts/1072626246134461/ introducing- fblearner- flow- facebook- s- ai- backbone/ , 2016. Accessed: 2017-06-26. [13] M. Gleicher . Explainers: Expert explorations with crafted projections. IEEE T ransactions on V isualization and Computer Graphics , 19(12):2042– 2051, 2013. [14] A. W . Harley . An interactive node-link visualization of con volutional neural networks. In Proceedings of the 11th International Symposium on V isual Computing , pages 867–877, 2015. [15] X. He, J. Pan, O. Jin, T . Xu, B. Liu, T . Xu, Y . Shi, A. Atallah, R. Herbrich, S. Bowers, and J. Q. Candela. Practical lessons from predicting clicks on ads at F acebook. In Proceedings of the 8th International W orkshop on Data Mining for Online Advertising , pages 1–9. A CM, 2014. [16] A. Joulin, E. Grave, P . Bojano wski, and T . Mikolov . Bag of tricks for efﬁcient te xt classiﬁcation. arXiv pr eprint arXiv:1607.01759 , 2016. [17] M. Kahng, D. Fang, and D. H. P . Chau. V isual exploration of machine learning results using data cube analysis. In Proceedings of the W orkshop on Human-In-the-Loop Data Analytics at the A CM SIGMOD International Confer ence on Management of Data . A CM, 2016. [18] A. Karpathy . Con vnetjs. http://cs.stanford.edu/people/ karpathy/convnetjs/ , 2016. Accessed: 2017-06-26. [19] Y . Kim. Con volutional neural networks for sentence classiﬁcation. In Pr oceedings of the 2014 Conference on Empirical Methods in Natural Language Pr ocessing (EMNLP) , 2014. [20] J. Krause, A. Perer, and E. Bertini. Infuse: Interactive feature selection for predictiv e modeling of high dimensional data. IEEE T ransactions on V isualization and Computer Graphics , 20(12):1614–1623, 2014. [21] J. Krause, A. Perer, and K. Ng. Interacting with predictions: V isual inspection of black-box machine learning models. In Pr oceedings of the 2016 CHI Conference on Human F actors in Computing Systems , pages 5686–5697. A CM, 2016. [22] J. Krause, A. Perer , and H. Stavropoulos. Supporting iterati ve cohort con- struction with visual temporal queries. IEEE T ransactions on V isualization and Computer Graphics , 22(1):91–100, 2016. [23] T . K ulesza, M. Burnett, W .-K. W ong, and S. Stumpf. Principles of e xplana- tory debugging to personalize interactive machine learning. In Pr oceedings of the 20th International Confer ence on Intelligent User Interfaces (IUI) , pages 126–137. A CM, 2015. [24] T . K ulesza, S. Stumpf, W .-K. W ong, M. M. Burnett, S. Perona, A. K o, and I. Oberst. Why-oriented end-user debugging of naive Bayes text classiﬁcation. A CM T ransactions on Interactive Intelligent Systems (T iiS) , 1(1):2, 2011. [25] X. Li and D. Roth. Learning question classiﬁers. In Pr oceedings of the 19th International Confer ence on Computational Linguistics , pages 1–7. Association for Computational Linguistics (A CL), 2002. [26] M. Liu, J. Shi, Z. Li, C. Li, J. Zhu, and S. Liu. T owards better analysis of deep con volutional neural networks. IEEE T ransactions on V isualization and Computer Graphics , 23(1):91–100, 2017. [27] L. v . d. Maaten and G. Hinton. V isualizing data using t-SNE. Journal of Machine Learning Resear ch , 9(Nov):2579–2605, 2008. [28] H. B. McMahan, G. Holt, D. Sculley , M. Y oung, D. Ebner , J. Grady , L. Nie, T . Phillips, E. Davydov , D. Golovin, S. Chikkerur, D. Liu, M. W attenberg, A. M. Hrafnkelsson, T . Boulos, and J. Kubica. Ad click prediction: A vie w from the trenches. In Pr oceedings of the 19th ACM SIGKDD International Confer ence on Knowledge Discovery and Data Mining , pages 1222–1230. A CM, 2013. [29] K. Patel, N. Bancroft, S. M. Drucker , J. Fogarty , A. J. Ko, and J. Landay . Gestalt: Integrated support for implementation and analysis in machine learning. In Proceedings of the 23nd Annual ACM Symposium on User Interface Softwar e and T echnology (UIST) , pages 37–46. A CM, 2010. [30] K. Patel, J. Fogarty , J. A. Landay , and B. Harrison. In vestigating statistical machine learning as a tool for softw are dev elopment. In Pr oceedings of the SIGCHI Confer ence on Human F actors in Computing Systems , pages 667–676. A CM, 2008. [31] P . E. Rauber , S. G. Fadel, A. X. Falcao, and A. C. T elea. V isualizing the hidden activity of artiﬁcial neural networks. IEEE Tr ansactions on V isualization and Computer Graphics , 23(1):101–110, 2017. [32] D. Ren, S. Amershi, B. Lee, J. Suh, and J. D. Williams. Squares: Sup- porting interactiv e performance analysis for multiclass classiﬁers. IEEE T ransactions on V isualization and Computer Graphics , 23(1):61–70, 2017. [33] M. T . Ribeiro, S. Singh, and C. Guestrin. Why should I trust you?: Explaining the predictions of any classiﬁer . In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining , pages 1135–1144. A CM, 2016. [34] D. Smilko v , S. Carter , D. Sculle y , F . B. V iegas, and M. W attenberg. Direct- manipulation visualization of deep networks. In W orkshop on V isualization for Deep Learning at the 33rd International Conference on Machine Learning (ICML) , 2016. [35] D. Smilko v , N. Thorat, C. Nicholson, E. Reif, F . B. V i ´ egas, and M. W atten- berg. Embedding Projector: Interactiv e visualization and interpretation of embeddings. In W orkshop on Interpretable Machine Learning in Comple x Systems at the 30th Annual Conference on Neural Information Pr ocessing Systems (NIPS) , 2016. [36] F .-Y . Tzeng and K.-L. Ma. Opening the black box: Data dri ven visual- ization of neural networks. In IEEE V isualization , pages 383–390. IEEE, 2005. [37] S. V an Den Elzen and J. J. V an W ijk. BaobabV iew: Interactive construction and analysis of decision trees. In IEEE Conference on V isual Analytics Science and T echnology (V AST) , pages 151–160. IEEE, 2011. [38] J. Y osinski, J. Clune, A. Nguyen, T . Fuchs, and H. Lipson. Understanding neural networks through deep visualization. In W orkshop on V isualization for Deep Learning at the 33rd International Conference on Machine Learning (ICML) , 2016. 10

ActiVis: Visual Exploration of Industry-Scale Deep Neural Network Models

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment