Infrastructure for Usable Machine Learning: The Stanford DAWN Project

Infrastructure for Usable Machine Learning: The Stanford D A WN Project Peter Bailis, Kunle Oluk otun, Christopher R ´ e, Matei Zaharia Stanford D A WN Project http://dawn.cs.stanford.edu/ 21 May 2017 Abstract Despite incredible recent adv ances in machine learning, buildi ng machine learning applications remains prohibiti vely time-consuming and expensi ve for all b ut the best-trained, best-funded engineering organizations. This expense comes not from a need for new and improv ed statistical models b ut instead from a lack of systems and tools for supporting end-to-end machine learning application dev elopment, from data preparation and labeling to productionization and monitoring. In this document, we outline opportunities for infrastructure supporting usable, end-to-end machine learning applications in the context of the nascent D A WN (Data Analytics for What’ s Next) project at Stanford. 1 Intr oduction and D A WN Pr oject Goals A Gilded Dawn f or Machine Learning and Artiﬁcial Intelligence. W e are in the golden age of machine learning and artiﬁcial intelligence. Sustained algorithmic advances coupled with the av ailability of massive datasets and fast parallel computing ha ve led to breakthroughs in applications that would hav e been considered science ﬁction ev en a fe w years ago. Over the past ﬁv e years, voice-dri ven personal assistants ha ve become commonplace, image recognition systems hav e reached human quality , and autonomous v ehicles are rapidly becoming a reality . Gi ven these successes, there is no doubt that machine learning will transform most areas of our economy and society . Businesses, governments and scientiﬁc labs are clamoring to see how machine learning can tackle their problems. Unfortunately , although ne w machine learning (ML) applications are impressiv e, they are very expensiv e to build. Every major ne w ML product, such as Apple Siri, Amazon Alexa, or T esla Autopilot, requires lar ge and costly teams of domain e xperts, data scientists, data engineers, and De vOps. Even within or ganizations that ha ve successfully employed ML, ML remains a rare and expensi ve commodity reserved for a small subset of teams and applications. Moreover , many ML models require huge amounts of training data, and obtaining such training data is highly challenging in many application domains. For e xample, e ven though an ML algorithm may achie ve human accuracy in identifying pictures of dogs on the Internet (thanks to millions of av ailable labeled images), the algorithm will not achiev e the same accuracy identifying cancer in medical images unless an organization expends years of human expert time creating labeled training data. Finally , once an ML product is built, it requires substantial effort to deploy , operate, and monitor at scale, especially if critical b usiness processes will rely on it. For example, how can a company make guarantees about its new automated diagnosis system, or monitor that the system is performing as expected in practice? In summary , ML technology is at a similar stage to early digital computers, where armies of white-clad technicians labored to keep a small handful of machines operating in production: ML technology clearly has tremendous potential, but today , ML-powered applications are f ar too expensi ve to build for most domains. Our Response: The D A WN Stack for End-to-End ML Development. T o address this potential, our group at Stanford is beginning a ne w , ﬁv e-year research project to design systems infrastructure and tools for usable mac hine learning , called DA WN (Data Analytics for What’ s Next). Our goal is not to improve ML algorithms, which are almost always “good enough” for man y important applications, but instead to make ML usable so that small teams of non-ML experts can apply ML to their problems, achie ve high-quality results, and deplo y production systems that can be used in critical applications. Whereas today’ s ML successes hav e required large and costly teams of statisticians and engineers, we would like to make similar successes attainable for domain e xperts, such as a medical lab optimizing clinical procedures or a business group applying ML to its domain-speciﬁc problems. Major improvements in the usability of machine learning are mandatory to realize its potential. W e ask: How can we enable anyone with domain e xpertise to build their own pr oduction-quality data pr oducts (without requiring a team of PhDs in machine learning, big data, or distributed systems, and without understanding the latest hardw are) ? At ﬁrst, our goal of usable machine learning might appear too ambitious—how can we expect one or two domain experts to match work that today requires teams of hundreds? Our observation is that such rev olutions “democratizing” computing technology have happened before. For example, although textual search is a complex ﬁeld requiring sophisticated algorithms and data structures, today , search is ubiquitous. Non-expert users rely on search engines e very day , and any de veloper can add search to an application by linking a library such as Lucene or Solr . These libraries of fer good enough results out of the box—as well as simple enough tuning options—to be usable by non-experts. Similarly , in the 1970s, relational databases rev olutionized data management. Before these modern databases, organizations b uilt computer applications using low-le v el code that had to directly manipulate on-disk data structures and implement complex processing algorithms. Databases encapsulated this complexity behind simple interfaces that an y de veloper can use, and that most users can even tune without understanding system internals. As a result, organizations need to spend far less ef fort to build a data-intens iv e application, and, instead of running one or two custom, le gacy database-back ed applications, many or ganizations run thousands of database-backed applications e v ery day . W ith history as a guide, our key observation is that most of the effort in industrial ML applications is not spent in devising new learning algorithms or models but is instead spent in other areas that are especially in need of better tools and infrastructure: data pr eparation , featur e selection and extraction , and pr oductionization (cf. [23]). Data preparation means acquiring, producing and cleaning enough training data to feed into an ML algorithm: without this quality data, ML algorithms fall ﬂat. Feature selection and e xtraction means identifying the data characteristics and behaviors of interest: what aspects of data are most important, and what would a domain expert implicitly or e xplicitly say about a gi ven data point? Productionization means deploying, monitoring and deb ugging a rob ust product: how can an or ganization check that the ML algorithm deployed is working, deb ug issues that arise, and make the system robust to changes in data? In the large teams that b uild ML products such as Siri, most of the individuals work on data preparation, feature selection and extraction, and productionization, as well as the distributed systems infrastructure to driv e these tasks at scale, not on training ML models. Howe ver , thus far , these critical steps in the ML product pipeline hav e receiv ed far less attention than model training and new model tweaks—both from the research community and the open source software community—and, based on our prior work in this area, we see substantial opportunity to greatly reduce the effort required by these tasks via the de velopment of new softw are tools and systems infrastructure. 2 The D A WN Pr oject and Systems Resear ch in Usable ML T o capitalize on the opportunity represented by our project goals and drawing on our prior experience in building large-scale analytics systems such as Apache Spark [27], Apache Mesos [11], Delite [5] and DeepDive [6], we are spending the next ﬁ ve years researching and b uilding tools to address these end-to-end problems in usable machine learning. By combining expertise spanning algorithms to systems to hardware, working closely with a small team of collaborati ve partners in some of the most challenging data-intensi ve domains, and producing and validating high-quality research prototypes in real application scenarios, we plan to tackle D A WN’ s goals at multiple stages of the ML lifec ycle and lev els of the systems stack (see Figure 1). Our design philosophy in the D A WN stack centers around three main tenets: a) T ar get end-to-end ML workﬂo ws. ML-powered application de velopment consists of far more than model training. As a result, today , the bulk of challenges in developing ne w ML-po wered applications are not in model training but are instead in data preparation, feature selection/e xtraction, and productionization (serving, monitoring, debugging, etc). Systems should target the entire, end-to-end ML w orkﬂow . b) Empower domain experts. The highest-impact ML applications will hav e to be dev eloped by domain experts, not ML experts. Howe ver , today , few systems allo w these domain experts to encode their domain kno wledge so it can be lev eraged via automation and machine learning models. Systems should empower users who are not ML e xperts, by providing them tools for onerous tasks such as labeling, feature engineering and data augmentation. c) Optimize end-to-end. Execution speed is important in ML both for model training, where it allo ws building better models (e.g., through more input data or wider parameter search), and for production serving, where it allo ws deplo ying Data Acquisition Feature Extracti on Model T raining Produ ctionizing Interfaces Algorithms Systems Hardwar e … Snorkel DeepDive MacroBase (Streamin g Data) NoScope (Video) AutoRec , SimDex (Recommen dation) Data Fusion Mulligan ( SQL+graph+ML ) CPU GPU FPGA Cluster Mobile New Hardware: FuzzyBit , Plasticine CGRA End - to - End Compile rs: Weld, Delite ModelQA ModelSnap Figure 1: The D A WN Stack for Usable Machine Learning: In the Stanford D A WN project, we are addressing the need for infrastructure for usable ML by building a research stack of software and tools spanning each stage of the ML lifecycle and abstractions from new interfaces to new hardware. W e believ e this parallel end-to-end and interfaces-to-hardware approach is necessary to fully realize the potential of more usable ML. these models cost-ef fectiv ely in practice. Howe ver , today’ s ML tools often perform 10–100 × below hardware limits, requiring expensi ve softw are engineering to build production systems. Our early results show that by architecting tools to optimize ML pipelines end-to-end, and leveraging statistical properties of the algorithms such as tolerance to inexact ex ecution, we can accelerate ML applications by 10–100 × on both current and emerging hardw are. In summary , we belie ve that systems that tar get full application de velopment, real non-e xpert needs, and optimize all lev els of the software and hardw are stack are critical to fully realizing the potential of usable ML. D A WN Resear ch Dir ections T o embody these principles, we are pursuing research along several directions. W e provide an overvie w below and provide citations to early results in each area: I) New Interfaces to ML. T o empo wer domain experts who are not ML experts, we need to de velop ne w interfaces to ML technologies, from model speciﬁcation to model monitoring: a) Easing model speciﬁcation via observational ML (data preparation, feature engineering) : Can we build ML systems that learn high-quality models simply by observing domain experts? For example, when labeling data, domain experts often apply a set of heuristic rules to determine the label for a given data point (e.g., if the phrase “preheat the oven” appears repeatedly in a document collection, label the collection as likely pertaining to cooking). By providing simple interfaces for these users to specify their beliefs about data in rule form (e.g., regular expressions), we can combine a small number of these rules and apply them to massiv e datasets. W e use unsupervised ML to denoise the rules and learn their accuracies, and train supervised ML models with the resulting probabilistic labels, in a ne w paradigm we call data programming [18]. W e hav e obtained promising early results with a ne w system called Snorkel [17] that produces high-quality models from lo w-quality rules. W e are also pursuing new lines of research in weakly supervised ML to impro ve model quality without manual user intervention, such as feature disco very [24, 25] and structure learning [2]. b) Explaining r esults to humans (feature engineering, productionization) : Giv en an ML deplo yment, ho w can we explain ML model results to humans? As models are used in increasingly business-critical applications, the ability to explain the prediction of a classiﬁcation decision in a human-interpretable manner is critical. This is challenging: large, comple x models deli ver highly accurate results b ut are f ar from interpretable or explainable. One promising approach is that ML predictions are not made in a vacuum: each user has tens to hundreds of attributes that can be used to segment, correlate, and contextualize predictions (e.g., users running version v47 of the software are abnormally likely to be ﬂagged as spammers). Preliminary results with even basic correlation-based analyses [4] hav e been extremely promising, and we plan to expand this suite of functionality to other domains, including textual, visual, and time-series data [21]. c) Debug ging and observability (feature engineering, productionization) : ML model “drift, ” in which phenomena ev olv e but models do not, can be catastrophic: for example, Google Flu Trends, which used common search terms as a signal for inﬂuenza pre v alence, was prominently featured in a 2008 Natur e paper , only to later miss the peak of the 2013 ﬂu season by an extremely large mar gin [13]. As ML models are deployed, the y must be monitored and updated. W e are interested in de veloping and deploying inexpensi ve, useful tools for monitoring the quality of ML model predictions, especially as ne w models are released to potentially heterogeneous user and device platforms. Subsequently surfacing and correcting for deviations from expected beha vior will require advances in both interf aces and model training. d) Assessing and enriching data quality (data preparation, feature engineering) : High-quality models are produced by consuming and training on a di verse diet of high-quality data. As more and more data sources are captured in digital form, integrating structured (e.g., data warehouse, CSV) and unstructured (e.g., text, image, and time-series) data will become increasingly important to extract signal in model construction. Given a menu of div erse data sources, which sources can be most trusted? Which sources should be augmented and enriched, either via additional human labeling or by augmentation with existing kno wledge bases? Our early results [20] indicate that, if we start to explicitly model the quality of each data source, then we can automatically identify the data sources that are most in need of enrichment, thus reducing the cost of data cleaning and acquisition. II) End-to-End ML Systems. W e belie ve that in many important domains, it is possible to design end-to-end systems that encapsulate the whole ML workﬂo w and hide internals from users, similar to a search engine or a SQL database. W e are pursuing se veral such areas: a) Classiﬁcation over massive str eams (data preparation, feature engineering, productionization) : Classiﬁcation and ranking are core operators behind e very modern search engine. Howe ver , how can we go beyond classifying static text or images in batch and start to classify sensor data, time series, and other data streams as they change, in real-time and at scales of tens of millions of e vents per second? W e are interested in dev eloping high-quality but extremely optimized operators for classiﬁcation and aggregation of div erse data, combining feature transformation, classiﬁcation, and aggregation o ver streams. Preliminary prototyping in the MacroBase engine [3] has rev ealed that a small number of operators can be reused at scale across domains including sensors from manufacturing, mobile analytics, and automoti ves. W e are interested in expanding this functionality to domains such as video processing, where a $ 0.50 image sensor currently requires a $ 1200 graphics card to process in real time; e xploiting classic systems techniques including caching, incremental memoization, branch-and-bound pruning, and adapti ve specialization (e.g., training a scene-speciﬁc object detector) within a uniﬁed systems frame work and “classiﬁer toolbox” will enable line speed without compromising accuracy [8, 12]. b) P ersonalized r ecommendations (feature engineering, productionization) : Personalization is key to many popular ML-po wered applications, and the literature is replete with algorithms for personalized recommendation. Howe ver , despite the simple inputs and outputs to recommendation engines, practitioners still ha ve to build each engine from scratch, chaining together lo w-le vel algorithms and tools. W e plan to build a general end-to-end platform for recommendation, including a simple interface for inputs (e.g., clicks or ratings from users), automatic model tuning, and automatic serving, monitoring, and model retraining. Our early results suggest that it is possible to perform all these tasks incrementally as inputs arri ve, creating a “plug-and-play” personalized recommendation system where users can simply input user interactions and request up-to-date recommendations in real time. c) Combining inference and actuation (feature engineering, data preparation, productionization) : If ML is pow- erful because it deliv ers broader insights and decision-making ability , then how do we actually automate the decision-making process? T oday , this combination of inference/prediction (i.e., predicting what will occur) and actuation/decision-making (i.e., taking action based on a prediction) is almost always performed by separate systems (often an automated inference engine and a human “decider”), except in a small handful of applications such as autonomous vehicles. Ho w do we integrate actuation and decision-making as a ﬁrst-class citizen in ML pipelines? With the advent of the automated API, decision-making has never been easier (e.g., send a POST request to an automated control center); what is missing is the “glue” required to integrate ML and these automated APIs as well as the logic for reasoning about the combination. W e are dev eloping a series of integrations for this kind of actuated inference, from alerting and notiﬁcations to ph ysical manipulation of the en vironment (e.g., send an Internet-po wered Roomba to verify the presence of a student in the of ﬁce). d) Unifying SQL, Graphs, and Linear Algebra (productionization) : ML product pipelines consist of a diverse set of operations including, SQL, graph computations, and ML training and ev aluation. Unfortunately , most execution engines optimize for only one of these computational patterns; how can we build an engine that is optimized for each of them? Perhaps surprisingly , many of these patterns can be cast as an instance of the classic relational join operator , and PI R ´ e recently de veloped an asymptotically faster join operator [14]. In practice, we hav e found that, when combined with SIMD-optimized e xecution this optimized join is f ast, matching optimized engines for each of SQL and graphs [1]. What about ML? W e belie ve it is possible to we can do the same for man y ML workloads, by e xtending these theoretical results to classic ML patterns including linear algebra operations and sparse matrix operations [9]. By combining these operators within a single engine, we can optimize end-to-end pipelines of SQL, graph computations, linear algebra, and more. III) New Substrates f or ML. T raining and deploying ML quickly and in a cost-effecti ve manner requires the develop- ment of new computational substrates, from language support to distrib uted runtimes and accelerated hardware. a) Compilers for end-to-end optimization (feature engineering,productionization) : Modern ML applications are comprised of an increasingly div erse mix of libraries and systems such as T ensorFlow , Apache Spark, scikit-learn, and Pandas. Even if each of these libraries is optimized in isolation, real pipelines combine multiple libraries, so production use at scale usually requires a software engineering team to rewrite the whole application in low-le vel code. W e are de veloping W eld [15], a new runtime that can optimize data-intensive code acr oss different libraries and functions to automatically generate fast implementations either for ML training or serving. Perhaps surprisingly , W eld can already accelerate modern data analysis tools such as Apache Spark, Pandas and T ensorFlo w by 10 × by optimizing across the operators within them, and can accelerate cross-library workloads by up to 30 × . Moreover , W eld is designed for portability to heterogeneous hardware; we will therefore also be able to run these libraries on GPUs, mobile processors, or FPGAs. Apart from W eld, we are also developing new compiler technology for ML in Delite [5], a framework for de veloping domain-speciﬁc languages, and Splinter [26], a priv acy-preserving data analysis platform. b) Reduced pr ecision and inexact pr ocessing (productionization) : ML operators are stochastic and probabilistic; ho w can we lev erage this fact in our ex ecution substrates? Our earlier work (HogW ild! [19]) was the ﬁrst to sho w that asynchrony in e xecution can actually improv e con v ergence time, and the basic algorithms are no w running daily in production at companies including Google, Microsoft, and other large-scale technology companies. Howe ver , we belie ve it is possible to go e ven further , lowering po wer and increasing performance by lev eraging stochasticity at the bit level : we can designing chips that are specialized for ML, operating at lower precision and allowing f abrication at high yield and ex ecution at extremely lo w po wer . Our recent theoretical results illustrate low-precision e xecution is possible without compromising accurac y [7], with promising results in practice [22]. c) Reconﬁgurable har dware for cor e k ernels (feature engineering, productionization) : Computer architects com- monly proclaim that year N +1 is the year of the FPGA. Ho wever , FPGAs remain notoriously dif ﬁcult to program and expensi ve to utilize. Ne vertheless, ML may be a turning point: in 2017, compute is an increasingly crit- ical bottleneck for data-hungry ML analyses, both at training time and inference time. Giv en the impending collision of CPUs and on-chip FPGAs, reconﬁgurable hardware with high-le v el programmability functionality will be increasingly important. In addition, we are developing new substrates in the form of reconﬁgurable architectures for easily specifying modular and efﬁcient compute k ernels [16] that will be critical to realizing performance-per-w att, especially as the upper lev els of the software stack continue to e volve. d) Distributed runtimes (productionization) : As models continue to grow , scale-out execution of training and inference is becoming increasingly important. Combining ML with distributed systems is a real headache: is a model misbehaving because it is distributed to too many servers, or because it is poorly speciﬁed? What’ s the optimal amount of asynchrony? What does the optimal distributed training frame work really look lik e? W e are extremely interested in harnessing both intra-de vice (e.g., FPGA, GPU, vectorized) and inter-de vice (e.g., cluster compute) parallelism to consume all possible resources (i.e., automatically and dynamically of ﬂoading to different hardware within a cluster). And, perhaps surprisingly , some of our recent theory [10] shows that we can explicitly and automatically tune the underlying learning algorithms for optimal execution on a given set of hardware and a gi ven computer network. There are many remaining questions here: ho w can distributed asynchronous execution beneﬁt us at inference time (i.e., in model serving)? Can we lev erage ne w computational substrates like serverless computing (e.g., Amazon Lambda) to further scale-out inferential procedures? What is the uniﬁed programming model for distributed ex ecution? W e plan to build tools (and integrate with existing framew orks such as T ensorFlo w and Spark) to answer these questions. Research Roadmap and Success Metrics The D A WN research roadmap represents an exciting potential for the future of systems and ML research and practice. W ithin DA WN, we will pursue the above research objectives in collaboration with target research partners and on- and of f-campus use cases. Our primary success metric will be usability , comprising i ) the time and cost to specify an ML application (including data sources and features of interest), ii ) the time and cost to ex ecute the application in production (including hardware and human resources to monitor the ML models), and iii. ) the beneﬁt to the end-user expert. W e intend to make all our work a v ailable as open source software, enabling practitioners throughout industry and science to try our ideas and beneﬁt from D A WN’ s successes. Our focus on end-to-end real-world problems lends itself naturally to integration between components of the D A WN project stack. Simply solving one problem, such as observational ML, without accounting for the hardware costs at the opposite end of the stack, will lead to sub-optimal results according to the abov e metrics (especially ii ) ) in an end-to-end validation of the D A WN project output. Thus, we plan regular hackathons and contact points (via students and collaborators) at multiple lev els of the stack, from interfaces to pipelines to substrates. W e believe the goals and research questions we ha ve outlined here are of broad technical merit to the software systems and computer architecture communities and form a promising roadmap for future data-intensi v e systems research at large. Our early results deploying systems including Snorkel, MacroBase, and DeepDiv e in production hav e conﬁrmed our belief in the opportunity represented by the D A WN project, and, looking forward, ev en incremental progress tow ards these goals promises to radically improv e upon the state of the art. Refer ences [1] C. R. Aberger , S. Tu, K. Olukotun, and C. R ´ e. EmptyHeaded: A relational engine for graph processing. In SIGMOD , 2016. [2] S. H. Bach, B. D. He, A. Ratner , and C. R ´ e. Learning the structure of generativ e models without labeled data. In ICML , 2017. [3] P . Bailis, E. Gan, S. Madden, D. Narayanan, K. Rong, and S. Suri. MacroBase: Prioritizing Attention in Fast Data. In SIGMOD , 2017. https://github.com/stanford- futuredata/macrobase . [4] P . Bailis, E. Gan, K. Rong, and S. Suri. Prioritizing Attention in Fast Data: Principles and Promise. In CIDR , 2017. [5] H. Chaﬁ, A. K. Sujeeth, K. J. Bro wn, H. Lee, A. R. Atre ya, and K. Olukotun. A Domain-speciﬁc Approach to Heterogeneous Parallelism. In Pr oceedings of the 16th A CM Symposium on Principles and Practice of P arallel Pr ogramming , PPoPP ’11, pages 35–46, Ne w Y ork, NY , USA, 2011. ACM. [6] C. De Sa, A. Ratner , C. R ´ e, J. Shin, F . W ang, S. W u, and C. Zhang. DeepDi ve: Declarativ e knowledge base construction. In SIGMOD , 2016. http://deepdive.stanford.edu/ . [7] C. M. De Sa, C. Zhang, K. Olukotun, and C. R ´ e. T aming the wild: A uniﬁed analysis of HogWild!-style algorithms. In NIPS , 2015. [8] Edward Gan and Peter Bailis. Scalable Kernel Density Classiﬁcation via Threshold-Based Pruning. In SIGMOD , 2017. [9] A. Gu, R. Puttagunta, C. R ´ e, and A. Rudra. Recurrence width for structured dense matrix vector multiplication. arXiv:1611.01569 , 2016. [10] S. Hadjis, C. Zhang, I. Mitliagkas, and C. R ´ e. Omnivore: An optimizer for multi-device deep learning on CPUs and GPUs. CoRR , abs/1606.04487, 2016. [11] B. Hindman, A. K onwinski, M. Zaharia, A. Ghodsi, A. D. Joseph, R. H. Katz, S. Shenker , and I. Stoica. Mesos: A platform for ﬁne-grained resource sharing in the data center ., 2011. http://mesos.apache.org . [12] D. Kang, J. Emmons, F . Abuzaid, P . Bailis, and M. Zaharia. Optimizing deep cnn-based queries ov er video streams at scale. , 2017. [13] D. Lazer and R. Kennedy . What we can learn from the epic failure of google ﬂu trends. W ired , October 2015. https://www.wired.com/2015/10/can- learn- epic- failure- google- flu- trends/ . [14] H. Q. Ngo, E. Porat, C. R ´ e, and A. Rudra. W orst-case optimal join algorithms. In PODS , 2012. Best Paper A ward. [15] S. Palkar , J. J. Thomas, A. Shanbhag, D. Narayanan, H. Pirk, M. Schwarzkopf, S. Amarasinghe, and M. Zaharia. W eld: A common runtime for high performance data analytics. In CIDR , 2017. [16] R. Prabhakar, Y . Zhang, D. Koeplinger , M. Feldman, T . Zhao, S. Hadjis, A. P . abd Christos K ozyrakis, and K. Olukotun. Plasticine: A reconﬁgurable architecture for parallel patterns. In ISCA , 2017. [17] A. J. Ratner , S. H. Bach, H. E. Ehrenberg, and C. R ´ e. Snorkel: Fast training set generation for information extraction. In SIGMOD , 2017. https://github.com/HazyResearch/snorkel . [18] A. J. Ratner , C. M. De Sa, S. W u, D. Selsam, and C. R ´ e. Data Programming: Creating large training sets, quickly . In NIPS , 2016. [19] B. Recht, C. Re, S. Wright, and F . Niu. HogWild!: A lock-free approach to parallelizing stochastic gradient descent. In NIPS , 2011. [20] T . Rekatsinas, M. Joglekar , H. Garcia-Molina, A. Parameswaran, and C. R ´ e. SLiMFast: Guaranteed results for data fusion and source reliability . In SIGMOD , 2017. [21] K. Rong and P . Bailis. Asap: Automatic smoothing for attention prioritization in streaming time series visualization. arXiv:1703.00983 , 2017. [22] C. D. Sa, M. Feldman, C. R ´ e, and K. Olukotun. Understanding and optimizing asynchronous low-precision stochastic gradient descent. In ISCA , 2017. [23] D. Sculley , G. Holt, D. Golovin, E. Davydov , T . Phillips, D. Ebner , V . Chaudhary , and M. Y oung. Machine learning: The high interest credit card of technical debt. In SE4ML: Software Engineering for Mac hine Learning (NIPS 2014 W orkshop) , 2014. [24] P . V arma, D. Iter , C. De Sa, and C. R ´ e. Flipper: A systematic approach to debugging training sets. In HILD A , 2017. [25] P . V arma, R. Y u, D. Iter , C. De Sa, and C. R ´ e. Socratic learning: Correcting misspeciﬁed generativ e models using discriminativ e models. , 2017. [26] F . W ang, C. Y un, S. Goldwasser , V . V aikuntanathan, and M. Zaharia. Splinter: Practical priv ate queries on public data. In NSDI , 2017. [27] M. Zaharia, M. Chowdhury , T . Das, A. Dav e, J. Ma, M. McCauley , M. J. Franklin, S. Shenker , and I. Stoica. Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In NSDI , 2012. http://spark.apache.org .

Infrastructure for Usable Machine Learning: The Stanford DAWN Project

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment