'The Human Body is a Black Box': Supporting Clinical Decision-Making with Deep Learning

Reading time: 6 minute
...

📝 Abstract

Machine learning technologies are increasingly developed for use in healthcare. While research communities have focused on creating state-of-the-art models, there has been less focus on real world implementation and the associated challenges to accuracy, fairness, accountability, and transparency that come from actual, situated use. Serious questions remain under examined regarding how to ethically build models, interpret and explain model output, recognize and account for biases, and minimize disruptions to professional expertise and work cultures. We address this gap in the literature and provide a detailed case study covering the development, implementation, and evaluation of Sepsis Watch, a machine learning-driven tool that assists hospital clinicians in the early diagnosis and treatment of sepsis. We, the team that developed and evaluated the tool, discuss our conceptualization of the tool not as a model deployed in the world but instead as a socio-technical system requiring integration into existing social and professional contexts. Rather than focusing on model interpretability to ensure a fair and accountable machine learning, we point toward four key values and practices that should be considered when developing machine learning to support clinical decision-making: rigorously define the problem in context, build relationships with stakeholders, respect professional discretion, and create ongoing feedback loops with stakeholders. Our work has significant implications for future research regarding mechanisms of institutional accountability and considerations for designing machine learning systems. Our work underscores the limits of model interpretability as a solution to ensure transparency, accuracy, and accountability in practice. Instead, our work demonstrates other means and goals to achieve FATML values in design and in practice.

💡 Analysis

Machine learning technologies are increasingly developed for use in healthcare. While research communities have focused on creating state-of-the-art models, there has been less focus on real world implementation and the associated challenges to accuracy, fairness, accountability, and transparency that come from actual, situated use. Serious questions remain under examined regarding how to ethically build models, interpret and explain model output, recognize and account for biases, and minimize disruptions to professional expertise and work cultures. We address this gap in the literature and provide a detailed case study covering the development, implementation, and evaluation of Sepsis Watch, a machine learning-driven tool that assists hospital clinicians in the early diagnosis and treatment of sepsis. We, the team that developed and evaluated the tool, discuss our conceptualization of the tool not as a model deployed in the world but instead as a socio-technical system requiring integration into existing social and professional contexts. Rather than focusing on model interpretability to ensure a fair and accountable machine learning, we point toward four key values and practices that should be considered when developing machine learning to support clinical decision-making: rigorously define the problem in context, build relationships with stakeholders, respect professional discretion, and create ongoing feedback loops with stakeholders. Our work has significant implications for future research regarding mechanisms of institutional accountability and considerations for designing machine learning systems. Our work underscores the limits of model interpretability as a solution to ensure transparency, accuracy, and accountability in practice. Instead, our work demonstrates other means and goals to achieve FATML values in design and in practice.

📄 Content

“The Human Body is a Black Box”: Supporting Clinical Decision-Making with Deep Learning Mark Sendak Duke Institute for Health Innovation Durham NC USA mark.sendak@duke.edu Joseph Futoma† Engineering & Applied Sciences Harvard University
Cambridge MA USA jfutoma@seas.harvard.edu Armando Bedoya Pulmonology and Critical Care Duke School of Medicine
Durham NC USA
armando.bedoya@duke.edu Madeleine Clare Elish Data & Society Research
Institute New York NY USA mcelish@datasociety.net William Ratliff Duke Institute for Health Innovation Durham NC USA william.ratliff@duke.edu Suresh Balu Duke Institute for Health Innovation Durham NC USA suresh.balu@duke.edu Michael Gao Duke Institute for Health Innovation Durham NC USA michael.gao@duke.edu Marshall Nichols Duke Institute for Health Innovation Durham NC USA marshall.nichols@duke.edu Cara O’Brien
Hospital Medicine Duke School of Medicine
Durham NC USA cara.obrien@duke.edu ABSTRACT Machine learning technologies are increasingly developed for use in healthcare. While research communities have focused on creating state-of-the-art models, there has been less focus on real world implementation and the associated challenges to fairness, transparency, and accountability that come from actual, situated use. Serious questions remain underexamined regarding how to ethically build models, interpret and explain model output, recognize and account for biases, and minimize disruptions to professional expertise and work cultures. We address this gap in the literature and provide a detailed case study covering the development, implementation, and evaluation of Sepsis Watch, a machine learning-driven tool that assists hospital clinicians in the early diagnosis and treatment of sepsis. Sepsis is a severe infection that can lead to organ failure or death if not treated in time and is the leading cause of inpatient deaths in US hospitals. We, the team that developed and evaluated the tool, discuss our conceptualization of the tool not as a model deployed in the world but instead as a socio-technical system requiring integration into existing social and professional contexts. Rather than focusing solely on model interpretability to ensure fair and accountable machine learning, we point toward four key values and practices that should be considered when developing machine learning to support clinical decision-making: rigorously define the problem in context, build relationships with stakeholders, respect professional discretion, and create ongoing feedback loops with stakeholders. Our work has significant implications for future research regarding mechanisms of institutional accountability and considerations for responsibly designing machine learning systems. Our work underscores the limits of model interpretability as a solution to ensure transparency, accuracy, and accountability in practice. Instead, our work demonstrates other means and goals to achieve FATML values in design and in practice.
CCS CONCEPTS • Computing methodologies ® Machine learning; • Human-centered computing ® Field study; • Social and professional topics ® Government technology policy KEYWORDS Deep learning; Interpretability; Medicine; Trust; Expertise ACM Reference format: Mark Sendak, Madeleine Elish, Michael Gao, Joseph Futoma, William Ratliff, Marshall Nichcols, Armando Bedoya, Suresh Balu, Cara O’Brien. 2020. “The Human Body is a Black Box”: Supporting Clinical Decision- Making with Deep Learning. In Proceedings of ACM Conference on Fairness, Accountability, and Transparency (FAT* 2020), January 27-30. 2020. ACM, Barcelona, Spain, 10 pages. https://doi.org/10.1145/3351095.3372827 1 INTRODUCTION Machine learning technologies are increasingly developed for use in healthcare. From consumer facing apps to hospital readmission predictors, the healthcare industry includes a rapidly expanding †Joseph Futoma also retains a research position in the department of Statistics at Duke University Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author. FAT* ‘20, January 27–30, 2020, Barcelona, Spain © 2020 Copyright is held by the owner/author(s). ACM ISBN 978-1-4503-6936-7/20/02. FAT*20, January, 2020, Barcelona, Spain M. Sendak et al.

set of use cases for machine learning applications [59]. The machine learning community has focused much research on creating state-of-the-art models, but there has been less focus on real world implementation and the associated challenges to fairness, transparency, and accountab

This content is AI-processed based on ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut