Causal Inference in Observational Data
Our aging population increasingly suffers from multiple chronic diseases simultaneously, necessitating the comprehensive treatment of these conditions. Finding the optimal set of drugs for a combinatorial set of diseases is a combinatorial pattern ex…
Authors: Pranjul Yadav, Lisiane Prunelli, Alex
Causal Inference in Observ ational Data Pranjul Y adav Dept. of Computer Science University of Minnesota Minneapolis, MN yada v023@umn.edu Michael Steinbach Dept. of Computer Science University of Minnesota Minneapolis, MN stei0062@umn.edu Vipin Kumar Dept. of Computer Science University of Minnesota Minneapolis, USA kumar001@umn.edu Bonnie Westr a School of Nursing University of Minnesota Minneapolis, MN westr006@umn.edu Ale xander Hoff Dept. of Computer Science University of Minnesota Minneapolis, USA hoffx231@umn.edu Connie Delaney School of Nursing University of Minnesota Minneapolis, USA delaney@umn.edu Lisiane Pr unelli School of Nursing University of Minnesota Minneapolis, USA pruin001@umn.edu Gyorgy Simon Dept. of Health Sciences Research Ma yo Clinic, Rochester , MN simon.gyorgy@ma y o .edu ABSTRA CT Our aging population increasingly suffers from multiple c hronic diseases sim ultaneously , necessitating the comprehensiv e treat- men t of these conditions. Finding the optimal set of drugs and in terven tions for a com binatorial set of diseases is a com binatorial pattern exploration problem. Association rule mining is a p opular tool for such problems, but the require- men t of health care for finding causal, rather than asso- ciativ e, patterns renders asso ciation rule mining unsuitable. One of the purp ose of this study was to apply SSC guide- line recommendations to EHR data for patient s with sev ere sepsis or septic sho ck and determine guideline compliance as well as its impact on inpatien t mortality and sepsis com- plications. Propensity Score Matc hing in conjuction with Bootstrap Simulation w ere used to matc h patien ts with and without exp osure to the SCC recommendations. Findings sho wed that EHR data could be used to estimate compli- ance with SCC recommendations as well as the effect of compliance on outcomes. F urther, we prop ose a no vel frame- w ork based on the Rubin-Neyman causal mo del for extract- ing causal rules from observ ational data, correcting for a n umber of common biases. Specifically , given a set of in- terv entions and a set of items that define subpopulations (e.g., diseases), we wish to find all subpopulations in whic h effectiv e interv ention com binations exist and in eac h suc h subpopulation, w e wish to find all in terv ention com bina- tions suc h that dropping any in terven tion from this com- bination will reduce the efficacy of the treatmen t. A k ey aspect of our framework is the concept of closed interv en- Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full cita- tion on the first page. Copyrights for components of this work owned by others than A CM must be honored. Abstracting with credit is permitted. T o copy otherwise, or re- publish, to post on servers or to redistribute to lists, requires prior specific permission and /or a fee. Request permissions from permissions@acm.org. c 2016 A CM. ISBN 978-1-4503-2138-9. DOI: 10.1145/1235 tion sets which extend the concept of quantifying the effect of a single in terven tion to a set of concurren t in terven tions. Closed interv ention sets also allow for a pruning strategy that is strictly more efficien t than the traditional pruning strategy used by the Apriori algorithm. T o implement our ideas, we introduce and compare five metho ds of estimating causal effect from observ ational data and rigorously ev alu- ate them on syn thetic data to mathematically pro ve (when possible) why they work. W e also ev aluated our causal rule mining framew ork on the Electronic Health Records (EHR) data of a large cohort of patients from May o Clinic and sho wed that the patterns we extracted are sufficiently rich to expla in the contro v ersial findings in the medical literature regarding the effect of a class of cholesterol dru gs on T ype-I I Diabetes Mellitus (T2DM). K eywords Causal Inference, Confounding, Counterfactual Estimation. 1. RELA TED WORK Causation has received substan tial researc h interest in many areas. In computer science, Pea rl [5] and Rosenbaum[6] laid the foundation for causal inference, upon whic h several fields, cognitive science, econometrics, epidemiology , philos- oph y and statistics ha ve built their resp ective metho dologies [7, 8, 9]. A t the center of causation is a causal mo del. Arguably , one of the earliest and popular mo dels is the Rubin-Neyman causal mo del [3]. Under this mo del X causes Y , if X o ccusr before Y ; and without X , Y would b e differen t. Beside the Rubin-Neyman mo del, there are sev eral other causal models, including the Granger causality [10] for time se- ries, Bay es Netw orks [11], Structural Equation Modeling [8], causal graphical models [12], and more generally , probabilis- tic graphical models [13]. In our w ork, w e use the p otential outcome framework from the Rubin-Neyman model and we use causal graphical models to identify and correct for bi- ases. Causal graphical mo dels are to ols to visualize causal re- lationships among v ariables. No des of the causal graph are v ariables and edges are causal relationships. Most methods assume that the causal graph structure is a pri ori given, how- ev er, metho ds hav e b een prop osed for discov ering the struc- ture of the causal graph [14, 15]. In our w ork, the structure is partially given : w e know the relationships among groups of v ariables, how ever we ha ve to assign each v ariable to the correct group based on data. Kno wing the correct graph structure is imp ortan t, because substructures in the graph are suggestive of sources of bias. T o correct for biases, we are looking for sp ecific substruc- tures. F or example, causal c hains can be sources of ov ercor- rection bias and ” V” -shap ed structures can b e indicativ e of confounding or endogenous selection bias [9]. Many other in teresting substructures hav e b een studied [16, 17, 18]. In our w ork, we consider three fundamental suc h structures: direct causal effect, indirect causal effect and confounding. Of these, confounding is the most sev ere and received the most research interest. Numerous methods exist to handle confounding, whic h includes prop ensity score matc hing (PSM) [19], structural marginal mo dels [9] and g-estimation [8]. The latter tw o extend PSM for v arious situations, for example, for time- v arying in terven tions [9]. Propensity score matching is used to estimate the effect of an interv ention on an outcome. The prop ensit y score is the prop ensity (probability) of a patient receiving the in ter- v ention giv en his baseline c haracteristics and the propensity score is used to create a new p opulation that is free of con- founding. Man y PSM techniques exist and they typically differ in ho w they use the prop ensit y score to create this new p opulation [20, 21, 22, 23]. Applications of causal mo deling is not exclusive to so cial and life sciences. In data mining, Lambert et al. [24] in- v estigated the causal effect of new features on click through rates and Chan et al. [25] used doubly robust estimation tec hniquest to determine the efficacy of displa y adv ertise- men ts. Ev en extending asso ciation rules mining to causal rule mining has b een attempted b efore [26, 27, 28]. Li et al. [26] used o dds ratio to identify causal patterns and later ex- tended their technique [28] to handle large data set. Their tec hnique, how ever, is not rooted in a causal model and hence offers no protection against computing systematically biased estimates. In their proposed causal decision trees [29], they used the p otential outcome framework, but still ha ve not addressed correction for v arious biases, including confounding. 2. SIMPLE CA USAL RULE MINING IN IR- REGULAR TIME-SERIES D A T A 2.1 Introduction According to the Cen ter for Disease Control and Preven- tion, the incidence of sepsis or septicemia has doubled from 2000 through 2008, and hospitalizations hav e increased by 70% for these diagnoses1. In addition, sev ere sepsis and shock hav e higher mortalit y rates than other sepsis diag- noses, account ing for an estimated mortalit y betw een 18% and 40%. During the first 30 days of hospitalization, mor- talit y can range from 10% to 50% depending on the patient s risk factors. P atients with sev ere sepsis or septic shock are sic ker, hav e longer hospital stays, are more frequently dis- c harged to other short-term hospital or long-term care insti- tutions, and represen t the most expensive hospital condition treated in 20112. The use of evidence-based practice (EBP) gui delines, such as the Surviving Sepsis Campaign (SSC), could lead to an earlier diagnosis, and consequently , earlier treatmen t. Ho w- ev er, these guidelines hav e not b een widely incorporated in to clinical practice. The SSC is a compilation of interna- tional recommendations for the management of severe sep- sis and shock. Many of these recommendations are inter- v entions to preven t further system deterioration during and after diagnosis. Even when the presence of sepsis or progres- sion to sepsis is susp ected early in the course of treatment, timely implementation of adequate treatment management and guideline compliance are still a c hallenge. Therefore, the effectiveness of the guideline in preven ting clinical com- plications for this population is still unclear to clinicians a nd researc hers alik e. The ma jority of studies hav e fo cused on early detection and preven tion of sepsis and little is kno wn about the com- pliance rate to SSC and the impact of compliance on the pre- v ention of sepsis-related complication s. F urther, the mea- suremen t of adherence to individual SSC recommendations rather than the entire SSC is, to our knowledge, limited. The ma jority of studies ha ve used traditional randomized con trol trials with analytic tec hniques such as regression mo deling to adjust for risk factors kno wn from previous research. Data- driv en metho dologies, such as data mining techniques and mac hine learning, ha v e the potential to iden tify new insigh ts from electronic health records (EHRs) that can strengthen existing EBP guidelines. The national mandate for all health profess ionals to imp le- men t in terop erable EHRs by 2015 provides an opp ortunit y for the reuse of p otentially large amounts of EHR data to ad- dress new researc h questions that expl ore patterns of patien t c haracteristics, evidence-based guideline in terven tions, and impro vemen t in health. F urthermore, expanding the range of v ariables do cumented in EHRs to include team-based as- sessmen t and interv ention data can increase our understand- ing of the compliance with EBP guidelines and the influence of these guidelines on patient outcomes. In the absence of suc h data elements, adherence to guidelines can only b e in- ferred; it cannot b e directly observed. In this section, w e present a metho dology for using EHR data to estimate the compliance with the SSC guideline rec- ommendations and also estimate the effect of the individ- ual recommendations in the guideline on the preven tion of in-hospital mortalit y and sepsis-related complications in pa- tien ts with severe sepsis and septic shock. 2.2 Methods Data from the EHR of a health system in the Midw est w as transferred to a clinical data rep ository (CDR) at the Univ ersity of Minnesota which is funded through a Clini- cal T ranslational Science Award. After IRB approv al, de- iden tified data for all adult patien ts hospitalized betw een 1/1/09 to 12/31/11 with a severe sepsis or sho c k diagnosis w as obtained for this study . 2.2.1 Data and cohort selection The sample included 186 adult patien ts age 18 years or older with an ICD-9 diagnosis code of severe sepsis or sho c k (995.92 and 785.5*) iden tified from billing data. Since 785.* codes corresp onding to sho c k can capture patients without sepsis, patien ts without severe sepsis or septic shoc k, and pa- tien ts who did not receive antibio tics w ere excluded. These exclusions aimed to capture only those patients who had se- v ere sepsis and septic sho c k, and were treated for that clin- ical condition. The final sample consisted of 177 patients. 2.2.2 V ariables of inter est Fifteen predictor v ariables (baseline characteristic s) were collected. These include so cio demographics and health dis- parities data: age, gender, race, ethnicit y , and pay er (Medi- caid represents low income); lab oratory results: lactate and white blood cells coun t (WBC); vi tal signs: heart rate (HR), respiratory rate (RR), temperature (T emp), mean arterial bloo d pressure (MAP); and diagnoses for respiratory , car- dio v ascular, cerebro v ascular, and kidney-related comorbid conditions. ICD-9 codes for comorbid conditions were se- lected according to evidence in the literature . Comorbidities w ere aggregated from the patien tˆ a ˘ A ´ Zs prior problem list to detect preexisting (up on admission) respiratory , cardio v as- cular, cerebro v ascular, and kidney problems. Eac h category w as treated as yes/no if an y of the ICD-9 co des in that cat- egory were present. The outcomes of in terest w ere inhospital mortality and de- v elopment of new complication s ( respiratory , cardio v ascular, cerebro v ascular, and kidney) during the hospital encoun ter. New complications were determined as the presence of ICD- 9 co des on the patients billing data that did not exist at the time of the admission. 2.2.3 Study design This study aimed to analyze compliance with the SSC guideline recommendations in patients with sev ere sepsis or septic shock. Therefore, the baseline (TimeZero) w as de- fined as the onset of sepsis and the patien ts were under ob- serv ation un til discharged. Unfortunately , the timestamp for the diagnoses is dated bac k to the time of admission; hence the onset of sepsis needs to b e estimated. The onset time for sepsis was defined as the earliest time during a hos- pital encoun ter when the patient meets at least tw o of the follo wing six criteria: MAP < 65, HR > 100, RR > 20, tem- perature < 95 or > 100.94, WBC < 4 or > 12, and lactate > 2.0. The onset time w as established based on curren t clinical practice and literature on sepsis5. The earliest time when t wo or more of these aforementioned conditions were met, a TimeZero flag w as added to the time of first o ccurrence of that abnormalit y , and the timing of the SSC compliance commenced. 2.2.4 Guideline compliance SSC guideline recommendations were translated into a readily computable set of rules. These rules hav e condi- tions related to an observ ation (e.g. MAP < 65 Hgmm) and an in terven tion to administer (e.g. give v asopressors) if the patien t meets the condition of the rule. The SSC guideline w as transformed in to 15 rules in a computational format, one for each recommendation in the SSC guideline recom- mendations, and eac h rule w as ev aluated for eac h patien t (see Figure 1). After eac h rule is an abbreviated name sub- sequen tly used in this pap er. Figure 1: SSC rules for measuring guideline compli- ances W e call the treatmen t of a patien t compli ant (exp osed) for a sp ecific recommendation, if the patient meets the condi- tion of the corresponding rule any time after TimeZero and the required in terven tion was administered; the treatmen t is non-compliant (unexposed) if the patient meets the con- dition of the corresp onding rule after TimeZero, but the in- terv ention w as not administered (an y time after TimeZero); and the recommendtion is not applicable to a treatmen t if the pat ient do es not meet the condition of the co rresp onding rule. In estimating compliance (as a metric) with a specific recommendation, we simply measure the num b er of compli- an t encounters to which the recommendation is applicable. In this phase of the study , the time when a recommendation w as administered was not incorp orated in the analysis. W e also estimate the effect of the recommendation on the outcomes. W e call a patien t exp osed to a recommendation, if the recommendation is applicable to the patient and the corresponding interv ention was administered to the patient. W e call a p atient unexposed to a recommendation if the rec- ommendation is applicable but was not applied (the treat- men t w as non-complian t). The incidence fraction in exp osed patien ts with respect to an outcome is the fraction of pa- tien ts with the outcome among the exp osed patients. The incidence fraction of the unexp osed patients can b e defined analogously . W e define the effect of the recommendation on an outcome as the difference in the incidence fractions betw een the unexp osed and exp osed patients. The recom- mendation is b eneficial (protectiv e against an outcome) if the effect is p ositive, namely , the incidence faction in the unexposed is higher than the incidence fraction in the unex- posed patients. 2.2.5 Data quality Included v ariables were assessed for data quality regard- ing accuracy and completeness based on the literature and domain knowledge. Constraints were determined for plausi- ble v alues, e.g., a CVP reading could not be greater than 50. V alues outside of cons traints were reco ded as mi ssing v alues. An y observ ation that took place b efore the estimated onset of sepsis (TimeZero) w as considered a baseline observ ation. Simple mean imputation was the metho d of choice for im- puting missing v alues. Imputation was necessary for lactate (7.7%), temp erature (3%), and WBC (3%). There was no missing data for the other v ariables and for the outcomes of interest. Central venous pressure was not included as a baseline c haracteristic due to the high n umber of missing v alues (54%). 2.2.6 Pr opensity score matc hing P atients who received SSC recommendation s ma y b e in w orse health than patient who did not receive SSC rec- ommendations. F or example, patients whose lactate w as measured may hav e more apparen t (and p ossibly adv anced) sepsis than patients whose lactate was not measured. T o compensate for such disparities, propensity score matching (PSM) was employ ed. The goal of PSM is to balance the data set in terms of the cov ariates betw een patients exposed and unexposed to the SSC guideline recommendations. This is ac hiev ed b y matc hing exposed patien ts with unexp osed patien ts on their prop ensit y (probabilit y) of receiving the recommendations. This ensures that at TimeZero, pairs of patien ts, one exp osed and one unexp osed, are at the same state of health and they only differs in their exp osure to the recommendation. PSM is a p opular tec hnique for estimating treatmen t effects. T o compute the prop ensity of patients to receive treat- men t, a logistic regression mo del w as used, where the de- pendent v ariable is exposure to the recommendation and the indep enden t v ariables are the co v ariates. T he linear prediction (propensity score) of this model w as computed for every patien t. A new (matched) p opulation was created from pairs of exposed and unexposed patients with match- ing prop ensity scores. Tw o scores match if they differ by no more than a certain calip er (.1 in our study). The effect of the recommendation was estimated by comparing the inci- den t fraction among the exp osed and unexp osed patients in the matched p opulation. 2.2.7 PSM nested inside bootstrapping simulation In order to incorporate the effect of additional sources of v ariability arising due to estimation in the prop ensity score model and v ariability in the prop ensity score matc hed sam- ple, 500 b o otstrap samples were drawn from the original sample. In eac h of these b o otstrap iterations, the prop ensity score model was estimated using the abov e calip er matching tec hniques and the effect of the recommendation was com- puted with resp ect to all outcomes. In recent years, b o ot- strap simulation has b een widely employ ed in conjunction with PSM to b etter handle bias and confounding v ariables. F or each recommendation and outcome, the 500 b o otstrap iterations result in 500 estimates of the effect (of the recom- mendation on the outcome), appro ximating the sampling distribution of the effect. 2.3 Results T able 1 shows the baseline characteristics of the study population. Results are reported as total count for categor- ical v ariables, and mean with in ter-quartile (25% to 75%) range for contin uous v ariables. As shown in T able 1, the ma jority of patients were male, Caucasian, and had Medi- caid as the pay er. Before the onset of sepsis, Cardiov ascular comorbidities (56.4%) were common, the mean HR (101.3) w as slightly ab o ve the normal, as w ell as lactate (2.8), and WBC (15.8). The mean length o f stay for the sample w as 15 da ys, rangi ng from less than 24 hours to 6 months. TimeZero w as within the first 24 hours of admission, and patients at that time were primarily (86.4%) in the emergency depart- men t. F eature Mean T otal Number of Patien ts 177 Averag e Age 61 Gender(Male) 102 Race(Caucasian) 97 Ethnicity (Latino) 11 Pa yer(Medicaid) 102 White Blo o d cell 15.8 Lactate 28 Mean blo o d Pressure 73.9 T emperature 98.4 Heart Rate 101.3 Respiratory Rate 20.6 Cardiov ascular 100 Cerebrov ascular 66 Respiratory 69 Kidney 62 T able 1: Demographics statistics of patien t p opula- tion In Figure 2, the effects of v arious rule-combination pairs are depicted. An effect is defined as the difference in the mean rate of progression to complications b etw een the ex- posed and unexposed groups. Since w e used b o otstrap sim u- lation, for each rule-complication pair, 500 replications were performed resulting in a sampling distribution for the ef- fect. Sampling distribution for each rule-association pair is presen ted as boxplots. The b oxplots represent the statis- tic measured, i.e. in this study , the differential impact of a recommendation on mortalit y b etw een the exp osed and unexposed population. When this statistic is 0, the recom- mendation has no effect. If the recommendation is greater than 0, it means that the recommendation is protectiv e for that sp ecific condition; and if the recommendation is b elow 0, the recommendation may even increase the risk for the outcome for that sp ecific condition. The panes (groups of b oxplot s) correspond to the compli- cations and the b o xes within each pane corresp ond to the recommendation (rule). F or example, the effect of the V en- tilator rule (Recommendation 15: patients in respiratory distress should be put on ven tilator) on mortalit y (Death) is sho wn in the rightmost box (V en tilator) in the bottom-most pane (Death). Since all effects in the b o xplot are ab ov e 0, namely the num b er of observ ed complications in the unex- posed group is higher than in the exposed, compliance with the V entilator rule reduces the num b er of deaths. Therefore, the corresp onding recommendation is b eneficial to protect patien ts from Death (mortalit y). In T able 3, we present the 95% Confidence In terv als for v arious rule-outcome pairs. 95% Confidence interv als for v arious rule-outcome pairs. Figure 2: Bo x-plots of the mean difference b etw een groups (unexp osed - exp osed) to the guideline rec- ommendations and each of the outcomes of interest. T o further ensure the v alidity of the results, we examine the propensity score distribution in the exposed and unexp osed group. As an example, Figure 3 illustrates the prop ensity score distribution for a randomly selected b ootstrap iter- ation to measure the effect of V en tilator on Death. The horizon tal axis represents the prop ensity score, which is the probabilit y of receiving the in terv entions, and the vertical axis represent s the density distribution, namely the prop or- tion of patients in each group with a particular prop ensity for b eing put on V entilator. Figure 3 sho ws substantial o verlap b etw een the prop ensity scores in the exposed and unexposed group. The propensity score ov erlap represents the distribution; the predictor V en tilator across the exp osed and unexp osed populations regarding the outcome Death; the balance was successful when the propensity score w as applied for this p opulation. Other rule-complication pairs exhibit similar propensity score distribution. 3. CONCLUSION The ov erall purp ose of this study was to use EHR data to determine compliance with the Surviving Sepsis Campaign Figure 3: Distribution of the prop ensity scores b e- t ween exp osed and unexposed groups for the out- come Death when patients and the SSC recommen- dation w as V en tilator.. (SSC) guideline and measure its impact on inpatient mor- talit y and sepsis complications in patients with severe sep- sis and septic sho c k. Results show ed that compliance with man y of the recommendations was > 95% for MAP and CVP with fluid resuscitation given for low readings. Other high compliance (greater than 80%) recommendations were: in- sulin giv en for high blo o d glucose and ev aluating respiratory distress. The recommendations with the low est compliance ( < 30%) were: v asopressor or albumin for con tin uing lo w MAP or CVP readings. This ma y be due to a study de- sign artifact, where the rule only considered interv entions initiated after TimeZero (estimated onset of sepsis) while the fluid resuscitation may hav e taken place earlier. Al- ternativ ely , the apparently p o or compliance could also b e explained with issues related to the coding of fluids: during data v alidation, w e found that it w as difficult to trac k fluids. Our study also demonstrates that retrospective EHR data can be used to ev aluate the effect of compliance with guide- line recommendations on outcomes. W e found a n umber of SSC recommendations that were significantly protectiv e against more than one complication: V en tilator wa s protec- tiv e against Cardiov ascular and Respiratory complications as w ell as Death; use of V asopressors was pro tective for Res- piratory complications. Other recommendations, BCulture, Antibiotic, V asopres- sor, Lactate, CVP , and RespDistress, show ed results less consisten t with our exp ectation. F or instance, V asopressor used to treat low MAP , appears to increase cerebro v ascular complications. While this finding is not statistically signif- ican t, it ma y b e congruen t with the fact that small brain v essels are v ery sensitive to changes in blo o d pressure. Low MAP can cause oxygen depriv ation, and consequently brain damage. V en tilator, V asopressor, and BGlucose sho wed protective effects against Respiratory complications. The SSC guide- line recommends the implementation of ven tilator therapy as so on as any c hange in respiratory status is noticed. This in terven tion aims to protect the patient against further sys- tem stress, restore hypoxia, help with p erfusion across the main respiratory-cardio vessels, and decrease release of to x- ins due to respiratory efforts. Our study is a pro of-of-concept study demonstrating that EHR data can b e used to estimate the effect of guideline recommendations. How ever, for sev eral combinations of rec- ommendations and outcomes, the effect was not significan t. W e believ e that the reason is that guid elines represen t w ork- flo ws and t he effect of the w orkflow go es beyond the effects of the individual guideline recommendations. F or example, b y considering the recommendations outside the context of the w orkflow, w e may ignore whether the interv ention addressed the condition that triggered its administration. If low MAP triggered the administration of v asopressors, without consid - ering the workflo w, we do not kno w whether MAP returned to the normal levels thereafter. Thus we cannot equate an adv erse outcome with the failure of the guideline, it may b e the result of the insufficiency of the in terv ention. Moving forw ard, we are going to model the w orkflows b ehind the guidelines and apply the same principles that we developed in this work to estimate the effect of the entire workflo w. This phase of our study did not address the timing of recommendations nor the time prior to TimeZero. F or this analysis, guideline compliance was considered only after Time- Zero (the estimated onset), since compliance with SSC is only necessary in the presence of suspected or confirmed sep- sis. There is no reason to susp ect sepsis b efore TimeZero. Ho wev er, some interv entions ma y hav e started earlier, with- out respect to sepsis. F or example, 100% of the patients in this sample had an tibiotics (p oten tially preven tive an tibi- otics), b ut only 99 (55%) patien ts receiv ed it after TimeZero. The EHR does not pro vide date a nd time for certain ICD- 9 diagnoses. During a hospital stay , all new diagnoses are recorded with the admission date. W e kno w whether a diag- nosis was presen t on ad mission or not, thus we know whether it is a preexisting or new condition, but do not know pre- cisely when the patient developed this condition during the hospitalization. F or this reason, we are unable to detect whether the SSC guideline w as applied before or after a com- plication o ccurred, th us w e ma y underestimate the beneficial effect of some of the recommendations. F or example, high lev els of lactate is highly related to hypoxia and pulmonary damage. If these patients w ere c heck ed for lactate after pul- monary distress, w e would consider the treatmen t compliant with the Lactate recommendation, but we would not know that the respiratory distress was already present at the time of the lactate measuremen t and we would incorrectly count it as a complication that the guideline failed to prev en t. 4. COMPLEX CA USAL RULE MINING IN IRREGULAR TIME-SERIES D A T A 4.1 Introduction Effectiv e management of human health remains a ma jor societal challenge as evidenced by the rapid growth in the n umber of patien ts with multiple chronic conditions. T ype- II Diab etes Mellitus (T2DM), one of those conditions, af- fects 25.6 million (11.3%) Americans of age 20 or older and is the sev enth leading cause of deat h in the United States [1]. Effectiv e treatment of T2DM is frequen tly complicated b y diseases comorbid to T2DM, suc h as high blo o d pressure, high c holesterol, and ab dominal obesity . Currently , these diseases are treated in isolation, which leads to wasteful du- plicate treatments and sub optimal outcomes. The recent rise in the n umber of patients with multiple chronic condi- tions necessitates comprehensive treatment of these condi- tions to reduce medical waste and improv e outcomes. Finding optimal treatment for patients who suffer from m ultiple associated diseases, each of which can hav e m ul- tiple a v ailable treatments is a complex problem. W e could simply u se tec hniques based on association, but a reasonable algorithm would likely find that the use of a drug is asso ci- ated with some unfav orable outcome. This do es not mean that the drug is harmful; in fact in man y cases, it simply means that patients who take the drug are sick er than those who do not and th us they hav e a higher c hance of the unfa- v orable outcome. What we really wish to know is whether a treatment c auses an unfav orable outcome, as opp osed to being merely associated with it. The difficult y in quantifying the effect of interv entions on outcomes stems from subtle biases. Supp ose we wish to quan tify the effect of a choleste rol-low ering agent , statin, on diabetes. W e could simply compare the prop ortion of dia- betic patients in the subpopulation that tak es statin and the subpopulation that do es not and estimate the effect of statin as the difference b etw een the tw o prop ortions. This metho d w ould give the correct answ er only if the statin-taking and non-statin-taking patien ts are iden tical in all resp ects that influence the diab etes outcome. W e refer to this situation as treated and untreated patients b eing c omp ar able . Unfor- tunately , statin taking patients are not comparable to non- statin-taking patien ts, b ecause they tak e statin to treat high c holesterol, which b y and in itself increases the risk of dia- betes. High cholesterol c onfounds the effect of statin. Man y difference sources of bias e xist, confounding is just one of th e man y . In this manu script, we are going to address several differen t sources of bias, including confounding. T ec hniques to address suc h biases in causal effect estima- tion exist. Ho wev er, these techniques hav e b een designed to quan tify the effect of a single interv en tion. In trying to apply these techniques to our pro blem of finding optimal treatment for patients suffering from v arying sets of diseases, we face t wo challeng es. First, patient s with m ultiple conditions will lik ely need a com bination of drugs. Quantifying the effect of multi- ple concurren t interv entions is seman tically different from considering only a single in terven tion. The key concept in estimating the effect of an interv ention is comp ar ability : to estimate the effect of in terv ention, we need t wo groups of patien ts who are identical in all relev ant aspects except that one group receives the in terve ntion and the other group does not. F or a single in terv ention, the first gro up is typ ically the sic kest patients who still do not get treated and the second group consists of the healthiest patient who get treatment. They are reasonably in the same state of health. How ever, when we go from a single in terven tion to m ultiple interv en- tion and try to estimate their joint effect, comparability no longer exists. A patien t requiring multiple sim ultaneous in- terv entions is so fundamentally differen t from a patient who does not need any interv ention that they are not compara- ble. The other k ey c hallenge in finding optimal in terv ention sets for patien ts with com binatorial sets of diseases is the com binatorial search space. Even if we could trivially extend the methods for quan tifying the effect of a single interv en- tion to a set of concurrent in terven tions, we would hav e to systematically explore a com binatorially large search space. The asso ciation rule mining framework [2] provides an ef- ficien t solution for exploring combinato rial search spaces, ho wev er, it only detects associative relationships. Our in- terest is in causal relationships. In this man uscript, we propose causal rule mining, a frame- w ork for transitioning from asso ciation rule mining tow ards causal inference in subp opulations. Sp ecifically , given a set of interv entions and a set of items to define subpopulations, w e wish to find all subp opulations in which effective inter- v ention combinations exist and in each such subpopulation, w e wish to find all interv ention combinations such that drop- ping an y interv en tion from this combination will reduce the efficacy of the treatment. W e call these close d intervention sets , which are not b e confused with closed item sets. As a concrete example, in terven tions can be drugs, subp opula- tions can b e defined in terms of their diseases and for each subpopulations (set of diseases), our algorithm would return effectiv e drug co cktails of increasing num b er of constituen t drugs. Leaving out any drug from the co c ktail will reduce the efficacy of the treatmen t. Closed interv en tion sets allow us to go from estimating a single interv ention to multiple in terven tions. T o address the exploration of the combinatorial search space, we prop ose a nov el frequency-based anti monotonic pruning strategy enable by the closed in terven tion set con- cept. The essence of an timonotonic prop erty is that if a set I of interv entions do es not satisfy a criterion, none of its supersets will. The proposed pruning strategy based on the closed interv ention is strictly more efficient than the tradi- tional pruning strategy used b y the Apriori algorithm [2]. Underneath our combinatorial exploration algorithm, we utilize the Rubin-Neyman mo del of causation [3]. This mo del sets tw o conditions for causation: a set X of interv entions causes a c hange in Y iff X happ ens b efore Y and Y would b e differen t had X not o ccurred. The unobserv able outcome of what w ould happen had a treated patien t not received treat- men t is a p otential outc ome and needs to b e estimated. W e presen t and compare five methods for estimating these p o- ten tial outcomes and describe the biases these methods can correct. T ypically the ground truth for the effect of drugs is not kno wn. In order to assess the quality of the estimates, we conduct a sim ulation study utilizing five different syn thetic data set that in tro duce a new source of bias. W e will ev al- uate the effect of the bias on the fiv e prop osed methods underscoring the statemen ts with rigorous pro ofs when p os- sible. W e also ev aluate our work on a real clinical data set from Ma yo Clinic. W e hav e data for o ver 52,000 patients with 13 y ears of follow-up time. Our outcome of int erest is 5-year inciden t T2DM and we wish to extract patterns of in terven- tions for patients suffering from com binations of common comorbidities of T2DM. First, we ev aluate our metho dology in terms of the computational cost, demonstrating the effec- tiv eness of the pruning metho dologies. Next, w e ev aluate the patterns qualitativ ely , using patterns inv olving statins. W e sho w that our methodology extracted patterns that allo w us to explain the contro versial patterns surrounding statin [4]. Con tributions. (1) W e propose a no vel framew ork for ex- tracting causal rules from o bserv ational data correcting for a n umber of common biases. (2) W e introduce the concept of closed in terven tion sets to extend the concept of quantifying the effect of a single interv ention to a set of concurrent in- terv entions sidestepping the patient comparabilit y problem. Closed interv ention sets also allow for a pruning strategy that is strictly more efficien t than the traditional pruning strategy used by the Apriori algorithm [2]. (3) W e compare fiv e metho ds of estimating causal effect from observ ational data that are applicable to our problem and rigorously ev al- uate them on syn thetic data to mathematically prov e (when possible) why they work. 4.2 Background: Association Rule Mining W e first briefly review the fundamental concepts of asso- ciation rule mining and extend these concepts to causal rule mining in t he next section. Consider a set I of items , whic h are single-term predicates ev aluating to ‘true’ or ‘false’. F or example, { ag e > 55 } can b e in item. A k- itemset is a set of k items, ev aluated as the conjunction (logical ’and’) of its constituent items. Consider a dataset D = { d 1 , d 2 .....d n } , which consists of n observ ations . Eac h observ ation, de- noted by D j is a set of items. An itemset I = i 1 , i 2 , . . . , i k ( I ⊂ I ) supp orts an observ ation D j if all items in I ev al- uate to ‘true’ in the observ ation. The supp ort of I is the fraction of the observ ations in D that support I . An item- set is frequ ent if its support exceeds a pre-defined minim um support threshold. A asso ciation rule is a logical implication of form X ⇒ Y , where X and Y are disjoin t itemsets. The support of a rule is supp ort( X Y ) and the confidence of the rule is conf ( X ⇒ Y ) = supp ort( X Y ) supp ort( X ) = P( Y | X ) . 4.2.1 Causal Rule Mining Giv en an interv ention itemset X and an outcome item Y , such that X and Y are disjoin t, a causal rule is an impli- cation of form X → Y , suggesting that X c auses a c hange in Y . Let the itemset S define a subp opulation , consist- ing of all observ ations that supp ort S . This subp opulation consists of all observ ations for which all items in S ev alu- ate to ‘true’. The causal rule X → Y | S implies that the in terven tion X has causal effect on Y in the subpopulation defined by S . The quantit y of interest is the causal effect , whic h is the change in Y in the subp opulation S caused by X . W e will formally define the metric used to quantify the causal effect shortly . Rubin-Neyman Causal Model. X has a causal effect on Y if (i) X happ ens earlier than Y and (ii) if X had not happened, Y would b e differen t [3]. Our study design ensures that the interv en tion X precedes the outcome Y , but fulfilling the second conditions requires that w e estimate the outcome for the same patien t b oth under interv ent ion and without in terven tion. Potential Outc omes. Ev ery patien t in the dataset has tw o potential outcomes: Y 0 denotes their outcome had they not had the interv ention X ; and Y 1 denotes the outcome had they had the in terv ention. Typically , only one of the t w o po- ten tial outcomes can b e observ ed. The observ able outcome is the actual outcome (denoted b y Y ) and the unobserv able potential outcome is called the counterfactual outcome. Using the definition of counterfactual outcome, w e can no w define the metric for estimating the change in Y caused b y X . Average T reatmen t resp onse on the T reated (A TT) is a widely known metric in the causal literature and is computed as follows: A TT( X → Y | S ) = E [ Y 1 − Y 0 ] X =1 = E [ Y 1 ] X =1 − E [ Y 0 ] X =1 , where E denotes the exp ectation and the X = 1 in the sub- script signals that we only ev aluate the exp ectation in the treated patients ( X = 1). A TT aims to compute an av erage p er-patient change caused b y the in terven tion. Y 0 = Y 1 , indicates that the in terven tion resulted in no change in outcome for the patient. Biases. Beside X , n umerous other v ariables can also exert influence o ver Y , leading to biases in the estimates. T o correct for these biases, we ha ve correctly account for these other effects. The quintessen tial to ol for this purp ose is the causal graph, depicted in Figure 4. The no des of this graph are sets of v ariables that pla y a causal role and edges are causal effects. This is no t a correlati on graph (or dep endence graph), because for example, U and Z are dep enden t given X , y et there is no edge betw een them. V ariables (items in I ) can exert influence on the effect of X on Y in three wa y: they may only influence X , they may only influence Y or them may influence b oth X and Y . Ac- cordingly , v ariables can b e categorized into four categories: V are v ariables that directly influence Y and th us ha ve dir e ct effe ct on Y U are v ariables that only influence Y through X and th us ha ve indir e ct effe ct on Y ; Z are v ariables that influence b oth X and Y and are called c onfounders ; and finally O are v ariables that do not influence either X or Y and hence can b e safely ignored. Figure 4: Rubin-Neyman Causal Mo del Most of the causal inference literature assumes that the causal graph is known and true. In other words, we kno w apriori which v ariables fall into each of the categories, U , Z , V and O . In our case, only X and Y are sp ecified and w e hav e to infer which category each other v ariable (item) belongs to. Since this inference relies on asso ciation (de- pendence) rather than causation, the discov ered graph may ha ve errors, misclassifications of v ariables into the wrong category . F or example, b ecause of the marginal dep endence betw een U and Y , v ariables in U can easily get misclassi- fied as Z . Suc h misclassifications do not necessarily lead to biases, but they can cause loss of efficiency . Problem F ormulation. Giv en a data set D , a set S of subp opulation-defining items, a set X of in terv en tion items, a minimal supp ort threshold θ and a minim um effect threshold η , w e wish to find all subp opulations S ( S ⊂ S ) and all interv etions X ( X ⊂ X ), X and S are disjoint, such that the causal rule X → Y | S is frequent and its in terven tion set X is closed w.r.t. our metric of causal effect, A TT. Note that the meaning of θ , the minimum support thresh- old, is different than in asso ciation rule mining literature. T ypically , rules with supp ort less than θ are considered un- in teresting, in other cases, it is simply a computational con- v enience, but in our case, w e set θ to a minimum v alue such that A TT is estimable for the disco vered patterns. W e call a causal rule frequen t iff its support exceeds the user-specified minimum threshold θ supp ort( X → Y | S ) = supp ort( X Y S ) = P( X Y S ) > θ and we call an interv ent ion set X closed w.r.t. to A TT iff ∀ x ∈ X, | AT T ( x → Y | S,X \ x ) | > η , where η is the user-sp ecified minimum causal effect thresh- old. In other words, a causal rule is closed in a subp opu- lation, if its (absolute) effect is greater than any of its sub- rules. Example. In a medical setting, X may b e drugs, S could be comorbid diseases. Then X is a drug-combination that hopefully treats set of diseases S . This set of drugs b eing close d w.r.t. A TT means that dropping any drug from X will reduce the ov erall efficacy of the treatmen t; the patien t is not taking unnecessary drugs. An itemset is closed if its supp ort is strictly higher than all of its subitemsets’. Analogously , an in terv ention set is closed if its absolute causal effect is strictly higher than all of its subitemsets’. 4.2.2 F r equent Causal P attern Mining Algorithm W e can no w present o ur algorithm for causal pattern min- ing. A t a v ery high lev el, the algorithm comprises of tw o nested frequen t pattern enumeration [30] lo ops. The outer loop enumerates subpopulation-defining itemsets S using items in S , while the inner loop enumerates interv en tion com binations using items in X \ S . More generally , X and S can ov erlap but we do not consider that in this pap er. Ef- fectiv e algorithms to this end exists [31, 32], we simply use Apriori [2]. Once the patterns are disco vered, the A TT of the interv en- tions are computed, using one of the methods from Section 4.3 and the frequent, effective patterns are returned. On the surface, this approac h appears v ery exp ensive, ho wev er several no v el, extremely effective prun ing strategies are p ossible and we describ e them b elow. P otential Outcome Support Pruning. Let X b e an in terven tion k -itemset, S b e a subp opulation-defining item- set, and let X and S b e disjoin t. F urther, X − i be an itemset that ev aluates to ‘true’ iff all items except the i th are ‘true’ but the i th item is ‘false’. Using association rule mining ter- minology , all items in X except the i th are presen t in the transaction. Definition 1 (Potential Outcome Suppor t Pruning). We only ne e d to c onsider itemsets X such that min { supp ort( S, X ) , supp ort( { S, X − 1 ) , . . . , supp ort( S, X − k ) } > θ . In order to b e able to estimate the effect of x ∈ X in the subpopulation S , we need to hav e observ ations with x ‘true’ and also with x ‘false’ in S . Lemma 1. Potential Outc ome Supp ort Pruning is anti- monotonic. Proof: Consider a causal rule X → Y | S . If the causal rule X → Y | S is infrequent, then supp ort( X S ) < θ ∨ ∃ i, supp ort( X − i S ) < θ . If supp ort( X − i S ) had insufficient support, then any exten- sion of it with an interv en tion item x will contin ue to hav e insufficien t supp ort, thus the X x → Y | S rule will hav e in- sufficien t supp ort. Lik ewise, if supp ort( X S ) had insufficient support, then any extension of it with an interv ention item x will also hav e insufficien t supp ort. Pruning based on Causal Effect. Proposition 1. Effe ctive c ausal rule pruning c ondition is anti-monotonic. Ra tionale: T o explain the rational, let us return to the medical example, where X is a com bination of drugs forming a treatment. Assuming that the effects of drugs are additiv e, if a casual rule X → Y | S is ineffective b ecause ∃ x i ∈ X, | A TT( x i → Y | S,X \ x i ) | < η , then forming a new rule X x j → Y | S will also b e ineffectiv e because | A TT( x i → Y | S,x j ,X \ x i ) | will b e ineffective. In the presence of p ositiv e interaction s (that reinforce each other’s effect) among the drugs, this statemen t may not hold true. Beside statistical reasoning, one can question why a patient should receive a drug that has no effect in a com bination. 4.3 Causal Estimation Methods A TT, our metric of interest, with respect to a single in- terv ention x in a subpopulation S is defined as A TT( x → Y | S ) = E [ Y 1 − Y 0 ] S,X =1 , whic h is the exp ected difference b etw een the p oten tial out- come under treatment Y 1 and the p oten tial outcome with- out treatment Y 0 in patients with S who actually received treatmen t. Since we consider treated patients, the p oten tial outcome Y 1 can b e observed, the potential outcome Y 0 can- not. Thus at least one of the tw o must be estimated. The methods we presen t b elo w differ in which potential outcome they estimate and how they estimate it. F or the discussion b elow, w e consider the v ariables X , Z , U and V from the causal graph in Figure 4. X is a single in terven tion, U , V and Z can be s ets of items. F or regression models, we will denote the matrix defined by U , V and Z in the subpopulation S as U , V and Z (same letter as the v ariable sets). Coun terfactual Confidence (CC). This is the simplest method. W e simply assume that the patients who receiv e in terven tion X = 1 and those who do not X = 0, do not dif- fer in any imp ortan t resp ect that would influence Y . Under this assumption, Y 1 in the treated is simply the actual out- come in the treated and the p otential outcome Y 0 is simply the actual outcome in the non-treated ( X = 0). Th us A TT = conf (( X = 1) → Y | S ) − conf (( X = 0) → Y | S ) , = P( Y | S, X = 1) − P( Y | S, X = 0) In the follo wings, to improv e readabilit y , we drop the S subscript. All ev aluations tak e place in the S subpopula- tions. Direct Adjustmen t (DA). W e cannot estimate Y 0 in the treated ( X = 1) as the actual outcome Y in the untreated, because the treated and un treated populations can signifi- can tly differ in v ariables suc h as Z and V that influence Y . In Direct Adjustmen t, we attempt to directly remo ve the ef- fect of V and Z by including them into a regression model. Since a regression mo del relates the means of the predictors with the mean of the outcome, we can remov e the effect of V and Z by making their means 0. Let R b e a generalized linear regression mo del, predicting Y via a link function g g ( Y | V , Z, X ) = β 0 + β V V + β Z Z + β X X. Then the (link-transformed) p otential outcome under treat- men t is g ( Y 1 ) = β 0 + β V V + β Z Z + β X and the p oten tial outcome without treatment is g ( Y 0 ) = β 0 + β V V + β Z Z . The A TT is then A TT = E g − 1 ( Y 1 | V , Z, X = 1) X =1 − E g − 1 ( Y 0 | V , Z, X = 0) X =1 . where g − 1 ( Y 1 | V , Z, X = 1) is prediction for an observ ation with the observ ed V and Z but with X set to 1. The E ( · ) X =1 notation signifies that these exp ectation of the predictions are tak en only ov er patients who actually receiv ed the treat- men t. The adv antage of DA (o ver CC) is manyfold. First, it can adjust for Z and V as long the mo del specification is correct, namely the in teraction terms that ma y exist among Z and V are sp ecified correctly . Second, w e get correct estimates ev en if we ignore U , b ecause U is conditionally indep enden t of Y giv en X . This unfortunately only is a theoretical adv antage, because w e hav e to infer from the data whether a v ariable is a predictor of Y and U is marginally dependent on Y , so w e will likely adjust for U , even if we don’t need to. Coun terfactual Mo del (CM). In this tec hnique, w e build an explicit mo del for the p otent ial outcome without treat- men t Y 0 using patients with X = 0. Specifically , we build a model g ( Y | V , Z, X = 0) = β 0 + β V V + β Z Z. and estimate the p otential outcome as g ( Y 0 | V , Z ) = g ( Y | V , Z, X = 0) . The A TT is then A TT = P( Y | X = 1) − E g − 1 ( Y 0 | V , Z ) X =1 . Similarly to Direct Adjustment, the Co unterfactual Mo del does not depend on U . How ever, in case of the Counterfac- tual Model, we are only considering the p opulation with X = 0. In this p opulation, U and Y are independent, thus w e will not include U v ariables in to the mo del. Prop ensit y Score Matching (PSM). The cen tral idea of Prop ensity Score Matchin g is to create a new population, suc h that patients in this new p opulation are comparable in all relev ant resp ects and thus the exp ectation of the p oten- tial outcome in the untreated equals the exp ectation of the actual outcome in the untreat ed. P atients are matc hed based on their propensity of receiv- ing treatment. This prop ensit y is computed as a logistic regression mo del with treatment as the dep endent v ariable log P( X ) 1 − P( X ) = β 0 + β V V + β Z Z. P atient pairs are formed, such that in each pair, one patien t receiv ed treatmen t and the other did not and their propen- sities for treatmen t differ by no more than a user-defined caliper difference ρ . The matched p opulation has an equal num b er of treated and un treated patients, is balanced on V and Z , th us the patien ts are comparable in terms of their baseline risk of Y . Hopefully , the only factor causing a difference in outcome is the treatment. F or estimating A TT, the p oten tial outcome without treat- men t is estimated from the actual outcomes of the patien ts in the matched population who did not receive treatment: AT T = E [ Y 1 − Y 0 ] − P( Y | X = 1 , M ) − P( Y | X = 0 , M ) , where M denotes the matched population. Among the metho ds we consider, propensity score matc h- ing most strictly enforces the patien t comparability crite- rion, how ever, it is susceptible to missp ecification of the propensity regression model, whic h can erode the quality of the matching . Stratified Non-Parametric (SN). In the stratified esti- mation, we directly compute the exp ectation via stratifica- tion. The assumption is that the patient in each stratum are comparable in all relev ant respects and only differ in the presence or absence of interv ention. In each stratum, we can estimate the p otential outcome Y 0 in the treated as the actual outcome Y in the untreated. AT T = E [ Y 1 − Y 0 ] X =1 = X l P ( l | X = 1) [ P ( Y 1 | l, X = 1) − P ( Y 0 | l, X = 1)] = X l P ( l | X = 1) [ P ( Y | X = 1) − P ( Y | X = 0)] , where l iterates ov er the com bined lev els of V and Z . If w e can ident ify the items that fall into U , then we can ignore them, otherwise, w e should include them as well in to the stratification. The stratified metho d mak es v ery few assumptions and should arrive at the correct estimate as long as each of the strata are sufficien tly large. The key disadv antage of the stratified metho d lies in stratification itself: when the n um- ber of items across which w e need to stratify is to o large, we ma y end up dividing the p opulation into excessively many small subp opulations (strata) and become unable to esti- mate the causal effect in many of them thus introducing bias into the estimate. 4.4 Results After describing our data and study design, we present three ev aluations of the proposed metho dology . The first ev aluation demonstrates the computational efficiency of our pruning metho dologies, isolating the effect of eac h prun- ing metho ds: (i) Apriori support-based pruning, (ii) Po- ten tial Outcome Supp ort Pruning, and (iii) Poten tial Out- come Supp ort Pruning in conjunction with Effective Causal Rule Pruning. In the second section, we pro vide a quali- tativ e ev aluation, lo oking at patterns inv olving statin. W e attempt to use the extracted patterns to explain the con- tro versial findings that exist in the literature regarding the effect of statin on diab etes. Finally , in order to compare the treatmen t effect estimates to a ground truth, whic h does not exits for real drugs, we sim ulate a data set using prop ortions w e deriv ed from the Ma yo Clinic data set. Data and Study Design. In this study we utilized a large cohort of Ma yo Cl inic patien ts with data betw een 1999 and 2013. W e included all adult patients (69,747) with re- searc h consen t. The baseline of our study was set at Jan. 1, 2005. W e collected lab results, medications, vital signs and status, and medication orders during a 6-y ear r etr osp e ctive p erio d b etw een 1999 and the baseline to ascertain the pa- tien t’s baseline comorbidities. F rom this cohort, w e excluded all patien ts with a diagnosis of diabetes b efore the baseline (478 patients), missing fasting plasma glucose measurements (14,559 patien ts), patien ts whose lipid health could not b e determined (1,023 patients) and patien ts with unknown hy- pertension status (498 patien ts). Our final study cohort con- sists of 52,139 patien ts who were follow ed un til the summer of 2013. P atients were phenotyped during the retrosp ective p eriod. Comorbidities of interest include Impaired F asting Glucose (IF G), ab dominal ob esity , Hyp ertension (HTN; high blo o d pressure) and hyperlipidemia (HLP; high cho lesterol). F or eac h comorbidit y , the phenotyping algorithm classified pa- tien ts in to three broad levels of severit y: normal, mild and sev ere. Normal patien ts show no sign of disease; mild pa- tien ts are either untreated and out of con trol or are con- trolled using first-line therapy; severe patien ts require more aggressiv e therapy . IFG is categorized in to normal and pre- diabetic, the latter indicating impaired fasting plasma glu- cose lev els but not meeting the diab etes criteria yet. F or this study , progression to T2DM within 5 years from base- line (i.e. Jan 1, 2005) was c hosen as our outcome of in terest. Out of 52,139 patients 3627 patien ts progressed to T2DM , 41028 patien ts did not progressed t o T2DM and the remain- ing patien ts (7484) dropp ed out of the study . In T able 2 we presen t statistics ab out our patient population. 4.4.1 Pruning Efficiency In our w ork, w e prop osed tw o new pruning metho ds. First, w e ha ve the P oten tial Outcome Supp ort Pruning, which aims to eliminate patterns for whic h the A TT is not es- timable. Second, we hav e the Effective Causal Rule Pruning, where we eliminate patterns that do not impro ve treatment effectiv eness relativ e to the subitemsets. In Figure 5 we present the num b er of patterns discov ered using (i) the traditional Apriori support based pruning, (ii) our prop osed Poten tial Outcome Support Pruning (POSP), and (iii) POSP in conjunction with Effectiv e Causal Rule Pruning (ECRP). The n um b er of patterns discov ered by POSP is strictly less than the num b er of patterns disco vered b y the Apriori pruning. POSP in conjunction with ECRP is very effective. 4.4.2 Statin In this section, we demonstrate that the prop osed causal rule mining metho dology can be used to discov er non-trivial patterns from the ab ov e diab etes data set. In recent years, the use of statins, a class of cholesterol- lo wering agents, hav e b een prescribed increasingly . High c holesterol (hyperlipidemia) is linked to cardio-v ascular mor- talit y and the efficacy of statins in reducing cardio-v ascular mortalit y is well documented. How ever, as evidenced by a 2013 BMJ editorial [4] devoted to this topic, statins are sur- T2DM Present Absent T otal Number of Patien ts 3627 41028 Averag e Age 44.73 35.58 Male(%) 51 41 F emale(%) 49 59 Patien t Diagnosis Status (%) NormFG 42 84 PreDM 58 16 Normal Ob esity 29 59 Mild Obesity 25 30 Severe Obesity 46 11 Normal Hyp ertension 48 69 Mild Hypertension 33 23 Severe Hypertension 19 08 Normal Hyp erlipidemia 12 29 Mild Hyperlipidemia 72 64 Severe Hyperlipidemia 16 07 Patien t Medication Status(%) Statin 26 11 Fibrates 03 01 Cholesterol.Other 02 01 Acerab 17 07 Diuret 18 07 CCB 08 04 BetaBlocke rs 22 10 HTN.Others 01 01 T able 2: Demographics statistics of patien t p opula- tion Figure 5: Comparison of Pruning T echniques rounded in contro versy . In patien ts with normal bloo d sugar lev els (lab eled as NormalFG), statins hav e a detrimental ef- fect, they increase the risk of diab etes; y et in pre-diabetic patien ts (PreDM), it appears to hav e no effect. What we demonstrate b elo w is that this phenomenon is simply dis- ease heterogeneity . First, w e describe how this problem maps to the causal rule mining problem. Our set of interv ent ions ( X ) consists of statin and our subp opulation defining v ariables consist of the v arious levels of HTN, HLP and IF G ( S ). Our inter- est is the effect of statin ( x ) on T2DM ( Y ) in all p ossible subpopulations S , S ⊂ S . In this setup, HTN, which is associated with b oth hy- perlipidemia (and statin use), as w ell as with T2DM, is a confounder ( Z ). A c holesterol drug, other than statin, (sa y) fibrates, are in the U category: they are predictiv e of statin (patients on monotherap y who take fibrates do not tak e statins), but hav e no effect on Y , b ecause its effect is already incorp orated into the hyperlipidemia sev erity v ari- ables that defined the subp opulation. V ariables that only influence diabetes but not statin use (say a diab etes drug) w ould fall into the V category . All subpopulations hav e v ari- ables that fall into Z and U and some subp opulation may also hav e V . The HLP v ariable in T able 2 uses statin as part of its defi- nition, thus w e constructed tw o new v ariables. The first one is HLP1, a v ariable at the b orderline b etw een HLP-Normal and HLP-Mild, consisting of untreated patients with mildly abnormal lab results (these fall into HLP-Normal) and pa- tien ts who are diagnosed and receiv e a first-line treatmen t (they fall in to HLP-Mild). Comparability is the cen tral con- cept of estimating causal effects and these patients are com- parable at baseline. Similarly , we also created another v ari- able, HLP2, which is at the b order of HLP-Mild and HLP- Sev ere, again consisting of patients who are comparable in relev ant asp ects of their health at baseline. S CC D A CM PSM SN PreDM 0.145 0.022 0.010 0.022 0.017 NormF G 0.060 0.023 0.034 0.017 0.029 HLP1 0.078 0.019 0.014 0.010 0.010 HLP2 0.021 -0.013 -0.010 -0.021 -0.015 PreDM,HLP1 0.067 0.018 0.021 0.004 0.002 PreDM,HLP2 0.001 -0.038 -0.031 -0.048 -0.043 NormF G,HLP1 0.043 0.020 0.015 0.014 0.013 NormF G,HLP2 0.017 -0.002 -0.002 -0.005 -0.004 T able 3: A TT due to statin in v arious subp opula- tions S as estimated by the 5 prop osed metho ds. T able 3 presents the A TT estimates obtained by the v ar- ious metho ds prop osed in Section 3.4 for some of the most relev ant subp opulations. Negative A TT indicates b eneficial effect and p ositiv e A TT indicates detrimental effect. Coun terfactual confidence (CC) estimates statin to b e detri- men tal in all subpopulations. While statins are known to ha ve detrimen tal effect in patien ts with normal glucose lev- els [4], it is unlikely that statins are univ ersally detrimental, ev en in patien ts with sev ere h yperlipidemia, the very disease it is supp osed to treat. The results b et ween DA, CM, PSM and SN are similar, with PSM and SN having larger effect sizes in general. The picture that emerges from these results is that patien ts with sev ere h yp erlipidemia app ear to b enefit from statin treat- men t ev en in terms of their diab etes outcomes, while statin treatmen t is mo derately detrimental for patients with mild h yp erlipidemia. Bootstrap estimation was used to compute the statisti- cal significance of these results. F or brevity , we rep ort the results only for PSM. Th e estimates are significant in the fol- lo wing subp opulations: NormF G, PreDM+HLP2 (p-v alues are < .001) and NormFG+HLP1 (p-v alue .05). The true A TT in these subp opulations is not know. T o in vestigate the accuracy that the v arious methods ac hieve, w e use simulated that is largely based on this example [4, 33]. 4.4.3 Synthetic Data In this section, we describe four experimen ts utilizing syn- thetic data sets, eac h of which introduces a new p otential source of bias. Our ob jective is to illustrate the a bility of the fiv e metho ds from Section 3.4 for adjusting for these biases. W e compare their A TT estimates to the true A TT we used to generate the dat a set and discuss reasons for their success or failure. The rows of T able 4 correspond to the synthetic data sets in increasing order of the biases we in tro duced and the columns corresponds to the metho ds: Conf (confidence), CC (Coun- terfactual Confidence), D A (Direct Adjustment), CM (Coun- terfactual Model), PSM (Prop ensity Score Matc hing) and SNP (Stratified Non-P arametric). Some of these metho ds, DA, CM, PSM and SNP take the causal graph structure into accoun t while estimating A TT. Specifically , they require the information whether a v ariable is a confounder ( Z ), has a direct effect ( V ), an indirect effect ( V ), or no effect ( O ). PSM and SNP are not sensitiv e to the inclusion of sup erfluous v ariables, they simply decrease the method’s efficiency . In all of the data sets, we use a notation consistent with Figure 1: Z is the central disease with outcome Y ; X is the in terven tion of interest that treats Z ; V is another disease with direct causal effect on Y , but V is indep enden t of X ; and U is a third disease, which can b e treated with X , but has no impact on Y . All d ata sets con tain 5000 observ ations. I. Dir e ct Causal Effe ct fr om V . W e assume that ev ery pa- tien t in the cohort has disease Z at the same severit y . They are all comparable w.r.t. Z . 30% of the patients are sub- ject to the interv ention X aimed at treating Z , while others are not. Un treated patients face a 25% chance of ha ving Y , while treated patien ts only ha ve 10% c hance. Some patien ts, 20% of the p opulation, also hav e disease V , which directly affects Y : it increases the probabilit y of Y by 5%. In this example the true A TT is -.15, as X reduces the c hance of Y by 15%. Our causal graph dictates that X and V be marginally independent, hence this this effect is homogeneous across the levels of V . (Otherwise V w ould become predictive of X and it would b ecome a confounder. Confounding is discussed in exp eriments I I I-V.) All methods estimated the A TT correctly , b ecause A TT does not dep end on V . W e can demonstrate this b y stratifying on V and using the marginal indep endence of X and V . AT T = E [P( Y | X = 1) − P( Y | X = 0)] = X v ∈ V P( V = v ) [P( Y | V = v, X = 1) − P( Y | V = v , X = 0)] = X v ∈ V [P( Y , V = v | X = 1) − P( Y , V = v | X = 0)] = P( Y | X = 1) − P( Y | X = 0) where v denotes the lev els of V . The marginal independence of X and V is used in step three: P( Y | V , X ) = P( Y , V , X ) P( V , X ) = P( Y , V | X )P( X ) P( X, V ) = P( Y , V | X ) P( V ) . II. Indir e ct Causal Effe ct. The setup for this exp erimen t is the same as for the ’Direct Causal Effect’ exp erimen t, except w e hav e disease U instead of V . Just like Z , disease U is also treated b y X , but U has no direct effect on Y ; its effect is indirect through X . U is thus indep enden t of Y given X . The true A TT contin ues to be -.15. Again, the A TT do es not dep end on U , hence all metho ds estimated it correctly . T o demonstrate that A TT do es not depend on U , we use stratification and the conditional inde- pendence of Y and U . AT T = E [P( Y | X = 1) − P( Y | X = 0)] = X u ∈ U [P( Y | U = u, X = 1)P( U = u | X = 1) − P( Y | U = u, X = 0)P( U = u | X = 0)] = X u ∈ U [P( Y | X = 1)P( U = u | X = 1) − P( Y | X = 0)P( U = u | X = 0)] = P( Y | X = 1) X u P( U = u | X = 1) − P( Y | X = 0) X u P( U = u | X = 0) = P( Y | X = 1) − P( Y | X = 0) III. Confounding. In this exp eriment, w e consider the sim- plest case of confounding, inv olving a single disease Z , a single treatmen t X and outcome Y . 20% of the patien ts ha ve disease Z and 95% of the diseased patients are treated with X , while 5% are not. All treated patient s hav e Z . 25% of the untreated patients ( Z = 1 and X = 0) hav e outcome Y ; 10% of the treated patients ( Z = 1 and X = 1) hav e the outcome; and only 5% of the healthy patien ts ( Z = 0) hav e it. The true A TT is -.15. In the presence of confounding, the counterfactual confi- dence and A TT do not coincide. With z denoting the levels of Z and P( z ) being a shorthand for P( Z = z ), AT T = E [P( Y | X = 1) − P( Y | X = 0)] = X z P( z ) [P( Y | X = 1 , z ) − P( Y | X = 0 , z )] , while the count erfactual confidence (CC) is C C = P( Y | X = 1) − P( Y | X = 0) = X z [P( Y | X = 1 , z )P( z | X = 1) − P( Y | X = 0 , z )P( z | X = 0)] . When P( z | X ) 6 = P( z ), these quantities do not coincide. Ho wev er, an y metho d that can estimate P( Y | X, Z ) for all lev els of Z and X will arriv e at the correct A TT estimate. W e used logistic regression in our implementation of the Direct Adjustment metho d, which can estimate P( Y | X , Z ) when X and Z hav e no interactions. Note that the causal graph admits interaction betw een X and Z , th us model mis- specification can cause biases in the estimate. IV. Confounding with Indir e ct Effe ct. In addition to the Confounding exp eriment, we also ha ve an indirect causal effect from U . W e now hav e tw o diseases, Z and U , each of which can b e treated with X . 20% of the p opulation has Z and indep enden tly , 20% has U . 25% of the patients who ha ve Z and ha ve no treatment ( X = 0) hav e Y , while only 10% of the treated ( X = 1) patients hav e it, regardless of whether the patient has U . (If the probabilit y of Y w as affected b y U , it would b e another confounder, rather than ha ve an indirect effect.) X has a b eneficial A TT of -.15 in patients with Z == 1 (and X == 1) and has no effect in patients with Z = 0 (who get X b ecause of U ). Th us the true A TT=-.0833. In this experiment, the count erfactual mo del was the b est- performing mo del. The counterfactual mo del estimates the A TT through the definition A TT = E [P( Y 1 | X = 1) − P( Y 0 | X = 1)] , where Y 0 is the p otential outcome the patient w ould hav e without treatmen t X = 0 and P( Y 0 | X = 1) is the coun- terfactual probabilit y of Y (the probability of Y had they not receiv ed X ) in the p opulation who actually got X = 1. Note that the p otential outcome Y 1 | X = 1 in the patien ts who actually got X = 1 is the observed outcome Y | X = 1. With u and z denoting the levels of U and Z , resp ectively and P( u ) b eing a shorthand for P( U = u ), AT T = E [P( Y | X = 1) − P( Y 0 | X = 1)] = X u X z P( u, z ) [P( Y | X = 1 , u, z ) − P( Y 0 | X = 1 , u, z )] = X z P( z ) X [P( Y | X = 1 , z ) − P( Y 0 | X = 1 , z )] = X z P( z ) X [P( Y | X = 1 , z ) − P( Y | X = 0 , z )] , whic h coincides with the data generation mec hanism, hence the estimate is correct. In the deriv ation, step 2 holds because U and Z are inde- pendent given X and step 3 u ses the fact tha t the counterfac- tual mo del estimates P 0 ( Y | X = 1 , z , u ) from the untreated patien ts, th us P( Y 0 | X = 1 , z , u ) = P( Y | X = 0 , z , u ) = P( Y | X = 0 , z ) . V. Confounding with Dir e ct and Indir e ct Effe cts. In this ex- periment, w e ha ve three diseases: our index disease Z , which is a confounder; U having an indirect effect on Y via X ; and V having a direct effect on Y . 20% of the p opulation has eac h of Z , V and U independently . 95% of patients with Z or U get the interv en tion X . 25% of the un treated patients with Z get Y , while only 10% of the treated patien ts do, regardless of whether they hav e U . P atients with V face a 5% in their chance of exp eriencing outcome Y . X has a b eneficial A TT of -.15 in patients with Z = 1 and ha ve no effect in patients with Z = 0 (who get X because of U ). Whether a patient has V do es not influence the effect of X . The true A TT is thus -.0833. None of the methods estimated the effect correctly , but Prop en- sit y Score Matching came closest. Analytic deriv ation of wh y it performed w ell is outside the scop e of this pap er, but in essence, its success is driv en b y its abilit y to maximally ex- ploit the indep endence relationships encoded in the causal graph. It can ignore V when it constructs the propensity score model, because X and V are independent (when Y not given); and it can ignore U and V when it computes the A TT in the propensity matched p opulation. On the other hand, the causal graph admits in teraction among U , Z and X , thus a logistic regression mo del as the propensity score model can be sub ject to model misspecification. The Stratified Non-P arametric metho d, whic h is essentially just a direct implementation of the definition of A TT, un- derestimated the A TT by almost 25%. The reason lies in the excessiv e stratification across all combinations of the lev els of U , V , and Z . Ev en with just three v ariables, most strata did not hav e sufficiently man y patients (either treated or un treated) to estimate P( Y | X , u, v , z ). In the discussion, we will describ e remedies to o vercome this problem. Conf CC D A CM PSM SN I. +.110 -.150 -.150 -.150 NA -.150 II. +.099 -.150 -.150 -.150 -.151 -.149 II I. +.099 +.047 -.136 -.136 -.136 -.136 IV. +.077 +.024 -.019 -.083 -.068 -.064 V. +.072 +.038 -.037 -.105 -.074 -.067 T able 4: The A TT estimates by the 6 metho ds in the fiv e exp erimen ts. The exp erimen tal conditions, the full names of the methods and the true A TT v alue are sp ecified in the text. 5. DISCUSSION AND CONCLUSION W e proposed the causal rule mining framework, whic h transitions pattern mining from finding patterns that are associated with an outcome tow ards patterns that cause c hanges in the outcome. Finding causal relationships in- stead of associations is absolutely critical in health care, but also has app eal b eyond health care. The numerous biases that arise in establishing causation mak e quantifying causal effects difficult. W e use the Neyman- Rubin causal mo del to define causation and use the p oten- tial outcome framework to estimate the causal effects. W e correct for three kinds of p otential biases: those stemming from direct causal effect, indirect causal effect and confound- ing. W e compared five different metho ds for estimating the causal effect, ev aluated them on real and synthetic data and found that three of these metho ds gav e very similar results. W e hav e demonstrated on real clinical data that our pro- posed metho d can effectively enumerate causal patterns in a large com binatorial searc h space due to the tw o new pruning methods we dev elop ed for this work. W e also demonstrated that the patterns disco v ered from the data were very rich and we managed to illustrate ho w the effect of statin is dif- feren t in v arious subpopulations. The results we found are consisten t with the literature but go b eyond what is already kno wn about statin’s effect on the risk of diab etes. The discussions and exp erimen tal results provided in this paper pro vide some general guidance on when to use the differen t methods we described. W e recommend counterfac- tual confidence if no confound ing is susp ected as coun terfac- tual confidence is computationally efficien t and can arriv e at the correct solution even when direct effects and indirect effects are present. In the presence of confounding, propen- sit y score matc hing gav e the most accurate results, but due to the need to create a matched p opulation, it has built-in randomness, increasing its v ariance. Moreo ver, the coun ter- factual mo del as well as the prop ensit y score model are sus- ceptible to mo del missp ecification. If unknown in teractions among v ariables are susp ected, we recommend the strati- fied non-parametric metho d. With this technique, mo del misspecification is virtually imp ossible, how ever, its sample size requiremen t is high. The stratified model is subopti- mal if we need to stratify across man y v ariables. Stratify- ing across man y v ariables can fragment the p opulation into man y strata too small to afford us with the ability to esti- mate the effects correctly . If the estimates use some strata but not others, they may be biased. Acknowledgments This study is supp orted b y National Science F oundation (NSF) gran t: IIS-1344135. Conten ts of this do cument are the sole resp onsibility of the authors and do not necessarily represen t official views of the NSF. 6. ADDITIONAL A UTHORS 7. REFERENCES [1] National diab etes fact sheet: national estimates and general information on diabetes and prediab etes in the united states, 2011. Atlanta, GA: US Dep artment of He alth and Human Servic es, Centers for Dise ase Contr ol and Pr evention , 201, 2011. [2] R. Agraw al and R. Srik ant. F ast algorithms for mining association rules. In Pr o c. 20th int. c onf. very lar ge data b ases, VLDB , volume 1215, pages 487–499, 1994. [3] J. Sekhon. The neyman-rubin model of causal inference and estimation via matc hing methods. The Oxfor d handbo ok of p olitic al metho dolo gy , pages 271–299, 2008. [4] R. Huupp onen and J. Viik ari. Statins and the risk of dev eloping diabetes. BMJ , 346, 2013. [5] V. Didelez and I. Pigeot. Judea p earl: Causalit y: Models, reasoning, and inference. Politische Vierteljahr esschrift , 42(2):313–315, 2001. [6] P . Rosenb aum. Observational studies . Springer, 2002. [7] D. F reedman. F rom association to causation via regression. A dvanc es in applie d mathematics , 18(1):59–110, 1997. [8] W. Bielby and R. Hauser. Structural equation models. Annu al r eview of so ciolo gy , pages 137–161, 1977. [9] J. Robins, M. Hernan, and B. Brum back. Marginal structural mo dels and causal inference in epidemiology . Epidemiolo gy , pages 550–560, 2000. [10] C. Granger. Some recent developmen t in a concept of causalit y . Journal of e c onometrics , 39(1):199–211, 1988. [11] F. Jensen. A n intr o duction to Bayesian networks , v olume 210. UCL press London, 1996. [12] F. Elwert. Graphical causal mo dels. In Handb o ok of c ausal analysis for so cial r ese ar ch , pages 245–273. Springer, 2013. [13] K. Murphy . Dynamic bay esian net wo rks. Pr ob abilistic Gr aphic al Mo dels, M. Jor dan , 2002. [14] D. Heck erman. A bay esian approach to learning causal net works. In UAI , pages 285–295. Morgan Kaufmann Publishers Inc., 1995. [15] D. Heck erman. Bay esian netw orks for data mining. Data mining and know le dge disc overy , 1(1):79–119, 1997. [16] G. Co op er. A simple constrain t-based algorithm for efficien tly mining observ ational databases for causal relationships. Data Mining and Know ledg e Disc overy , 1(2):203–224, 1997. [17] C. Silverstein, S. Brin, R. Mot wan i, and J. Ullman. Scalable techniques for mining causal structures. Data Mining and Know le dge Disc overy , 4(2-3):163–192, 2000. [18] S. Mani, P . Spirtes, and G. Coop er. A theoretical study of y structures for causal discov ery . arXiv pr eprint arXiv:1206.6853 , 2012. [19] P . Austin. An introduction to propensity score methods for reducing the effects of confounding in observ ational studies. Multivariate b ehavior al r ese ar ch , 46(3):399–424, 2011. [20] J. Lunceford and M. Da vidian. Stratification and w eighting via the prop ensit y score in estimation of causal treatment effects: a comparative study . Statistics in me dicine , 23(19):2937–2960, 2004. [21] P . Austin and E. Stuart. Estimating the effect of treatmen t on binary outcomes using full matching on the prop ensity score. Statistic al metho ds in me dic al r ese ar ch , page 0962280215601134, 2015. [22] P . Rosen baum and D. Rubin. The cen tral role of the propensity score in observ ational studies for causal effects. Biometrika , 70(1):41–55, 1983. [23] P . Austin and E. Stuart. Movi ng tow ards b est practice when using inv erse probabilit y of treatment weigh ting (ipt w) using the propensity score to estimate causal treatmen t effects in observ ational studies. Statistics in me dicine , 34(28):3661–3679, 2015. [24] D. Lambert and D. Pregibon. More bang for their buc ks: Assessing new features for online advertisers. In Workshop on Data mining and audienc e intel ligenc e for advertising , pages 7–15. A CM, 2007. [25] D. Chan, R. Ge, O. Gershony , T. Hesterb erg, and D. Lambert. Ev aluating online ad campaigns in a pipeline: causal models at scale. In ACM-SIGKDD , pages 7–16. AC M, 2010. [26] J. Li, T. Le, L. Liu, J. Liu, Z. Jin, and B. Sun. Mining causal asso ciation rules. In Data Mining Workshops (ICDMW), 2013 IEEE 13th International Confer enc e on , pages 114–123. IEEE, 2013. [27] P . Holland and D. Thay er. Differen tial item performance and the mantel-haenszel procedure. T est validity , pages 129–145, 1988. [28] J. Li, T. Le, L. Liu, J. Liu, Z. Jin, B. Sun, and S. Ma. F rom observ ational studies to causal rule mining. AC M T r ansactions on Intel ligent Systems and T e chnolo gy (TIST) , 7(2):14, 2015. [29] J. Li, S. Ma, T. Le, L. Liu, and J. Liu. Causal decision trees. arXiv prepri nt arXiv:1508.03812 , 2015. [30] B. Go ethals. Survey on frequent pattern mining. Univ. of Helsinki , 2003. [31] J. Han, J. Pei, B. Mortaza vi-Asl, Q. Chen, U. Day al, and M. Hsu. F reespan: frequen t pattern-pro jected sequen tial pattern mining. In ACM-SIGKDD , pages 355–359. ACM, 2000. [32] J. Han, J. Pei, Y. Yin, and R. Mao. Mining frequent patterns without candidate generation: A frequen t-pattern tree approach. Data mining and know le dge disc overy , 8(1):53–87, 2004. [33] M Regina Castro, Gyorgy Simon, Stephen S Cha, Barbara P Y a wn, L Joseph Melton I I I, and Ped ro J Caraballo. Statin use, diab etes incidence and o verall mortalit y in normoglycemic and impaired fasting glucose patients. Journal of Gener al Internal Me dicine , pages 1–7.
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment