An Exploratory Data Survey of Drug Name Incidence and Prevalence From the FDAs Adverse Event Reporting System, 2004 to 2012Q2

Drug Names, Population Level Surveillance and the FDA's Adverse Event Reporting System: An Exploratory Data Survey of Drug Name Incidence and Prevalence, 2004-2012Q2 Purpose: To count and monitor the drug names reported in the publicly available vers…

Authors: Nick Williams

An Exploratory Data Survey of Drug Name Incidence and Prevalence From   the FDAs Adverse Event Reporting System, 2004 to 2012Q2
T itle Page T itle: An Explorat or y Data Survey of Drug Name Incidence and Prevalence From the FDA's Adverse Event Reporting System, 2004-2012Q2 Running Head: Drug Names Surveillance and the FDA Adverse Event Reporting S y stem Authors: Nick W illiams Institutions: New Y ork University Langone Medical Center Corresponding Author: Nick W illiams Name: Nick W illiams Address: NYU School of Medicine, Department of Emer gency Medicine Bellevue Hospital Center 462 First A venue, OBV A345, New Y ork, NY 10016 T elephone: 1-347-327-8684 Fax: 212.562.3001 Email: nick.williams@nyumc.org Key W ords: F AERS, FDA, Adverse Event, Data Mining, Surveillance, Epidemic, Historical T rends Key Points 1. Historical reporting trends for AE are relatively eas y to produce fro m the publicly available version of F AERS. 2. Data mining approaches should mirror established epidemiolog y populati on level surveillance operations. 3. Specificity aside, there is much to learn from a maximum sensitivity anal y sis of drug names in an AE database. 4. Pharmacoepidemiology effort s should be driven by epidemiology surveillance of population level values to maximize public health benefits for the largest segment of the af fected population. Conflict of Interest: The author has no competing interests to declare. This survey is unfunded. W ord Count : 2510 Prior Postings and Publication: The aut hor declares no prior presentation of this work of any kind or in any format. Abstract T itle: An Exploratory Data Survey of Drug Name Incidence and Prevalence From the FDA's Adverse Event Reporting System, 2004-2012Q2 Purpose: T o count and monitor the drug names reported in the publ icly available version of the Federal Adverse Event Reporting System (F AERS) from 2004-2012Q2 in a maximized sensitivity relational model. Methods: Data mining and data modeling was conducted and event based summary statistics with plots were created from nine continuous years of F AERS data. Results: This F AERS model contains 344,452 individual drug names and 432,541,994 drug name count references which occurred across 4,148,761 human subjects in the 34 quarter stud y period. F AERS has several trending outbreaks of drug name incidence reported for Adverse Events (AE). Plots for the top 100 scoring drug name references are reported b y year and quarter; the top 100 drug names contain 143,384,240 references or 33% of all drug name references over 34 quarters of continuous F AERS data. Conclusions: While F AERS contains many drugs and adverse event reports, its data pertains to very few of them. Drug name incidence lends timely and effective surveillance of lar ge populations of A verse Event Reports and does not require the cause of the AE, nor its validity to be known to detect a mass poisoning. Drug name surveillance and incidence reporting may serve as viable alternative to odds ratio’ s and other Gaussian based statistical approaches when a maximized sensitivit y relational model is used. Introduction The Federal Adverse Event Reporting S y stem (F AERS) dat a set is a massive, publicly available pharmacoepidemiology repository for worldwide post marketing drug surveillance i . F AERS is populated with provider reported data points describing Adverse Events (AE) experienced b y patients who consume marketed pharmaceutical drugs in patient care settings. As more and more drugs are marketed over time and access to western pharmaceuticals continues to grow it stands to reason that we will see more adverse events. How many adverse events is too many ii for a single drug name is a question of active debate, with obvious cases like (contaminated) Heparin Sodium Injection (HSI) iii and VIOXX iv on one hand and well debated yet indecisive cases like Aspirin v on the other . While these drug names are often explored in F AERS in terms of historical trends vi of individual drug names, odds ratio signal dis-proportionality and mortality surveillance vii , the literature fundamentally lacks summary statistics for all drug names in F AERS or a drug name to drug names comparison. Here, a maximum sensitivity model is crafted from F AERS reports from 2004-2012Q2; utilizing the FDA ’ s publicl y available database. T his data source is largely discarded as a research resource because F AERS contains over 300,000 drug names for a highly likely 10,000 substances viii . Thi s inflation is largely due to spelling errors and open input data string fields. Further there is widespread criticism that F AERS does not contain meaningful data in the public version and that most impressive data elements are reserved for government investigators in the name of patient information protections ix . These concerns are accurate, but must be tempered by efforts to develop surveillance methodologies that can resolve these static roadblocks that have shown no sign of moving despite years of publicly available surveillance data. F AERS uses a three tier index model where drug names are tied to clinical indic ations and observed reactions by the reporting provider . A patient ma y have several drugs (poly pharmacy x ), indications (co- morbidity) and reactions (adverse or clinical, known or unknown) in any single subject level report. A host of secondary variables, including patient outcome is also available. Several common statistical and epidemiological methods of mining F AERS ma y we ll be inappropr iate given the structure, distribution and shape of F AERS drug name incidence and prevalence. For a direct example, a patient on ten drugs with four indications and two reactions returns six drug name counts across each of the ten drugs for the individual subject in this model, inflated b y i nd ication and reaction. T o beco me a high count relational drug name this must happen to a specific drug disproportionately across bodies or time. Although this model sacrifices the clinical sensitivit y that many providers and toxicologists look to F AERS to provide, it allows epidemiologists to describe in explicit detail the incidence and relevance of reported drug na me relationships over time. More complex F AERS reports suggest more complex management of and therefore spontaneous or unknown reactions and severe subject level clinical complications. Population level events outside of beds ide matters are well suited to maximum sensitivity detection methods like this. Knowing the noise from the pharmacokinetics is a major undertaking of data science and pharmacology . Signal based reporting was supposed to solve this problem xi , yet signal work is largely derived from single drug odds ratios xii xiii that assume proportional incidence can be subject to false positives and noise xiv . By assuming F AERS reports are false until proven true (as if F AERS reports were populated by clinical providers by accident xv ) we obscure and deliberately under power F AERS signals with overly complicated mathematical models. Different and multiple approaches to AE surveillance may resolve longstanding dissatisfaction with championed single method approaches. Methods Subject numbers (ISR) were left joined to their reported drug names, then clinical indications and finally reactions. This set was then striped of its subject level identifiers in Microsoft Access 2008. Further , clinical ind ication and specific reaction were also striped from the model to create a relational count data model of drug names by quarter using Google API Big-Query . This model assumes every reported drug name in F AERS actually influenced an adverse event and that in a population level perspective, higher count values indicate more problematic substances. While F AERS probably contains false positives, they most likely did not happen across the available, international and historical patient population contained in F AERS. Plots were constructed using Microsoft Excel 2007, RED-R (R programming language xvi ) and CIRCOS xvii . Summary statistics were computed in SPSS 19. Results T able One: Model V alues Over 75% of the 344,452 drug names in F AERS contained less than fort y relational counts across 4,148,761 human subjects and 34 reporting quarters. Despite maximum sensitivity most drug names failed to capture a meaningful volume of references over time suggesting non-population level AE but individual patient AE. Some drug names returned millions of counts suggesting mass poisonings. T able T wo: Quarterly (Q) Measures for The T op T en Scoring Drug Names HSI is the largest scori ng drug name in the model followed by the widely used and debated Aspirin. The distinction between their QSUM is telling as HSI is nearly three times larger than Aspirin. Several drugs beat out VIOXX for the third spot on the list and w arrant further investigation. T he range between the QMIN and QMAX may prove adequate to detect departures from the norm if taken with median and average taken as baseline values. All values are quarterl y except QSUM. Mass poisoning events are detectable when historical subservience is utilized. Graph One: Box Plot of Drug Name Reference Counts by Quarter VIOXX (Graph 1 Box Plot Left) and HEPERIN SODIUM INJECTION (Graph 1 Box Plot Right) are clearly legible and served for the top scoring values over several quarters of documented historical incidence. Thi s data model can clearly detect departure from the trend with simplistic incidence surve y work. Graph T wo: Population Level Ratio T rend V ariation in F AERS Here we see the total population from the stud y period b y quarter di v ided by the number of drug name reference events by quarter and plotted against the percentage of reporting subjects and drug name reference events from the study period. W e clearl y see that while there is no stable rate or neutral state in F AERS there are strong departures from the norm, especiall y coinciding with HSI contamination. VIOXX and other poisoning events are not readily legible here, suggesting that some mass poisonings are obscured by proportional surveillance. If F AERS is a natural pattern w ithout variation, the af fected human population and drug name counts should return similar percentages over time. There are several departures from the expected 1:2 ratio, where the complexit y of the adverse event outpaced the human population experiencing it. T op 100 Sum Drug Name References Set Plotted b y Y ear and Quarter 2004-2012 This set contains 100 drug names and 33% of F AERS drug name count references over nine continuous years. 2004 2005 2006 2007 2008 2009 2010 201 1 2012 Full Set 2004-2012Q2 Discussion: There is an order of operations to population level epidemiolog y : sensitivit y , outbre ak identification, specificity , case defin ition and then automated surveillance. Thi s maximu m sensitivity model may be complicated to understand as it has no specificity controls. In the first step specificit y or cause is not required, rather detection is key . These AE case definitions emerge from drilling down into initial sensitivity surveillance. This m odel can detect signals of A E despite the equal power that relational modeling lends to assumed false positives. Further , histor ical epidemics of AE (VIOXX and HSI) are detectable. Most importantly this model does not rely on any base measure of a drug over time but rather compares a drug to drugs across clinical complexity within and across quarters. VIOXX may be a successful y et late intervention; as most surveillance schemes look for increases in reporting for a baseline that w as never natural. Further , online supplement Graph F AERS 2006 clearly demonstrates the power and utility of F AERS b y highlighting the near evaporation of VIOXX cases by the second quarter of 2006; for which mass legal action xviii rather than marketing practices may well be responsible. Further utilit y can be seen in countless label changes, black box warnings and direct actions taken b y the FDA. Increasingly F AERS has fallen under somewhat unwarranted criticism for not detail ing epidemics or naming drugs as endemic causes of AE fast enough. This model details the kinds of utility that may be found in an adverse event reporting system like F AERS and suggest drug names for further investigation. Maximum sensitivity seems to coincide with major epidemics of AE including VIOXX and HSI, suggesting that other model values may also be valid. High scoring drugs warrant further epidemiological investigation, as these signals are equall y powered in this m axi mum sensitivity model as V IOXX and HSI. Conclusion F AERS presents challenges and rewards in an endless waltz of suspicion, corroboration and false positives in traditional surveillance scheme. Novel approaches as well as model, relational and dimensional analysis may supplement formal surveillance programs. While more invention, collaboration and expertise are often called upon to supplement surveillance efforts, an old fashioned closed set count data model has demonstrated some utility . i US Food and Drug Administration (12/01/2013) 'FDA Adverse Event Reporting S y ste m (F AERS) (formerly AERS)' Last Updated:09/10/2012; Retr ieved from: http://www .fda.go v/Drugs/GuidanceCompliance Regulator y Information/Surveillance/AdverseDr ugEf fects/default.htm ii Lester , J., Neyarapally , G. A., Lipowski, E., Graham, C. F ., Hall, M. and Dal Pan, G. (2013), Evaluation of FDA safety-related drug label changes in 2010. Pharmacoepidem. Drug Safe., 22: 302– 305. doi: 10.1002/pds.3395 iiiRodriguez, E. M., Staffa, J. A. and Graham, D. J. (2001), The role of databases in drug postmarketing surveillance. Pharmacoepidem. Drug Safe., 10: 407–410. doi: 10.1002/pds.615 iv Eric J. T opol, M.D. Failing the Public Health — Rofecoxib, Merck, and the FDA N Engl J Med 2004; 351:1707-1709 October 21, 2004 DOI: 10.1056/NEJMp048286 v Kenneth R. McQuaid, MD, Loren Laine, MD Systematic Review and Meta-anal y sis of A dverse Events of Low-dose Aspirin and Clopidogrel in Randomized Controlled T rials The American J ournal of Medicine, V olume 1 19, Issue 8, Augus t 2006, Pages 624–638 H ttp://dx.doi.org/10.1016/j.amjmed.2005.10.039 vi AM Hochberg 1 and M Hauben T ime-to-Signal Comparison for Drug Safety Data-Mining Algorithms vs. T raditional Signaling Criteria Clinical Pharmacology & Therapeutics (2009); 85, 6, 600–606 doi:10.1038/clpt.2009.26 vii Jennifer Jacobs, Peter Fisher Pol y phar macy , multimorbidity and the value of integrative medicine in public health European Journal of Integrative M edicine, V olume 5, Issue 1, February 2013, Pages 4–7 http://dx.doi.org/10.1016/j.eujim.2012.09.001 viii Bilker , W ., Gogolak, V ., Goldsmith, D., Hauben, M., Herrera, G., Hochberg, A., Jolley , S., Kulldorff, M., Madigan, D., Nelson, R., S hapiro, A. and Sh mueli, G. (2006), Accelera ting statistical research in drug safety . Phar macoepidem. Drug Safe., 15: 687–688. doi: 10.1002/pds.1267 ix Bilker , W ., Gogolak, V ., Goldsmith, D., Hauben, M., Herrera, G., Hochberg, A., Jolle y , S., Kulldorf f, M., Madigan, D., Nelson, R., Shapiro, A. and Shmueli, G. (2006), Accelerating statistical research in drug safety . Pharmacoepidem. Drug Safe., 15: 687–688. doi: 10.1002/pds.1267 x Raymond L. W oosley MD, PhD Discovering adverse reactions: Why does it take so long? Clinical Pharmacology & Thera p eutics (2004) 76, 287–289; doi: 10.1016/j.clpt.2004.06.006 xi Stephenson, W . P . and Hauben, M. (2007), Data mining for signals in spontaneous reporting databases: proceed with caution. Pharmacoepidem. Drug Safe., 16: 359–365. doi: 10.1002/pds.1323 xii Poluzzi, E., Raschi, E., Moretti, U. and De Ponti, F . (2009), Drug-induced torsades de pointes : data mining of the public version of the FDA Adverse Event Reporting S y stem (AERS). Pharmacoepidem. Drug Safe., 18: 512–518. doi: 10.1002/pds.1746 xiii Almenoff, J. S., Du Mouchel, W ., Kindman, L. A., Y ang, X. and Fram, D. (2003), Disproportionality analysis using empirical Bayes data mining: a tool for the evaluation of drug interactions in the post- marketing setting. Pharmacoepidem. Drug Safe., 12: 517–521. doi: 10.1002/pds.885 xiv Brown, J. S., Kulldorff, M., Petronis, K. R., Rey nolds, R., Chan, K. A., Davis, R. L., Graham, D., Andrade, S. E., Raebel, M. A., Herrinton, L., Roblin, D., Boudreau, D., Smith, D., Gurwitz, J. H., Gunter , M. J. and Platt, R. (2009), Earl y adverse drug event signal detection within population-based health networks using sequential methods: key methodologic considerations. Pharmacoepidem. Drug Safe., 18: 226–234. doi: 10.1002/pds.1706 xv T annert C , Elvers HD , Jandrig B .The ethics of uncertaint y . In the light of possible dangers, research becomes a moral duty . EMBO Rep. 2007 Oct;8(10):892-6. PMID:17906667 [PubMed - indexed for MEDLINE] PMCID: PMC2002561 xvi Covington, K. R. and A. Parikh (201 1, August). The red-r framework for integrated discovery . The Red-R Journal 1-08/08/201 1. xviiKrzywinski M, Schein J, Birol I, et al. Circos: an information aesthetic for comparative geno mics. Genome Research. 2009;19:1639–1645. xviii Thomas, W . John V ioxx Stor y: W ould It Have Ended Different l y in the European Union, The; 32 Am. J.L. & Med. 366 (2006)

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment