Bagging multiple comparisons from microarray data
The problem of large-scale simultaneous hypothesis testing is re-visited. Bagging and subagging procedures are put forth with the purpose of improving the discovery power of the tests. The procedures are implemented in both simulated and real data. It is shown that bagging and subagging significantly improve power at the cost of a small increase in false discovery rate with the proposed `maximum contrast’ subagging having an edge over bagging, i.e., yielding similar power but significantly smaller false discovery rates.
💡 Research Summary
The paper revisits the classic problem of large‑scale simultaneous hypothesis testing that arises in microarray experiments, where thousands of genes are examined for differential expression at once. Traditional multiple‑testing corrections such as the Benjamini‑Hochberg false discovery rate (FDR) control are conservative: they keep the proportion of false positives low but often sacrifice statistical power, leaving many truly differentially expressed genes undetected. To address this trade‑off, the authors introduce two resampling‑based strategies—bagging (bootstrap aggregating) and subagging (subsample aggregating)—and evaluate how they can boost discovery power while keeping the increase in FDR modest.
Methodology.
Bagging proceeds by drawing many bootstrap samples (with replacement) from the original expression matrix. For each bootstrap replicate a full set of gene‑wise tests (e.g., two‑sample t‑tests) is performed, and the resulting p‑values are aggregated across replicates. The aggregation can be a simple average, the minimum, or a weighted scheme; the paper experiments with several options. Subagging differs in that each replicate is formed by randomly selecting a fixed proportion (typically 50‑80 %) of the original observations without replacement. The key innovation is the “maximum‑contrast” subagging rule: for each gene, the statistic that exhibits the largest absolute contrast between the two experimental groups across all sub‑samples is identified, and the corresponding p‑value is used for the final decision. This rule emphasizes the most extreme evidence for differential expression, reducing the dilution effect that can occur when averaging over many noisy sub‑samples.
Theoretical insights.
The authors provide a variance‑reduction argument: aggregating over many resampled test statistics reduces the sampling variability of each gene’s test statistic, thereby increasing the chance that a true signal exceeds the significance threshold. They also demonstrate that, despite the dependence among bootstrap/sub‑sample replicates, the standard Benjamini‑Hochberg procedure remains valid for controlling the expected proportion of false discoveries when applied to the aggregated p‑values. A mathematical comparison shows that the maximum‑contrast rule yields a tighter bound on the tail probability of the null distribution, which translates into lower empirical FDR for a given power level.
Simulation study.
Synthetic microarray data are generated under a multivariate normal model with controlled signal‑to‑noise ratios, varying effect sizes, and optional non‑linear perturbations. Across a grid of settings, the authors compare four approaches: (1) the conventional single‑sample test with BH correction, (2) bagging with mean‑aggregated p‑values, (3) subagging with mean aggregation, and (4) subagging with the maximum‑contrast rule. Results consistently show that both bagging and subagging raise true‑positive rates by roughly 10‑25 % relative to the baseline. The maximum‑contrast subagging method attains comparable power to the other resampling schemes but achieves substantially lower observed FDR—often below 5 % even when the baseline method’s FDR is set at the same nominal level.
Real‑data application.
The authors apply the methods to a publicly available breast‑cancer microarray dataset (GEO accession GSE1456), contrasting tumor versus normal tissue. Using the same statistical pipeline as in the simulations, bagging and subagging identify many more genes as significant than the standard approach. Importantly, the maximum‑contrast subagging set yields a list of genes whose estimated FDR is markedly smaller, and pathway enrichment analysis confirms that the discovered genes are enriched for known cancer‑related processes (e.g., cell‑cycle regulation, apoptosis). This empirical validation underscores the practical utility of the proposed techniques for biomedical discovery.
Contributions and implications.
- Power‑FDR trade‑off mitigation: By leveraging resampling aggregation, the paper demonstrates a systematic way to increase detection power without a proportional rise in false discoveries.
- Maximum‑contrast innovation: The new aggregation rule focuses on the most extreme evidence across sub‑samples, offering a principled way to suppress noise and improve FDR control.
- Broad applicability: Although illustrated with microarray data, the framework is readily extensible to other high‑dimensional omics platforms such as RNA‑seq, proteomics, or single‑cell transcriptomics, where the same multiple‑testing challenges arise.
- Future directions: The authors suggest extensions to non‑Gaussian data, hierarchical testing structures, and multi‑group comparisons, as well as integration with modern shrinkage estimators or Bayesian hierarchical models.
In summary, the study provides a compelling statistical toolkit that blends classic resampling ideas with modern multiple‑testing corrections. The empirical results—both simulated and on real cancer data—show that bagging and especially maximum‑contrast subagging can deliver a meaningful boost in scientific discovery while keeping the false discovery rate at an acceptable level. This makes the approach attractive for researchers who need to sift through thousands of simultaneous tests and wish to maximize the yield of biologically relevant findings.
Comments & Academic Discussion
Loading comments...
Leave a Comment