Empirical Bayes Shrinkage of Functional Effects, with Application to Analysis of Dynamic eQTLs

Empirical Bayes Shrinkage of Functional Effects, with Application to Analysis of Dynamic eQTLs
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

We introduce functional adaptive shrinkage (FASH), an empirical Bayes method for joint analysis of observation units in which each unit estimates an effect function at several values of a continuous condition variable. The ideas in this paper are motivated by dynamic expression quantitative trait locus (eQTL) studies, which aim to characterize how genetic effects on gene expression vary with time or another continuous condition. FASH integrates a broad family of Gaussian processes defined through linear differential operators into an empirical Bayes shrinkage framework, enabling adaptive smoothing and borrowing of information across units. This provides improved estimation of effect functions and principled hypothesis testing, allowing straightforward computation of significance measures such as local false discovery and false sign rates. To encourage conservative inferences, we propose a simple prior- adjustment method that has theoretical guarantees and can be more broadly used with other empirical Bayes methods. We illustrate the benefits of FASH by reanalyzing dynamic eQTL data on cardiomyocyte differentiation from induced pluripotent stem cells. FASH identified novel dynamic eQTLs, revealed diverse temporal effect patterns, and provided improved power compared with the original analysis. More broadly, FASH offers a flexible statistical framework for joint analysis of functional data, with applications extending beyond genomics. To facilitate use of FASH in dynamic eQTL studies and other settings, we provide an accompanying R package at https: //github.com/stephenslab/fashr.


💡 Research Summary

The paper introduces Functional Adaptive Shrinkage (FASH), a novel empirical Bayes framework designed for the joint analysis of large numbers of observation units that each provide noisy estimates of an effect function across a continuous condition such as time. Traditional dynamic eQTL studies either treat each gene‑variant pair independently or impose restrictive parametric interaction models (e.g., linear or quadratic time effects). Such approaches ignore the opportunity to share information across units and often lack calibrated hypothesis testing for non‑linear dynamics. FASH addresses these gaps by combining a flexible family of Gaussian processes—called the L‑GP family—with adaptive shrinkage ideas.

In the L‑GP construction a linear differential operator L defines a baseline function space (e.g., constant functions when L = D¹, linear functions when L = D²). A scalar variance parameter σ controls how far the true effect function deviates from this baseline. When σ is small the prior forces strong shrinkage toward the baseline; when σ is large the prior allows more flexible shapes. Crucially, σ and the choice of L are learned from the data across all units using maximum‑likelihood or variational methods, yielding a “global‑local” shrinkage: units whose observed patterns closely follow the baseline are shrunk heavily, while those that deviate are shrunk less.

Given the observed effect estimates (\hat\beta_j(t_{jr})) and their standard errors (s_{jr}), the model assumes independent normal sampling (conditional on the true function). The posterior distribution for each unit’s effect function is then obtained analytically (or via efficient Gaussian‑process computations). The posterior mean provides a smoothed estimate, and the posterior variance yields credible intervals. Because the prior is data‑driven, the amount of smoothing adapts automatically to each unit’s signal‑to‑noise ratio and to the overall similarity of the dataset.

For hypothesis testing, the baseline model defines a null set (S_0) (e.g., all constant functions). The posterior probability that a unit’s function lies in (S_0) is used to compute local false discovery rates (lfdr) and local false sign rates (lfsr). These quantities give a calibrated, unit‑specific measure of evidence against the null while controlling the overall false discovery rate in the multiple‑testing setting. The authors also propose a Bayes‑factor‑based prior adjustment that inflates the prior weight on the null when the alternative prior is misspecified, thereby guaranteeing conservative inference. This adjustment is generic and could be applied to other empirical Bayes shrinkage methods.

The methodology is applied to a publicly available dynamic eQTL dataset from Strob‑er et al. (2019), which tracks cardiomyocyte differentiation over 16 days in induced pluripotent stem cells. The dataset contains over one million gene‑variant pairs, each measured at 16 time points. Using FASH, the authors fit two separate baseline models: a constant baseline (testing “any change over time?”) and a linear baseline (testing “non‑linear change?”). The adaptive shrinkage yields smoothed effect curves that respect the data while borrowing strength across all pairs. Compared with the original analysis that relied on fixed interaction terms, FASH discovers many additional dynamic eQTLs, reveals diverse temporal patterns (e.g., transient spikes, delayed activation, oscillations), and provides tighter credible intervals. Importantly, the method works directly on summary statistics, making it applicable when individual‑level data are unavailable due to privacy constraints.

All procedures are implemented in the R package fashr, which includes functions for defining L‑GP priors, estimating hyper‑parameters, computing posterior summaries, and calculating lfdr/lfsr. The package is well‑documented, includes a vignette reproducing the cardiomyocyte analysis, and is designed to be extensible to other functional data contexts such as drug‑response curves, environmental exposure time‑courses, or any setting where effects vary smoothly over a continuous covariate.

The authors acknowledge limitations: the method requires a moderate number of measurement points per unit (ideally > 3) to estimate smoothness reliably, and the current L‑GP family may struggle with highly periodic or abrupt change‑point patterns without further kernel extensions. Future work may explore richer differential operators, multivariate condition variables, and integration with deep learning representations.

In summary, FASH provides a statistically principled, computationally efficient, and broadly applicable solution for estimating and testing functional effects in high‑dimensional, noisy, time‑course data. By learning a data‑driven prior and performing adaptive shrinkage, it simultaneously improves effect‑function estimation, increases power for detecting dynamic eQTLs, and delivers well‑calibrated inference, representing a substantial advance over existing methods in functional genomics and beyond.


Comments & Academic Discussion

Loading comments...

Leave a Comment