Bayesian Analysis for miRNA and mRNA Interactions Using Expression Data
MicroRNAs (miRNAs) are small RNA molecules composed of 19-22 nt, which play important regulatory roles in post-transcriptional gene regulation by inhibiting the translation of the mRNA into proteins or otherwise cleaving the target mRNA. Inferring miRNA targets provides useful information for understanding the roles of miRNA in biological processes that are potentially involved in complex diseases. Statistical methodologies for point estimation, such as the Least Absolute Shrinkage and Selection Operator (LASSO) algorithm, have been proposed to identify the interactions of miRNA and mRNA based on sequence and expression data. In this paper, we propose using the Bayesian LASSO (BLASSO) and the non-negative Bayesian LASSO (nBLASSO) to analyse the interactions between miRNA and mRNA using expression data. The proposed Bayesian methods explore the posterior distributions for those parameters required to model the miRNA-mRNA interactions. These approaches can be used to observe the inferred effects of the miRNAs on the targets by plotting the posterior distributions of those parameters. For comparison purposes, the Least Squares Regression (LSR), Ridge Regression (RR), LASSO, non-negative LASSO (nLASSO), and the proposed Bayesian approaches were applied to four public datasets. We concluded that nLASSO and nBLASSO perform best in terms of sensitivity and specificity. Compared to the point estimate algorithms, which only provide single estimates for those parameters, the Bayesian methods are more meaningful and provide credible intervals, which take into account the uncertainty of the inferred interactions of the miRNA and mRNA. Furthermore, Bayesian methods naturally provide statistical significance to select convincing inferred interactions, while point estimate algorithms require a manually chosen threshold, which is less meaningful, to choose the possible interactions.
💡 Research Summary
The paper addresses the problem of inferring functional interactions between microRNAs (miRNAs) and messenger RNAs (mRNAs) from expression data. While traditional point‑estimate methods such as ordinary least squares (LSR), ridge regression (RR), LASSO, and non‑negative LASSO (nLASSO) have been widely used, they produce a single coefficient per miRNA‑mRNA pair and require an arbitrarily chosen threshold to decide which interactions are significant. To overcome these limitations, the authors propose two Bayesian extensions: Bayesian LASSO (BLASSO) and non‑negative Bayesian LASSO (nBLASSO). Both models place a Laplace (or scale‑mixture‑of‑normals) prior on the regression coefficients, enabling ℓ1‑type shrinkage while simultaneously generating full posterior distributions via Gibbs sampling. The nBLASSO further imposes a non‑negativity constraint on the coefficients, reflecting the biological expectation that miRNAs can only down‑regulate target mRNAs.
Four publicly available miRNA‑mRNA expression datasets, covering different tissue types and disease contexts, were used for empirical evaluation. For each dataset the authors fitted LSR, RR, LASSO, nLASSO, BLASSO, and nBLASSO models and compared them using receiver‑operating‑characteristic (ROC) curves, area under the curve (AUC), sensitivity, specificity, and F1‑score. The results consistently showed that the non‑negative approaches (nLASSO and nBLASSO) achieved the highest sensitivity and specificity. Importantly, nBLASSO provided posterior means together with 95 % credible intervals for each coefficient, allowing researchers to visualize the magnitude of miRNA effects and the associated uncertainty. The Bayesian framework also yields a natural measure of statistical significance: interactions whose posterior probability exceeds a pre‑specified level (e.g., 0.95) can be declared significant without manual threshold tuning.
Beyond performance metrics, the paper highlights several conceptual advantages of the Bayesian methods. First, the credible intervals convey the reliability of each inferred interaction, guiding experimental validation priorities. Second, the posterior distribution captures the full uncertainty, which is lost in point‑estimate approaches. Third, the non‑negative constraint aligns the statistical model with known biology, reducing the risk of spurious positive coefficients that would imply miRNA‑mediated up‑regulation.
The authors acknowledge computational cost as a limitation, since Gibbs sampling can be time‑consuming for high‑dimensional data. They also note that the choice of prior hyper‑parameters was not extensively explored, and that integration with other omics layers (e.g., proteomics, methylation) remains an open avenue. Future work could adopt variational Bayesian inference to speed up computation, perform sensitivity analyses on prior specifications, and extend the model to incorporate network‑based structural priors.
In summary, this study demonstrates that Bayesian LASSO, particularly its non‑negative variant, offers a statistically rigorous and biologically coherent framework for miRNA‑mRNA interaction inference. By delivering both point estimates and credible intervals, it enables more informed decision‑making in downstream functional studies and biomarker discovery, outperforming traditional point‑estimate methods in both predictive accuracy and interpretability.
Comments & Academic Discussion
Loading comments...
Leave a Comment