Comparative Analysis of RNA Families Reveals Distinct Repertoires for Each Domain of Life

The RNA world hypothesis, that RNA genomes and catalysts preceded DNA genomes and genetically-encoded protein catalysts, has been central to models for the early evolution of life on Earth. A key part

Comparative Analysis of RNA Families Reveals Distinct Repertoires for   Each Domain of Life

The RNA world hypothesis, that RNA genomes and catalysts preceded DNA genomes and genetically-encoded protein catalysts, has been central to models for the early evolution of life on Earth. A key part of such models is continuity between the earliest stages in the evolution of life and the RNA repertoires of extant lineages. Some assessments seem consistent with a diverse RNA world, yet direct continuity between modern RNAs and an RNA world has not been demonstrated for the majority of RNA families, and, anecdotally, many RNA functions appear restricted in their distribution. Despite much discussion of the possible antiquity of RNA families, no systematic analyses of RNA family distribution have been performed. To chart the broad evolutionary history of known RNA families, we performed comparative genomic analysis of over 3 million RNA annotations spanning 1446 families from the Rfam 10 database. We report that 99% of known RNA families are restricted to a single domain of life, revealing discrete repertoires for each domain. For the 1% of RNA families/clans present in more than one domain, over half show evidence of horizontal gene transfer, and the rest show a vertical trace, indicating the presence of a complex protein synthesis machinery in the Last Universal Common Ancestor (LUCA) and consistent with the evolutionary history of the most ancient protein-coding genes. However, with limited interdomain transfer and few RNA families exhibiting demonstrable antiquity as predicted under RNA world continuity, our results indicate that the majority of modern cellular RNA repertoires have primarily evolved in a domain-specific manner.


💡 Research Summary

The authors set out to test a central tenet of the RNA‑world hypothesis: that a substantial portion of today’s cellular RNAs are direct descendants of an ancient, universal RNA repertoire that existed before the divergence of the three domains of life. To do this, they performed a large‑scale comparative analysis using the Rfam 10 database, which contains 1 446 curated RNA families. By mapping more than three million individual RNA annotations onto the three major domains—Bacteria, Archaea, and Eukarya—they quantified how many families are shared across domains versus how many are domain‑restricted.

The most striking result is that 99 % of the known RNA families are found in only one domain. This overwhelming domain specificity contradicts the expectation that a broad “RNA world” signature would be detectable across all life. Only about 1 % of families appear in two or more domains. The authors then split this minority into two groups. The first group, representing just over half of the cross‑domain families, shows clear signatures of horizontal gene transfer (HGT). These RNAs are often associated with mobile genetic elements such as plasmids, bacteriophages, or transposons, and include small regulatory RNAs, introns, and certain ribosomal RNA variants. Their distribution pattern suggests that RNA molecules can be transferred between lineages more readily than protein‑coding genes, but that such transfers are relatively rare on the evolutionary timescale.

The second group of cross‑domain RNAs lacks HGT signatures and instead displays a vertical inheritance pattern that can be traced back to the Last Universal Common Ancestor (LUCA). This set includes the core components of the translation machinery—5S rRNA, 16S/23S rRNA, transfer RNAs, RNase P RNA, and a few catalytic ribozymes. Their presence in all three domains provides strong evidence that LUCA already possessed a sophisticated protein‑synthesis apparatus, consistent with phylogenetic reconstructions of ancient protein‑coding genes.

The authors also discuss potential biases in the analysis. Rfam’s coverage is skewed toward well‑studied model organisms and experimentally validated RNAs, meaning that rare or environmentally derived RNAs may be under‑represented. Nevertheless, the sheer dominance of domain‑restricted families in the curated dataset suggests that the observed pattern reflects genuine biological reality rather than a sampling artifact.

In sum, the study concludes that the majority of modern cellular RNAs have evolved in a domain‑specific manner, with only a tiny fraction showing either horizontal transfer or genuine antiquity. While a few essential RNAs can be traced back to LUCA, the broad continuity envisioned by the RNA‑world hypothesis is not supported by the current inventory of known RNA families. The findings highlight the importance of considering domain‑specific evolutionary trajectories and the limited role of HGT in shaping the RNA complement of contemporary organisms, and they call for further exploration of unexplored environmental RNAs that might yet reveal deeper evolutionary connections.


📜 Original Paper Content

🚀 Synchronizing high-quality layout from 1TB storage...