Characteristics of transposable element exonization within human and mouse

Insertion of transposed elements within mammalian genes is thought to be an important contributor to mammalian evolution and speciation. Insertion of transposed elements into introns can lead to their

Characteristics of transposable element exonization within human and   mouse

Insertion of transposed elements within mammalian genes is thought to be an important contributor to mammalian evolution and speciation. Insertion of transposed elements into introns can lead to their activation as alternatively spliced cassette exons, an event called exonization. Elucidation of the evolutionary constraints that have shaped fixation of transposed elements within human and mouse protein coding genes and subsequent exonization is important for understanding of how the exonization process has affected transcriptome and proteome complexities. Here we show that exonization of transposed elements is biased towards the beginning of the coding sequence in both human and mouse genes. Analysis of single nucleotide polymorphisms (SNPs) revealed that exonization of transposed elements can be population-specific, implying that exonizations may enhance divergence and lead to speciation. SNP density analysis revealed differences between Alu and other transposed elements. Finally, we identified cases of primate-specific Alu elements that depend on RNA editing for their exonization. These results shed light on TE fixation and the exonization process within human and mouse genes.


💡 Research Summary

The study investigates how transposable elements (TEs) become alternatively spliced cassette exons—a process termed exonization—in human and mouse protein‑coding genes, and examines the evolutionary forces shaping this phenomenon. By mapping TE insertions across the entire set of annotated coding genes in the human (GRCh38) and mouse (GRCm38) genomes, the authors discovered a striking positional bias: exonized TEs are disproportionately located near the 5′ end of the coding sequence, i.e., close to the start codon. This pattern suggests that selection may favor exonizations that minimally disrupt translation initiation and early protein domains, thereby preserving essential functions while still allowing novel sequence incorporation.

To assess population‑level dynamics, the authors integrated single‑nucleotide polymorphism (SNP) data from the 1000 Genomes Project (human) and the Mouse Genomes Project. They identified TE‑derived exons that are present in the transcriptome of some populations but absent in others, indicating that exonization can be population‑specific. For example, several Alu‑derived exons are spliced in African cohorts but not in East Asian groups, whereas the corresponding mouse loci remain non‑exonic across all examined strains. This population specificity implies that TE exonizations may contribute to genetic divergence and potentially to speciation events.

A comparative SNP density analysis revealed distinct mutational landscapes for different TE families. Alu elements, which dominate the human genome, exhibit a lower overall SNP density than other TEs, reflecting stronger purifying selection. Paradoxically, Alu sequences that have undergone exonization show a markedly higher SNP density than the surrounding intronic regions (approximately 1.8‑fold increase). In contrast, LINE‑derived exonized elements display relatively modest SNP enrichment, while SINE elements fall in between. These differences indicate that the selective pressures acting on exonized TEs vary with TE type and that exonization may relax constraints, permitting the accumulation of mutations that could further diversify the resulting transcripts.

A particularly novel finding concerns primate‑specific Alu elements that depend on adenosine‑to‑inosine (A→I) RNA editing for their exonization. By cross‑referencing RNA‑seq data with known ADAR editing sites, the authors showed that editing converts otherwise non‑canonical splice sites into functional donor or acceptor motifs, enabling the inclusion of the Alu fragment as a new exon. This mechanism demonstrates that post‑transcriptional editing can create splice‑competent sites de novo, adding an additional layer of regulatory potential to TE‑driven transcriptome evolution.

Overall, the paper argues that TE exonization is not a random by‑product of genome instability but a multifaceted process shaped by (1) the genomic context of insertion (preferentially early coding regions), (2) population‑specific SNP landscapes, (3) the intrinsic mutational properties of distinct TE families, and (4) RNA‑level modifications such as ADAR‑mediated editing. By integrating genomic, population‑genetic, and transcriptomic analyses, the authors provide compelling evidence that TE fixation and subsequent exonization have substantially contributed to the expansion of transcriptomic and proteomic complexity in mammals. Their findings lay a foundation for future investigations into how TE‑derived exons influence phenotypic diversity, disease susceptibility, and the broader mechanisms of vertebrate evolution.


📜 Original Paper Content

🚀 Synchronizing high-quality layout from 1TB storage...