Integrative genomics analysis identifies pericentromeric regions of human chromosomes affecting patterns of inter-chromosomal interactions
Genome-wide analysis of distributions of densities of long-range interactions of human chromosomes with each other, nucleoli, nuclear lamina, and binding sites of chromatin state regulatory proteins, CTCF and STAT1, identifies non-random highly correlated patterns of density distributions along the chromosome length for all these features. Marked co-enrichments and clustering of all these interactions are detected at discrete genomic regions on selected chromosomes, which are located within pericentromeric heterochromatin and designated Centromeric Regions of Interphase Chromatin Homing (CENTRICH). CENTRICH manifest 199-716-fold higher density of inter-chromosomal binding sites compared to genome-wide or chromosomal averages (p = 2.10E-101-1.08E-292). Sequence alignment analysis shows that CENTRICH represent unique DNA sequences of 3.9 to 22.4 Kb in size which are: 1) associated with nucleolus; 2) exhibit highly diverse set of DNA-bound chromatin state regulators, including marked enrichment of CTCF and STAT1 binding sites; 3) bind multiple intergenic disease-associated genomic loci (IDAGL) with documented long-range enhancer activities and established links to increased risk of developing epithelial malignancies and other common human disorders. Using distances of SNP loci homing sites within genomic coordinates of CENTRICH as a proxy of likelihood of disease-linked SNP loci binding to CENTRICH, we demonstrate statistically significant correlations between the probability of SNP loci binding to CENTRICH and GWAS-defined odds ratios of increased risk of a disease for cancer, coronary artery disease, and type 2 diabetes. Our analysis suggests that centromeric sequences and pericentromeric heterochromatin may play an important role in human cells beyond the critical functions in chromosome segregation.
💡 Research Summary
This study presents a comprehensive, genome‑wide integration of several high‑throughput datasets to uncover non‑random, highly correlated patterns of long‑range inter‑chromosomal contacts, nucleolar and lamina associations, and binding sites for the chromatin regulators CTCF and STAT1 across the human genome. By constructing density profiles for each feature in 1 kb–100 kb windows and performing pairwise correlation analyses, the authors demonstrate that these diverse genomic interactions co‑localize in discrete hotspots. Remarkably, on several chromosomes these hotspots are concentrated within pericentromeric heterochromatin, leading the authors to define a novel class of loci they term “Centromeric Regions of Interphase Chromatin Homing” (CENTRICH).
CENTRICH are short (3.9–22.4 kb) DNA segments that exhibit an extraordinary enrichment of inter‑chromosomal binding sites—199‑ to 716‑fold higher than the genome‑wide average (p = 2.10E‑101 to 1.08E‑292). Sequence alignment shows that these regions are unique, non‑repetitive sequences distinct from classic satellite DNA. They are physically tethered to the nucleolus, contain a dense collection of CTCF and STAT1 binding motifs, and serve as convergence points for multiple intergenic disease‑associated genomic loci (IDAGL) that function as long‑range enhancers.
To assess functional relevance, the authors intersected GWAS‑identified single‑nucleotide polymorphisms (SNPs) linked to cancer, coronary artery disease, and type‑2 diabetes with the genomic coordinates of CENTRICH. Using the distance of each SNP’s “homing” site within a CENTRICH as a proxy for binding likelihood, they found a robust, statistically significant positive correlation between the probability of a SNP binding to a CENTRICH and the disease‑specific odds ratio reported in GWAS (p < 0.001). This suggests that CENTRICH act as genomic “hubs” that concentrate risk‑associated variants, potentially amplifying their regulatory impact on gene expression.
The findings challenge the traditional view that pericentromeric heterochromatin is solely a structural element required for chromosome segregation. Instead, the data support a model in which centromeric and pericentromeric DNA actively participates in three‑dimensional genome organization, serving as anchoring platforms for CTCF‑mediated loops, STAT1‑driven signaling, and nucleolar tethering. By concentrating diverse regulatory proteins and disease‑linked enhancers, CENTRICH may influence transcriptional programs across the nucleus and modulate susceptibility to common complex diseases.
The authors propose several avenues for future research: (1) CRISPR‑Cas9–mediated deletion or mutation of CENTRICH to test effects on inter‑chromosomal contact frequency and gene expression; (2) live‑cell super‑resolution imaging to monitor dynamic positioning of CENTRICH relative to the nucleolus and lamina throughout the cell cycle; (3) perturbation of CTCF or STAT1 binding within CENTRICH to dissect their contributions to loop formation and disease‑variant activity. Such experiments will be essential to validate the mechanistic role of CENTRICH and to determine whether their activity is cell‑type specific or altered in disease states.
In summary, this paper identifies pericentromeric regions—CENTRICH—as highly enriched, functionally versatile hubs that integrate structural nuclear architecture with regulatory protein binding and disease‑associated genetic variation. The work opens a new conceptual framework for understanding how “non‑coding” heterochromatic DNA can influence genome function and human disease, and it provides a solid foundation for future functional genomics and therapeutic investigations.