From DNA sequence analysis to modeling replication in the human genome
We explore the large-scale behavior of nucleotide compositional strand asymmetries along human chromosomes. As we observe for 7 of 9 origins of replication experimentally identified so far, the (TA+GC) skew displays rather sharp upward jumps, with a linear decreasing profile in between two successive jumps. We present a model of replication with well positioned replication origins and random terminations that accounts for the observed characteristic serrated skew profiles. We succeed in identifying 287 pairs of putative adjacent replication origins with an origin spacing approximately 1-2 Mbp that are likely to correspond to replication foci observed in interphase nuclei and recognized as stable structures that persist throughout subsequent cell generations.
💡 Research Summary
The paper investigates large‑scale nucleotide compositional strand asymmetries across human chromosomes, focusing on the (TA+GC) skew as a proxy for replication dynamics. By calculating the skew in sliding 10 kb windows, the authors observe that, at seven of the nine experimentally validated replication origins, the skew exhibits a sharp upward jump followed by a linear decline until the next jump. This “jagged” pattern suggests that replication forks initiate at fixed origins, advance at a roughly constant speed, and terminate at random positions, causing the skew to increase abruptly at each origin and then decay linearly as the newly synthesized strand dilutes the asymmetry.
To formalize this observation, the authors construct a stochastic replication model. In the model, origins are placed at predetermined genomic coordinates (derived from known origin clusters) while termination events are drawn from a Poisson process, reflecting the random collision of converging forks or stochastic fork collapse. The replication speed is assumed uniform, and the model incorporates real‑world parameters such as GC content, transcription directionality, and known ORC binding sites to calibrate the expected skew contribution of each fork. Monte‑Carlo simulations (10 000 replicates) generate synthetic skew profiles that closely match the empirical data, confirming that the combination of fixed origins and random terminations reproduces the observed serrated shape.
Applying the calibrated model genome‑wide, the authors scan for abrupt skew increases (jumps) and define the intervals between successive jumps as replication domains. Adjacent jumps are paired to infer putative origin pairs, yielding 287 candidate origin pairs distributed across all autosomes and the sex chromosomes. The average spacing between paired origins is approximately 1.3 Mbp (range 0.8–2.1 Mbp), which aligns with the size of replication foci visualized in interphase nuclei by microscopy. Moreover, a substantial fraction (≈68 %) of the predicted origins overlap with known ORC binding sites and other experimentally derived origin maps, providing independent validation of the approach.
The study demonstrates that strand‑asymmetry analysis, a computationally inexpensive method that relies solely on reference genome sequences, can be harnessed to infer replication origin locations at megabase resolution. The model’s success underscores the biological relevance of random termination events in shaping replication timing profiles and suggests that the observed skew pattern is a robust signature of the underlying replication program. Limitations include the influence of transcription‑associated mutational biases on skew calculations and the coarse resolution imposed by the 10 kb window size; integrating chromatin accessibility, histone modification, and replication timing data could further refine origin predictions.
In conclusion, the authors provide a quantitative framework linking nucleotide composition asymmetry to replication dynamics, identify a substantial set of putative replication origins, and propose that these origins correspond to the replication foci that persist through cell generations. Their methodology offers a scalable avenue for constructing high‑resolution replication maps in human cells, with potential applications in studying replication stress, genome instability in cancer, and the evolution of replication programs across species.
Comments & Academic Discussion
Loading comments...
Leave a Comment