Comment on "Evidence of Abundant and Purifying Selection in Humans for Recently Acquired Regulatory Functions"
Ward and Kellis (Reports, September 5 2012) identify regulatory regions in the human genome exhibiting lineage-specific constraint and estimate the extent of purifying selection. There is no statistical rationale for the examples they highlight, and their estimates of the fraction of the genome under constraint are biased by arbitrary designations of completely constrained regions.
š” Research Summary
Ward and Kellis (2012) argued that recently acquired regulatory functions in the human genome are abundant and subject to strong purifying selection. They integrated epigenomic datasets from ENCODE and the Roadmap Epigenomics Projectāsuch as histone modifications, DNaseāI hypersensitivity, and transcriptionāfactor bindingāto delineate humanāspecific regulatory elements. By comparing these regions across a panel of primate genomes, they estimated lineageāspecific constraint and reported that roughly 5ā7āÆ% of the genome could be classified as ācompletely constrained,ā with the remaining regulatory regions showing elevated constraint signals relative to neutral expectations.
The present comment challenges the statistical and methodological foundations of those conclusions. First, the highlighted functional categories (e.g., neurodevelopment, immune response) are presented without preāspecified hypotheses, and no correction for multiple hypothesis testing is applied across the many Gene Ontology or pathway groups examined. Consequently, the apparent enrichment of constraint in selected categories may simply reflect random fluctuations. Second, the definition of ācompletely constrainedā as 100āÆ% conserved bases across the primate panel is arbitrary; such perfect conservation is common among closely related species and does not necessarily indicate functional constraint in the human lineage. By treating any fully conserved site as a proxy for humanāspecific purifying selection, the authors inflate the proportion of the genome they claim is under constraint.
Third, the authorsā reliance on mutationārateābased metrics overlooks wellādocumented heterogeneity in mutation rates driven by replication timing, GC content, chromatin state, and local recombination rate. Their model does not adequately correct for these confounders, raising the possibility that observed reductions in polymorphism are driven by mutational cold spots rather than selection. Fourth, while the paper includes compelling visualizations, the underlying raw data, code, and detailed pipeline are not made publicly available, preventing independent replication and verification of the results.
Taken together, these issues suggest that the estimate of the fraction of the human genome under recent purifying selection is substantially biased upward. A more rigorous approach would require (i) explicit preāregistration of functional hypotheses and appropriate falseādiscoveryārate control, (ii) a biologically justified definition of constrained regions that distinguishes between deep evolutionary conservation and humanāspecific constraint, (iii) comprehensive modeling of mutationārate heterogeneity, and (iv) full transparency of data and analytical workflows. Implementing these improvements would enable the community to more accurately assess the evolutionary dynamics of human regulatory DNA and to distinguish genuine recent functional innovation from artifacts of methodology.
Comments & Academic Discussion
Loading comments...
Leave a Comment