Correcting pervasive errors in RNA crystallography through enumerative structure prediction
Three-dimensional RNA models fitted into crystallographic density maps exhibit pervasive conformational ambiguities, geometric errors and steric clashes. To address these problems, we present enumerative real-space refinement assisted by electron density under Rosetta (ERRASER), coupled to Python-based hierarchical environment for integrated ‘xtallography’ (PHENIX) diffraction-based refinement. On 24 data sets, ERRASER automatically corrects the majority of MolProbity-assessed errors, improves the average Rfree factor, resolves functionally important discrepancies in noncanonical structure and refines low-resolution models to better match higher-resolution models.
💡 Research Summary
RNA crystallography has produced a wealth of three‑dimensional structures, yet most models are solved at modest resolution (>2.5 Å) and suffer from pervasive geometric ambiguities: bond‑length/angle outliers, steric clashes, non‑rotameric backbone “suites,” and ambiguous sugar‑pucker assignments. Existing tools such as RNABC and RCrane can correct a subset of backbone errors but rely on manually placed anchors and do not address the full spectrum of problems.
In this work the authors introduce ERRASER (Enumerative Real‑space Refinement Assisted by Electron density under Rosetta), a fully automated pipeline that couples exhaustive conformational sampling of each nucleotide with a Rosetta energy function augmented by an electron‑density correlation term. For every residue the method enumerates all plausible backbone torsion angles (α, β, γ, δ, ε, ζ) and both common sugar puckers (2′‑endo, 3′‑endo), scores each candidate using a weighted sum of Rosetta physics‑based energy and map‑fit χ², and selects the lowest‑scoring conformation. The resulting model is then handed to PHENIX for conventional diffraction‑based refinement, creating a seamless loop of real‑space correction followed by reciprocal‑space optimization.
The authors benchmarked the ERRASER‑PHENIX workflow on 24 RNA‑containing crystal structures ranging from small pseudoknots to entire ribosomal subunits (resolution 2.0–3.7 Å). Starting PDB models displayed MolProbity‑detected errors at the following average rates: bond‑length outliers 0.53 %, angle outliers 1.18 %, serious steric clashes 18.0 per 1,000 atoms, backbone‑suite rotamer outliers 19 %, and sugar‑pucker errors 5 %. After ERRASER‑PHENIX refinement, bond‑length/angle outliers were eliminated entirely, clash scores dropped to an average of 7.0 (a 60 % reduction), backbone‑suite rotamer outliers fell to 8 %, and sugar‑pucker errors were reduced to 0.2 % (zero errors in 19 of 24 cases).
Crucially, the pipeline improved agreement with experimental diffraction data. The average R factor decreased from 0.210 to 0.199 and the free‑R (Rfree) from 0.255 to 0.243, with Rfree improving in 22 of the 24 cases. Compared with PHENIX alone, RNABC‑PHENIX, and RCrane‑PHENIX, ERRASER‑PHENIX consistently yielded the lowest MolProbity scores and the most favorable Rfree values, demonstrating that the method adds genuine structural information rather than over‑fitting.
Functional validation was provided by correcting a discrepancy in the active site of a group I ribozyme. Two independent low‑resolution models of the same ribozyme had opposite sugar puckers for a key adenine, leading to divergent hydrogen‑bonding patterns. ERRASER‑PHENIX unified both models to the same 2′‑endo conformation and restored the biologically relevant hydrogen‑bond network, a result later confirmed by a higher‑resolution crystal structure and mutagenesis data.
The authors also applied the workflow to a newly solved hepatitis C virus IRES subdomain (PDB 3TZR). The ERRASER‑PHENIX model exhibited fewer MolProbity violations and lower R/Rfree, and was deposited as the final structure. An independent assessment comparing low‑resolution models refined with ERRASER‑PHENIX to their high‑resolution counterparts showed increased similarity in backbone torsions (average 80.5 % vs 64.9 % for PHENIX alone) and sugar‑pucker agreement (97 % vs 91.5 %).
Overall, ERRASER‑PHENIX provides a robust, automated solution for improving the geometric quality and diffraction fit of RNA crystal structures across a wide range of resolutions and sizes. The software is available in the current Rosetta release (3.4), as a ROSIE web server application, and integrated within the PHENIX suite, making it readily accessible to the structural biology community for routine use and for challenging cases such as RNA‑protein complexes or large ribosomal assemblies.
Comments & Academic Discussion
Loading comments...
Leave a Comment