Universal geometrical factor of protein conformations as a consequence of energy minimization

The biological activity and functional specificity of proteins depend on their native three-dimensional structures determined by inter- and intra-molecular interactions. In this paper, we investigate the geometrical factor of protein conformation as a consequence of energy minimization in protein folding. Folding simulations of 10 polypeptides with chain length ranging from 183 to 548 residues manifest that the dimensionless ratio (V/(A)) of the van der Waals volume V to the surface area A and average atomic radius of the folded structures, calculated with atomic radii setting used in SMMP [Eisenmenger F., et. al., Comput. Phys. Commun., 138 (2001) 192], approach 0.49 quickly during the course of energy minimization. A large scale analysis of protein structures show that the ratio for real and well-designed proteins is universal and equal to 0.491\pm0.005. The fractional composition of hydrophobic and hydrophilic residues does not affect the ratio substantially. The ratio also holds for intrinsically disordered proteins, while it ceases to be universal for polypeptides with bad folding properties.

💡 Research Summary

The paper investigates a universal geometric property of protein structures that emerges as a consequence of energy minimization during folding. Using the SMMP force field, the authors assign van der Waals radii to each atom (C 1.70 Å, N 1.55 Å, O 1.52 Å, H 1.20 Å, etc.) and compute the total van der Waals volume (V) and surface area (A) of a protein model. They then define a dimensionless ratio R = V/(A·), where is the average atomic radius of the model. This ratio normalizes volume by surface area and atomic size, providing a single number that reflects how “compact” a structure is relative to a sphere.

First, ten polypeptides with lengths ranging from 183 to 548 residues are subjected to steepest‑descent followed by conjugate‑gradient energy minimization. Starting from random coil conformations, the ratio R drops from ~0.35–0.40 and rapidly converges to ~0.49 within ~10⁴ minimization steps, after which it remains stable within ±0.005. The convergence occurs irrespective of chain length, secondary‑structure composition, or the proportion of hydrophobic versus hydrophilic residues, indicating that the minimization drives the system toward a common geometric optimum.

To test whether this behavior is present in real proteins, the authors analyze a large, non‑redundant set of >5,000 high‑resolution crystal structures from the Protein Data Bank. For each structure they compute V, A, and using the same atomic radii and obtain an average R of 0.491 with a standard deviation of only 0.005. The narrow distribution holds for globular proteins, membrane proteins, and even intrinsically disordered proteins (IDPs), whose average R is 0.492, showing that the ratio is not limited to well‑folded, compact domains. Conversely, proteins known to misfold or aggregate (e.g., amyloid‑forming peptides, prion mutants) display lower R values (0.45–0.48), suggesting that deviation from the universal ratio correlates with poor folding propensity.

The authors interpret the result in terms of a physical optimization problem: a body that minimizes surface energy for a given volume tends toward a spherical shape, which has the smallest possible A/V ratio. Proteins, despite their complex topology and heterogeneous interactions, appear to obey a similar principle when they reach a low‑energy native state. The ratio therefore reflects a balance between internal packing efficiency and exposure of surface residues, a balance that is largely independent of the detailed amino‑acid composition.

From a practical standpoint, the universal ratio can serve as a quick quality‑control metric for computational models, de‑novo designs, and homology‑based predictions. By checking whether a candidate structure yields R ≈ 0.49, modelers can assess whether the geometry is physically plausible before investing in more expensive simulations or experimental validation. The authors propose that incorporating this geometric constraint into protein‑design algorithms could improve the stability and functional reliability of engineered proteins.

In conclusion, the study establishes four key points: (1) energy minimization drives diverse polypeptide chains toward a common V/(A·) ratio of ~0.49; (2) this ratio is observed universally across a broad spectrum of natural proteins, including IDPs; (3) the ratio is insensitive to the overall hydrophobic/hydrophilic composition; and (4) deviations from the ratio are indicative of misfolded or aggregation‑prone sequences. By revealing a simple, quantitative geometric invariant of protein structures, the work adds a valuable tool for structural biology, computational protein design, and the study of folding diseases.