Minimal models for proteins and RNA: From folding to function

Minimal models for proteins and RNA: From folding to function
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

We present a panoramic view of the utility of coarse-grained (CG) models to study folding and functions of proteins and RNA. Drawing largely on the methods developed in our group over the last twenty years, we describe a number of key applications ranging from folding of proteins with disulfide bonds to functions of molecular machines. After presenting the theoretical basis that justifies the use of CG models, we explore the biophysical basis for the emergence of a finite number of folds from lattice models. The lattice model simulations of approach to the folded state show that non-native interactions are relevant only early in the folding process - a finding that rationalizes the success of structure-based models that emphasize native interactions. Applications of off-lattice $C_{\alpha}$ and models that explicitly consider side chains ($C_{\alpha}$-SCM) to folding of $\beta$-hairpin and effects of macromolecular crowding are briefly discussed. Successful application of a new class of off-lattice model, referred to as the Self-Organized Polymer (SOP), is shown by describing the response of Green Fluorescent Protein (GFP) to mechanical force. The utility of the SOP model is further illustrated by applications that clarify the functions of the chaperonin GroEL and motion of the molecular motor kinesin. We also present two distinct models for RNA, namely, the Three Site Interaction (TIS) model and the SOP model, that probe forced unfolding and force quench refolding of a simple hairpin and {\it Azoarcus} ribozyme. The predictions based on the SOP model show that force-induced unfolding pathways of the ribozyme can be dramatically changed by varying the loading rate. We conclude with a discussion of future prospects for the use of coarse-grained models in addressing problems of outstanding interest in biology.


💡 Research Summary

This review provides a panoramic overview of coarse‑grained (CG) modeling approaches that have been developed and refined over the past two decades to investigate protein and RNA folding as well as functional dynamics. The authors begin by establishing the theoretical justification for CG models: by reducing the degrees of freedom while preserving the essential physics of native interactions, one can capture the dominant features of the energy landscape without the computational cost of all‑atom simulations. Lattice models are employed to demonstrate that the number of viable protein folds is intrinsically limited; simulations reveal that non‑native contacts play a significant role only during the earliest stages of collapse, thereby rationalizing the success of structure‑based models that focus exclusively on native contacts.

The review then moves to off‑lattice representations. The simple Cα model, which treats each residue as a single bead, is contrasted with the Cα‑Side‑Chain Model (Cα‑SCM) that explicitly includes side‑chain geometry and electrostatics. Using a β‑hairpin as a test case, the authors show that Cα‑SCM captures subtle heterogeneity in transition‑state ensembles, while both models reproduce the effects of macromolecular crowding on folding rates and stability.

A major portion of the paper is devoted to the Self‑Organized Polymer (SOP) model, a versatile framework that represents each amino acid (or nucleotide) as a single interaction site but retains backbone stiffness, native contact potentials, and non‑bonded repulsion. The SOP model successfully reproduces force‑extension curves for Green Fluorescent Protein (GFP) under mechanical pulling, matching single‑molecule force spectroscopy data and revealing intermediate states that are invisible to purely elastic models. The same model is applied to the chaperonin GroEL, elucidating the allosteric transitions that accompany ATP binding and hydrolysis, and to the molecular motor kinesin, where it captures the coordinated stepping motion driven by ATP‑dependent conformational changes.

RNA modeling is addressed through two complementary CG schemes. The Three‑Site Interaction (TIS) model treats each nucleotide as three interaction sites (phosphate, sugar, base), allowing explicit representation of electrostatic screening, base‑stacking, and hydrogen‑bonding. In parallel, an RNA‑specific SOP model simplifies the ribozyme to a polymer chain with native contact potentials derived from the crystal structure. Both models are used to simulate forced unfolding and force‑quench refolding of a simple hairpin and the Azoarcus ribozyme. A striking prediction is that the unfolding pathways of the ribozyme are highly sensitive to the loading rate: slow pulling yields a stepwise unraveling through well‑defined intermediates, whereas rapid loading forces a more cooperative, abrupt transition.

The authors conclude by highlighting future prospects: integration of machine‑learning techniques for systematic parameter optimization, development of hybrid multi‑scale schemes that couple CG and atomistic regions, and expansion of CG models to address intrinsically disordered proteins, phase‑separating systems, and drug‑target interactions. Overall, the review underscores that coarse‑grained models, when judiciously chosen and calibrated, provide powerful, computationally efficient tools for probing the folding mechanisms and functional motions of biomolecules, bridging the gap between experimental observations and molecular‑level understanding.


Comments & Academic Discussion

Loading comments...

Leave a Comment