Harnessing the Peripheral Surface Information Entropy from Globular Protein-Peptide Complexes
Predicting favorable protein-peptide binding events remains a central challenge in biophysics, with continued uncertainty surrounding how nonlocal effects shape the global energy landscape. Here, we introduce peripheral surface information (PSI) entropy, a quantitative measure of the statistical variability in apolar and charged non-interacting surface (NIS) proportions across conformational ensembles. Using energy-directed molecular docking via HADDOCK3 and explicit-solvent molecular dynamics simulations, it is demonstrated that favorable binding partners exhibit emergent, low-entropy N-states (discrete macrostates in NIS state space) indicative of preferential apolar/charged surface configurations. Across dozens of peptides and multiple receptor systems (WW, PDZ, and MDM2 domains), dominant N-states persisted under varied docking parameters and initial conditions. An experimental meta-ensemble of WW domains from 36 high-resolution structures confirmed the presence of dominant NIS modes independent of in silico methodology, suggesting an evolutionary selection pressure toward specific NIS fingerprints. These findings establish PSI entropy as a thermoinformatic descriptor that encodes favorable binding constraints into unique statistical signatures of the NIS.
💡 Research Summary
The authors address a long‑standing problem in biophysics: predicting which protein‑peptide pairs will bind favorably. Traditional approaches focus on the interface and often ignore non‑local contributions from surface regions that do not directly contact the partner. To capture these effects, the paper introduces “peripheral surface information (PSI) entropy” (SΨ), a statistical‑mechanical descriptor that quantifies the variability of apolar (Nₐ) and charged (N_c) fractions of the non‑interacting surface (NIS) across an ensemble of conformations.
Methodologically, each complex is processed as follows. Solvent‑accessible surface area (SASA) for every residue is converted to relative solvent accessibility (RSA). Residues with RSA ≥ 0.05 are classified as apolar (A), charged (C), or polar (P). Interface residues (within 5 Å heavy‑atom distance) are excluded, leaving a tuple (n_A, n_C, n_P) that defines a macrostate. The authors generate large docking ensembles using HADDOCK 3.0, which proceeds through three stages: rigid‑body docking, semi‑flexible refinement, and explicit‑solvent molecular dynamics. Energy terms (van‑der‑Waals, electrostatics, desolvation, restraint violations) are weighted differently at each stage, and the OPLS force field provides the underlying potentials.
From each ensemble, the set of microstates Ω (all generated models) is partitioned into N distinct macrostates (N‑states) based on identical (Nₐ, N_c) coordinates. The multiplicity g(N_i) counts how many microstates belong to macrostate i. Using Shannon’s information entropy, an unnormalized PSI entropy S′Ψ = −∑_i (g_i/Ω) log₂(g_i/Ω) is computed. To relate this entropy to physical binding quality, the authors introduce a normalization factor K = Q/M, where Q is the number of distinct residue‑residue contacts in the ensemble and M is the class‑weighted total contact mass (weights γ(c_i,c_j) reflect favorable vs. unfavorable contact types). The final descriptor, SΨ = K · S′Ψ, therefore measures the entropy per unit of favorable contact enthalpy.
The central hypothesis is that energetically favorable complexes occupy a reduced subset of N‑states, yielding low SΨ, whereas non‑favorable or promiscuous binders explore many N‑states and have high SΨ. To test this, the authors docked cognate (proper) and non‑cognate (improper) peptides to three receptor families (WW, PDZ, MDM2) under identical protocols. Across dozens of peptide variants and multiple docking parameter sweeps (varying the number of rigid‑body models and the number retained for semi‑flexible refinement), the favorable complexes consistently displayed a small number of dominant N‑states (often <30) and low SΨ values. In contrast, unfavorable complexes populated hundreds of N‑states and exhibited markedly higher SΨ.
Importantly, the phenomenon persisted when the peptide conformation was altered (NMR‑derived, AlphaFold2 prediction, short MD‑equilibrated, AlphaFold3 prediction) and when the ensembles were generated by explicit‑solvent MD alone, indicating that the emergence of low‑entropy N‑states is robust to initial structural variation.
To validate that PSI entropy is not an artifact of the in‑silico pipeline, the authors assembled a meta‑ensemble of 36 experimentally solved WW‑peptide complexes (high‑resolution X‑ray or NMR structures). Even without any docking, the same dominant N‑states appeared, confirming that evolutionary pressure has selected specific peripheral surface fingerprints (Nₐ, N_c patterns) that are conserved across homologous complexes.
The authors further propose three regimes for interpreting SΨ and the contact‑density ratio Q/M:
- Regime I (low SΨ, low Q/M): A focused set of high‑affinity contacts dominates; peripheral surface patterns converge to a few macrostates—typical of tight, specific binders.
- Regime II (high SΨ, high Q/M): Contact mass is distributed over many weak interactions; many distinct N‑states are sampled—characteristic of hub‑like or transient interactions.
- Regime III (intermediate SΨ, low Q/M): Strong core contacts coexist with peripheral flexibility, allowing multiple N‑states while maintaining overall affinity—useful for adaptable recognition.
These regimes provide a practical framework for protein‑design and drug‑discovery: low SΨ can be used as a filter to prioritize docking poses or engineered peptides, while the Q/M ratio informs on the breadth of the interaction landscape.
In conclusion, the study demonstrates that peripheral surface information entropy is a powerful, method‑independent descriptor that captures non‑local contributions to binding thermodynamics. By quantifying how narrowly a complex samples the space of peripheral apolar/charged surface configurations, SΨ offers a complementary metric to traditional energy scores, improves prediction of binding affinity, and reveals evolutionary conservation of surface fingerprints. The work opens avenues for integrating information‑theoretic measures into structure‑based design pipelines, potentially enhancing the accuracy of peptide therapeutics and synthetic biology applications.
Comments & Academic Discussion
Loading comments...
Leave a Comment