Physical principles of building protein megacomplexes in a crowded milieu
Multiple phenotypic protein expressions arising from one genome represent variations in the protein relative abundance and their stoichiometry. A lack of definite compositional parts challenges the modeling of protein megacomplexes and cellular architectures. Despite the advances in protein structural predictions with AI, the mechanism of protein interactions and the emergence of megacomplexes they assemble remains unclear. Here, we present a statistical physics framework of grand canonical ensemble to explore the protein interactions that drive the emergent assembly of a megacomplex using the observational mass spectrometry datasets including protein relative abundance and the cross linked connections. Using chromatin remodeler megacomplex, INO80, as an example, we discovered a class of divergent protein that plays a critical role in orchestrating the assembly beyond nearest neighbors, dependent on the excluded volumes exerted by others. With the constraints of the excluded volumes by varying crowding contents, these divergent subunits orchestrate and form clusters with selective components growing into configurationally distinct architectures. We propose a machinery view for the INO80 chromatin remodeler complex where each loosely associated subunits can be occasionally recruited for parts as attachment into a core assembly driven by excluded volumes. Our computational framework provides a mechanistic insight into taking the macromolecular crowding as necessary physicochemical variables representing cell states to remodel the configurations of protein megacomplexes with structurally loose modules.
💡 Research Summary
The manuscript tackles a fundamental problem in cellular biology: how thousands of proteins encoded by a single genome self‑assemble into large, often loosely defined megacomplexes despite the lack of a fixed stoichiometry. While recent advances in AI‑driven structure prediction (e.g., AlphaFold‑Multimer) have dramatically improved our ability to model individual proteins and small assemblies, they do not explain the physical principles that drive the emergence of megacomplexes in the crowded intracellular milieu.
To address this gap, the authors develop a statistical‑physics framework based on the grand canonical ensemble (μVT). In this formulation, the system is characterized by a chemical potential μ, a volume V, and a temperature T, which together define the probability of adding or removing any protein species. Crucially, the model incorporates experimentally measured protein relative abundances and chemical cross‑linking data obtained from mass‑spectrometry (MS) as hard constraints. By treating the observed MS data as the ensemble’s macroscopic observables, the authors can generate a thermodynamically consistent set of microscopic configurations that reproduce the measured abundance and connectivity patterns.
A central hypothesis of the work is that excluded‑volume effects—arising from macromolecular crowding—play a decisive role in shaping megacomplex architecture. In a dense cytoplasm, each protein occupies a finite volume that limits the configurational space available to its neighbors. The authors introduce a class of “divergent” subunits that, because of their relatively large excluded volume, influence not only their immediate binding partners but also more distant components of the assembly. These divergent proteins act as physical “organizers,” promoting the formation of clusters that extend beyond nearest‑neighbor interactions.
The INO80 chromatin‑remodeling complex serves as a test case. INO80 is known to consist of a relatively rigid core surrounded by many loosely associated subunits, many of which have eluded high‑resolution structural characterization. The authors feed the measured INO80 MS abundance profile and cross‑link map into a Monte‑Carlo simulation of the grand canonical ensemble. By systematically varying the crowding parameter (i.e., the total excluded volume contributed by the background proteome), they explore two regimes: high crowding (large excluded volume) and low crowding (small excluded volume).
Simulation results reveal strikingly different organizational states. Under high‑crowding conditions, divergent subunits preferentially aggregate around the core, forming distinct peripheral clusters that are thermodynamically metastable. These clusters are not random; they selectively incorporate subunits involved in DNA binding, ATP hydrolysis, or histone interaction, suggesting functional compartmentalization driven purely by physical constraints. In low‑crowding simulations, the same divergent proteins are more dispersed, leading to a more fluid, less clustered megacomplex. Free‑energy analyses show that both states correspond to local minima separated by modest barriers, implying that modest changes in cellular conditions (e.g., stress‑induced protein synthesis, osmotic shifts) could trigger rapid re‑organization of the complex.
By treating excluded volume as a tunable “cell‑state variable,” the authors propose a mechanistic link between global physicochemical conditions and specific megacomplex architectures. This perspective reframes the megacomplex not as a static scaffold but as a dynamic ensemble whose composition and geometry can be remodeled by altering crowding. The framework also offers a practical route to integrate AI‑based structural predictions with experimental MS data: predicted inter‑subunit distances can be weighted by crowding‑dependent potentials, allowing the placement of loosely bound subunits that are otherwise invisible to cryo‑EM or X‑ray crystallography.
The discussion extends these findings to broader biological contexts. For instance, transcriptional co‑activator complexes, spliceosomal assemblies, and disease‑associated aggregates (e.g., in neurodegeneration) may all exploit excluded‑volume‑driven clustering to achieve rapid, reversible re‑configuration. The authors suggest that pharmacological modulation of intracellular crowding—through osmolytes, crowding agents, or targeted degradation of specific divergent subunits—could become a novel strategy to influence megacomplex function.
In summary, the paper delivers a quantitative, physics‑based model that bridges high‑throughput proteomics with thermodynamic theory to explain how protein megacomplexes emerge, adapt, and function in a crowded cellular environment. The approach is generalizable, experimentally grounded, and opens new avenues for both basic research on macromolecular organization and applied efforts to manipulate megacomplexes for therapeutic benefit.
Comments & Academic Discussion
Loading comments...
Leave a Comment