Quantifying the Role of OpenFold Components in Protein Structure Prediction
📝 Abstract
Models such as AlphaFold2 and OpenFold have transformed protein structure prediction, yet their inner workings remain poorly understood. We present a methodology to systematically evaluate the contribution of individual OpenFold components to structure prediction accuracy. We identify several components that are critical for most proteins, while others vary in importance across proteins. We further show that the contribution of several components is correlated with protein length. These findings provide insight into how OpenFold achieves accurate predictions and highlight directions for interpreting protein prediction networks more broadly.
💡 Analysis
Models such as AlphaFold2 and OpenFold have transformed protein structure prediction, yet their inner workings remain poorly understood. We present a methodology to systematically evaluate the contribution of individual OpenFold components to structure prediction accuracy. We identify several components that are critical for most proteins, while others vary in importance across proteins. We further show that the contribution of several components is correlated with protein length. These findings provide insight into how OpenFold achieves accurate predictions and highlight directions for interpreting protein prediction networks more broadly.
📄 Content
Recent protein structure prediction models such as AlphaFold2 [11] and OpenFold [2] have transformed biology, enabling breakthroughs in protein folding [23], drug discovery [12,13], and protein synthesis [7,23]. Yet, despite their success, it remains unclear how these models achieve such accuracy, i.e., which architectural components are most essential for prediction. At the core of several protein structure prediction models lies a transformer-based model that iteratively refines two key internal representations: the multiple sequence alignment (MSA) representations and the pairwise residue (Pair) representations. In AlphaFold2 and OpenFold, the main transformer model is called the Evoformer, which includes several components such as attention layers, transition MLPs, and triangular update operations. However, the relative importance of these components for structure prediction is not well understood. Previous ablations of AlphaFold2 and OpenFold focused mainly on auxiliary losses, training regimes, or coarse architectural changes (e.g., “no triangles, biasing, or gating”), leaving the role of individual Evoformer components largely unexplored. In this paper, we address this gap by systematically evaluating component-level contributions across proteins. Our study reveals which components are broadly critical, which are dispensable, and how their importance varies with properties such as protein length.
Our main contributions are as follows:
- We propose a methodology to systematically examine the contribution of individual Open-Fold components to structure prediction accuracy, revealing protein-specific reliance on different architectural components. 2. We identify several components that are critical for accurate predictions across most proteins, including MSA Column Attention, both MLP Transition layers, and the final Pair representation used for structure prediction, providing biological insights into how OpenFold achieves accurate predictions. 3. We analyze how these contributions vary with protein length and find that the importance of several components is statistically significantly correlated with sequence length.
Several recent works have studied the inner workings of AlphaFold2. For example, [21] proposes an objective function to solve for a set of residue deletions or substitutions required to change the network’s structure predictions. In [8], the impact of different templates and the recycling mechanism are studied in more detail. In [19], a framework is proposed to study the contribution of different amino acids to final structure prediction using SHAP. Our study differs from existing studies since we seek to characterize the contribution of architectural components to structure prediction, rather than the contribution of individual residues, templates, or recycling iterations. Beyond structure prediction models, there has also been interpretability work on protein language models (see [16] for a survey).
In [6,20], sparse auto-encoders are used to study the ESM-2 [15] representation space. There has also been work to study the intrinsic dimensionality of protein language model representation spaces [5,22,26]. In [24], a methodology is proposed to relate the attention maps of protein language models to protein properties such as structure or function. More broadly, the interpretability of protein models has been proposed as a promising future direction by several researchers [3,4,13,16].
Model Components. OpenFold [2] is an open-source reproduction of AlphaFold2 [11], making it a suitable model for our study. Despite AlphaFold2’s advances in protein structure prediction [18], many architectural contributions remain unclear [13]. Since recent models such as AlphaFold3 [1] and Boltz [17,25] retain the same transformer-based architecture with triangular operations on Pair representations, our findings extend beyond OpenFold. OpenFold operates in three distinct phases to predict structures from amino acid sequences: i) a pre-processing phase that produces a Multiple Sequence Alignment (MSA) representation and a Pair representation; ii) Evoformer processing via 48 blocks to refine these representations; and iii) the Structure Module, which outputs a 3D structure from these representations. The MSA representation is produced by comparing the input sequence to existing sequences from nature, while the Pair representation is produced by comparing residue-residue pair relationships in the sequence.
Here, we study the impact of various Evoformer components on structural accuracy (see Fig. 1 for an overview). Specifically, each Evoformer block consists of two pathways to operate on the MSA and Pair representations. Within the MSA pathway, MSA Row Attention integrates information across homologous sequences for each residue position, followed by MSA Column Attention, which correlates residues along each sequence. An MSA Transition (feedforward MLP) further transforms these features. The Outer Prod
This content is AI-processed based on ArXiv data.