On the optimal contact potential of proteins

Reading time: 5 minute
...

📝 Original Info

  • Title: On the optimal contact potential of proteins
  • ArXiv ID: 0709.0346
  • Date: 2008-01-03
  • Authors: ** 정보 없음 (원문에 저자 정보가 제공되지 않음) **

📝 Abstract

We analytically derive the lower bound of the total conformational energy of a protein structure by assuming that the total conformational energy is well approximated by the sum of sequence-dependent pairwise contact energies. The condition for the native structure achieving the lower bound leads to the contact energy matrix that is a scalar multiple of the native contact matrix, i.e., the so-called Go potential. We also derive spectral relations between contact matrix and energy matrix, and approximations related to one-dimensional protein structures. Implications for protein structure prediction are discussed.

💡 Deep Analysis

📄 Full Content

Proteins' biological functions are made possible by their precise three-dimensional (3D) structures, and each 3D structure is determined by its amino acid sequence through the laws of thermodynamics [1]. Therefore, predicting protein structures from their amino acid sequences is important not only for inferring proteins' biological functions, but also for understanding how 3D structures are encoded in such one-dimensional information as amino acid sequence. The problem of protein structure prediction is naturally cast as an optimization problem where a potential function is minimized. Given an appropriate potential function, conformational optimization should yield the native structure as the unique global minimum conformation of the potential function. Thus, the problem has been traditionally divided into two sub-problems: One is to establish an appropriate potential function [2], and the other is to develop the methods to efficiently search the vast conformational space of a protein [3]. Among various forms of effective energy functions, statistical contact potentials [4,5] have been widely used. In this Letter, we exclusively treat a class of such contact potentials, neglecting other contributions such as electrostatics and local interactions. Accordingly, a protein conformation is represented as a contact matrix in which the (i, j) element is 1 if the residues i and j are in contact in space, otherwise it is 0. Although the contact matrix is a coarse-grained representation of protein conformation, it has been known that the contact matrix contains sufficient information to recover the three-dimensional (native) structure of proteins [6]. It is noted that, for the lattice model of proteins [7], these representations of protein conformation and energy function are exact.

Our fundamental assumption is that the conformational energy of a protein can be somehow expressed in terms of a contact matrix. Now let us assume that the total energy of a protein can be well approximated by the sum of pairwise contact energies between amino acid residues, and that each pairwise contact energy can be decomposed into a sequence-dependent term and a conformation-dependent term. The sequence-dependent term is expressed as a matrix E(S) = (E ij ) which we call the contact energy matrix, or E-matrix for short. Each element E ij of the E-matrix represents the energy between the residues i and j when they are in contact. This form of the E-matrix is a very general one: Each element, E ij , may depend on the entire sequence, S, or it may depend only on the types of the interacting amino acid residues, i and j, as in the conventional contact potentials. The conformation-dependent term is expressed as another matrix ∆(C) = (∆ ij ) which we call the contact matrix, or C-matrix. Each element ∆ ij of the Cmatrix assumes a value of either 1 or 0, depending on the residues i and j are in contact or not, respectively. Hence the total energy E(C, S) of a protein of sequence S of N residues and having conformation C is given by

where [•, •] denotes the Frobenius inner product between two matrices [8,9]. Based on this assumption, we derive the lower bound for the conformational energy and the conditions for the native structure and E-matrix to achieve the bound. The Frobenius inner product leads to the matrix l 2 norm defined as, for a matrix M , M ≡ [M, M ] 1/2 = ( i,j M 2 ij ) 1/2 . In the case of C-matrix, since ∆ ij = 0 or 1, we have

where N c ≡ (1/2) i,j ∆ ij is the total number of contacts. As for any inner products, the Frobenius inner product satisfies the Cauchy-Schwarz inequality

where the equality holds if and only if

for some scalar ε < 0. Although the inequality (Eq. 4) holds for any pair of matrices, we now regard it as the lower bound for conformational energy for a given Ematrix. For simplicity, we first consider the energy minimization problem for conformations with ∆(C) fixed to the value of the native conformation. It is desirable for the native conformation to satisfy the lower bound and hence its condition Eq. ( 5). If the native conformation indeed satisfies the condition Eq. ( 5), then the elements of the E-matrix is either 0 or ε so that only the contacts present in the native conformation are stabilizing. Thus, the native conformation satisfying Eq. ( 5) is actually a GMEC among any conformations with arbitrary values of ∆(C) . An E-matrix that satisfies Eq. ( 5) for the native C-matrix is a kind of the so-called Gō potential [10,11] which has been essential for studying the protein folding problem. At this point, it is still possible that the native structure is not the unique GMEC. For example, if a conformation contains all the native contacts together with some other contacts, this conformation has the same energy as the native conformation. In order for a native conformation to be the unique GMEC, it is required that the total number of contacts of the native conformation is larger than that of any other

Reference

This content is AI-processed based on open access ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut