Content deleted Content added
m →Gaussian graphical models of protein structures: replaced: neghborhood → neighborhood using AWB |
Amkilpatrick (talk | contribs) Disambiguated: closed form → Closed-form expression; Unlinked: Continuous; Help needed: Potential function |
||
Line 2:
[[Graphical model]]s have become powerful frameworks for [[protein structure prediction]], [[protein–protein interaction]] and [[Thermodynamic free energy|free energy]] calculations for protein structures. Using a graphical model to represent the protein structure allows the solution of many problems including secondary structure prediction, protein protein interactions, protein-drug interaction, and free energy calculations.
There are two main approaches to use graphical models in protein structure modeling. The first approach uses [[Discrete mathematics|discrete]] variables for representing coordinates or [[dihedral angle]]s of the protein structure. The variables are originally all continuous values and, to transform them into discrete values, a discretization process is typically applied. The second approach uses
==Discrete graphical models for protein structure==
Line 24:
:<math>p(X_s = x_s|X_b = x_b) = \frac{1}{Z} \prod_{c\in C(G)}\Phi_c (x_s^c,x_b^c)</math>
where ''C''(''G'') is the set of all cliques in ''G'', <math>\Phi</math> is a [[potential function]]{{dn|date=August 2013}} defined over the variables, and ''Z'' is the [[partition function (mathematics)|partition function]].
To completely characterize the MRF, it is necessary to define the potential function <math>\Phi</math>. To simplify, the cliques of a graph are usually restricted to only the cliques of size 2, which means the potential function is only defined over pairs of variables. In [[Goblin System]], this pairwise functions are defined as
Line 57:
To learn the graph structure as a multivariate Gaussian graphical model, we can use either [[L-1 regularization]], or [[neighborhood selection]] algorithms. These algorithms simultaneously learn a graph structure and the edge strength of the connected nodes. An edge strength corresponds to the potential function defined on the corresponding two-node [[clique]]. We use a training set of a number of PDB structures to learn the <math>\mu</math> and <math>\Sigma^{-1}</math>.
Once the model is learned, we can repeat the same step as in the discrete case, to get the density functions at each node, and use analytical form to calculate the free energy. Here, the [[Partition function (mathematics)|partition function]] already has a [[Closed-form expression|closed form]], so the [[inference]], at least for the Gaussian graphical models is trivial. If the analytical form of the partition function is not available, [[particle filtering]] or [[expectation propagation]] can be used to approximate ''Z'', and then perform the inference and calculate free energy.
{{No footnotes|date=August 2010}}
|