Partition function (mathematics): Difference between revisions

Content deleted Content added
Dedeo sfi (talk | contribs)
m sp, date & link fixes; unlinking common words, replaced: Fubini-Study → Fubini–Study (2) using AWB
Line 1:
{{For|the partition function in number theory|Partition (number theory)}}
The '''partition function''' or '''configuration integral''', as used in [[probability theory]], [[information science]] and [[dynamical systems]], is a generalization of the definition of a [[partition function in statistical mechanics]]. It is a special case of a [[normalizing constant]] in probability theory, for the [[Boltzmann distribution]]. The partition function occurs in many problems of probability theory because, in situations where there is a natural symmetry, its associated [[probability measure]], the [[Gibbs measure]], has the [[Markov property]]. This means that the partition function occurs not only in physical systems with translation symmetry, but also in such varied settings as neural networks (the [[Hopfield network]]), and applications such as [[genomics]], [[corpus linguistics]] and [[artificial intelligence]], which employ [[Markov network]]s, and [[Markov logic network]]s. The Gibbs measure is also the unique measure that has the property of maximizing the [[entropy (general concept)|entropy]] for a fixed expectation value of the energy; this underlies the appearance of the partition function in [[maximum entropy method]]s and the algorithms derived therefrom.
 
The partition function ties together many different concepts, and thus offers a general framework in which many different kinds of quantities may be calculated. In particular, it shows how to calculate [[expectation value]]s and [[Green's function]]s, forming a bridge to [[Fredholm theory]]. It also provides a natural setting for the [[information geometry]] approach to [[information theory]], where the [[Fisher information metric]] can be understood to be a [[correlation function]] derived from the partition function; it happens to define a [[Riemannian manifold]].
 
When the setting for random variables is on [[complex projective space]] or [[projective Hilbert space]], geometrized with the [[Fubini-StudyFubini–Study metric]], the theory of [[quantum mechanics]] and more generally [[quantum field theory]] results. In these theories, the partition function is heavily exploited in the [[path integral formulation]], with great success, leading to many formulas nearly identical to those reviewed here. However, because the underlying measure space is complex-valued, as opposed to the real-valued [[simplex]] of probability theory, an extra factor of ''i'' appears in many formulas. Tracking this factor is troublesome, and is not done here. This article focuses primarily on classical probability theory, where the sum of probabilities total to one.
 
==Definition==
Line 35:
The role or meaning of the parameter <math>\beta</math> can be understood in a variety of different ways. In classical thermodynamics, it is an [[inverse temperature]]. More generally, one would say that it is the variable that is [[Conjugate variables (thermodynamics)|conjugate]] to some (arbitrary) function <math>H</math> of the random variables <math>X</math>. The word ''conjugate'' here is used in the sense of conjugate [[generalized coordinates]] in [[Lagrangian mechanics]], thus, properly <math>\beta</math> is a [[Lagrange multiplier]]. It is not uncommonly called the [[generalized force]]. All of these concepts have in common the idea that one value is meant to be kept fixed, as others, interconnected in some complicated way, are allowed to vary. In the current case, the value to be kept fixed is the [[expectation value]] of <math>H</math>, even as many different [[probability distribution]]s can give rise to exactly this same (fixed) value.
 
For the general case, one considers a set of functions <math>\{H_k(x_1,\cdots)\}</math> that each depend on the random variables <math>X_i</math>. These functions are chosen because one wants to hold their expectation values constant, for one reason or another. To constrain the expectation values in this way, one applies the method of [[Lagrange multiplier]]s. In the general case, [[maximum entropy method]]s illustrate the manner in which this is done.
 
Some specific examples are in order. In basic thermodynamics problems, when using the [[canonical ensemble]], the use of just one parameter <math>\beta</math> reflects the fact that there is only one expectation value that must be held constant: the [[free energy]] (due to [[conservation of energy]]). For chemistry problems involving chemical reactions, the [[grand canonical ensemble]] provides the appropriate foundation, and there are two Lagrange multipliers. One is to hold the energy constant, and another, the [[fugacity]], is to hold the particle count constant (as chemical reactions involve the recombination of a fixed number of atoms).
 
For the general case, one has
Line 43:
:<math>Z(\beta) = \sum_{x_i} \exp \left(-\sum_k\beta_k H_k(x_i) \right)</math>
 
with <math>\beta=(\beta_1, \beta_2,\cdots)</math> a point in a space.
 
For a collection of observables <math>H_k</math>, one would write
Line 64:
:<math>H(x_1,x_2,\dots) = \sum_s V(s)\,</math>
 
where the sum over ''s'' is a sum over some subset of the [[power set]] ''P''(''X'') of the set <math>X=\lbrace x_1,x_2,\dots \rbrace</math>. For example, in [[statistical mechanics]], such as the [[Ising model]], the sum is over pairs of nearest neighbors. In probability theory, such as [[Markov networks]], the sum might be over the [[clique (graph theory)|cliques]] of a graph; so, for the Ising model and other [[lattice model (physics)|lattice models]], the maximal cliques are edges.
 
The fact that the potential function can be written as a sum usually reflects the fact that it is invariant under the [[group action|action]] of a [[group (mathematics)|group symmetry]], such as [[translational invariance]]. Such symmetries can be discrete or continuous; they materialize in the [[correlation function]]s for the random variables (discussed below). Thus a symmetry in the Hamiltonian becomes a symmetry of the correlation function (and vice-versa).
Line 74:
:<math>\exp \left(-\beta H(x_1,x_2,\dots) \right)</math>
 
can be interpreted as a likelihood that a specific [[configuration space|configuration]] of values <math>(x_1,x_2,\dots)</math> occurs in the system. Thus, given a specific configuration <math>(x_1,x_2,\dots)</math>,
 
:<math>P(x_1,x_2,\dots) = \frac{1}{Z(\beta)} \exp \left(-\beta H(x_1,x_2,\dots) \right)</math>
Line 88:
 
==Expectation values==
The partition function is commonly used as a [[generating function]] for [[expectation value]]s of various functions of the random variables. So, for example, taking <math>\beta</math> as an adjustable parameter, then the derivative of <math>\log(Z(\beta))</math> with respect to <math>\beta</math>
 
:<math>\bold{E}[H] = \langle H \rangle = -\frac {\partial \log(Z(\beta))} {\partial \beta}</math>
 
gives the average (expectation value) of ''H''. In physics, this would be called the average [[energy]] of the system.
 
Given the definition of the probability measure above, the expectation value of any function ''f'' of the random variables ''X'' may now be written as expected: so, for discrete-valued ''X'', one writes
Line 139:
where we've written <math>P(x)</math> for <math>P(x_1,x_2,\dots)</math> and the summation is understood to be over all values of all random variables <math>X_k</math>. For continuous-valued random variables, the summations are replaced by integrals, of course.
 
Curiously, the [[Fisher information metric]] can also be understood as the flat-space [[Euclidean metric]], after appropriate change of variables, as described in the main article on it. When the <math>\beta</math> are complex-valued, the resulting metric is the [[Fubini-StudyFubini–Study metric]]. When written in terms of [[mixed state (physics)|mixed states]], instead of [[pure state]]s, it is known as the [[Bures metric]].
 
== Correlation functions==
Line 180:
* [[Exponential family]]
* [[Partition function (statistical mechanics)]]
 
== References==
{{reflist}}