Ising model: Difference between revisions

Content deleted Content added
Ta2o (talk | contribs)
m Set upright of Fig
Bender the Bot (talk | contribs)
m References: HTTP to HTTPS for Brown University
 
(13 intermediate revisions by 7 users not shown)
Line 3:
{{Statistical mechanics|cTopic=Models}}
 
The '''Ising model''' (or '''Lenz–Ising model'''), named after the physicists [[Ernst Ising]] and [[Wilhelm Lenz]], is a [[mathematical models in physics|mathematical model]] of [[ferromagnetism]] in [[statistical mechanics]]. The model consists of [[discrete variables]] that represent [[Nuclear magnetic moment|magnetic dipole moments of atomic "spins"]] that can be in one of two states (+1 or −1). The spins are arranged in a [[Graph (abstract data type)|graph]], usually a [[lattice (group)|lattice]] (where the local structure repeats periodically in all directions), allowing each spin to interact with its neighbors. Neighboring spins that agree have a lower energy than those that disagree; the system tends to the lowest energy but heat disturbs this tendency, thus creating the possibility of different structural phases. The model allows the identification of [[phase transition]]s as a simplified model of reality. The two-dimensional [[square-lattice Ising model]] is one of the simplest statistical models to show a [[phase transition]].<ref>See {{harvtxt|Gallavotti|1999}}, Chapters VI-VII.</ref> Though it is a highly simplified model of a magnetic material, the Ising model can still provide qualitative and sometimes quantitative results applicable to real physical systems.
 
The Ising model was invented by the physicist {{harvs|txt|authorlink=Wilhelm Lenz|first=Wilhelm|last=Lenz|year=1920}}, who gave it as a problem to his student Ernst Ising. The one-dimensional Ising model was solved by {{harvtxt|Ising|1925}} alone in his 1924 thesis;<ref>[http://www.hs-augsburg.de/~harsch/anglica/Chronology/20thC/Ising/isi_fm00.html Ernst Ising, ''Contribution to the Theory of Ferromagnetism'']</ref> it has no phase transition. The two-dimensional square-lattice Ising model is much harder and was only given an analytic description much later, by {{harvs|txt|authorlink=Lars Onsager|first=Lars |last=Onsager|year=1944}}. It is usually solved by a [[Transfer-matrix method (statistical mechanics)|transfer-matrix method]], although there exists a very simple approach relating the model to a non-interacting fermionic [[quantum field theory]].<ref>{{Cite journal |last1=Samuel |first1=Stuart|date=1980 |title=The use of anticommuting variable integrals in statistical mechanics. I. The computation of partition functions|url=https://doi.org/10.1063/1.524404 |journal=Journal of Mathematical Physics |language=en |volume=21|issue=12 |pages= 2806–2814 |doi=10.1063/1.524404|url-access=subscription }}</ref>
 
In dimensions greater than four, the phase transition of the Ising model is described by [[mean-field theory]]. The Ising model for greater dimensions was also explored with respect to various tree topologies in the late 1970s, culminating in an exact solution of the zero-field, time-independent {{harvtxt|Barth|1981}} model for closed Cayley trees of arbitrary branching ratio, and thereby, arbitrarily large dimensionality within tree branches. The solution to this model exhibited a new, unusual phase transition behavior, along with non-vanishing long-range and nearest-neighbor spin-spin correlations, deemed relevant to large neural networks as one of its possible {{pslink|Ising model|applications|nopage=y}}.
Line 18:
<math display="block">H(\sigma) = -\sum_{\langle ij\rangle} J_{ij} \sigma_i \sigma_j - \mu \sum_j h_j \sigma_j,</math>
 
where the first sum is over pairs of adjacent spins (every pair is counted once). The notation <math>\langle ij\rangle</math> indicates that sites <math>i</math> and <math>j</math> are nearest neighbors. The [[magnetic moment]] is given by <math>\mu</math>. Note that the sign in the second term of the Hamiltonian above should actually be positive because the electron's magnetic moment is antiparallel to its spin, but the negative term is used conventionally.<ref>See {{harvtxt|Baierlein|1999}}, Chapter 16.</ref> The Ising Hamiltonian is an example of a [[pseudo-Boolean function]]; tools from the [[analysis of Boolean functions]] can be applied to describe and study it.

The ''configuration probability'' is given by the [[Boltzmann distribution]] with [[inverse temperature]] <math>\beta\geq0</math>:
 
<math display="block">P_\beta(\sigma) = \frac{e^{-\beta H(\sigma)}}{Z_\beta},</math>
Line 68 ⟶ 70:
For the Ising model without an external field on a graph G, the Hamiltonian becomes the following sum over the graph edges E(G)
 
:<math>H(\sigma) = -\sum_{ij\in E(G)} J_{ij}\sigma_i\sigma_j</math>.
 
Here each vertex i of the graph is a spin site that takes a spin value <math>\sigma_i = \pm 1 </math>. A given spin configuration <math>\sigma</math> partitions the set of vertices <math>V(G)</math> into two <math>\sigma</math>-depended subsets, those with spin up <math>V^+</math> and those with spin down <math>V^-</math>. We denote by <math>\delta(V^+)</math> the <math>\sigma</math>-depended set of edges that connects the two complementary vertex subsets <math>V^+</math> and <math>V^-</math>. The ''size'' <math>\left|\delta(V^+)\right|</math> of the cut <math>\delta(V^+)</math> to [[bipartite graph|bipartite]] the weighted undirected graph G can be defined as
Line 131 ⟶ 133:
 
==== Simon-Lieb inequality ====
The Simon-Lieb inequality<ref>{{Cite journal |last=Simon |first=Barry |date=1980-10-01 |title=Correlation inequalities and the decay of correlations in ferromagnets |url=https://doi.org/10.1007/BF01982711 |journal=Communications in Mathematical Physics |language=en |volume=77 |issue=2 |pages=111–126 |doi=10.1007/BF01982711 |bibcode=1980CMaPh..77..111S |s2cid=17543488 |issn=1432-0916|url-access=subscription }}</ref> states that for any set <math>S</math> disconnecting <math>x</math> from <math>y</math> (e.g. the boundary of a box with <math>x</math> being inside the box and <math>y</math> being outside),
 
<math display="block">\langle \sigma_x \sigma_y \rangle \leq \sum_{z\in S} \langle \sigma_x \sigma_z \rangle \langle \sigma_z \sigma_y \rangle.</math>
Line 142 ⟶ 144:
 
==Historical significance==
One of [[Democritus]]' arguments in support of [[atomism]] was that atoms naturally explain the sharp phase boundaries observed in materials{{citation needed|date=July 2014}}, as when ice melts to water or water turns to steam. His idea was that small changes in atomic-scale properties would lead to big changes in the aggregate behavior. Others believed that matter is inherently continuous, not atomic, and that the large-scale properties of matter are not reducible to basic atomic properties.
 
While the laws of chemical bonding made it clear to nineteenth century chemists that atoms were real, among physicists the debate continued well into the early twentieth century. Atomists, notably [[James Clerk Maxwell]] and [[Ludwig Boltzmann]], applied Hamilton's formulation of Newton's laws to large systems, and found that the [[statistical mechanics|statistical behavior]] of the atoms correctly describes room temperature gases. But classical statistical mechanics did not account for all of the properties of liquids and solids, nor of gases at low temperature.
 
Once modern [[quantum mechanics]] was formulated, atomism was no longer in conflict with experiment, but this did not lead to a universal acceptance of statistical mechanics, which went beyond atomism. [[Josiah Willard Gibbs]] had given a complete formalism to reproduce the laws of thermodynamics from the laws of mechanics. But many faulty arguments survived from the 19th century, when statistical mechanics was considered dubious. The lapses in intuition mostly stemmed from the fact that the limit of an infinite statistical system has many [[Zero–one law|zero-one laws]] which are absent in finite systems: an infinitesimal change in a parameter can lead to big differences in the overall, aggregate behavior, as Democritus expected.
 
===No phase transitions in finite volume===
Line 168:
<math display="block">M = \frac{1}{N} \sum_{i=1}^N \sigma_i.</math>
 
A bogus argument analogous to the argument in the last section now establishes that the ''average'' magnetization in the Ising model is always zero.
# Every configuration of spins has equal energy to the configuration with all spins flipped.
# So for every configuration with magnetization ''M'' there is a configuration with magnetization −''M'' with equal probability.
Line 247:
=== Artificial neural network ===
{{Main|Hopfield network}}
Ising model was instrumental in the development of the [[Hopfield network]]. The original Ising model is a model for equilibrium. [[Roy J. Glauber]] in 1963 studied the Ising model evolving in time, as a process towards thermal equilibrium ([[Glauber dynamics]]), adding in the component of time.<ref name=":222">{{cite journal |last1=Glauber |first1=Roy J. |date=February 1963 |title=Roy J. Glauber "Time-Dependent Statistics of the Ising Model" |url=https://aip.scitation.org/doi/abs/10.1063/1.1703954 |journal=Journal of Mathematical Physics |volume=4 |issue=2 |pages=294–307 |doi=10.1063/1.1703954 |access-date=2021-03-21|url-access=subscription }}</ref> (Kaoru Nakano, 1971)<ref name="Nakano1971">{{cite book |last1=Nakano |first1=Kaoru |title=Pattern Recognition and Machine Learning |date=1971 |isbn=978-1-4615-7568-9 |pages=172–186 |chapter=Learning Process in a Model of Associative Memory |doi=10.1007/978-1-4615-7566-5_15}}</ref><ref name="Nakano1972">{{cite journal |last1=Nakano |first1=Kaoru |date=1972 |title=Associatron-A Model of Associative Memory |journal=IEEE Transactions on Systems, Man, and Cybernetics |volume=SMC-2 |issue=3 |pages=380–388 |doi=10.1109/TSMC.1972.4309133}}</ref> and ([[Shun'ichi Amari]], 1972),<ref name="Amari19722">{{cite journal |last1=Amari |first1=Shun-Ichi |date=1972 |title=Learning patterns and pattern sequences by self-organizing nets of threshold elements |journal=IEEE Transactions |volume=C |issue=21 |pages=1197–1206}}</ref> proposed to modify the weights of an Ising model by [[Hebbian theory|Hebbian learning]] rule as a model of associative memory. The same idea was published by ({{ill|William A. Little (physicist)|lt=William A. Little|de|William A. Little}}, 1974),<ref name="little74">{{cite journal |last=Little |first=W. A. |year=1974 |title=The Existence of Persistent States in the Brain |journal=Mathematical Biosciences |volume=19 |issue=1–2 |pages=101–120 |doi=10.1016/0025-5564(74)90031-5}}</ref> who was cited by Hopfield in his 1982 paper.
 
The [[Spin glass#Sherrington–Kirkpatrick model|Sherrington–Kirkpatrick model]] of spin glass, published in 1975,<ref>{{Cite journal |last1=Sherrington |first1=David |last2=Kirkpatrick |first2=Scott |date=1975-12-29 |title=Solvable Model of a Spin-Glass |url=https://link.aps.org/doi/10.1103/PhysRevLett.35.1792 |journal=Physical Review Letters |volume=35 |issue=26 |pages=1792–1796 |bibcode=1975PhRvL..35.1792S |doi=10.1103/PhysRevLett.35.1792 |issn=0031-9007|url-access=subscription }}</ref> is the Hopfield network with random initialization. Sherrington and Kirkpatrick found that it is highly likely for the energy function of the SK model to have many local minima. In the 1982 paper, Hopfield applied this recently developed theory to study the Hopfield network with binary activation functions.<ref name="Hopfield1982">{{cite journal |last1=Hopfield |first1=J. J. |date=1982 |title=Neural networks and physical systems with emergent collective computational abilities |journal=Proceedings of the National Academy of Sciences |volume=79 |issue=8 |pages=2554–2558 |bibcode=1982PNAS...79.2554H |doi=10.1073/pnas.79.8.2554 |pmc=346238 |pmid=6953413 |doi-access=free}}</ref> In a 1984 paper he extended this to continuous activation functions.<ref name=":03">{{cite journal |last1=Hopfield |first1=J. J. |date=1984 |title=Neurons with graded response have collective computational properties like those of two-state neurons |journal=Proceedings of the National Academy of Sciences |volume=81 |issue=10 |pages=3088–3092 |bibcode=1984PNAS...81.3088H |doi=10.1073/pnas.81.10.3088 |pmc=345226 |pmid=6587342 |doi-access=free}}</ref> It became a standard model for the study of neural networks through statistical mechanics.<ref>{{Cite book |last1=Engel |first1=A. |title=Statistical mechanics of learning |last2=Broeck |first2=C. van den |date=2001 |publisher=Cambridge University Press |isbn=978-0-521-77307-2 |___location=Cambridge, UK; New York, NY}}</ref><ref>{{Cite journal |last1=Seung |first1=H. S. |last2=Sompolinsky |first2=H. |last3=Tishby |first3=N. |date=1992-04-01 |title=Statistical mechanics of learning from examples |url=https://journals.aps.org/pra/abstract/10.1103/PhysRevA.45.6056 |journal=Physical Review A |volume=45 |issue=8 |pages=6056–6091 |bibcode=1992PhRvA..45.6056S |doi=10.1103/PhysRevA.45.6056 |pmid=9907706|url-access=subscription }}</ref>
 
===Sea ice===
Line 258:
[[File:Cayley Tree Branch with Branching Ratio = 2.jpg|thumb|An Open Cayley Tree or Branch with Branching Ratio = 2 and k Generations]]
 
In order to investigate an Ising model with potential relevance for large (e.g. with <math>10^4</math> or <math>10^5</math> interactions per node) neural nets, at the suggestion of Krizan in 1979, {{harvtxt|Barth|1981}} obtained the exact analytical expression for the free energy of the Ising model on the closed [[Cayley tree]] (with an arbitrarily large branching ratio) for a zero-external magnetic field (in the thermodynamic limit) by applying the methodologies of {{harvtxt|Glasser|1970}} and {{harvtxt|Jellito|1979}}
 
<math display="block">-\beta f = \ln 2 + \frac{2\gamma}{(\gamma+1)} \ln (\cosh J) + \frac{\gamma(\gamma-1)}{(\gamma+1)} \sum_{i=2}^z\frac{1}{\gamma^i}\ln J_i (\tau) </math>
Line 307:
The [[Metropolis–Hastings algorithm]] is the most commonly used Monte Carlo algorithm to calculate Ising model estimations.<ref name="Newman" /> The algorithm first chooses ''selection probabilities'' ''g''(μ, ν), which represent the probability that state ν is selected by the algorithm out of all states, given that one is in state μ. It then uses acceptance probabilities ''A''(μ, ν) so that [[detailed balance]] is satisfied. If the new state ν is accepted, then we move to that state and repeat with selecting a new state and deciding to accept it. If ν is not accepted then we stay in μ. This process is repeated until some stopping criterion is met, which for the Ising model is often when the lattice becomes [[ferromagnetic]], meaning all of the sites point in the same direction.<ref name="Newman" />
 
When implementing the algorithm, one must ensure that ''g''(μ, ν) is selected such that [[ergodicity]] is met. In [[thermal equilibrium]] a system's energy only fluctuates within a small range.<ref name="Newman" /> This is the motivation behind the concept of '''single-spin-flip dynamics''',<ref name="pre0">{{cite journal|url= http://journals.aps.org/pre/abstract/10.1103/PhysRevE.90.032141|title= M. Suzen "Effective ergodicity in single-spin-flip dynamics"|journal= Physical Review E|date= 29 September 2014|volume= 90|issue= 3|page= 032141|doi= 10.1103/PhysRevE.90.032141|language=en-US|access-date=2022-08-09|last1= Süzen|first1= Mehmet|pmid= 25314429|arxiv= 1405.4497|bibcode= 2014PhRvE..90c2141S|s2cid= 118355454}}</ref> which states that in each transition, we will only change one of the spin sites on the lattice.<ref name="Newman" /> Furthermore, by using single- spin-flip dynamics, one can get from any state to any other state by flipping each site that differs between the two states one at a time. The maximum amount of change between the energy of the present state, ''H''<sub>μ</sub> and any possible new state's energy ''H''<sub>ν</sub> (using single-spin-flip dynamics) is 2''J'' between the spin we choose to "flip" to move to the new state and that spin's neighbor.<ref name="Newman" /> Thus, in a 1D Ising model, where each site has two neighbors (left and right), the maximum difference in energy would be 4''J''. Let ''c'' represent the ''lattice coordination number''; the number of nearest neighbors that any lattice site has. We assume that all sites have the same number of neighbors due to [[periodic boundary conditions]].<ref name="Newman" /> It is important to note that the Metropolis–Hastings algorithm does not perform well around the critical point due to critical slowing down. Other techniques such as multigrid methods, Niedermayer's algorithm, [[Swendsen–Wang algorithm]], or the [[Wolff algorithm]] are required in order to resolve the model near the critical point; a requirement for determining the critical exponents of the system.
 
Specifically for the Ising model and using single-spin-flip dynamics, one can establish the following. Since there are ''L'' total sites on the lattice, using single-spin-flip as the only way we transition to another state, we can see that there are a total of ''L'' new states ν from our present state μ. The algorithm assumes that the selection probabilities are equal to the ''L'' states: ''g''(μ, ν) = 1/''L''. [[Detailed balance]] tells us that the following equation must hold:
Line 448:
==== Renormalization ====
 
When there is no external field, we can derive a functional equation that <math>f(\beta, 0) = f(\beta)</math> satisfies using renormalization.<ref>{{Cite journal |last1=Maris |first1=Humphrey J. |last2=Kadanoff |first2=Leo P. |date=June 1978 |title=Teaching the renormalization group |url=https://pubs.aip.org/aapt/ajp/article/46/6/652-657/1045608 |journal=American Journal of Physics |language=en |volume=46 |issue=6 |pages=652–657 |doi=10.1119/1.11224 |bibcode=1978AmJPh..46..652M |issn=0002-9505|url-access=subscription }}</ref> Specifically, let <math>Z_N(\beta, J)</math> be the partition function with <math>N</math> sites. Now we have:<math display="block">Z_N(\beta, J) = \sum_{\sigma} e^{K \sigma_2(\sigma_1 + \sigma_3)}e^{K \sigma_4(\sigma_3 + \sigma_5)}\cdots</math>where <math>K := \beta J</math>. We sum over each of <math>\sigma_2, \sigma_4, \cdots</math>, to obtain<math display="block">Z_N(\beta, J) = \sum_{\sigma} (2\cosh(K(\sigma_1 + \sigma_3))) \cdot (2\cosh(K(\sigma_3 + \sigma_5))) \cdots</math>Now, since the cosh function is even, we can solve <math>Ae^{K'\sigma_1\sigma_3} = 2\cosh(K(\sigma_1+\sigma_3))</math> as <math display="inline">A = 2\sqrt{\cosh(2K)}, K' = \frac 12 \ln\cosh(2K)</math>. Now we have a self-similarity relation:<math display="block">\frac 1N \ln Z_N(K) = \frac 12 \ln\left(2\sqrt{\cosh(2K)}\right) + \frac 12 \frac{1}{N/2} \ln Z_{N/2}(K')</math>Taking the limit, we obtain<math display="block">f(\beta) = \frac 12 \ln\left(2\sqrt{\cosh(2K)}\right) + \frac 12 f(\beta')</math>where <math>\beta' J = \frac 12 \ln\cosh(2\beta J)</math>.
 
When <math>\beta</math> is small, we have <math>f(\beta)\approx \ln 2</math>, so we can numerically evaluate <math>f(\beta)</math> by iterating the functional equation until <math>K</math> is small.
Line 454:
=== Two dimensions ===
 
* In the ferromagnetic case there is a phase transition. At low temperature, the [[Peierls argument]] proves positive magnetization for the nearest neighbor case and then, by the [[Griffiths inequality]], also when longer range interactions are added. Meanwhile, at high temperature, the [[cluster expansion]] gives analyticity of the thermodynamic functions. In the nearest-neighbor case, the free energy was exactly computed by Onsager. The spin-spin correlation functions were computed by McCoy and Wu.
* In the nearest-neighbor case, the free energy was exactly computed by Onsager. The spin-spin correlation functions were computed by McCoy and Wu.
 
==== Onsager's exact solution ====
Line 472 ⟶ 471:
 
When the interaction energies <math>J_1</math>, <math>J_2</math> are both negative, the Ising model becomes an antiferromagnet. Since the square lattice is bi-partite, it is invariant under this change when the magnetic field <math>h=0</math>, so the free energy and critical temperature are the same for the antiferromagnetic case. For the triangular lattice, which is not bi-partite, the ferromagnetic and antiferromagnetic Ising model behave notably differently. Specifically, around a triangle, it is impossible to make all 3 spin-pairs antiparallel, so the antiferromagnetic Ising model cannot reach the minimal energy state. This is an example of [[geometric frustration]].
 
===== Transfer matrix =====
 
Start with an analogy with quantum mechanics. The Ising model on a long periodic lattice has a partition function
 
<math display="block">\sum_{\{S\}} \exp\biggl(\sum_{ij} S_{i,j} \left( S_{i,j+1} + S_{i+1,j} \right)\biggr).</math>
 
Think of the ''i'' direction as ''space'', and the ''j'' direction as ''time''. This is an independent sum over all the values that the spins can take at each time slice. This is a type of [[path integral formulation|path integral]], it is the sum over all spin histories.
 
A path integral can be rewritten as a Hamiltonian evolution. The Hamiltonian steps through time by performing a unitary rotation between time ''t'' and time ''t'' + Δ''t'':
<math display="block"> U = e^{i H \Delta t}</math>
 
The product of the U matrices, one after the other, is the total time evolution operator, which is the path integral we started with.
 
<math display="block"> U^N = (e^{i H \Delta t})^N = \int DX e^{iL}</math>
 
where ''N'' is the number of time slices. The sum over all paths is given by a product of matrices, each matrix element is the transition probability from one slice to the next.
 
Similarly, one can divide the sum over all partition function configurations into slices, where each slice is the one-dimensional configuration at time 1. This defines the [[Transfer-matrix method (statistical mechanics)|transfer matrix]]:
<math display="block">T_{C_1 C_2}.</math>
 
The configuration in each slice is a one-dimensional collection of spins. At each time slice, ''T'' has matrix elements between two configurations of spins, one in the immediate future and one in the immediate past. These two configurations are ''C''<sub>1</sub> and ''C''<sub>2</sub>, and they are all one-dimensional spin configurations. We can think of the vector space that ''T'' acts on as all complex linear combinations of these. Using quantum mechanical notation:
<math display="block">|A\rangle = \sum_S A(S) |S\rangle</math>
 
where each basis vector <math>|S\rangle</math> is a spin configuration of a one-dimensional Ising model.
 
Like the Hamiltonian, the transfer matrix acts on all linear combinations of states. The partition function is a matrix function of T, which is defined by the [[Trace (linear algebra)|sum]] over all histories which come back to the original configuration after ''N'' steps:
<math display="block">Z= \mathrm{tr}(T^N).</math>
 
Since this is a matrix equation, it can be evaluated in any basis. So if we can diagonalize the matrix ''T'', we can find ''Z''.
 
===== ''T'' in terms of Pauli matrices =====
 
The contribution to the partition function for each past/future pair of configurations on a slice is the sum of two terms. There is the number of spin flips in the past slice and there is the number of spin flips between the past and future slice. Define an operator on configurations which flips the spin at site i:
 
<math display="block">\sigma^x_i.</math>
 
In the usual Ising basis, acting on any linear combination of past configurations, it produces the same linear combination but with the spin at position i of each basis vector flipped.
 
Define a second operator which multiplies the basis vector by +1 and −1 according to the spin at position ''i'':
 
<math display="block">\sigma^z_i.</math>
 
''T'' can be written in terms of these:
 
<math display="block">\sum_i A \sigma^x_i + B \sigma^z_i \sigma^z_{i+1}</math>
 
where ''A'' and ''B'' are constants which are to be determined so as to reproduce the partition function. The interpretation is that the statistical configuration at this slice contributes according to both the number of spin flips in the slice, and whether or not the spin at position ''i'' has flipped.
 
===== Spin flip creation and annihilation operators =====
 
Just as in the one-dimensional case, we will shift attention from the spins to the spin-flips. The σ<sup>''z''</sup> term in ''T'' counts the number of spin flips, which we can write in terms of spin-flip creation and annihilation operators:
 
<math display="block"> \sum C \psi^\dagger_i \psi_i. \,</math>
 
The first term flips a spin, so depending on the basis state it either:
#moves a spin-flip one unit to the right
#moves a spin-flip one unit to the left
#produces two spin-flips on neighboring sites
#destroys two spin-flips on neighboring sites.
 
Writing this out in terms of creation and annihilation operators:
<math display="block"> \sigma^x_i = D {\psi^\dagger}_i \psi_{i+1} + D^* {\psi^\dagger}_i \psi_{i-1} + C\psi_i \psi_{i+1} + C^* {\psi^\dagger}_i {\psi^\dagger}_{i+1}.</math>
 
Ignore the constant coefficients, and focus attention on the form. They are all quadratic. Since the coefficients are constant, this means that the ''T'' matrix can be diagonalized by Fourier transforms.
 
Carrying out the diagonalization produces the Onsager free energy.
 
===== Onsager's formula for spontaneous magnetization =====
Line 564 ⟶ 496:
 
=== Four dimensions and above ===
{{main article|High-dimensional Ising model}}
{{unreferenced section|date=November 2024}}
In any dimension, the Ising model can be productively described by a locally varying [[mean field theory|mean field]]. The field is defined as the average spin value over a large region, but not so large so as to include the entire system. The field still has slow variations from point to point, as the averaging volume moves. These fluctuations in the field are described by a continuum field theory in the infinite system limit. The accuracy of this approximation improves as the dimension becomes larger. A deeper understanding of how the Ising model behaves, going beyond mean-field approximations, can be achieved using [[renormalization group]] methods.
{{overly detailed|section|date=November 2024}}
 
In any dimension, the Ising model can be productively described by a locally varying mean field. The field is defined as the average spin value over a large region, but not so large so as to include the entire system. The field still has slow variations from point to point, as the averaging volume moves. These fluctuations in the field are described by a continuum field theory in the infinite system limit.
 
==== Local field ====
 
The field ''H'' is defined as the long wavelength Fourier components of the spin variable, in the limit that the wavelengths are long. There are many ways to take the long wavelength average, depending on the details of how high wavelengths are cut off. The details are not too important, since the goal is to find the statistics of ''H'' and not the spins. Once the correlations in ''H'' are known, the long-distance correlations between the spins will be proportional to the long-distance correlations in ''H''.
 
For any value of the slowly varying field ''H'', the free energy (log-probability) is a local analytic function of ''H'' and its gradients. The free energy ''F''(''H'') is defined to be the sum over all Ising configurations which are consistent with the long wavelength field. Since ''H'' is a coarse description, there are many Ising configurations consistent with each value of ''H'', so long as not too much exactness is required for the match.
 
Since the allowed range of values of the spin in any region only depends on the values of ''H'' within one averaging volume from that region, the free energy contribution from each region only depends on the value of ''H'' there and in the neighboring regions. So ''F'' is a sum over all regions of a local contribution, which only depends on ''H'' and its derivatives.
 
By symmetry in ''H'', only even powers contribute. By reflection symmetry on a square lattice, only even powers of gradients contribute. Writing out the first few terms in the free energy:
 
<math display="block">\beta F = \int d^dx \left[ A H^2 + \sum_{i=1}^d Z_i (\partial_i H)^2 + \lambda H^4 +\cdots \right].</math>
 
On a square lattice, symmetries guarantee that the coefficients ''Z<sub>i</sub>'' of the derivative terms are all equal. But even for an anisotropic Ising model, where the ''Z<sub>i</sub>''{{'}}s in different directions are different, the fluctuations in ''H'' are isotropic in a coordinate system where the different directions of space are rescaled.
 
On any lattice, the derivative term
<math display="block">Z_{ij} \, \partial_i H \, \partial_j H </math>
is a positive definite [[quadratic form]], and can be used to ''define'' the metric for space. So any translationally invariant Ising model is rotationally invariant at long distances, in coordinates that make ''Z<sub>ij</sub>'' = δ<sub>''ij''</sub>. Rotational symmetry emerges spontaneously at large distances just because there aren't very many low order terms. At higher order multicritical points, this [[accidental symmetry]] is lost.
 
Since β''F'' is a function of a slowly spatially varying field, the probability of any field configuration is (omitting higher-order terms):
 
<math display="block">P(H) \propto e^{ - \int d^dx \left[ AH^2 + Z |\nabla H|^2 + \lambda H^4 \right]} = e^{-\beta F[H]}. </math>
 
The statistical average of any product of ''H'' terms is equal to:
 
<math display="block">\langle H(x_1) H(x_2)\cdots H(x_n) \rangle = { \int DH \, e^{ - \int d^dx \left[ A H^2 + Z |\nabla H|^2 + \lambda H^4 \right]} H(x_1) H(x_2) \cdots H(x_n) \over \int DH \, e^{ - \int d^dx \left[ A H^2 + Z |\nabla H|^2 + \lambda H^4 \right]} }.</math>
 
The denominator in this expression is called the ''partition function'':<math display="block">Z = \int DH \, e^{ - \int d^dx \left[ A H^2 + Z |\nabla H|^2 + \lambda H^4 \right]}</math>and the integral over all possible values of ''H'' is a statistical path integral. It integrates exp(β''F'') over all values of ''H'', over all the long wavelength fourier components of the spins. ''F'' is a "Euclidean" Lagrangian for the field ''H''. It is similar to the Lagrangian in of a scalar field in [[quantum field theory]], the difference being that all the derivative terms enter with a positive sign, and there is no overall factor of ''i'' (thus "Euclidean").
 
==== Dimensional analysis ====
 
The form of ''F'' can be used to predict which terms are most important by dimensional analysis. Dimensional analysis is not completely straightforward, because the scaling of ''H'' needs to be determined.
 
In the generic case, choosing the scaling law for ''H'' is easy, since the only term that contributes is the first one,
 
<math display="block">F = \int d^dx \, A H^2.</math>
 
This term is the most significant, but it gives trivial behavior. This form of the free energy is ultralocal, meaning that it is a sum of an independent contribution from each point. This is like the spin-flips in the one-dimensional Ising model. Every value of ''H'' at any point fluctuates completely independently of the value at any other point.
 
The scale of the field can be redefined to absorb the coefficient ''A'', and then it is clear that ''A'' only determines the overall scale of fluctuations. The ultralocal model describes the long wavelength high temperature behavior of the Ising model, since in this limit the fluctuation averages are independent from point to point.
 
To find the critical point, lower the temperature. As the temperature goes down, the fluctuations in ''H'' go up because the fluctuations are more correlated. This means that the average of a large number of spins does not become small as quickly as if they were uncorrelated, because they tend to be the same. This corresponds to decreasing ''A'' in the system of units where ''H'' does not absorb ''A''. The phase transition can only happen when the subleading terms in ''F'' can contribute, but since the first term dominates at long distances, the coefficient ''A'' must be tuned to zero. This is the ___location of the critical point:
 
<math display="block">F= \int d^dx \left[ t H^2 + \lambda H^4 + Z (\nabla H)^2 \right],</math>
 
where ''t'' is a parameter which goes through zero at the transition.
 
Since ''t'' is vanishing, fixing the scale of the field using this term makes the other terms blow up. Once ''t'' is small, the scale of the field can either be set to fix the coefficient of the ''H''<sup>4</sup> term or the (∇''H'')<sup>2</sup> term to 1.
 
==== Magnetization ====
 
To find the magnetization, fix the scaling of ''H'' so that λ is one. Now the field ''H'' has dimension −''d''/4, so that ''H''<sup>4</sup>''d<sup>d</sup>x'' is dimensionless, and ''Z'' has dimension 2&nbsp;−&nbsp;''d''/2. In this scaling, the gradient term is only important at long distances for ''d'' ≤ 4. Above four dimensions, at long wavelengths, the overall magnetization is only affected by the ultralocal terms.
 
There is one subtle point. The field ''H'' is fluctuating statistically, and the fluctuations can shift the zero point of ''t''. To see how, consider ''H''<sup>4</sup> split in the following way:
 
<math display="block">H(x)^4 = -\langle H(x)^2\rangle^2 + 2\langle H(x)^2\rangle H(x)^2 + \left(H(x)^2 - \langle H(x)^2\rangle\right)^2</math>
 
The first term is a constant contribution to the free energy, and can be ignored. The second term is a [[Subshift of finite type|finite shift]] in ''t''. The third term is a quantity that scales to zero at long distances. This means that when analyzing the scaling of ''t'' by dimensional analysis, it is the shifted ''t'' that is important. This was historically very confusing, because the shift in ''t'' at any finite ''λ'' is finite, but near the transition ''t'' is very small. The fractional change in ''t'' is very large, and in units where ''t'' is fixed the shift looks infinite.
 
The magnetization is at the minimum of the free energy, and this is an analytic equation. In terms of the shifted ''t'',
 
<math display="block">{\partial \over \partial H } \left( t H^2 + \lambda H^4 \right ) = 2t H + 4\lambda H^3 = 0</math>
 
For ''t'' < 0, the minima are at ''H'' proportional to the square root of ''t''. So Landau's [[catastrophe theory|catastrophe]] argument is correct in dimensions larger than 5. The magnetization exponent in dimensions higher than 5 is equal to the mean-field value.
 
When ''t'' is negative, the fluctuations about the new minimum are described by a new positive quadratic coefficient. Since this term always dominates, at temperatures below the transition the fluctuations again become ultralocal at long distances.
 
==== Fluctuations ====
 
To find the behavior of fluctuations, rescale the field to fix the gradient term. Then the length scaling dimension of the field is 1&nbsp;−&nbsp;''d''/2. Now the field has constant quadratic spatial fluctuations at all temperatures. The scale dimension of the ''H''<sup>2</sup> term is 2, while the scale dimension of the ''H''<sup>4</sup> term is 4&nbsp;−&nbsp;''d''. For ''d'' < 4, the ''H''<sup>4</sup> term has positive scale dimension. In dimensions higher than 4 it has negative scale dimensions.
 
This is an essential difference. In dimensions higher than 4, fixing the scale of the gradient term means that the coefficient of the ''H''<sup>4</sup> term is less and less important at longer and longer wavelengths. The dimension at which nonquadratic contributions begin to contribute is known as the critical dimension. In the Ising model, the critical dimension is 4.
 
In dimensions above 4, the critical fluctuations are described by a purely quadratic free energy at long wavelengths. This means that the correlation functions are all computable from as [[Gaussian distribution|Gaussian]] averages:
 
<math display="block">\langle S(x)S(y)\rangle \propto \langle H(x)H(y)\rangle = G(x-y) = \int {dk \over (2\pi)^d} { e^{ik(x-y)}\over k^2 + t }</math>
 
valid when ''x''&nbsp;−&nbsp;''y'' is large. The function ''G''(''x''&nbsp;−&nbsp;''y'') is the analytic continuation to imaginary time of the [[propagator|Feynman propagator]], since the free energy is the analytic continuation of the quantum field action for a free scalar field. For dimensions 5 and higher, all the other correlation functions at long distances are then determined by [[S-matrix#Wick's theorem|Wick's theorem]]. All the odd moments are zero, by ± symmetry. The even moments are the sum over all partition into pairs of the product of ''G''(''x''&nbsp;−&nbsp;''y'') for each pair.
 
<math display="block">\langle S(x_1) S(x_2) \cdots S(x_{2n})\rangle = C^n \sum G(x_{i1},x_{j1}) G(x_{i2},x_{j2}) \ldots G(x_{in},x_{jn})</math>
 
where ''C'' is the proportionality constant. So knowing ''G'' is enough. It determines all the multipoint correlations of the field.
 
==== The critical two-point function ====
 
To determine the form of ''G'', consider that the fields in a path integral obey the classical equations of motion derived by varying the free energy:
 
<math display="block">\begin{align}
&&\left(-\nabla_x^2 + t\right) \langle H(x)H(y) \rangle &= 0 \\
\rightarrow {} && \nabla^2 G(x) + tG(x) &= 0
\end{align}</math>
 
This is valid at noncoincident points only, since the correlations of ''H'' are singular when points collide. ''H'' obeys classical equations of motion for the same reason that quantum mechanical operators obey them—its fluctuations are defined by a path integral.
 
At the critical point ''t'' = 0, this is [[Laplace's equation]], which can be solved by [[Gaussian surface|Gauss's method]] from electrostatics. Define an electric field analog by
 
<math display="block">E = \nabla G</math>
 
Away from the origin:
 
<math display="block">\nabla \cdot E = 0</math>
 
since ''G'' is spherically symmetric in ''d'' dimensions, and ''E'' is the radial gradient of ''G''. Integrating over a large ''d''&nbsp;−&nbsp;1 dimensional sphere,
 
<math display="block">\int d^{d-1}S E_r = \mathrm{constant}</math>
 
This gives:
 
<math display="block">E = {C \over r^{d-1} }</math>
 
and ''G'' can be found by integrating with respect to ''r''.
 
<math display="block">G(r) = {C \over r^{d-2} }</math>
 
The constant ''C'' fixes the overall normalization of the field.
 
==== ''G''(''r'') away from the critical point ====
 
When ''t'' does not equal zero, so that ''H'' is fluctuating at a temperature slightly away from critical, the two point function decays at long distances. The equation it obeys is altered:
 
<math display="block">\nabla^2 G + t G = 0 \to {1 \over r^{d - 1}} {d \over dr} \left( r^{d-1} {dG \over dr} \right) + t G(r) = 0</math>
 
For ''r'' small compared with <math>\sqrt{t}</math>, the solution diverges exactly the same way as in the critical case, but the long distance behavior is modified.
 
To see how, it is convenient to represent the two point function as an integral, introduced by Schwinger in the quantum field theory context:
 
<math display="block">G(x) = \int d\tau {1 \over \left(\sqrt{2\pi\tau}\right)^d} e^{-{x^2 \over 4\tau} - t\tau}</math>
 
This is ''G'', since the Fourier transform of this integral is easy. Each fixed τ contribution is a Gaussian in ''x'', whose Fourier transform is another Gaussian of reciprocal width in ''k''.
 
<math display="block">G(k) = \int d\tau e^{-(k^2 - t)\tau} = {1 \over k^2 - t}</math>
 
This is the inverse of the operator ∇<sup>2</sup>&nbsp;−&nbsp;''t'' in ''k''-space, acting on the unit function in ''k''-space, which is the Fourier transform of a delta function source localized at the origin. So it satisfies the same equation as ''G'' with the same boundary conditions that determine the strength of the divergence at 0.
 
The interpretation of the integral representation over the ''proper time'' τ is that the two point function is the sum over all random walk paths that link position 0 to position ''x'' over time τ. The density of these paths at time τ at position ''x'' is Gaussian, but the random walkers disappear at a steady rate proportional to ''t'' so that the Gaussian at time τ is diminished in height by a factor that decreases steadily exponentially. In the quantum field theory context, these are the paths of relativistically localized quanta in a formalism that follows the paths of individual particles. In the pure statistical context, these paths still appear by the mathematical correspondence with quantum fields, but their interpretation is less directly physical.
 
The integral representation immediately shows that ''G''(''r'') is positive, since it is represented as a weighted sum of positive Gaussians. It also gives the rate of decay at large r, since the proper time for a random walk to reach position τ is r<sup>2</sup> and in this time, the Gaussian height has decayed by <math>e^{-t\tau} = e^{-tr^2}</math>. The decay factor appropriate for position ''r'' is therefore <math>e^{-\sqrt t r}</math>.
 
A heuristic approximation for ''G''(''r'') is:
 
<math display="block">G(r) \approx { e^{-\sqrt t r} \over r^{d-2}}</math>
 
This is not an exact form, except in three dimensions, where interactions between paths become important. The exact forms in high dimensions are variants of [[Bessel functions]].
 
==== Symanzik polymer interpretation ====
 
The interpretation of the correlations as fixed size quanta travelling along random walks gives a way of understanding why the critical dimension of the ''H''<sup>4</sup> interaction is 4. The term ''H''<sup>4</sup> can be thought of as the square of the density of the random walkers at any point. In order for such a term to alter the finite order correlation functions, which only introduce a few new random walks into the fluctuating environment, the new paths must intersect. Otherwise, the square of the density is just proportional to the density and only shifts the ''H''<sup>2</sup> coefficient by a constant. But the intersection probability of random walks depends on the dimension, and random walks in dimension higher than 4 do not intersect.
 
The [[fractal dimension]] of an ordinary random walk is 2. The number of balls of size ε required to cover the path increase as ε<sup>−2</sup>. Two objects of fractal dimension 2 will intersect with reasonable probability only in a space of dimension 4 or less, the same condition as for a generic pair of planes. [[Kurt Symanzik]] argued that this implies that the critical Ising fluctuations in dimensions higher than 4 should be described by a free field. This argument eventually became a mathematical proof.
 
==== 4&nbsp;−&nbsp;''ε'' dimensions – renormalization group ====
 
The Ising model in four dimensions is described by a fluctuating field, but now the fluctuations are interacting. In the polymer representation, intersections of random walks are marginally possible. In the quantum field continuation, the quanta interact.
 
The negative logarithm of the probability of any field configuration ''H'' is the [[Thermodynamic free energy|free energy]] function
 
<math display="block">F= \int d^4 x \left[ {Z \over 2} |\nabla H|^2 + {t\over 2} H^2 + {\lambda \over 4!} H^4 \right] \,</math>
 
The numerical factors are there to simplify the equations of motion. The goal is to understand the statistical fluctuations. Like any other non-quadratic path integral, the correlation functions have a [[Feynman diagram|Feynman expansion]] as particles travelling along random walks, splitting and rejoining at vertices. The interaction strength is parametrized by the classically dimensionless quantity λ.
 
Although dimensional analysis shows that both λ and ''Z'' are dimensionless, this is misleading. The long wavelength statistical fluctuations are not exactly scale invariant, and only become scale invariant when the interaction strength vanishes.
 
The reason is that there is a cutoff used to define ''H'', and the cutoff defines the shortest wavelength. Fluctuations of ''H'' at wavelengths near the cutoff can affect the longer-wavelength fluctuations. If the system is scaled along with the cutoff, the parameters will scale by dimensional analysis, but then comparing parameters doesn't compare behavior because the rescaled system has more modes. If the system is rescaled in such a way that the short wavelength cutoff remains fixed, the long-wavelength fluctuations are modified.
 
===== Wilson renormalization =====
 
A quick heuristic way of studying the scaling is to cut off the ''H'' wavenumbers at a point λ. Fourier modes of ''H'' with wavenumbers larger than λ are not allowed to fluctuate. A rescaling of length that make the whole system smaller increases all wavenumbers, and moves some fluctuations above the cutoff.
 
To restore the old cutoff, perform a partial integration over all the wavenumbers which used to be forbidden, but are now fluctuating. In Feynman diagrams, integrating over a fluctuating mode at wavenumber ''k'' links up lines carrying momentum ''k'' in a correlation function in pairs, with a factor of the inverse propagator.
 
Under rescaling, when the system is shrunk by a factor of (1+''b''), the ''t'' coefficient scales up by a factor (1+''b'')<sup>2</sup> by dimensional analysis. The change in ''t'' for infinitesimal ''b'' is 2''bt''. The other two coefficients are dimensionless and do not change at all.
 
The lowest order effect of integrating out can be calculated from the equations of motion:
 
<math display="block">\nabla^2 H + t H = - {\lambda \over 6} H^3.</math>
 
This equation is an identity inside any correlation function away from other insertions. After integrating out the modes with Λ < ''k'' < (1+''b'')Λ, it will be a slightly different identity.
 
Since the form of the equation will be preserved, to find the change in coefficients it is sufficient to analyze the change in the ''H''<sup>3</sup> term. In a Feynman diagram expansion, the ''H''<sup>3</sup> term in a correlation function inside a correlation has three dangling lines. Joining two of them at large wavenumber ''k'' gives a change ''H''<sup>3</sup> with one dangling line, so proportional to ''H'':
 
<math display="block">\delta H^3 = 3H \int_{\Lambda<|k|<(1 + b)\Lambda} {d^4k \over (2\pi)^4} {1\over (k^2 + t)}</math>
 
The factor of 3 comes from the fact that the loop can be closed in three different ways.
 
The integral should be split into two parts:
 
<math display="block">\int dk {1 \over k^2} - t \int dk { 1\over k^2(k^2 + t)} = A\Lambda^2 b + B b t</math>
 
The first part is not proportional to ''t'', and in the equation of motion it can be absorbed by a constant shift in ''t''. It is caused by the fact that the ''H''<sup>3</sup> term has a linear part. Only the second term, which varies from ''t'' to ''t'', contributes to the critical scaling.
 
This new linear term adds to the first term on the left hand side, changing ''t'' by an amount proportional to ''t''. The total change in ''t'' is the sum of the term from dimensional analysis and this second term from [[operator product expansion|operator products]]:
 
<math display="block">\delta t = \left(2 - {B\lambda \over 2} \right)b t</math>
 
So ''t'' is rescaled, but its dimension is [[anomalous dimension|anomalous]], it is changed by an amount proportional to the value of λ.
 
But λ also changes. The change in λ requires considering the lines splitting and then quickly rejoining. The lowest order process is one where one of the three lines from ''H''<sup>3</sup> splits into three, which quickly joins with one of the other lines from the same vertex. The correction to the vertex is
 
<math display="block">\delta \lambda = - {3 \lambda^2 \over 2} \int_k dk {1 \over (k^2 + t)^2} = -{3\lambda^2 \over 2} b</math>
 
The numerical factor is three times bigger because there is an extra factor of three in choosing which of the three new lines to contract. So
 
<math display="block">\delta \lambda = - 3 B \lambda^2 b</math>
 
These two equations together define the renormalization group equations in four dimensions:
 
<math display="block">\begin{align}
{dt \over t} &= \left(2 - {B\lambda \over 2}\right) b \\
{d\lambda \over \lambda} &= {-3 B \lambda \over 2} b
\end{align}</math>
 
The coefficient ''B'' is determined by the formula
<math display="block">B b = \int_{\Lambda<|k|<(1+b)\Lambda} {d^4k\over (2\pi)^4} {1 \over k^4}</math>
 
and is proportional to the area of a three-dimensional sphere of radius λ, times the width of the integration region ''b''Λ divided by Λ<sup>4</sup>:
<math display="block">B= (2 \pi^2 \Lambda^3) {1\over (2\pi)^4} { b \Lambda} {1 \over b\Lambda^4} = {1\over 8\pi^2} </math>
 
In other dimensions, the constant ''B'' changes, but the same constant appears both in the ''t'' flow and in the coupling flow. The reason is that the derivative with respect to ''t'' of the closed loop with a single vertex is a closed loop with two vertices. This means that the only difference between the scaling of the coupling and the ''t'' is the combinatorial factors from joining and splitting.
 
===== Wilson–Fisher fixed point =====
 
To investigate three dimensions starting from the four-dimensional theory should be possible, because the intersection probabilities of random walks depend continuously on the dimensionality of the space. In the language of Feynman graphs, the coupling does not change very much when the dimension is changed.
 
The process of continuing away from dimension 4 is not completely well defined without a prescription for how to do it. The prescription is only well defined on diagrams. It replaces the Schwinger representation in dimension 4 with the Schwinger representation in dimension 4&nbsp;−&nbsp;ε defined by:
<math display="block"> G(x-y) = \int d\tau {1 \over t^{d\over 2}} e^{{x^2 \over 2\tau} + t \tau} </math>
 
In dimension 4&nbsp;−&nbsp;ε, the coupling λ has positive scale dimension ε, and this must be added to the flow.
 
<math display="block">\begin{align}
{d\lambda \over \lambda} &= \varepsilon - 3 B \lambda \\
{dt \over t} &= 2 - \lambda B
\end{align}</math>
 
The coefficient ''B'' is dimension dependent, but it will cancel. The fixed point for λ is no longer zero, but at:
<math display="block">\lambda = {\varepsilon \over 3B} </math>
where the scale dimensions of ''t'' is altered by an amount λ''B'' = ε/3.
 
The magnetization exponent is altered proportionately to:
<math display="block">\tfrac{1}{2} \left( 1 - {\varepsilon \over 3}\right)</math>
 
which is .333 in 3 dimensions (ε = 1) and .166 in 2 dimensions (ε = 2). This is not so far off from the measured exponent .308 and the Onsager two dimensional exponent .125.
 
==== Infinite dimensions – mean field ====
{{Main|Mean-field theory}}
 
The behavior of an Ising model on a fully connected graph may be completely understood by [[mean-field theory]]. This type of description is appropriate to very-high-dimensional square lattices, because then each site has a very large number of neighbors.
 
The idea is that if each spin is connected to a large number of spins, only the average ratio of + spins to − spins is important, since the fluctuations about this mean will be small. The [[mean field]] ''H'' is the average fraction of spins which are + minus the average fraction of spins which are&nbsp;−. The energy cost of flipping a single spin in the mean field ''H'' is ±2''JNH''. It is convenient to redefine ''J'' to absorb the factor ''N'', so that the limit ''N'' → ∞ is smooth. In terms of the new ''J'', the energy cost for flipping a spin is ±2''JH''.
 
This energy cost gives the ratio of probability ''p'' that the spin is + to the probability 1−''p'' that the spin is&nbsp;−. This ratio is the Boltzmann factor:
<math display="block">{p\over 1-p} = e^{2\beta JH}</math>
 
so that
<math display="block">p = {1 \over 1 + e^{-2\beta JH} }</math>
 
The mean value of the spin is given by averaging 1 and −1 with the weights ''p'' and 1&nbsp;−&nbsp;''p'', so the mean value is 2''p''&nbsp;−&nbsp;1. But this average is the same for all spins, and is therefore equal to ''H''.
<math display="block"> H = 2p - 1 = { 1 - e^{-2\beta JH} \over 1 + e^{-2\beta JH}} = \tanh (\beta JH)</math>
 
The solutions to this equation are the possible consistent mean fields. For β''J'' < 1 there is only the one solution at ''H'' = 0. For bigger values of β there are three solutions, and the solution at ''H'' = 0 is unstable.
 
The instability means that increasing the mean field above zero a little bit produces a statistical fraction of spins which are + which is bigger than the value of the mean field. So a mean field which fluctuates above zero will produce an even greater mean field, and will eventually settle at the stable solution. This means that for temperatures below the critical value β''J'' = 1 the mean-field Ising model undergoes a phase transition in the limit of large ''N''.
 
Above the critical temperature, fluctuations in ''H'' are damped because the mean field restores the fluctuation to zero field. Below the critical temperature, the mean field is driven to a new equilibrium value, which is either the positive ''H'' or negative ''H'' solution to the equation.
 
For β''J'' = 1 + ε, just below the critical temperature, the value of ''H'' can be calculated from the Taylor expansion of the hyperbolic tangent:
<math display="block">H = \tanh(\beta J H) \approx (1+\varepsilon)H - {(1+\varepsilon)^3H^3\over 3}</math>
 
Dividing by ''H'' to discard the unstable solution at ''H'' = 0, the stable solutions are:
<math display="block">H = \sqrt{3\varepsilon}</math>
 
The spontaneous magnetization ''H'' grows near the critical point as the square root of the change in temperature. This is true whenever ''H'' can be calculated from the solution of an analytic equation which is symmetric between positive and negative values, which led [[Lev Landau|Landau]] to suspect that all Ising type phase transitions in all dimensions should follow this law.
 
The mean-field exponent is [[Universality (dynamical systems)|universal]] because changes in the character of solutions of analytic equations are always described by [[catastrophe theory|catastrophes]] in the [[Taylor series]], which is a polynomial equation. By symmetry, the equation for ''H'' must only have odd powers of ''H'' on the right hand side. Changing β should only smoothly change the coefficients. The transition happens when the coefficient of ''H'' on the right hand side is 1. Near the transition:
<math display="block">H = {\partial (\beta F) \over \partial h} = (1+A\varepsilon) H + B H^3 + \cdots</math>
 
Whatever ''A'' and ''B'' are, so long as neither of them is tuned to zero, the spontaneous magnetization will grow as the square root of ε. This argument can only fail if the free energy β''F'' is either non-analytic or non-generic at the exact β where the transition occurs.
 
But the spontaneous magnetization in magnetic systems and the density in gasses near the critical point are measured very accurately. The density and the magnetization in three dimensions have the same power-law dependence on the temperature near the critical point, but the behavior from experiments is:
<math display="block">H \propto \varepsilon^{0.308}</math>
 
The exponent is also universal, since it is the same in the Ising model as in the experimental magnet and gas, but it is not equal to the mean-field value. This was a great surprise.
 
This is also true in two dimensions, where
<math display="block">H \propto \varepsilon^{0.125}</math>
 
But there it was not a surprise, because it was predicted by [[Lars Onsager|Onsager]].
 
==== Low dimensions&nbsp;– block spins ====
 
In three dimensions, the perturbative series from the field theory is an expansion in a coupling constant λ which is not particularly small. The effective size of the coupling at the fixed point is one over the branching factor of the particle paths, so the expansion parameter is about 1/3. In two dimensions, the perturbative expansion parameter is 2/3.
 
But renormalization can also be productively applied to the spins directly, without passing to an average field. Historically, this approach is due to [[Leo Kadanoff]] and predated the perturbative ε expansion.
 
The idea is to integrate out lattice spins iteratively, generating a flow in couplings. But now the couplings are lattice energy coefficients. The fact that a continuum description exists guarantees that this iteration will converge to a fixed point when the temperature is tuned to criticality.
 
===== Migdal–Kadanoff renormalization =====
 
Write the two-dimensional Ising model with an infinite number of possible higher order interactions. To keep spin reflection symmetry, only even powers contribute:
<math display="block">E = \sum_{ij} J_{ij} S_i S_j + \sum J_{ijkl} S_i S_j S_k S_l \ldots.</math>
 
By translation invariance, ''J<sub>ij</sub>'' is only a function of i-j. By the accidental rotational symmetry, at large i and j its size only depends on the magnitude of the two-dimensional vector ''i''&nbsp;−&nbsp;''j''. The higher order coefficients are also similarly restricted.
 
The renormalization iteration divides the lattice into two parts – even spins and odd spins. The odd spins live on the odd-checkerboard lattice positions, and the even ones on the even-checkerboard. When the spins are indexed by the position (''i'',''j''), the odd sites are those with ''i''&nbsp;+&nbsp;''j'' odd and the even sites those with ''i''&nbsp;+&nbsp;''j'' even, and even sites are only connected to odd sites.
 
The two possible values of the odd spins will be integrated out, by summing over both possible values. This will produce a new free energy function for the remaining even spins, with new adjusted couplings. The even spins are again in a lattice, with axes tilted at 45 degrees to the old ones. Unrotating the system restores the old configuration, but with new parameters. These parameters describe the interaction between spins at distances <math display="inline">\sqrt{2}</math> larger.
 
Starting from the Ising model and repeating this iteration eventually changes all the couplings. When the temperature is higher than the critical temperature, the couplings will converge to zero, since the spins at large distances are uncorrelated. But when the temperature is critical, there will be nonzero coefficients linking spins at all orders. The flow can be approximated by only considering the first few terms. This truncated flow will produce better and better approximations to the critical exponents when more terms are included.
 
The simplest approximation is to keep only the usual ''J'' term, and discard everything else. This will generate a flow in ''J'', analogous to the flow in ''t'' at the fixed point of λ in the ε expansion.
 
To find the change in ''J'', consider the four neighbors of an odd site. These are the only spins which interact with it. The multiplicative contribution to the partition function from the sum over the two values of the spin at the odd site is:
<math display="block"> e^{J (N_+ - N_-)} + e^{J (N_- - N_+)} = 2 \cosh(J[N_+ - N_-])</math>
 
where ''N''<sub>±</sub> is the number of neighbors which are ±. Ignoring the factor of 2, the free energy contribution from this odd site is:
<math display="block"> F = \log(\cosh[J(N_+ - N_-)]).</math>
 
This includes nearest neighbor and next-nearest neighbor interactions, as expected, but also a four-spin interaction which is to be discarded. To truncate to nearest neighbor interactions, consider that the difference in energy between all spins the same and equal numbers + and – is:
<math display="block"> \Delta F = \ln(\cosh[4J]).</math>
 
From nearest neighbor couplings, the difference in energy between all spins equal and staggered spins is 8''J''. The difference in energy between all spins equal and nonstaggered but net zero spin is 4''J''. Ignoring four-spin interactions, a reasonable truncation is the average of these two energies or 6''J''. Since each link will contribute to two odd spins, the right value to compare with the previous one is half that:
<math display="block">3J' = \ln(\cosh[4J]).</math>
 
For small ''J'', this quickly flows to zero coupling. Large ''J'''s flow to large couplings. The magnetization exponent is determined from the slope of the equation at the fixed point.
 
Variants of this method produce good numerical approximations for the critical exponents when many terms are included, in both two and three dimensions.
 
==See also==
{{div col|colwidth=25em}}
{{too many see alsos|date=November 2024}}
* [[ANNNI model]]
* [[Binder parameter]]
* [[Boltzmann machine]]
* [[Conformal bootstrap]]
* [[Construction of an irreducible Markov chain in the Ising model]]
* [[Geometrical frustration]]
* [[Geometrically frustrated magnet]]
* [[Heisenberg model (classical)|Classical Heisenberg model]]
* [[Heisenberg model (quantum)|Quantum Heisenberg model]]
* [[Hopfield net]]
* [[Ising critical exponents]]
* [[John Clive Ward|J. C. Ward]]
* [[Kuramoto model]]
* [[Maximal evenness]]
* [[Order operator]]
* [[PottsSpin model]] (common with [[Ashkin–Teller model]])
* [[Spin model]]s
* [[Swendsen–Wang algorithm]]
* [[t-J model]]
* [[Two-dimensionalClassical critical IsingXY model]]
* [[WolffZN algorithmmodel]]
{{div col end}}
* [[XY model]]
* [[Z N model]]
 
==Footnotes==
Line 924 ⟶ 522:
 
==References==
{{Refbegin|30em}}
*{{Citation | last1=Barth | first1=P. F. |author-link1=Peter F. Barth | year=1981 | title= Cooperativity and the Transition Behavior of Large Neural Nets | pages=1–118 | journal= Master of Science Thesis | publisher= University of Vermont | ___location= Burlington |oclc=8231704 }}
*{{Citation | last1=Baxter | first1=Rodney J. | title=Exactly solved models in statistical mechanics | url=https://physics.anu.edu.au/theophys/baxter_book.php | publisher=Academic Press Inc. [Harcourt Brace Jovanovich Publishers] | ___location=London | isbn=978-0-12-083180-7 | mr=690578 | year=1982 }}
Line 941 ⟶ 539:
* {{Citation | last = Lenz | first = W. | author-link = Wilhelm Lenz | year = 1920 | title = Beiträge zum Verständnis der magnetischen Eigenschaften in festen Körpern | journal = Physikalische Zeitschrift | volume = 21 | pages = 613–615 }}
* Barry M. McCoy and Tai Tsun Wu (1973), ''The Two-Dimensional Ising Model''. Harvard University Press, Cambridge Massachusetts, {{ISBN|0-674-91440-6}}
*{{Citation | last1=Montroll | first1=Elliott W. | last2=Potts | first2=Renfrey B. | last3=Ward | first3=John C. | author-link3=John Clive Ward | title=Correlations and spontaneous magnetization of the two-dimensional Ising model | url=http://link.aip.org/link/?JMAPAQ%2F4%2F308%2F1 | doi=10.1063/1.1703955 | mr=0148406 | year=1963 | journal=[[Journal of Mathematical Physics]] | issn=0022-2488 | volume=4 | pages=308–322 | bibcode=1963JMP.....4..308M | issue=2 | url-status=dead | archive-url=https://archive.today/20130112095848/http://link.aip.org/link/?JMAPAQ/4/308/1 | archive-date=2013-01-12 | access-date=2009-10-25 | url-access=subscription }}
*{{Citation | last1=Onsager | first1=Lars | author-link1= Lars Onsager|title=Crystal statistics. I. A two-dimensional model with an order-disorder transition | doi=10.1103/PhysRev.65.117 | mr=0010315 | year=1944 | journal= Physical Review | series = Series II | volume=65 | pages=117–149|bibcode = 1944PhRv...65..117O | issue=3–4 }}
*{{Citation |last=Onsager |first=Lars |author-link=Lars Onsager|title=Discussion|journal=Supplemento al Nuovo Cimento | volume=6|page=261|year=1949}}
* John Palmer (2007), ''Planar Ising Correlations''. Birkhäuser, Boston, {{ISBN|978-0-8176-4248-8}}.
*{{Citation | last1=Istrail | first1=Sorin | title=Proceedings of the Thirty-Second Annual ACM Symposium on Theory of Computing | chapter-url=httphttps://www.cs.brown.edu/~sorin/pdfs/Ising-paper.pdf | publisher=ACM | mr=2114521 | year=2000 | chapter=Statistical mechanics, three-dimensionality and NP-completeness. I. Universality of intractability for the partition function of the Ising model across non-planar surfaces (extended abstract) | pages=87–96 | doi=10.1145/335305.335316 | isbn=978-1581131840 | s2cid=7944336 }}
*{{Citation | last1=Yang | first1=C. N. | author-link1=C. N. Yang| title=The spontaneous magnetization of a two-dimensional Ising model | doi=10.1103/PhysRev.85.808 | mr=0051740 | year=1952 | journal=Physical Review | series = Series II | volume=85 | pages=808–816|bibcode = 1952PhRv...85..808Y | issue=5 }}
*{{Citation | last1=Glasser | first1=M. L. | year=1970 | title= Exact Partition Function for the Two-Dimensional Ising Model | journal=American Journal of Physics | volume=38 | issue=8 | pages=1033–1036 | doi=10.1119/1.1976530 | bibcode=1970AmJPh..38.1033G }}
Line 965 ⟶ 563:
* [http://ibiblio.org/e-notes/Perc/contents.htm Phase transitions on lattices]
* [http://www.sandia.gov/media/NewsRel/NR2000/ising.htm Three-dimensional proof for Ising Model impossible, Sandia researcher claims]
* [http://isingspinwebgl.com Interactive Monte Carlo simulation of the Ising, XY and Heisenberg models with 3D graphics (requires WebGL compatible browser)]
* [https://github.com/AmazaspShumik/BayesianML-MCMC/blob/master/Gibbs%20Ising%20Model/GibbsIsingModel.m Ising Model code ], [https://github.com/AmazaspShumik/BayesianML-MCMC/blob/master/Gibbs%20Ising%20Model/imageDenoisingExample.m image denoising example with Ising Model]
* [http://www.damtp.cam.ac.uk/user/tong/statphys/five.pdf David Tong's Lecture Notes ] provide a good introduction