Content deleted Content added
English syntax error? |
{{mcn}} |
||
(28 intermediate revisions by 19 users not shown) | |||
Line 1:
{{Use dmy dates|date=October 2017}}
'''Maximal entropy random walk''' ('''MERW''') is a popular type of [[biased random walk on a graph]], in which transition probabilities are chosen accordingly to the [[principle of maximum entropy]], which says that the probability distribution which best represents the current state of knowledge is the one with largest entropy. While standard [[random walk]] chooses for every vertex uniform probability distribution among its outgoing edges, locally maximizing [[entropy rate]], MERW maximizes it globally (average entropy production) by assuming uniform probability distribution among all paths in a given graph.▼
{{Mcn|date=May 2025}}
▲A '''
Additionally, it recreates some properties of [[quantum mechanics]], suggesting a way to repair the discrepancy between [[diffusion]] models and quantum predictions, like [[Anderson localization]].<ref name=prl>[http://journals.aps.org/prl/abstract/10.1103/PhysRevLett.102.160602 Z. Burda, J. Duda, J. M. Luck, and B. Waclaw, ''Localization of the Maximal Entropy Random Walk''], Phys. Rev. Lett., 2009.</ref>▼
MERW is used in various fields of science. A direct application is choosing probabilities to maximize transmission rate through a constrained channel, analogously to [[Fibonacci coding]]. Its properties also made it useful for example in analysis of complex networks,<ref name="SinatraGómez-Gardeñes2011">{{cite journal|last1=Sinatra|first1=Roberta|last2=Gómez-Gardeñes|first2=Jesús|last3=Lambiotte|first3=Renaud|last4=Nicosia|first4=Vincenzo|last5=Latora|first5=Vito|title=Maximal-entropy random walks in complex networks with limited information|journal=Physical Review E|volume=83|issue=3|pages=030103|year=2011|issn=1539-3755|doi=10.1103/PhysRevE.83.030103|pmid=21517435|url=http://www.robertasinatra.com/pdf/sinatra_maximal_entropy.pdf|bibcode=2011PhRvE..83c0103S|arxiv=1007.4936|s2cid=6984660 }}</ref> like [[link prediction]],<ref name="LiYu2011">{{cite conference|last1=Li|first1=Rong-Hua|last2=Yu|first2=Jeffrey Xu|last3=Liu|first3=Jianquan|title=Link prediction: the power of maximal entropy random walk|year=2011|pages=1147|doi=10.1145/2063576.2063741|s2cid=15309519 |url=https://pdfs.semanticscholar.org/f185/bc2499be95b4312169a7e722bac570c2d509.pdf|archive-url=https://web.archive.org/web/20170212090812/https://pdfs.semanticscholar.org/f185/bc2499be95b4312169a7e722bac570c2d509.pdf|url-status=dead|archive-date=2017-02-12|conference=Association for Computing Machinery Conference on Information and Knowledge Management|conference-url=http://www.cikm2011.org/}}</ref> community detection,<ref name="OchabBurda2013">{{cite journal|last1=Ochab|first1=J.K.|last2=Burda|first2=Z.|title=Maximal entropy random walk in community detection|journal=The European Physical Journal Special Topics|volume=216|issue=1|year=2013|pages=73–81|issn=1951-6355|doi=10.1140/epjst/e2013-01730-6|arxiv=1208.3688|bibcode=2013EPJST.216...73O|s2cid=56409069 }}</ref>
robust transport over networks<ref name="CGPT2016">{{cite journal|last1=Chen|first1=Y.|last2=Georgiou|first2=T.T.|last3=Pavon|first3=M.|last4=Tannenbaum|first4=A.|title=Robust transport over networks|journal=IEEE Transactions on Automatic Control|volume=62|issue=9|year=2016|pages=4675–4682|doi=10.1109/TAC.2016.2626796|pmid=28924302|pmc=5600536|arxiv=1603.08129|bibcode=2016arXiv160308129C}}</ref> and [[centrality]] measures.<ref name="DelvenneLibert2011">{{cite journal|last1=Delvenne|first1=Jean-Charles|last2=Libert|first2=Anne-Sophie|title=Centrality measures and thermodynamic formalism for complex networks|journal=Physical Review E|volume=83|issue=4|pages=046117|year=2011|issn=1539-3755|doi=10.1103/PhysRevE.83.046117|pmid=21599250|arxiv=0710.3972|bibcode=2011PhRvE..83d6117D|s2cid=25816198 }}</ref> It is also used in [[image analysis]], for example for detecting visual saliency regions,<ref name="saliency">{{cite journal | title=Maximal Entropy Random Walk for Region-Based Visual Saliency | journal=IEEE Transactions on Cybernetics | publisher=Institute of Electrical and Electronics Engineers (IEEE) | volume=44 | issue=9 | year=2014 | issn=2168-2267 | doi=10.1109/tcyb.2013.2292054 | pmid=25137693 | pages=1661–1672| last1=Jin-Gang Yu | last2=Ji Zhao | last3=Jinwen Tian | last4=Yihua Tan | s2cid=20962642 }}</ref> object localization,<ref name="local">[https://ieeexplore.ieee.org/abstract/document/6678551/ L. Wang, J. Zhao, X. Hu, J. Lu, ''Weakly supervised object localization via maximal entropy random walk''], ICIP, 2014.</ref> tampering detection<ref name="tamp">{{cite journal | last1=Korus | first1=Pawel | last2=Huang | first2=Jiwu | title=Improved Tampering Localization in Digital Image Forensics Based on Maximal Entropy Random Walk | journal=IEEE Signal Processing Letters | publisher=Institute of Electrical and Electronics Engineers (IEEE) | volume=23 | issue=1 | year=2016 | issn=1070-9908 | doi=10.1109/lsp.2015.2507598 | pages=169–173| bibcode=2016ISPL...23..169K | s2cid=16305991 }}</ref> or [[tractography]] problem.<ref name="trac">{{cite journal | last1=Galinsky | first1=Vitaly L. | last2=Frank | first2=Lawrence R. | title=Simultaneous Multi-Scale Diffusion Estimation and Tractography Guided by Entropy Spectrum Pathways | journal=IEEE Transactions on Medical Imaging | publisher=Institute of Electrical and Electronics Engineers (IEEE) | volume=34 | issue=5 | year=2015 | issn=0278-0062 | doi=10.1109/tmi.2014.2380812 | pmid=25532167 | pmc=4417445 | pages=1177–1193}}</ref>
▲Additionally, it recreates some properties of [[quantum mechanics]], suggesting a way to repair the discrepancy between [[diffusion]] models and quantum predictions, like [[Anderson localization]].<ref name=prl>
== Basic model ==
[[Image:General picture for Maximal Entropy Random Walk.png|thumb|upright=2.65|right|'''Left:''' basic concept of the generic random walk (GRW) and maximal entropy random walk (MERW)<br>'''Right:''' example of their evolution on the same inhomogeneous 2D lattice with cyclic boundary conditions – probability density after 10, 100 and 1000 steps while starting from the same vertex. The small boxes represent defects: all vertices but the marked ones have additional [[self-loop]] (edge to itself). For regular lattices (no defects), GRW and MERW are identical. While defects do not strongly affect the local beha­vior, they lead to a completely different global stationary probability here. While GRW (and based on it standard [[diffusion]]) leads to nearly uniform stationary density, MERW has strong localization property, imprisoning the walkers in entropic wells in analogy to electrons in defected lattice of [[semi-conductor]].]]
We would like to choose a random walk as a [[Markov process]] on this graph: for every vertex <math>i</math> and its outgoing edge to <math>j</math>, choose probability <math>S_{ij}</math> of the walker randomly using this edge after visiting <math>i</math>. Formally, find a [[stochastic matrix]] <math>S</math> (containing the transition probabilities of a Markov chain) such that
* <math>0\leq S_{ij} \leq A_{ij}</math> for all <math>i, j</math> and
* <math>\sum_{j=1}^n S_{ij}=1</math> for all <math>i</math>.
Assuming this graph is connected and not [[Periodic graph (graph theory)|periodic]], [[ergodic theory]] says that evolution of this [[stochastic process]] leads to some stationary probability distribution <math>\rho</math> such that <math>\rho S = \rho</math>.
Using [[Shannon entropy]] for every vertex and averaging over probability of visiting this vertex (to be able to use its entropy), we get the following formula for average entropy production ([[entropy rate]]) of
:<math>H(S)=\sum_{i=1}^n \rho_i \sum_{j=1}^n S_{ij} \log(1/S_{ij})</math>
This definition turns out to be equivalent
In the standard random walk, referred to here as generic random walk (GRW), we naturally choose that each outgoing edge is equally probable:
:<math>S_{ij} = \frac{A_{ij}}{\sum\limits_{k=1}^n A_{ik}}</math>.
For a symmetric <math>A</math> it leads to a stationary probability distribution <math>\rho</math> with
Line 28 ⟶ 32:
It locally maximizes entropy production (uncertainty) for every vertex, but usually leads to a suboptimal averaged global entropy rate <math>H(S)</math>.
MERW chooses the stochastic matrix which maximizes <math>H(S)</math>, or equivalently assumes uniform probability distribution among all paths in a given graph. Its formula is obtained by first calculating the dominant [[eigenvalue]] <math>\lambda</math> and corresponding [[eigenvector]] <math>\psi</math> of the adjacency matrix, i.e. the largest <math>\lambda \in \mathbb{R}</math> with corresponding <math>\psi \in \mathbb{R}^n</math> such that <math>\psi A = \lambda \psi</math>. Then the stochastic matrix and stationary probability distribution are given by
:<math>S_{ij} = \frac{A_{ij}}{\lambda} \frac{\psi_j}{\psi_i}</math>
for which every possible path of length <math>l</math> from the <math>i</math>-th to <math>j</math>-th vertex has probability
:<math>\frac{1}{\lambda^l} \frac{\psi_j}{\psi_i}</math>.
Its entropy rate is <math>\log(\lambda)</math> and the stationary probability distribution <math>\rho</math> is
:<math>\rho_i = \frac{\psi_i^2}{\|\psi\|_2^2}</math>.
In contrast to GRW, the MERW transition probabilities generally depend on the structure of the entire graph,
=== Sketch of derivation ===
Assume for simplicity that the considered graph is
MERW requires a uniform distribution along paths. The number <math>m_{il}</math> of paths with length <math>2l</math> and vertex <math>i</math> in the center is
:<math>m_{il} = \sum_{j=1}^n \sum_{k=1}^n \left(A^l\right)_{ji} \left(A^l\right)_{ik} \approx \sum_{j=1}^n \sum_{k=1}^n \left(\lambda^l \psi \psi^\top\right)_{ji} \left(\lambda^l \psi \psi^\top\right)_{ik} = \sum_{j=1}^n \sum_{k=1}^n \lambda^{2l} \psi_j \psi_i \psi_i \psi_k = \lambda^{2l} \psi_i^2 \underbrace{\sum_{j=1}^n \psi_j \sum_{k=1}^n \psi_k}_{=: b}</math>,
hence for all <math>i</math>,
:<math>\rho_i = \lim_{l \rightarrow \infty} \frac{m_{il}}{\sum\limits_{k=1}^n m_{kl}} = \lim_{l \rightarrow \infty} \frac{\lambda^{2l} \psi_i^2 b}{\sum\limits_{k=1}^n \lambda^{2l} \psi_k^2 b} = \lim_{l \rightarrow \infty} \frac{\psi_i^2}{\sum\limits_{k=1}^n \psi_k^2} = \frac{\psi_i^2}{\sum\limits_{k=1}^n \psi_k^2} = \frac{\psi_i^2}{\|\psi\|_2^2}</math>.
Analogously calculating probability distribution for two succeeding vertices, one obtains that the probability of being at the <math>i</math>-th vertex and next at the <math>j</math>-th vertex is
:<math>\frac{\psi_i A_{ij} \psi_j}{\sum\limits_{i'=1}^n \sum\limits_{j'=1}^n \psi_{i'} A_{i'j'} \psi_{j'}} = \frac{\psi_i A_{ij} \psi_j}{\psi A \psi^\top} = \frac{\psi_i A_{ij} \psi_j}{\lambda \|\psi\|_2^2}</math>.
:<math>S_{ij} = \frac{A_{ij}}{\lambda} \frac{\psi_j}{\psi_i}</math>.
=== Weighted MERW: Boltzmann path ensemble ===
We have assumed that <math>A_{ij} \in \{0,1\} </math>, yielding a MERW corresponding to the uniform ensemble among paths. However, the above derivation works for any real nonnegative <math>A</math> for which the Perron-Frobenius theorem applies. Given <math>A_{ij} = \exp(-E_{ij}) </math>, the probability of a particular length-<math>l </math> path <math>(\gamma_0, \ldots,\gamma_l) </math> is as follows:
:<math>\textrm{Pr}(\gamma_0, \ldots,\gamma_l)=\rho_{\gamma_0} S_{\gamma_0 \gamma_1}\ldots S_{\gamma_{l-1}\gamma_l}= \psi_{\gamma_0} \frac{A_{\gamma_0 \gamma_1}\ldots A_{\gamma_{l-1}\gamma_l}}{\lambda^l} \psi_{\gamma_l}=\psi_{\gamma_0}\frac{\exp(-(E_{\gamma_0 \gamma_1}+\ldots +E_{\gamma_{l-1}\gamma_l}))}{\lambda^l} \psi_{\gamma_l} </math>,
which is the same as the [[Boltzmann distribution]] of paths with energy defined as the sum of <math>E_{ij} </math> over the edges of the path. For example, this can be used with the transfer matrix to calculate the probability distribution of patterns in the [[Ising model]].
== Examples ==
[[Image:Examples of using MERW, Fibonacci coding(left) and 1D defected lattice (right).png|thumb|320px|right|Left: choosing the optimal probability after symbol 0 in [[Fibonacci coding]]. Right: a one-dimensional defected lattice and its stationary density for
Let us first look at
To practically use such long sequences, after 1 we have to use 0, but there remains the freedom of choosing the probability of 0 after 0. Let us denote this probability <math>q</math>. [[Entropy coding]] allows encoding a message using this chosen probability distribution. The stationary probability distribution of symbols for a given <math>q</math> turns out to be <math>\rho=(1/(2-q),1-1/(2-q)) </math>. Hence, entropy produced is <math>H(S)=\rho_0 \left(q\log(1/q)+(1-q)\log(1/(1-q))\right)</math>, which is maximized for <math>q=(\sqrt{5}-1)/2\approx 0.618</math>, known as the [[golden ratio]]. In contrast, a standard random walk would choose the suboptimal <math>q=0.5</math>. While choosing a larger <math>q</math> reduces the amount of information produced after 0, it also reduces the frequency of 1, after which we cannot write any information.
A more complex example is defected one-dimensional cyclic lattice: let say 1000 nodes connected in a ring, for which all nodes but the defects have self-loop (edge to itself). In standard random walk (GRW) the stationary probability distribution would have defect probability being 2/3 of probability of the remaining vertices – there is nearly no localization, also analogously for standard [[diffusion]], which is infinitesimal limit of GRW. For MERW we have to first find the dominant eigenvector of the adjacency matrix – maximizing <math>\lambda</math> in:▼
▲A more complex
<math>(\lambda \psi)_x = (A\psi)_x = \psi_{x-1}+ (1-V_x) \psi_x +\psi_{x+1}</math>
Line 63 ⟶ 74:
<math>E \psi_x =-(\psi_{x-1} -2 \psi_x +\psi_{x+1}) + V_x \psi_x</math>
where <math>E = 3-\lambda</math> is minimized now, becoming the analog of energy. The formula inside the bracket is [[discrete Laplace operator]], making this equation a discrete analogue of the stationary [[
==See also==
Line 77 ⟶ 88:
* Gábor Simonyi, [http://www.nature.com/articles/srep05365 Y. Lin, Z. Zhang, "Mean first-passage time for maximal-entropy random walks in complex networks"]. Scientific Reports, 2014.
* [http://demonstrations.wolfram.com/ElectronConductanceModelsUsingMaximalEntropyRandomWalks/ Electron Conductance Models Using Maximal Entropy Random Walks] Wolfram Demonstration Project
{{Stochastic processes}}
[[Category:Network theory]]
|