Content deleted Content added
Sesquilinear (talk | contribs) →Weighted MERW: Boltzmann path ensemble: Improved flow of grammar. Tags: Mobile edit Mobile web edit Advanced mobile edit |
{{mcn}} |
||
(3 intermediate revisions by the same user not shown) | |||
Line 1:
{{Use dmy dates|date=October 2017}}
'''Maximal entropy random walk''' ('''MERW''') is a popular type of [[biased random walk on a graph]], in which transition probabilities are chosen accordingly to the [[principle of maximum entropy]], which says that the [[probability distribution]] which best represents the current state of knowledge is the one with largest entropy. While standard [[random walk]] chooses for every vertex uniform probability distribution among its outgoing edges, locally maximizing [[entropy rate]], MERW maximizes it globally (average [[entropy production]]) by assuming uniform probability distribution among all paths in a given graph.▼
{{Mcn|date=May 2025}}
▲A '''
MERW is used in various fields of science. A direct application is choosing probabilities to maximize transmission rate through a constrained channel, analogously to [[Fibonacci coding]]. Its properties also made it useful for example in analysis of complex networks,<ref name="SinatraGómez-Gardeñes2011">{{cite journal|last1=Sinatra|first1=Roberta|last2=Gómez-Gardeñes|first2=Jesús|last3=Lambiotte|first3=Renaud|last4=Nicosia|first4=Vincenzo|last5=Latora|first5=Vito|title=Maximal-entropy random walks in complex networks with limited information|journal=Physical Review E|volume=83|issue=3|pages=030103|year=2011|issn=1539-3755|doi=10.1103/PhysRevE.83.030103|pmid=21517435|url=http://www.robertasinatra.com/pdf/sinatra_maximal_entropy.pdf|bibcode=2011PhRvE..83c0103S|arxiv=1007.4936|s2cid=6984660 }}</ref> like [[link prediction]],<ref name="LiYu2011">{{cite conference|last1=Li|first1=Rong-Hua|last2=Yu|first2=Jeffrey Xu|last3=Liu|first3=Jianquan|title=Link prediction: the power of maximal entropy random walk|year=2011|pages=1147|doi=10.1145/2063576.2063741|s2cid=15309519 |url=https://pdfs.semanticscholar.org/f185/bc2499be95b4312169a7e722bac570c2d509.pdf|archive-url=https://web.archive.org/web/20170212090812/https://pdfs.semanticscholar.org/f185/bc2499be95b4312169a7e722bac570c2d509.pdf|url-status=dead|archive-date=2017-02-12|conference=Association for Computing Machinery Conference on Information and Knowledge Management|conference-url=http://www.cikm2011.org/}}</ref> community detection,<ref name="OchabBurda2013">{{cite journal|last1=Ochab|first1=J.K.|last2=Burda|first2=Z.|title=Maximal entropy random walk in community detection|journal=The European Physical Journal Special Topics|volume=216|issue=1|year=2013|pages=73–81|issn=1951-6355|doi=10.1140/epjst/e2013-01730-6|arxiv=1208.3688|bibcode=2013EPJST.216...73O|s2cid=56409069 }}</ref>
robust transport over networks<ref name="CGPT2016">{{cite journal|last1=Chen|first1=Y.|last2=Georgiou|first2=T.T.|last3=Pavon|first3=M.|last4=Tannenbaum|first4=A.|title=Robust transport over networks|journal=IEEE Transactions on Automatic Control|volume=62|issue=9|year=2016|pages=4675–4682|doi=10.1109/TAC.2016.2626796|pmid=28924302|pmc=5600536|arxiv=1603.08129|bibcode=2016arXiv160308129C}}</ref> and [[centrality]] measures.<ref name="DelvenneLibert2011">{{cite journal|last1=Delvenne|first1=Jean-Charles|last2=Libert|first2=Anne-Sophie|title=Centrality measures and thermodynamic formalism for complex networks|journal=Physical Review E|volume=83|issue=4|pages=046117|year=2011|issn=1539-3755|doi=10.1103/PhysRevE.83.046117|pmid=21599250|arxiv=0710.3972|bibcode=2011PhRvE..83d6117D|s2cid=25816198 }}</ref>
Additionally, it recreates some properties of [[quantum mechanics]], suggesting a way to repair the discrepancy between [[diffusion]] models and quantum predictions, like [[Anderson localization]].<ref name=prl>{{cite journal | last1=Burda | first1=Z. | last2=Duda | first2=J. | last3=Luck | first3=J. M. | last4=Waclaw | first4=B. | title=Localization of the Maximal Entropy Random Walk | journal=Physical Review Letters | volume=102 | issue=16 | date=2009-04-23 | issn=0031-9007 | doi=10.1103/physrevlett.102.160602 | pmid=19518691 | page=160602| bibcode=2009PhRvL.102p0602B | arxiv=0810.4113 | s2cid=32134048 }}</ref>
Line 10 ⟶ 13:
[[Image:General picture for Maximal Entropy Random Walk.png|thumb|upright=2.65|right|'''Left:''' basic concept of the generic random walk (GRW) and maximal entropy random walk (MERW)<br>'''Right:''' example of their evolution on the same inhomogeneous 2D lattice with cyclic boundary conditions – probability density after 10, 100 and 1000 steps while starting from the same vertex. The small boxes represent defects: all vertices but the marked ones have additional [[self-loop]] (edge to itself). For regular lattices (no defects), GRW and MERW are identical. While defects do not strongly affect the local beha­vior, they lead to a completely different global stationary probability here. While GRW (and based on it standard [[diffusion]]) leads to nearly uniform stationary density, MERW has strong localization property, imprisoning the walkers in entropic wells in analogy to electrons in defected lattice of [[semi-conductor]].]]
Consider a [[Graph (discrete mathematics)|graph]] with <math>n</math> vertices, defined by an [[adjacency matrix]] <math>A \in \left\{0, 1\right\}^{n \times n}</math>: <math>A_{ij}=1</math> if there is an edge from vertex <math>i</math> to <math>j</math>, 0 otherwise. For simplicity, assume it is an [[Graph (discrete mathematics)#Undirected graph|undirected graph]], which corresponds to a symmetric <math>A</math>; however,
We would like to choose a random walk as a [[Markov process]] on this graph: for every vertex <math>i</math> and its outgoing edge to <math>j</math>, choose probability <math>S_{ij}</math> of the walker randomly using this edge after visiting <math>i</math>. Formally, find a [[stochastic matrix]] <math>S</math> (containing the transition probabilities of a Markov chain) such that
Line 29 ⟶ 32:
It locally maximizes entropy production (uncertainty) for every vertex, but usually leads to a suboptimal averaged global entropy rate <math>H(S)</math>.
MERW chooses the stochastic matrix which maximizes <math>H(S)</math>, or equivalently assumes uniform probability distribution among all paths in a given graph. Its formula is obtained by first calculating the dominant [[eigenvalue]] <math>\lambda</math> and corresponding [[eigenvector]] <math>\psi</math> of the adjacency matrix, i.e. the largest <math>\lambda \in \mathbb{R}</math> with corresponding <math>\psi \in \mathbb{R}^n</math> such that <math>\psi A = \lambda \psi</math>. Then the stochastic matrix and stationary probability distribution are given by
:<math>S_{ij} = \frac{A_{ij}}{\lambda} \frac{\psi_j}{\psi_i}</math>
for which every possible path of length <math>l</math> from the <math>i</math>-th to <math>j</math>-th vertex has probability
Line 36 ⟶ 39:
:<math>\rho_i = \frac{\psi_i^2}{\|\psi\|_2^2}</math>.
In contrast to GRW, the MERW transition probabilities generally depend on the structure of the entire graph,
=== Sketch of derivation ===
Assume for simplicity that the considered graph is
MERW requires a uniform distribution along paths. The number <math>m_{il}</math> of paths with length <math>2l</math> and vertex <math>i</math> in the center is
:<math>m_{il} = \sum_{j=1}^n \sum_{k=1}^n \left(A^l\right)_{ji} \left(A^l\right)_{ik} \approx \sum_{j=1}^n \sum_{k=1}^n \left(\lambda^l \psi \psi^\top\right)_{ji} \left(\lambda^l \psi \psi^\top\right)_{ik} = \sum_{j=1}^n \sum_{k=1}^n \lambda^{2l} \psi_j \psi_i \psi_i \psi_k = \lambda^{2l} \psi_i^2 \underbrace{\sum_{j=1}^n \psi_j \sum_{k=1}^n \psi_k}_{=: b}</math>,
hence for all <math>i</math>,
Line 57 ⟶ 60:
== Examples ==
[[Image:Examples of using MERW, Fibonacci coding(left) and 1D defected lattice (right).png|thumb|320px|right|Left: choosing the optimal probability after symbol 0 in [[Fibonacci coding]]. Right: a one-dimensional defected lattice and its stationary density for
Let us first look at a simple nontrivial situation: Fibonacci coding, where we want to transmit a message as a sequence of 0s and 1s, but not using two successive 1s: after a 1 there has to be a 0. To maximize the amount of information transmitted in such sequence, we should assume a uniform probability distribution in the space of all possible sequences fulfilling this constraint.
A more complex
<math>(\lambda \psi)_x = (A\psi)_x = \psi_{x-1}+ (1-V_x) \psi_x +\psi_{x+1}</math>
Line 69 ⟶ 74:
<math>E \psi_x =-(\psi_{x-1} -2 \psi_x +\psi_{x+1}) + V_x \psi_x</math>
where <math>E = 3-\lambda</math> is minimized now, becoming the analog of energy. The formula inside the bracket is [[discrete Laplace operator]], making this equation a discrete analogue of the stationary [[
==See also==
|