Data processing inequality: Difference between revisions

Content deleted Content added
lowercase d
Adding short description: "Concept in information processing"
 
(8 intermediate revisions by 7 users not shown)
Line 1:
{{Short description|Concept in information processing}}
The '''data processing inequality''' is an [[information theory|information theoretic]] concept whichthat states that the information content of a signal cannot be increased via a local physical operation. This can be expressed concisely as 'post-processing cannot increase information'.<ref name= BeaudryArxiv>{{citation |journal=Quantum Information & Computation |volume=12 |issue=5–6 |pages=432–441 |last1=Beaudry |first1=Normand |title=An intuitive proof of the data processing inequality |date=2012 |doi=10.26421/QIC12.5-6-4 |arxiv=1107.0740|bibcode=2011arXiv1107.0740B |s2cid=9531510 }}</ref>
==Definition==
 
==Statement==
Let three random variables form the [[Markov chain]] <math>X \rightarrow Y \rightarrow Z</math>, implying that the conditional distribution of <math>Z</math> depends only on <math>Y</math> and is [[Conditional independence|conditionally independent]] of <math>X</math>. Specifically, we have such a Markov chain if the joint probability mass function can be written as
:<math>p(x,y,z) = p(x)p(y|x)p(z|y)=p(y)p(x|y)p(z|y)</math>
 
In this setting, no processing of <math>Y </math>, deterministic or random, can increase the information that <math>Y</math> contains about <math>X</math>. Using the [[mutual information]], this can be written as :
:<math> I(X;Y) \geqslant I(X;Z),</math>
 
with the equality <math>I(X;Y) = I(X;Z) </math> if and only if <math> I(X;Y\mid Z)=0 </math>. That is, <math>Z</math> and <math>Y</math> contain the same information about <math>X</math>, and <math>X \rightarrow Z \rightarrow Y</math> also forms a Markov chain.<ref>{{cite book| title=Elements of information theory | last1=Cover | last2=Thomas | date=2012 | publisher=John Wiley & Sons}}</ref>
 
==Proof==
One can apply the [[Conditional_mutual_information#Chain_rule_for_mutual_information chain rule for mutual information|chain rule for mutual information]] to obtain two different decompositions of <math>I(X;Y,Z)</math>:
 
:<math>
In this setting, no processing of Y , deterministic or random, can increase the information that Y contains about X. Using the [[mutual information]], this can be written as :
:<math>I(X;Z) + I(X;Y\mid Z) \geqslant= I(X;Y,Z)</math> = I(X;Y) + I(X;Z\mid Y)
</math>
 
WithBy the equalityrelationship <math>I(X; \rightarrow Y) =\rightarrow I(X;Z) </math>, ifwe andknow only ifthat <math> I(X;Y|Z)=0 </math>, i.e.and <math>Z</math> andare conditionally independent, given <math>Y</math>, which containmeans the same[[conditional mutual information about <math>X</math>]], and <math>I(X \rightarrow ;Z \rightarrowmid Y)=0</math> also forms a Markov chain.<ref>{{cite book|The title=Elementsdata ofprocessing informationinequality theorythen |follows last1=Coverfrom |the last2=Thomasnon-negativity |of date=2012<math>I(X;Y\mid | publisher=John Wiley & Sons}}Z)\ge0</refmath>.
 
==See also==