Locally testable code: Difference between revisions

Content deleted Content added
m removing orphan tags, removed orphan tag using AWB
Rewrote article to no longer be a stub
Line 1:
A '''Locally testable code''' is a type of [[error-correcting code]] for which it can be determined if a [[String (computer science)|string]] is a [[Code word|word]] in that code by looking at a small (frequently constant) number of bits of the string. In some situations, it is useful to know if the data is corrupted without decoding all of it so that appropriate action can be taken in response. For example, in communication, if the receiver encounters a corrupted code, it can request the data be re-sent, which could increase the accuracy of said data. Similarly, in data storage, these codes can allow for damaged data to be recovered and rewritten properly.
{{Unreferenced|date=March 2009}}
In [[theoretical computer science]], a '''locally testable code''' is an [[error correcting code]] for which membership can be tested by a non-adaptive [[property testing|property testing algorithm]]. That is, locally testable codes have an efficient probabilistic algorithm that, based on its random bits, non-adaptively looks at few (e.g. a constant number of) positions of the message and can then determine with high probability whether the message is close to a codeword.
 
In contrast, [[locally decodable code]]s use a small number of bits of the codeword to [[Probabilistic Turing Machine|probabilistically]] recover the original information. The fraction of errors determines how likely it is that the decoder correctly recovers the original bit; however, not all locally decodable codes are locally testable.<ref name=decNotTest>{{cite web|url=http://eccc.hpi-web.de/report/2010/130/revision/1/download/|title=Locally Testable vs. Locally Decodable Codes|author=Tali Kaufman, Michael Viderman}}</ref>
Locally testable codes have applications in [[average-case complexity]] and the design of [[probabilistically checkable proof (complexity)|probabilistically checkable proofs]].
 
Clearly, any valid codeword should be accepted as a codeword, but strings that are not codewords could be only one bit off, which would require many (certainly more than a constant number) of probes. To account for this, testing failure is only defined if the string is at least off by a set fraction of its bits. This implies words of the code must be longer than the input strings by adding some redundancy.
[[Locally decodable code]]s are closely related to locally testable codes, in that they make it possible to decode single bits of the message by only looking at few positions of a possibly corrupted codeword.
 
== Definition ==
To measure the distance between two strings, the [[Hamming distance]] is used
A family of error-correcting codes ''C<sub>n</sub>'' ⊆ {0, 1}<sup>''n''</sup> is '''locally testable''' with ''soundness'' ''s''(δ) (where ''s''(δ) < 1 for every δ > 0), and ''query complexity'' ''q''(''n'') if there exists a non-adaptive [[oracle machine|oracle algorithm]] ''T'' (the ''tester'') that on input 1<sup>''n''</sup> and given oracle access to a string ''y ∈ {0, 1}<sup>''n''</sup> satisfies the following:
:<math>\Delta(x,y)=|\{i:x_i \neq y_i\}|</math>
* '''Completeness''': If ''y ∈ C'', then ''T''<sup>''y''</sup>(1<sup>''n''</sup>) accepts with probability 1.
The distance of a string <math>w</math> from a code <math>C:\{0,1\}^k\to\{0,1\}^n</math> is computed by
* '''Soundness''': If the [[Hamming distance]] between ''y'' and ''C'' is δ''n'', then ''T''<sup>''y''</sup>(1<sup>''n''</sup>) accepts with probability at most ''s''(δ).
:<math>\Delta(w,C)=\min_x\{w,C(x)\}</math>
Relative distances are computed as a fraction of the number of bits
:<math>\delta(x,y)=\Delta(x,y)/n \text{ and } \delta(w,C)=\Delta(w,C)/n</math>
A code <math>C:\{0,1\}^k\to{0,1}^n</math> is called <math>q</math>-local <math>\delta</math>-testable if there exists a Turing machine M given [[random access]] to an input <math>w</math> that makes at most <math>q</math> non-adaptive queries of <math>w</math> and satisfies the following:<ref name=robustLTC>{{cite web|url=http://people.csail.mit.edu/madhu/papers/2004/rltc-conf.pdf|title=Robust Locally Testable Codes and Products of Codes|author=Eli Ben-Sasson, Madhu Sudan}}</ref>
:* For any <math>x\in\{0,1\}^k</math> and <math>w=C(x)</math>, <math>Pr[M^w(1^k)=1]=1</math>. In other words, M accepts given access to any codeword of C.
:* For <math>w\in\{0,1\}^n</math> such that <math>\delta(w,C)>\delta</math>, <math>Pr[M^w(1^k)=1]\leq1/2</math>. M must reject strings <math>\delta</math>-far from C at least half the time.
 
In theoretical computer science, one is typically interested in locally testable codes whose query complexity is constant or at most logarithmic.
 
==Examples Limits ==
It remains an open question whether there are any locally testable codes of linear size, but there are several constructions that come close. There are 3 different goals for "nearly linear":<ref name=shortLTC>{{cite web|url=http://eccc.hpi-web.de/eccc-reports/2005/TR05-014/index.html|title=Short Locally Testable Codes and Proofs (Survey)|author=Oded Goldreich}}</ref>
An example of a locally testable code is the [[Hadamard code]]. The Hadamard code is testable with 3 queries and soundness 1 - δ.
 
# Polynomial arbitrarily close to linear; for any <math>\epsilon>0</math>, <math>n=k^{1+\epsilon}</math>.
[[Category:Error detection and correction]]
# Functions of the form <math>n=k^{1+\epsilon(k)}</math> for <math>\epsilon(k)\in o(1)</math>, meaning a given function gets closer to linear as k increases. For example:
#* <math>1/\log \log k</math>
#* <math>1/(\log k)^c</math> for some <math>c\in (0,1)</math>
#* <math>\exp((\log \log \log k)^c)/\log k</math> for <math>c\in (0,1)</math>
# Linear up to [[polylogarithmic]] factor; <math>n=\text{poly}(\log k)*k</math>
 
The first and second have been achieved, even with constant query complexity and a binary [[Alphabet (computer science)|alphabet]], such as with <math>n=k^{1+1/(\log k)^c}</math> for any <math>c\in (0,1)</math>, but there have yet to be seen any linearly testable codes of size even the third nearly linear function.<ref name=shortLTC/>
 
== Parallels ==
{{comp-sci-theory-stub}}
 
Locally testable codes have a lot in common with [[Probabilistically checkable proofs]] (PCPs). This should be apparent from the similarities of their construction. In both, we are given <math>q</math> random nonadaptive queries into a large string and if we want to accept, we must with probability 1, and if not, we must accept no more than half the time. The major difference is that PCPs are interested in accepting <math>x</math> if there exists a <math>w</math> so that <math>M^w(x)=1</math>. Locally testable codes, on the other hand, accept <math>w</math> if it is part of the code. Many things can go wrong in assuming a PCP proof encodes a locally testable code. For example, the PCP definition says nothing about invalid proofs, only invalid inputs.
 
Despite this difference, locally testable codes and PCPs are similar enough that frequently to construct one, a prover will construct the other along the way.<ref name=LTCandPCP>{{cite web|url=http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.91.9955&rep=rep1&type=pdf|title=Locally Testable Codes|author=Mahdi Cheraghchi-Bashi-Astaneh}}</ref>
 
== Examples ==
 
=== Hadamard code ===
One of the most famous error-correcting codes, the [[Hadamard code]] is a locally testable code. A codeword x is encoded in the Hadamard code to be the linear function <math>f(y)={\sum_i {x_i y_i}}</math> (mod 2). This requires listing out the result of this function for every possible y, which requires exponentially more bits than its input. To test if a string w is a codeword of the Hadamard code, all we have to do is test if the function it encodes is linear. This means simply checking if <math>w(x)\oplus w(y)=w(x\oplus y)</math> for x and y [[Uniform distribution (discrete)|uniformly random]] vectors.
 
It is easy to see that for any valid encoding <math>w</math>, this equation is true, as that is the definition of a linear function. Somewhat harder, however, is showing that a string that is <math>\delta</math>-far from C will have an upper bound on its error in terms of <math>\delta</math>. One bound is found by the direct approach of approximating the chances of exactly one of the three probes yielding an incorrect result. Let A, B, and C be the events of <math>w(x)</math>, <math>w(y)</math>, and <math>w(x\oplus y)</math> being incorrect. Let E be the event of exactly one of these occurring. This comes out to
:<math>\begin{align}P(E) &\geq P(A\cup B\cup C)-3*P(A\cup B)\\
&\geq 3*P(A)-3*P(A\cup B)-3*P(A\cup B)\\
&\geq 3\delta-6\delta^2\end{align}</math>
This works for <math>0<\delta\leq5/16</math>, but shortly after, <math>3\delta-6\delta^2<\delta</math>. With additional work, it can be shown that the error is bounded by
:<math>f(x) = \begin{cases}
3\delta-6\delta^2 & : 0\leq\delta\leq 5/16 \\
45/128 & : 5/16\leq \delta\leq45/128 \\
\delta & :45/128 \leq \delta\leq 1/2
\end{cases}
</math>
For any given <math>\delta</math>, this only has a constant chance of false positives, so we can simply check a constant number of times to get the probability below 1/2.<ref name=Hadamard>{{cite web|url=http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.110.2530&rep=rep1&type=pdf|title=Short Locally Testable Codes and Proofs (Survey)|author=Oded Goldreich}}</ref>
 
=== Long code ===
The [[Long code]] is another code with very large blowup which is close to locally testable. Given an input <math>0\leq i\leq2^k</math> (note, this takes <math>k</math> bits to represent), the function that returns the <math>i^{th}</math> bit of the input, <math>f_i(x)=x_i</math>, is evaluated on all possible <math>2^k</math>-bit inputs <math>0\leq x\leq2^{2^k}</math>, and the codeword is the concatenation of these (of length <math>n=2^{2^k}</math>). The way to locally test this with some errors is to pick a uniformly random input <math>x</math> and set <math>y=x</math>, but with a small chance of flipping each bit, <math>\mu>0</math>. Accept a function <math>f</math> as a codeword if <math>f(x)=f(y)</math>. If <math>f</math> is a codeword, this will accept <math>f</math> as long as <math>x_i</math> was unchanged, which happens with probability <math>1-\mu</math>. This violates the requirement that codewords are always accepted, but may be good enough for some needs.<ref name=LongCode>{{cite web|url=www.wisdom.weizmann.ac.il/~ranraz/publications/Punique_codes.pdf|title=Bounds on Locally Testable Codes with Unique Tests|author=Gillat Kol, Ran Raz}}</ref>
 
Other locally testable codes include [[Reed-Muller codes]] (see [[locally decodable code]]s for a decoding algorithm), [[Reed-Solomon code]]s, and the short code.
 
== References ==
{{reflist}}
 
[[Category:Error detection and correction]]