Locally decodable code: Difference between revisions

Content deleted Content added
harvnb, tweak cites
Line 1:
A '''locally decodable code''' (LDC) is an [[error-correcting code]] that allows a single bit of the original message to be decoded with high probability by only examining (or querying) a small number of bits of a possibly corrupted [[codeword]].
<ref name=LDCSurvey>{{cite web|url=http://research.microsoft.com/en-us/um/people/yekhanin/papers/survey_iwcc.pdf |format=PDF |title=Locally decodable codes: a brief survey |author=Sergey Yekhanin}}</ref><ref name=PrivateLDC>{{cite web|url=http://eprint.iacr.org/2007/025.pdf|format=PDF |title=Private Locally Decodable Codes|author=Rafail Ostrovsky, Omkant Pandey, Amit Sahai}}</ref><ref name=newLDCPIR>Sergey Yekhanin. New locally decodable codes and private information retrieval schemes, [http://www.eccc.hpi-web.de/eccc-reports/2006/TR06-127/index.html Technical Report ECCC TR06-127], 2006.</ref>
This property could be useful, say, in a context where information is being transmitted over a noisy channel, and only a small subset of the data is required at a particular time and there is no need to decode the entire message at once. Note that locally decodable codes are not a subset of [[locally testable code]]s, though there is some overlap between the two.<ref name=LTCvsLDC>{{cite web|url=http://eccc.hpi-web.de/report/2010/130/revision/1/download/ |title=Locally Testable vs. Locally Decodable Codes |author=Tali Kaufman, Michael Viderman}}</ref>
 
Line 8:
More formally, a <math>(q, \delta, \epsilon)</math>-locally decodable code encodes an <math>n</math>-bit message <math>x</math> to an <math>N</math>-bit codeword <math>C(x)</math> such that any bit <math>x_i</math> of the message can be recovered with probability <math>1 - \epsilon</math> by using a randomized decoding algorithm that queries only <math>q</math> bits of the codeword <math>C(x)</math>, even if up to <math>\delta N</math> locations of the codeword have been corrupted.
 
Furthermore, a perfectly smooth local decoder is a decoder such that, in addition to always generating the correct output given access to an uncorrupted codeword, for every <math>j \in [q]</math> and <math>i \in [n]</math> the <math>j^{th}</math> query to recover the <math>i^{th}</math> bit is uniform over <math>[N]</math>.<ref name=AppCodingTheory>{{cite web|url=http://theory.stanford.edu/~trevisan/pubs/codingsurvey.pdf |format=PDF |title=Some Applications of Coding Theory in Computational Complexity |author=Luca Trevisan}}</ref>
(The notation <math>[y]</math> denotes the set <math>\{1,\ldots, y\}</math>). Informally, this means that the set of queries required to decode any given bit are uniformly distributed over the codeword.
 
Local list decoders are another interesting subset of local decoders. List decoding is useful when a codeword is corrupted in more than <math>\delta/2</math> places, where <math>\delta</math> is the minimum [[Hamming distance]] between two codewords. In this case, it is no longer possible to identify exactly which original message has been encoded, since there could be multiple codewords within <math>\delta</math> distance of the corrupted codeword. However, given a radius <math>\epsilon</math>, it is possible to identify the set of messages that encode to codewords that are within <math>\epsilon</math> of the corrupted codeword. An upper bound on the size of the set of messages can be determined by <math>\delta</math> and <math>\epsilon</math>.<ref>{{Cite book |chapter=Section 19.5 of {{Cite book
| last1=Arora | first1=Sanjeev | authorlink1=Sanjeev Arora
| last2=Barak | first2=Boaz
Line 19:
| year=2009
| isbn=978-0-521-42426-4
| postscript=. |ref=harv
}}</ref>
 
Locally decodable codes can also be concatenated, where a message is encoded first using one scheme, and the resulting codeword is encoded again using a different scheme. (Note that, in this context, [[concatenation]] is the term used by scholars to refer to what is usually called [[function composition|composition]]; see <ref name=AppCodingTheory/>). This might be useful if, for example, the first code has some desirable properties with respect to rate, but it has some undesirable property, such as producing a codeword over a non-binary alphabet. The second code can then transform the result of the first encoding over a non-binary alphabet to a binary alphabet. The final encoding is still locally decodable, and requires additional steps to decode both layers of encoding.<ref>{{Cite book |chapter=Section 19.4.3 of {{Cite book
|title={{harvnb|Arora|Barak|2009}}
| last1=Arora | first1=Sanjeev | authorlink1=Sanjeev Arora
| last2=Barak | first2=Boaz
| title=Computational Complexity: A Modern Approach
| url = http://www.cs.princeton.edu/theory/complexity/
| publisher=[[Cambridge University Press|Cambridge]]
| year=2009
| isbn=978-0-521-42426-4
| postscript=.
}}</ref>
 
Line 36 ⟶ 29:
The rate of a code refers to the ratio between its message length and codeword length: <math>\frac{|x|}{|C(x)|}</math>, and the number of queries required to recover 1 bit of the message is called the query complexity of a code.
 
The rate of a code is inversely related to the query complexity, but the exact shape of this tradeoff is a major open problem.<ref name=LDC1>{{cite web|url=http://research.microsoft.com/en-us/um/people/yekhanin/Papers/LDC_now.pdf |format=PDF |title=Locally Decodable Codes |author=Sergey Yekhanin}}</ref><ref name=LDC2>{{cite web|url=http://www.iacr.org/workshops/tcc2012/survey_tcc.pdf |format=PDF |title=Locally Decodable Codes |author=Sergey Yekhanin}}</ref> It is known that there are no LDCs that query the codeword in only one position, and that the optimal codeword size for query complexity 2 is exponential in the size of the original message.<ref name=LDC1/> However, there are no known tight lower bounds for codes with query complexity greater than 2. Approaching the tradeoff from the side of codeword length, the only known codes with codeword length proportional to message length have query complexity <math>k^\epsilon</math> for <math>\epsilon > 0</math>.<ref name=LDC1/> There are also codes in between, that have codewords polynomial in the size of the original message and polylogarithmic query complexity.<ref name=LDC1/>
 
== Applications ==
Line 42 ⟶ 35:
 
=== Data Transmission and Storage ===
Locally decodable codes are especially useful for data transmission over noisy channels. The Hadamard code (a special case of Reed Muller codes) was used in 1971 by [[Mariner 9]] to transmit pictures of Mars back to Earth. It was chosen over a 5-repeat code (where each bit is repeated 5 times) because, for roughly the same number of bits transmitted per pixel, it had a higher capacity for error correction. (The Hadamard code falls under the general umbrella of [[forward error correction]], and just happens to be locally decodable; the actual algorithm used to decode the transmission from Mars was a generic error-correction scheme.)<ref>[{{cite web |title=Combinatorics in Space The Mariner 9 Telemetry System |url=http://www-math.ucdenver.edu/~wcherowi/courses/m7409/mariner9talk.pdf]|format=PDF Combinatorics in Space The Mariner 9 Telemetry System}}</ref>
 
LDCs are also useful for data storage, where the medium may become partially corrupted over time, or the reading device is subject to errors. In both cases, an LDC will allow for the recovery of information despite errors, provided that there are relatively few. In addition, LDCs do not require that the entire original message be decoded; a user can decode a specific portion of the original message without needing to decode the entire thing.<ref name=PIR>{{cite web|url=http://research.microsoft.com/pubs/141305/cacm_2010.pdf |title=Private Information retrieval |author=Sergey Yekhanin}}</ref>
Line 49 ⟶ 42:
One of the applications of locally decodable codes in [[Computational complexity theory|complexity theory]] is hardness amplification. Using LDCs with polynomial codeword length and polylogarithmic query complexity, one can take a function <math>L: \{0,1\}^n \rightarrow \{0,1\}</math> that is hard to solve on worst case inputs and design a function <math>L': \{0,1\}^N \rightarrow \{0,1\}</math> that is hard to compute on average case inputs.
 
Consider <math>L</math> limited to only length <math>t</math> inputs. Then we can see <math>L</math> as a binary string of length <math>2^t</math>, where each bit is <math>L(x)</math> for each <math>x \in \{ 0, 1\}^t</math>. We can use a polynomial length locally decodable code <math>C</math> with polylogarithmic query complexity that tolerates some constant fraction of errors to encode the string that represents <math>L</math> to create a new string of length <math>2^{O(t)} = 2^{t'}</math>. We think of this new string as defining a new problem <math>L'</math> on length <math>t'</math> inputs. If <math>L'</math> is easy to solve on average, that is, we can solve <math>L'</math> correctly on a large fraction <math>1 - \epsilon</math> of inputs, then by the properties of the LDC used to encode it, we can use <math>L'</math> to probabilistically compute <math>L</math> on all inputs. Thus, a solution to <math>L'</math> for most inputs would allow us to solve <math>L</math> on all inputs, contradicting our assumption that <math>L</math> is hard on worst case inputs.<ref name=AppCodingTheory/><ref name=LDC1/><ref>{{Cite book |chapter=Section 19.4 of {{Cite book
|title={{harvnb|Arora|Barak|2009}}
| last1=Arora | first1=Sanjeev | authorlink1=Sanjeev Arora
| last2=Barak | first2=Boaz
| title=Computational Complexity: A Modern Approach
| url = http://www.cs.princeton.edu/theory/complexity/
| publisher=[[Cambridge University Press|Cambridge]]
| year=2009
| isbn=978-0-521-42426-4
| postscript=<!-- Bot inserted parameter. Either remove it; or change its value to "." for the cite to end in a ".", as necessary. -->{{inconsistent citations}}
}}</ref>
 
Line 86 ⟶ 72:
 
Since <math>y</math> and <math>y'</math> are uniformly distributed (even though they are dependent), the [[union bound]] implies that <math>f(y) = x \odot y</math> and <math>f(y') = x \odot y'</math> with probability at least <math>1 - 2\rho</math>. Note: to amplify the probability of success, one can repeat the procedure with different random vectors and take the majority answer.
<ref>{{Cite book |chapter=Section 11.5.2 of {{Cite book
|title={{harvnb|Arora|Barak|2009}}
| last1=Arora | first1=Sanjeev | authorlink1=Sanjeev Arora
| last2=Barak | first2=Boaz
| title=Computational Complexity: A Modern Approach
| url = http://www.cs.princeton.edu/theory/complexity/
| publisher=[[Cambridge University Press|Cambridge]]
| year=2009
| isbn=978-0-521-42426-4
| postscript=<!-- Bot inserted parameter. Either remove it; or change its value to "." for the cite to end in a ".", as necessary. -->{{inconsistent citations}}
}}</ref>
 
Line 102 ⟶ 81:
More formally, let <math>\mathbb{F}</math> be a finite field, and let <math>l, d</math> be numbers with <math>d < |\mathbb{F}|</math>. The Reed-Muller code with parameters <math>\mathbb{F}, l, d</math> is the function RM : <math>\mathbb{F}^{\binom{l+d}{d}} \rightarrow \mathbb{F}^{|\mathbb{F}|^l}</math> that maps every <math>l</math>-variable polynomial <math>P</math> over <math>\mathbb{F}</math> of total degree <math>d</math> to the values of <math>P</math> on all the inputs in <math>\mathbb{F}^l</math>. That is, the input is a polynomial of the form
<math>P(x_1, \ldots, x_l) = \sum\limits_{i_1+\ldots+i_l\le d}c_{i_1,\ldots,i_l}x_1^{i_1}x_2^{i_2}\cdots x_l^{i_l}</math>
specified by the interpolation of the <math>\binom{l+d}{d}</math> values of the predefined points and the output is the sequence <math>\{P(x_1, \ldots, x_l)\}</math> for every <math>x_1, \ldots, x_l \in \mathbb{F}</math>.<ref name=AB1942>{{Cite book |chapter=Section 19.4.2 of {{Cite book
|title={{harvnb|Arora|Barak|2009}}
| last1=Arora | first1=Sanjeev | authorlink1=Sanjeev Arora
| last2=Barak | first2=Boaz
| title=Computational Complexity: A Modern Approach
| url = http://www.cs.princeton.edu/theory/complexity/
| publisher=[[Cambridge University Press|Cambridge]]
| year=2009
| isbn=978-0-521-42426-4
| postscript=<!-- Bot inserted parameter. Either remove it; or change its value to "." for the cite to end in a ".", as necessary. -->{{inconsistent citations}}
}}</ref>