Locally recoverable code

Locally recoverable codes are a family of error correction codes that were introduced first by D. S. Papailiopoulos and A. G. Dimakis^[1] and have been widely studied in information theory due to their applications related to distributive and cloud storage systems.^[2]^[3]^[4]^[5]

An $[n,k,d,r]_{q}$ LRC is an $[n,k,d]_{q}$ linear code such that there is a function $f_{i}$ that takes as input $i$ and a set of $r$ other coordinates of a codeword $c=(c_{1},\ldots ,c_{n})\in C$ different from $c_{i}$ , and outputs $c_{i}$ .

Definition

Let $C$ be a $[n,k,d]_{q}$ linear code. For $i\in \{1,\ldots ,n\}$ , let us denote by $r_{i}$ the minimum number of other coordinates we have to look at to recover an erasure in coordinate $i$ . The number $r_{i}$ is said to be the locality of the $i$ -th coordinate of the code. The locality of the code is defined as

r={\textrm {max}}\{r_{i}|i\in \{1,\ldots ,n\}\}

An $[n,k,d,r]_{q}$ locally recoverable code (LRC) is an $[n,k,d]_{q}$ linear code $C\in \mathbb {F} _{q}^{n}$ with locality $r$ .

Let $C$ be an $[n,k,d]_{q}$ -locally recoverable code. Then an erased component can be recovered linearly,^[6] i.e. for every $i\in \{1,\ldots ,n\}$ , the space of linear equations of the code contains elements of the form $x_{i}=f(x_{i_{1}},\ldots ,x_{i_{r}})$ , where $i_{j}\neq i$ .

Optimal locally recoverable codes

Theorem^[7] Let $n=(r+1)s$ and let $C$ be an $[n,k,d]_{q}$ -locally recoverable code having $s$ disjoint locality sets of size $r+1$ . Then

d\leq n-k-\left\lceil {\frac {k}{r}}\right\rceil +2

An $[n,k,d,r]_{q}$ -LRC $C$ is said to be optimal if the minimum distance of $C$ satisfies

d=n-k-\left\lceil {\frac {k}{r}}\right\rceil +2

Tamo–Barg codes

Let $f\in \mathbb {F} _{q}[x]$ be a polynomial and let $\ell$ be a positive integer. Then $f$ is said to be ( $r$ , $\ell$ )-good if

•

f

has degree

r+1

,

• there exist distinct subsets

A_{1},\ldots ,A_{\ell }

of

\mathbb {F} _{q}

such that

– for any

i\in \{1,\ldots ,\ell \}

,

f(A_{i})=\{t_{i}\}

for some

t_{i}\in \mathbb {F} _{q}

, i.e.,

f

is constant on

A_{i}

,

–

\#A_{i}=r+1

,

–

A_{i}\cap A_{j}=\varnothing

for any

i\neq j

.

We say that { $A_{1},\ldots ,A_{\ell }$ } is a splitting covering for $f$ .^[8]

Tamo–Barg construction

The Tamo–Barg construction utilizes good polynomials.^[9]

• Suppose that a

(r,\ell )

-good polynomial

f(x)

over

\mathbb {F} _{q}

is given with splitting covering

i\in \{1,\ldots ,\ell \}

.

• Let

s\leq \ell -1

be a positive integer.

• Consider the following

\mathbb {F} _{q}

-vector space of polynomials

V=\{\sum _{i=0}^{s}g_{i}(x)f(x)^{i}:\deg(g_{i}(x))\leq \deg(f(x))-2\}.

• Let

{\textstyle T=\bigcup _{i=1}^{\ell }A_{i}}

.

• The code

\{ev_{T}(g):g\in V\}

is an

((r+1)\ell ,(s+1)r,d,r)

-optimal locally coverable code, where

ev_{T}

denotes evaluation of

g

at all points in the set

T

.

Parameters of Tamo–Barg codes

• Length. The length is the number of evaluation points. Because the sets

A_{i}

are disjoint for

i\in \{1,\ldots ,\ell \}

, the length of the code is

|T|=(r+1)\ell

.

• Dimension. The dimension of the code is

(s+1)r

, for

s

≤

\ell -1

, as each

g_{i}

has degree at most

\deg(f(x))-2

, covering a vector space of dimension

\deg(f(x))-1=r

, and by the construction of

V

, there are

s+1

distinct

g_{i}

.

• Distance. The distance is given by the fact that

V\subseteq \mathbb {F} _{q}[x]_{\leq k}

, where

k=r+1-2+s(r+1)

, and the obtained code is the Reed-Solomon code of degree at most

k

, so the minimum distance equals

(r+1)\ell -((r+1)-2+s(r+1))

.

• Locality. After the erasure of the single component, the evaluation at

a_{i}\in A_{i}

, where

|A_{i}|=r+1

, is unknown, but the evaluations for all other

a\in A_{i}

are known, so at most

r

evaluations are needed to uniquely determine the erased component, which gives us the locality of

r

.

To see this,

g

restricted to

A_{j}

can be described by a polynomial

h

of degree at most

\deg(f(x))-2=r+1-2=r-1

thanks to the form of the elements in

V

(i.e., thanks to the fact that

f

is constant on

A_{j}

, and the

g_{i}

's have degree at most

\deg(f(x))-2

). On the other hand

|A_{j}\backslash \{a_{j}\}|=r

, and

r

evaluations uniquely determine a polynomial of degree

r-1

. Therefore

h

can be constructed and evaluated at

a_{j}

to recover

g(a_{j})

.

Example of Tamo–Barg construction

We will use $x^{5}\in \mathbb {F} _{41}[x]$ to construct $[15,8,6,4]$ -LRC. Notice that the degree of this polynomial is 5, and it is constant on $A_{i}$ for $i\in \{1,\ldots ,8\}$ , where $A_{1}=\{1,10,16,18,37\}$ , $A_{2}=2A_{1}$ , $A_{3}=3A_{1}$ , $A_{4}=4A_{1}$ , $A_{5}=5A_{1}$ , $A_{6}=6A_{1}$ , $A_{7}=11A_{1}$ , and $A_{8}=15A_{1}$ : $A_{1}^{5}=\{1\}$ , $A_{2}^{5}=\{32\}$ , $A_{3}^{5}=\{38\}$ , $A_{4}^{5}=\{40\}$ , $A_{5}^{5}=\{9\}$ , $A_{6}^{5}=\{27\}$ , $A_{7}^{5}=\{3\}$ , $A_{8}^{5}=\{14\}$ . Hence, $x^{5}$ is a $(4,8)$ -good polynomial over $\mathbb {F} _{41}$ by the definition. Now, we will use this polynomial to construct a code of dimension $k=8$ and length $n=15$ over $\mathbb {F} _{41}$ . The locality of this code is 4, which will allow us to recover a single server failure by looking at the information contained in at most 4 other servers.

Next, let us define the encoding polynomial: $f_{a}(x)=\sum _{i=0}^{r-1}f_{i}(x)x^{i}$ , where $f_{i}(x)=\sum _{i=0}^{{\frac {k}{r}}-1}a_{i,j}g(x)^{j}$ . So, $f_{a}(x)=$ $a_{0,0}+$ $a_{0,1}x^{5}+$ $a_{1,0}x+$ $a_{1,1}x^{6}+$ $a_{2,0}x^{2}+$ $a_{2,1}x^{7}+$ $a_{3,0}x^{3}+$ $a_{3,1}x^{8}$ .

Thus, we can use the obtained encoding polynomial if we take our data to encode as the row vector $a=$ $(a_{0,0},a_{0,1},a_{1,0},a_{1,1},a_{2,0},a_{2,1},a_{3,0},a_{3,1})$ . Encoding the vector $m$ to a length 15 message vector $c$ by multiplying $m$ by the generator matrix

$G={\begin{pmatrix}1&1&1&1&1&1&1&1&1&1&1&1&1&1&1\\1&1&1&1&1&32&32&32&32&32&38&38&38&38&38\\1&10&16&18&37&2&20&32&33&36&3&7&13&29&30\\1&10&16&18&37&23&25&40&31&4&32&20&2&36&33\\1&18&10&37&16&4&31&40&23&25&9&8&5&21&39\\1&18&10&37&16&5&8&9&39&21&14&17&26&19&6\\1&16&37&10&18&8&5&9&21&39&27&15&24&35&22\\1&16&37&10&18&10&37&1&16&18&1&37&10&18&16\end{pmatrix}}$

For example, the encoding of information vector $m=(1,1,1,1,1,1,1,1)$ gives the codeword $c=mG=(8,8,5,9,21,3,36,31,32,12,2,20,37,33,21)$ .

Observe that we constructed an optimal LRC; therefore, using the Singleton bound, we have that the distance of this code is $d=n-k-\left\lceil {\frac {k}{r}}\right\rceil +2=15-8-2+2=7$ . Thus, we can recover any 6 erasures from our codeword by looking at no more than 8 other components.

Locally recoverable codes with availability

Definition^[10] A code $C$ has all-symbol locality $r$ and availability $t$ if every code symbol can be recovered from $t$ disjoint repair sets of other symbols, each set of size at most $r$ symbols. Such codes are called $(r,t)_{a}$ -LRC.

Theorem^[11] The minimum distance of $[n,k,d]_{q}$ -LRC having locality $r$ and availability $t$ satisfies the upper bound

d\leq n-\sum _{i=0}^{t}\left\lfloor {\frac {k-1}{r^{i}}}\right\rfloor

.

If the code is systematic and locality and availability apply only to its information symbols, then the code has information locality $r$ and availability $t$ , and is called $(r,t)_{i}$ -LRC.

Theorem^[12] The minimum distance $d$ of an $[n,k,d]_{q}$ linear $(r,t)_{i}$ -LRC satisfies the upper bound

d\leq n-k-\left\lceil {\frac {t(k-1)+1}{t(r-1)+1}}\right\rceil +2

.

References

^ Papailiopoulos, Dimitris S.; Dimakis, Alexandros G. (2012), "Locally repairable codes", 2012 IEEE International Symposium on Information Theory Proceedings, Cambridge, MA, USA, pp. 2771–2775, arXiv:1206.3804, doi:10.1109/ISIT.2012.6284027, ISBN 978-1-4673-2579-0{{citation}}: CS1 maint: ___location missing publisher (link)
^ Barg, A.; Tamo, I.; Vlăduţ, S. (2015), "Locally recoverable codes on algebraic curves", 2015 IEEE International Symposium on Information Theory, Hong Kong, China, pp. 1252–1256, arXiv:1603.08876, doi:10.1109/ISIT.2015.7282656, ISBN 978-1-4673-7704-1{{citation}}: CS1 maint: ___location missing publisher (link)
^ Cadambe, V. R.; Mazumdar, A. (2015), "Bounds on the Size of Locally Recoverable Codes", IEEE Transactions on Information Theory, 61 (11): 5787–5794, doi:10.1109/TIT.2015.2477406
^ Dukes, A.; Ferraguti, A.; Micheli, G. (2022), "Optimal selection for good polynomials of degree up to five", Designs, Codes and Cryptography, 90 (6): 1427–1436, doi:10.1007/s10623-022-01046-y
^ Haymaker, K.; Malmskog, B.; Matthews, G. (2022), Locally Recoverable Codes with Availability t≥2 from Fiber Products of Curves, doi:10.3934/amc.2018020
^ Papailiopoulos, Dimitris S.; Dimakis, Alexandros G. (2012), "Locally repairable codes", 2012 IEEE International Symposium on Information Theory, Cambridge, MA, USA, pp. 2771–2775, arXiv:1206.3804, doi:10.1109/ISIT.2012.6284027, ISBN 978-1-4673-2579-0{{citation}}: CS1 maint: ___location missing publisher (link)
^ Cadambe, V.; Mazumdar, A. (2013), "An upper bound on the size of locally recoverable codes", 2013 International Symposium on Network Coding, Calgary, AB, Canada, pp. 1–5, arXiv:1308.3200, doi:10.1109/NetCod.2013.6570829, ISBN 978-1-4799-0823-3{{citation}}: CS1 maint: ___location missing publisher (link)
^ Micheli, G. (2020), "Constructions of Locally Recoverable Codes Which are Optimal", IEEE Transactions on Information Theory, 66: 167–175, arXiv:1806.11492, doi:10.1109/TIT.2019.2939464
^ Tamo, I.; Barg, A. (2014), "A family of optimal locally recoverable codes", 2014 IEEE International Symposium on Information Theory, Honolulu, HI, USA, pp. 686–690, doi:10.1109/ISIT.2014.6874920, ISBN 978-1-4799-5186-4{{citation}}: CS1 maint: ___location missing publisher (link)
^ Huang, P.; Yaakobi, E.; Uchikawa, H.; Siegel, P.H. (2015), "Linear locally repairable codes with availability", "Linear locally repairable codes with availability", Hong Kong, China: IEEE International Symposium on Information Theory, pp. 1871–1875, doi:10.1109/ISIT.2015.7282780, ISBN 978-1-4673-7704-1
^ Tamo, I.; Barg, A. (2014), "Bounds on locally recoverable codes with multiple recovering sets", "Bounds on locally recoverable codes with multiple recovering sets", Honolulu, HI, USA: 2014 IEEE International Symposium on Information Theory, pp. 691–695, arXiv:1402.0916, doi:10.1109/ISIT.2014.6874921, ISBN 978-1-4799-5186-4
^ Wang, A.; Zhang, Z. (2014), ""Repair locality with multiple erasure tolerance"", IEEE Transactions on Information Theory, 60 (11): 6979–6987, arXiv:1306.4774, doi:10.1109/TIT.2014.2351404

[1] Papailiopoulos, Dimitris S.; Dimakis, Alexandros G. (2012), "Locally repairable codes", 2012 IEEE International Symposium on Information Theory Proceedings, Cambridge, MA, USA, pp. 2771–2775, arXiv:1206.3804, doi:10.1109/ISIT.2012.6284027, ISBN 978-1-4673-2579-0{{citation}}: CS1 maint: ___location missing publisher (link)

[2] Barg, A.; Tamo, I.; Vlăduţ, S. (2015), "Locally recoverable codes on algebraic curves", 2015 IEEE International Symposium on Information Theory, Hong Kong, China, pp. 1252–1256, arXiv:1603.08876, doi:10.1109/ISIT.2015.7282656, ISBN 978-1-4673-7704-1{{citation}}: CS1 maint: ___location missing publisher (link)

[3] Cadambe, V. R.; Mazumdar, A. (2015), "Bounds on the Size of Locally Recoverable Codes", IEEE Transactions on Information Theory, 61 (11): 5787–5794, doi:10.1109/TIT.2015.2477406

[4] Dukes, A.; Ferraguti, A.; Micheli, G. (2022), "Optimal selection for good polynomials of degree up to five", Designs, Codes and Cryptography, 90 (6): 1427–1436, doi:10.1007/s10623-022-01046-y

[5] Haymaker, K.; Malmskog, B.; Matthews, G. (2022), Locally Recoverable Codes with Availability t≥2 from Fiber Products of Curves, doi:10.3934/amc.2018020

[6] Papailiopoulos, Dimitris S.; Dimakis, Alexandros G. (2012), "Locally repairable codes", 2012 IEEE International Symposium on Information Theory, Cambridge, MA, USA, pp. 2771–2775, arXiv:1206.3804, doi:10.1109/ISIT.2012.6284027, ISBN 978-1-4673-2579-0{{citation}}: CS1 maint: ___location missing publisher (link)

[7] Cadambe, V.; Mazumdar, A. (2013), "An upper bound on the size of locally recoverable codes", 2013 International Symposium on Network Coding, Calgary, AB, Canada, pp. 1–5, arXiv:1308.3200, doi:10.1109/NetCod.2013.6570829, ISBN 978-1-4799-0823-3{{citation}}: CS1 maint: ___location missing publisher (link)

[8] Micheli, G. (2020), "Constructions of Locally Recoverable Codes Which are Optimal", IEEE Transactions on Information Theory, 66: 167–175, arXiv:1806.11492, doi:10.1109/TIT.2019.2939464

[9] Tamo, I.; Barg, A. (2014), "A family of optimal locally recoverable codes", 2014 IEEE International Symposium on Information Theory, Honolulu, HI, USA, pp. 686–690, doi:10.1109/ISIT.2014.6874920, ISBN 978-1-4799-5186-4{{citation}}: CS1 maint: ___location missing publisher (link)

[10] Huang, P.; Yaakobi, E.; Uchikawa, H.; Siegel, P.H. (2015), "Linear locally repairable codes with availability", "Linear locally repairable codes with availability", Hong Kong, China: IEEE International Symposium on Information Theory, pp. 1871–1875, doi:10.1109/ISIT.2015.7282780, ISBN 978-1-4673-7704-1

[11] Tamo, I.; Barg, A. (2014), "Bounds on locally recoverable codes with multiple recovering sets", "Bounds on locally recoverable codes with multiple recovering sets", Honolulu, HI, USA: 2014 IEEE International Symposium on Information Theory, pp. 691–695, arXiv:1402.0916, doi:10.1109/ISIT.2014.6874921, ISBN 978-1-4799-5186-4

[12] Wang, A.; Zhang, Z. (2014), ""Repair locality with multiple erasure tolerance"", IEEE Transactions on Information Theory, 60 (11): 6979–6987, arXiv:1306.4774, doi:10.1109/TIT.2014.2351404

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]