Homomorphic signatures for network coding: Difference between revisions

Content deleted Content added
Rescuing 1 sources and tagging 0 as dead.) #IABot (v2.0.9.5
 
(30 intermediate revisions by 18 users not shown)
Line 1:
[[Network coding]] has been shown to optimally use [[Bandwidth (computing)|bandwidth]] in a network, maximizing information flow but the scheme is very inherently vulnerable to pollution attacks by malicious nodes in the network. A node injecting garbage can quickly affect many receivers. The pollution of [[Networknetwork packet]]s spreads quickly since the output of (even an) honest node is corrupted if at least one of the incoming packets is corrupted.

An attacker can easily corrupt a packet even if it is encrypted by either forging the signature or by producing a collision under the [[Hashhash function]]. This will give an attacker access to the packets and the ability to corrupt them. Denis Charles, Kamal Jain and Kristin Lauter designed a new [[Homomorphichomomorphic encryption]] signature scheme for use with network coding to prevent pollution attacks.<ref>http{{Cite journal |citeseerx = 10.1.1.60.4738 |title = Signatures for Network Coding |year = 2006 |url = https://citeseerx.ist.psu.edu/viewdoc/downloadsummary?doi=10.1.1.60.4738&rep |archive-url =rep1&type https://web.archive.org/web/20211121060603/https://citeseerx.ist.psu.edu/viewdoc/summary?doi=pdf10.1.1.60.4738 |archive-date = 2021-11-21 |url-status = bot: unknown |access-date = 2021-11-21 }}</ref>.

The homomorphic property of the signatures allows nodes to sign any linear combination of the incoming packets without contacting the signing authority. In this scheme it is computationally infeasible for a node to sign a linear combination of the packets without disclosing what [[Linearlinear combination]] was used in the generation of the packet. Furthermore, we can prove that the signature scheme is secure under well known [http://en.wikipedia.org/wiki/[Cryptography |cryptographic]] assumptions of the hardness of the [[Discretediscrete logarithm]] problem and the computational [[Elliptic_curve_Diffie-HellmanElliptic curve Diffie–Hellman]].
 
==Network coding==
Let <math>G = (V, E)</math> be a [[Directeddirected graph]] where <math>V</math> is a set, whose elements are called vertices or [http://en.wikipedia.org/wiki/Node_%28networking%29[Node nodes(networking)|node]]s, and <math>E</math> is a set of [[Orderedordered pair]]s of vertices, called arcs, directed edges, or arrows. A source <math>s \in V</math> wants to transmit a file <math>D</math> to a set <math>T \subseteq V</math> of the vertices. One chooses a [[Vectorvector space]] <math>W/\mathbb{F}_p</math>(say of dimension <math>d</math>), where <math>p</math> is a prime, and views the data to be transmitted as a bunch of vectors <math>w_1 . . .,\ldots , w_k \in W</math>. The source then creates the augmented vectors <math>v_1, . . .\ldots , v_k</math> by setting <math> v_i = (0, . . .\ldots , 0, 1, . . .\ldots , 0, w_{i_1}, . . .\ldots , w{i_d})</math> where <math>w_{i_j}</math> is the <math>j</math>-th coordinate of the vector <math>w_i</math>. There are <math>(i-1)</math> zeros before the first '1' appears in <math>v_i</math>. One can assume without loss of generality that the vectors <math>v_i</math> are [http://en.wikipedia.org/wiki/Linear_independence[Linear independence|linearly independent]]. We denote the [[Linearlinear subspace]] (of <math>\mathbb{F}_p^{k+d}</math> ) spanned by these vectors by <math>V</math> . Each outgoing edge <math>e \in E</math> computes a linear combination, <math>y(e)</math>, of the vectors entering the vertex <math>v = in(e)</math> where the edge originates, that is to say
 
: <math>y(e) = \sum_{f:\mathrm{out}(f)=v}(m_e(f)y(f))</math>
 
where <math>m_e(f) \in \mathbb{F}_p</math>. We consider the source as having <math>k</math> input edges carrying the <math>k</math> vectors <math>w_i</math>. By [http://en.wikipedia.org/wiki/Mathematical_induction[Mathematical induction|induction]], one has that the vector <math>y(e)</math> on any edge is a linear combination <math>y(e) = \sum_{1 \le i \le k}(g_i(e)v_i)</math> and is a vector in <math>V</math> . The k-dimensional vector <math>g(e) = (g_1(e), . . .\ldots , g_k(e))</math> is simply the first ''k-'' coordinates of the vector <math>y(e)</math>. We call the [http://en.wikipedia.org/wiki/Matrix_%28mathematics%29[Matrix (mathematics)|matrix]] whose rows are the vectors <math>g(e_1), . . .\ldots , g(e_k)</math>, where <math>e_i</math> are the incoming edges for a vertex <math>t \in T</math>, the global encoding matrix for <math>t</math> and denote it as <math>G_t</math>. In practice the encoding vectors are chosen at random so the matrix <math>G_t</math> is invertible with high probability. Thus, any receiver, on receiving <math>y_1, . . .\ldots , y_k</math> can find <math>w_1, . . .\ldots ,w_k</math> by solving
 
: <math>\begin{bmatrix} y'\\ y_2' \\. \\. \\.vdots \\ y_k' \end{bmatrix} = G_t \begin{bmatrix} w_1\\ w_2 \\. \\. \\.vdots \\ w_k \end{bmatrix}</math>
 
where the <math>y_i'</math> are the vectors formed by removing the first <math>k</math> coordinates of the vector <math>y_i</math>.
 
==Decoding at the receiver==
Each [http://en.wikipedia.org/wiki/Receiver_%28information_theory%29[Receiver (information theory)|receiver]], <math>t \in T</math>, gets <math>k</math> vectors <math>y_1, . . .\ldots , y_k</math> which are random linear combinations of the <math>v_i</math>’s.
In fact, if
 
: <math>y_i = (\alpha_{i_1}, . . .\ldots , \alpha_{i_k}, a_{i_1}, . . .\ldots , a_{i_d})</math>
 
then
 
: <math>y_i = \sum_{1 \le j \le k}(\alpha_{ij}v_j).</math>.
 
Thus we can invert the linear transformation to find the <math>v_i</math>’s with high [[Probabilityprobability]].
 
==History==
Krohn, Freedman and Mazieres proposed a theory<ref>http{{cite book |last1=Krohn |first1=Maxwell N. |last2=Freedman |first2=Michael J |last3=Mazières |first3=David |title=IEEE Symposium on Security and Privacy, 2004. Proceedings. 2004 |chapter=On-the-fly verification of rateless erasure codes for efficient content distribution |date=2004 |pages=226–240 |doi=10.1109/SECPRI.2004.1301326 |chapter-url=https://www.cs.princeton.edu/~mfreed/docs/authcodes-ieee04sp04.pdf |access-date=17 November 2022 |___location=Berkeley, California, USA |language=en-us |issn=1081-6011|isbn=0-7695-2136-3|s2cid=6976686 }}</ref> in 2004 that if we have a hash function
<math>H : V \longrightarrow G</math> such that:
* <math>H</math> is [http://en.wikipedia.org/wiki/Collision_resistance[Collision resistance|collision resistant]] – it is hard to find <math>x</math> and <math>y</math> such that <math>H(x)</math> = <math>H(y)</math>;
* <math>H</math> is a [[Homomorphismhomomorphism]] – <math>H(x</math> + <math>y</math>) = <math>H(x) + H(y)</math>.
 
Then server can securely distribute <math>H(v_i)</math> to each receiver, and to check if
 
: <math>y = \sum_{1 \leq i\leq k} (\alpha_iv_i)</math>
 
we can check ifwhether
 
: <math>H(y) = \sum_{1 \leq i\leq k} (\alpha_iH(v_i))</math>
 
The problem with this method is that the server needs to transfer secure information to each of the receivers. The hash functions <math>H</math> needs to be transmitted to all the nodes in the network through a separate secure channel.<math>H</math> is expensive to compute and secure transmission of <math>H</math> is not economical either.
 
==Advantages of Homomorphichomomorphic Signaturessignatures==
# Establishes authentication in addition to detecting pollution.
# No need for distributing secure hash digests.
Line 46 ⟶ 51:
 
==Signature scheme==
The homomorphic property of the signatures allows nodes to sign any linear cominationcombination of the incoming packets without contacting the signing authority.
 
===Elliptic curves cryptography over a finite field===
[[Elliptic curve cryptography]] over a finite field is an approach to [[public-key cryptography]] based on the algebraic structure of [[Ellipticelliptic curves]] over [[Finitefinite field]]s.
 
Let <math>\mathbb{F}_q</math> be a finite field such that <math>q</math> is not a power of 2 or 3. Then an elliptic curve <math>E</math> over <math>\mathbb{F}_q</math> is a curve given by an equation of the form
: <math> y^2 = x^3 + ax + b, \, </math>,
 
where <math>a, b \in \mathbb{F}_q</math> such that <math>4a^3 + 27b^2 \not= 0</math>
 
Let <math>K \supseteq \mathbb{F}_q</math> , then,
 
: <math>E(K) = \{(x, y) | y^2 = x^3 + ax + b\} \bigcup \{O\}</math>

forms an [[Abelianabelian group]] with O as identity. The [http://en.wikipedia.org/wiki/Group_%28mathematics%29[Group (mathematics)|group operations]] can be performed efficiently.
 
===Weil pairing===
[[Weil pairing]] is a construction of [[Rootsroots of unity]] by means of functions on an [[Ellipticelliptic curve]] <math>E</math>, in such a way as to constitute a [http://en.wikipedia.org/wiki/Pairing [pairing]] ([http://en.wikipedia.org/wiki/Bilinear_form [bilinear form]], though with [[Multiplicativemultiplicative notation]]) on the [[Torsiontorsion subgroup]] of <math>E</math>. Let <math>E/\mathbb{F}_q</math> be an elliptic curve and let <math>\mathbb{\bar{F}}_q</math> be an algebraic closure of <math>\mathbb{F}_q</math>. If <math>m</math> is an integer, relatively prime to the characteristic of the field <math>\mathbb{F}_q</math>, then the group of <math>m</math>-torsion points,
<math>E[m] = {P \in E(\mathbb{\bar {F}}_q) : mP = O}</math>.
 
If <math>E/\mathbb{F}_q</math> is an elliptic curve and <math>\gcd(m, q) = 1</math> then
 
: <math>E[m] \cong (\mathbb{Z}/m\mathbb{Z}) * (\mathbb{Z}/m\mathbb{Z})</math>
 
There is a map <math>e_m : E[m] * E[m] \rightarrow \mu_m(\mathbb{F}_q)</math> such that:
 
#(Bilinear) <math>e_m(P + R,Q) = e_m(P,Q)e_m(R,Q)</math>\text{ &and <math>}e_m(P,Q + R) = e_m(P,Q)e_m(P, R)</math>.
#(Non-degenerate) <math>e_m(P,Q) = 1</math> for all ''P'' implies that <math>Q = O</math>.
#(Alternating) <math>e_m(P, P) = 1</math>.
 
Also, <math>e_m</math> can be computed efficiently.<ref>{{Cite journal |citeseerx = 10.1.1.88.8848|title = Improved Weil and Tate pairings for elliptic and hyperelliptic curves|url = https://archive.org/details/arxiv-math0311391|pages = 169–183|year = 2004|bibcode = 2003math.....11391E|last1 = Eisentraeger|first1 = Kirsten|last2 = Lauter|first2 = Kristin|last3 = Montgomery|first3 = Peter L.|arxiv = math/0311391}}</ref>
Also, <math>e_m</math> can be computed efficiently<ref>http://www.google.com/url?sa=t&source=web&cd=1&ved=0CBsQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.88.8848%26rep%3Drep1%26type%3Dpdf&rct=j&q=Improved%20Weil%20and%20Tate%20pairings%20for%20elliptic%20and%20hyperelliptic%20curves&ei=Fyi-Te_nMdCitgejkIzGBQ&usg=AFQjCNFTlKZaFlQ_BZyXDh6nKSDpntTTlA&sig2=VDWSZSuKNVeefL0EhmOCqA&cad=rja</ref>.
 
===Homomorphic signatures===
 
Let <math>p</math> be a prime and <math>q</math> a prime power. Let <math>V/\mathbb{F}_p</math> be a vector space of dimension <math>D</math> and <math>E/\mathbb{F}_q</math> be an elliptic curve such that <math>P_1, . . .\ldots , P_D \in E[p]</math>.
Define <math>h : V \longrightarrow E[p]</math> as follows:
<math>h(u_1, . . .\ldots , u_D) = \sum_{1 \leq i\leq D} (u_iP_i)</math>.
The function <math>h</math> is an arbitrary homomorphism from <math>V</math> to <math>E[p]</math>.
 
The server chooses <math>s_1, . . .\ldots , s_D</math> secretly in <math>\mathbb{F}_p</math> and publishes a point <math>Q</math> of p-torsion such that <math>e_p(P_i,Q) \not= 1</math> and also publishes <math>(P_i, s_iQ)</math> for <math>1 \leq i \leq D</math>.
The signature of the vector <math>v = u_1, . . .\ldots , u_D</math> is
<math>\sigma(v) = \sum_{1 \leq i\leq D} (u_is_iP_i)</math>
Note: This signature is homomorphic since the computation of h is a homomorphism.
Line 89 ⟶ 97:
===Signature verification===
 
Given <math>v = u_1, . . .\ldots , u_D</math> and its signature <math>\sigma</math>, verify that
 
: <math>
<math>e_p(\sigma,Q) = e_p \sum_{1 \leq i \leq D} (u_is_iP_i,Q)</math>
\begin{align}
<math>e_p(\sigma,Q) & = e_p \left(\sum_{1 \leq i \leq D} (u_is_iP_iu_i s_i P_i), Q \right)</math> \\
<math> & = \prod_i e_p(u_i s_i P_i,Q)</math> \\
<math> & = \prod_i e_p(u_i P_i, s_iQ)</math>
\end{align}
</math>
 
The verification cruiciallycrucially uses the bilinearity of the Weil-pairing.
<math> = \prod_i e_p(u_i s_i P_i,Q)</math>
 
<math> = \prod_i e_p(u_i P_i, s_iQ)</math>
 
The verification cruicially uses the bilinearity of the Weil-pairing.
 
==System setup==
The server computes <math>\sigma(v_i)</math> for each <math>1 \leq i \leq k</math>. Transmits <math>v_i, \sigma(v_i)</math>.
At each edge <math>e</math> while computing
<math>y(e) = \sum_{f \in E:\mathrm{out}(f)=\mathrm{in}(e)} (m_e(f)y(f))</math>
also compute
<math>\sigma(y(e)) = \sum_{f \in E:\mathrm{out}(f)=\mathrm{in}(e)} (m_e(f)\sigma(y(f)))</math>
on the elliptic curve <math>E</math>.
 
The signature is a point on the elliptic curve with coordinates in <math>\mathbb{F}_q</math>. Thus the size of the signature is <math>2 \log q</math> bits (which is some constant times <math>log(p)</math> bits, depending on the relative size of <math>p</math> and <math>q</math>), and this is the transmission overhead. The computation of the signature <math>h(e)</math> at each vertex requires <math>O(d_{in} \log p \log^{1+\epsilon} q)</math> bit operations, where <math>d_{in}</math> is the in-degree of the vertex <math>in(e)</math>. The verification of a signature requires <math>O((d + k) \log^{2+\epsilon} q)</math> bit operations.
 
==Proof of security==
Line 113 ⟶ 123:
Attacker can produce a collision under the hash function.
 
If given <math>(P_1, . . .\ldots , P_r)</math> points in <math>E[p]</math> find
<math>a = (a_1, . . .\ldots , a_r) \in \mathbb{F}_p^r</math> and <math>b = (b_1, . . .\ldots , b_r) \in \mathbb{F}_p^r</math>
 
such that <math>a \not= b</math> and
<math>\sum_{1 \leq i \leq r} (a_iP_i) = \sum_{1 \leq j \leq r} (b_jP_j)</math>.
 
: <math>\sum_{1 \leq i \leq r} (a_iP_i) = \sum_{1 \leq j \leq r} (b_jP_j).</math>.
Proposition: There is a polynomial time reduction from Discrete Log on the cyclic group of order <math>p</math> on elliptic curves to Hash-Collision.
 
Proposition: There is a polynomial time reduction from Discretediscrete Loglog on the [[cyclic group]] of order <math>p</math> on elliptic curves to Hash-Collision.
 
If <math>r = 2</math>, then we get <math>xP+yQ = uP+vQ</math>. Thus <math>(x-u)P+(y-v)Q = 0</math>.
We claim that <math>x \not = u</math> and <math>y \not = v</math>. Suppose that <math>x = u</math>, then we would have <math>(y-v)Q = 0</math>, but <math>Q</math> is a point of order <math>p</math> (a prime) thus <math>y-u \equiv 0 mod\bmod p</math>. In other words <math>y = v</math> in <math>\mathbb{F}_p</math>. This contradicts the assumption that <math>(x, y)</math> and <math>(u, v)</math> are distinct pairs in <math>\mathbb{F}_2</math>. Thus we have that <math>Q = -(x-u)(y-v)^{-1}P</math>, where the inverse is taken as modulo <math>p</math>.
 
If we have r > 2 then we can do one of two things. Either we can take <math>P_1 = P</math> and <math>P_2 = Q</math> as before and set <math>P_i = O</math> for <math>i</math> > 2 (in this case the proof reduces to the case when <math>r = 2</math>), or we can take <math>P_1 = r_1P</math> and <math>P_i = r_iQ</math> where <math>r_i</math> are chosen at random from <math>\mathbb{F}_p</math>. We get one equation in one unknown (the discrete log of <math>Q</math>). It is quite possible that the equation we get does not involve the unknown. However, this happens with very small probability as we argue next. Suppose the algorithm for Hash-Collision gave us that
 
: <math>ar_1P + \sum_{2 \leq i \leq r}(b_ir_iQ) = 0.</math>.
 
Then as long as <math>\sum_{2 \leq i \leq r} b_ir_i \not\equiv 0 mod\bmod p</math>, we can solve for the discrete log of Q. But the <math>r_i</math>’s are unknown to the oracle for Hash-Collision and so we can interchange the order in which this process occurs. In other words, given <math>b_i</math>, for <math>2 \leq i \leq r</math>, not all zero, what is the probability that the <math>r_i</math>’s we chose satisfies <math>\sum_{2 \leq i \leq r} (b_ir_i) = 0</math>? It is clear that the latter probability is <math>1 \over p</math> . Thus with high probability we can solve for the discrete log of <math>Q</math>.
 
We have shown that producing hash collisions in this scheme is difficult. The other method by which an adversary can foil our system is by forging a signature. This scheme for the signature is essentially the Aggregate Signature version of the Boneh-Lynn-Shacham signature scheme.<ref>http{{cite book |last1=Boneh |first1=Dan |last2=Lynn |first2=Ben |last3=Shacham |first3=Hovav |title=Advances in Cryptology — ASIACRYPT 2001 |chapter=Short Signatures from the Weil Pairing |series=Lecture Notes in Computer Science |date=2001 |volume=2248 |pages=514–532 |doi=10.1007/3-540-45682-1_30 |chapter-url=https://csewebhovav.ucsd.edunet/~hovavucsd/dist/sigs.pdf |access-date=17 November 2022 |language=en|isbn=978-3-540-45682-7}}</ref>. Here it is shown that forging a signature is at least as hard as solving the [[Ellipticelliptic curve Diffie-HellmanDiffie–Hellman]] problem. The only known way to solve this problem on elliptic curves is via computing discrete-logs. Thus forging a signature is at least as hard as solving the computational co-Diffie-HellmanDiffie–Hellman on elliptic curves and probably as hard as computing discrete-logs.
 
==See also==
*[[Network coding]]
*[[Homomorphic Encryptionencryption]]
*[[Elliptic-curve Curve Cryptographycryptography]]
*[[Weil pairing]]
*[[Elliptic-curve Curve Diffie-HellmanDiffie–Hellman]]
*[[Elliptic Curve DSADigital Signature Algorithm]]
*[[Digital Signature Algorithm]]
 
Line 147 ⟶ 158:
#[http://research.microsoft.com/pubs/69452/imc06.pdf Comprehensive View of a Live Network Coding P2P System]
#[http://research.microsoft.com/en-us/um/people/klauter/CISS_06.pdf Signatures for Network Coding(presentation) CISS 2006, Princeton]
#[https://web.archive.org/web/20110606191907/http://www.cse.buffalo.edu/~atri/courses/coding-theory/fall07.html University at Buffalo Lecture Notes on Coding Theory – Dr. Atri Rudra]
 
[[Category:Finite fields]]