Homomorphic signatures for network coding: Difference between revisions

Content deleted Content added
BG19bot (talk | contribs)
m WP:CHECKWIKI error fix for #61. Punctuation goes before References. Do general fixes if a problem exists. - using AWB (8853)
Extensive cleanup in math notation and punctuation.
Line 2:
 
==Network coding==
Let <math>G = (V, E)</math> be a [[directed graph]] where <math>V</math> is a set, whose elements are called vertices or [[Node (networking)|node]]s, and <math>E</math> is a set of [[ordered pair]]s of vertices, called arcs, directed edges, or arrows. A source <math>s \in V</math> wants to transmit a file <math>D</math> to a set <math>T \subseteq V</math> of the vertices. One chooses a [[vector space]] <math>W/\mathbb{F}_p</math>(say of dimension <math>d</math>), where <math>p</math> is a prime, and views the data to be transmitted as a bunch of vectors <math>w_1 . . .,\ldots , w_k \in W</math>. The source then creates the augmented vectors <math>v_1, . . .\ldots , v_k</math> by setting <math> v_i = (0, . . .\ldots , 0, 1, . . .\ldots , 0, w_{i_1}, . . .\ldots , w{i_d})</math> where <math>w_{i_j}</math> is the <math>j</math>-th coordinate of the vector <math>w_i</math>. There are <math>(i-1)</math> zeros before the first '1' appears in <math>v_i</math>. One can assume without loss of generality that the vectors <math>v_i</math> are [[Linear independence|linearly independent]]. We denote the [[linear subspace]] (of <math>\mathbb{F}_p^{k+d}</math> ) spanned by these vectors by <math>V</math> . Each outgoing edge <math>e \in E</math> computes a linear combination, <math>y(e)</math>, of the vectors entering the vertex <math>v = in(e)</math> where the edge originates, that is to say
 
<math>y(e) = \sum_{f:out(f)=v}(m_e(f)y(f))</math>
 
where <math>m_e(f) \in \mathbb{F}_p</math>. We consider the source as having <math>k</math> input edges carrying the <math>k</math> vectors <math>w_i</math>. By [[Mathematical induction|induction]], one has that the vector <math>y(e)</math> on any edge is a linear combination <math>y(e) = \sum_{1 \le i \le k}(g_i(e)v_i)</math> and is a vector in <math>V</math> . The k-dimensional vector <math>g(e) = (g_1(e), . . .\ldots , g_k(e))</math> is simply the first ''k-'' coordinates of the vector <math>y(e)</math>. We call the [[Matrix (mathematics)|matrix]] whose rows are the vectors <math>g(e_1), . . .\ldots , g(e_k)</math>, where <math>e_i</math> are the incoming edges for a vertex <math>t \in T</math>, the global encoding matrix for <math>t</math> and denote it as <math>G_t</math>. In practice the encoding vectors are chosen at random so the matrix <math>G_t</math> is invertible with high probability. Thus any receiver, on receiving <math>y_1, . . .\ldots , y_k</math> can find <math>w_1, . . .\ldots ,w_k</math> by solving
 
: <math>\begin{bmatrix} y'\\ y_2' \\. \\. \\.vdots \\ y_k' \end{bmatrix} = G_t \begin{bmatrix} w_1\\ w_2 \\. \\. \\.vdots \\ w_k \end{bmatrix}</math>
 
where the <math>y_i'</math> are the vectors formed by removing the first <math>k</math> coordinates of the vector <math>y_i</math>.
 
==Decoding at the receiver==
Each [[Receiver (information theory)|receiver]], <math>t \in T</math>, gets <math>k</math> vectors <math>y_1, . . .\ldots , y_k</math> which are random linear combinations of the <math>v_i</math>’s.
In fact, if
 
: <math>y_i = (\alpha_{i_1}, . . .\ldots , \alpha_{i_k}, a_{i_1}, . . .\ldots , a_{i_d})</math>
 
then
 
: <math>y_i = \sum_{1 \le j \le k}(\alpha_{ij}v_j).</math>.
 
Thus we can invert the linear transformation to find the <math>v_i</math>’s with high [[probability]].
Line 32:
Then server can securely distribute <math>H(v_i)</math> to each receiver, and to check if
 
: <math>y = \sum_{1 \leq i\leq k} (\alpha_iv_i)</math>
 
we can check ifwhether
 
: <math>H(y) = \sum_{1 \leq i\leq k} (\alpha_iH(v_i))</math>
 
The problem with this method is that the server needs to transfer secure information to each of the receivers. The hash functions <math>H</math> needs to be transmitted to all the nodes in the network through a separate secure channel.<math>H</math> is expensive to compute and secure transmission of <math>H</math> is not economical either.
Line 52 ⟶ 53:
 
Let <math>\mathbb{F}_q</math> be a finite field such that <math>q</math> is not a power of 2 or 3. Then an elliptic curve <math>E</math> over <math>\mathbb{F}_q</math> is a curve given by an equation of the form
: <math> y^2 = x^3 + ax + b, \, </math>,
 
where <math>a, b \in \mathbb{F}_q</math> such that <math>4a^3 + 27b^2 \not= 0</math>
 
Let <math>K \supseteq \mathbb{F}_q</math>, then,
 
: <math>E(K) = \{(x, y) | y^2 = x^3 + ax + b\} \bigcup \{O\}</math>

forms an [[abelian group]] with O as identity. The [[Group (mathematics)|group operations]] can be performed efficiently.
 
===Weil pairing===
Line 63 ⟶ 67:
<math>E[m] = {P \in E(\mathbb{\bar {F}}_q) : mP = O}</math>.
 
If <math>E/\mathbb{F}_q</math> is an elliptic curve and <math>\gcd(m, q) = 1</math> then
 
: <math>E[m] \cong (\mathbb{Z}/m\mathbb{Z}) * (\mathbb{Z}/m\mathbb{Z})</math>
 
There is a map <math>e_m : E[m] * E[m] \rightarrow \mu_m(\mathbb{F}_q)</math> such that:
 
#(Bilinear) <math>e_m(P + R,Q) = e_m(P,Q)e_m(R,Q)</math>\text{ &and <math>}e_m(P,Q + R) = e_m(P,Q)e_m(P, R)</math>.
#(Non-degenerate) <math>e_m(P,Q) = 1</math> for all ''P'' implies that <math>Q = O</math>.
#(Alternating) <math>e_m(P, P) = 1</math>.
 
Line 77 ⟶ 81:
===Homomorphic signatures===
 
Let <math>p</math> be a prime and <math>q</math> a prime power. Let <math>V/\mathbb{F}_p</math> be a vector space of dimension <math>D</math> and <math>E/\mathbb{F}_q</math> be an elliptic curve such that <math>P_1, . . .\ldots , P_D \in E[p]</math>.
Define <math>h : V \longrightarrow E[p]</math> as follows:
<math>h(u_1, . . .\ldots , u_D) = \sum_{1 \leq i\leq D} (u_iP_i)</math>.
The function <math>h</math> is an arbitrary homomorphism from <math>V</math> to <math>E[p]</math>.
 
The server chooses <math>s_1, . . .\ldots , s_D</math> secretly in <math>\mathbb{F}_p</math> and publishes a point <math>Q</math> of p-torsion such that <math>e_p(P_i,Q) \not= 1</math> and also publishes <math>(P_i, s_iQ)</math> for <math>1 \leq i \leq D</math>.
The signature of the vector <math>v = u_1, . . .\ldots , u_D</math> is
<math>\sigma(v) = \sum_{1 \leq i\leq D} (u_is_iP_i)</math>
Note: This signature is homomorphic since the computation of h is a homomorphism.
Line 89 ⟶ 93:
===Signature verification===
 
Given <math>v = u_1, . . .\ldots , u_D</math> and its signature <math>\sigma</math>, verify that
 
: <math>
<math>e_p(\sigma,Q) = e_p \sum_{1 \leq i \leq D} (u_is_iP_i,Q)</math>
\begin{align}
 
<math>e_p(\sigma,Q) & = e_p \sum_{1 \leq i \leq D} (u_is_iP_i,Q)</math> \\
<math> & = \prod_i e_p(u_i s_i P_i,Q)</math> \\
 
<math> & = \prod_i e_p(u_i P_i, s_iQ)</math>
\end{align}
</math>
 
The verification crucially uses the bilinearity of the Weil-pairing.
Line 107 ⟶ 113:
on the elliptic curve <math>E</math>.
 
The signature is a point on the elliptic curve with coordinates in <math>\mathbb{F}_q</math>. Thus the size of the signature is <math>2 \log q</math> bits (which is some constant times <math>log(p)</math> bits, depending on the relative size of <math>p</math> and <math>q</math>), and this is the transmission overhead. The computation of the signature <math>h(e)</math> at each vertex requires <math>O(d_{in} \log p \log^{1+\epsilon} q)</math> bit operations, where <math>d_{in}</math> is the in-degree of the vertex <math>in(e)</math>. The verification of a signature requires <math>O((d + k) \log^{2+\epsilon} q)</math> bit operations.
 
==Proof of security==
Line 113 ⟶ 119:
Attacker can produce a collision under the hash function.
 
If given <math>(P_1, . . .\ldots , P_r)</math> points in <math>E[p]</math> find
<math>a = (a_1, . . .\ldots , a_r) \in \mathbb{F}_p^r</math> and <math>b = (b_1, . . .\ldots , b_r) \in \mathbb{F}_p^r</math>
 
such that <math>a \not= b</math> and
<math>\sum_{1 \leq i \leq r} (a_iP_i) = \sum_{1 \leq j \leq r} (b_jP_j)</math>.
 
: <math>\sum_{1 \leq i \leq r} (a_iP_i) = \sum_{1 \leq j \leq r} (b_jP_j).</math>.
Proposition: There is a polynomial time reduction from Discrete Log on the cyclic group of order <math>p</math> on elliptic curves to Hash-Collision.
 
Proposition: There is a polynomial time reduction from Discretediscrete Loglog on the [[cyclic group]] of order <math>p</math> on elliptic curves to Hash-Collision.
 
If <math>r = 2</math>, then we get <math>xP+yQ = uP+vQ</math>. Thus <math>(x-u)P+(y-v)Q = 0</math>.
We claim that <math>x \not = u</math> and <math>y \not = v</math>. Suppose that <math>x = u</math>, then we would have <math>(y-v)Q = 0</math>, but <math>Q</math> is a point of order <math>p</math> (a prime) thus <math>y-u \equiv 0 mod\bmod p</math>. In other words <math>y = v</math> in <math>\mathbb{F}_p</math>. This contradicts the assumption that <math>(x, y)</math> and <math>(u, v)</math> are distinct pairs in <math>\mathbb{F}_2</math>. Thus we have that <math>Q = -(x-u)(y-v)^{-1}P</math>, where the inverse is taken as modulo <math>p</math>.
 
If we have r > 2 then we can do one of two things. Either we can take <math>P_1 = P</math> and <math>P_2 = Q</math> as before and set <math>P_i = O</math> for <math>i</math> > 2 (in this case the proof reduces to the case when <math>r = 2</math>), or we can take <math>P_1 = r_1P</math> and <math>P_i = r_iQ</math> where <math>r_i</math> are chosen at random from <math>\mathbb{F}_p</math>. We get one equation in one unknown (the discrete log of <math>Q</math>). It is quite possible that the equation we get does not involve the unknown. However, this happens with very small probability as we argue next. Suppose the algorithm for Hash-Collision gave us that
 
: <math>ar_1P + \sum_{2 \leq i \leq r}(b_ir_iQ) = 0.</math>.
 
Then as long as <math>\sum_{2 \leq i \leq r} b_ir_i \not\equiv 0 mod\bmod p</math>, we can solve for the discrete log of Q. But the <math>r_i</math>’s are unknown to the oracle for Hash-Collision and so we can interchange the order in which this process occurs. In other words, given <math>b_i</math>, for <math>2 \leq i \leq r</math>, not all zero, what is the probability that the <math>r_i</math>’s we chose satisfies <math>\sum_{2 \leq i \leq r} (b_ir_i) = 0</math>? It is clear that the latter probability is <math>1 \over p</math> . Thus with high probability we can solve for the discrete log of <math>Q</math>.
 
We have shown that producing hash collisions in this scheme is difficult. The other method by which an adversary can foil our system is by forging a signature. This scheme for the signature is essentially the Aggregate Signature version of the Boneh-Lynn-Shacham signature scheme.<ref>http://cseweb.ucsd.edu/~hovav/dist/sigs.pdf</ref> Here it is shown that forging a signature is at least as hard as solving the [[Elliptic curve Diffie-HellmanDiffie–Hellman]] problem. The only known way to solve this problem on elliptic curves is via computing discrete-logs. Thus forging a signature is at least as hard as solving the computational co-Diffie-HellmanDiffie–Hellman on elliptic curves and probably as hard as computing discrete-logs.
 
==See also==
*[[Network coding]]
*[[Homomorphic Encryptionencryption]]
*[[Elliptic Curvecurve Cryptographycryptography]]
*[[Weil pairing]]
*[[Elliptic Curvecurve Diffie-HellmanDiffie–Hellman]]
*[[Elliptic Curvecurve DSA]]
*[[Digital Signature Algorithm]]