Lenstra elliptic-curve factorization: Difference between revisions

Content deleted Content added
No edit summary
m Fixed grammar
Tags: canned edit summary Mobile edit Mobile app edit iOS app edit App select source
 
(37 intermediate revisions by 25 users not shown)
Line 1:
{{Short description|Algorithm for integer factorization}}
{{technical|date=December 2020}}
 
The '''Lenstra elliptic-curve factorization''' or the '''elliptic-curve factorization method''' ('''ECM''') is a fast, sub-[[exponential running time]], algorithm for [[integer factorization]], which employs [[elliptic curve]]s. For [[general-purpose computer|general-purpose]] factoring, ECM is the third-fastest known factoring method. The second-fastest is the [[quadratic sieve|multiple polynomial quadratic sieve]], and the fastest is the [[general number field sieve]]. The Lenstra elliptic-curve factorization is named after [[Hendrik Lenstra]].
 
Line 5 ⟶ 8:
==Algorithm==
The Lenstra elliptic-curve factorization method to find a factor of a given natural number <math>n</math> works as follows:
# Pick a random [[elliptic curve]] over <math>\mathbb{Z}/n\mathbb{Z}</math> (the integers modulo <math>n</math>), with equation of the form <math>y^2 = x^3 + ax + b \pmod n</math> together with a non-trivial [[Point (geometry)|point]] <math>P(x_0,y_0)</math> on it.
#:This can be done by first picking random <math>x_0,y_0,a \in \mathbb{Z}/n\mathbb{Z}</math>, and then setting <math>b = y_0^2 - x_0^3 - ax_0\pmod n</math> to assureensure the point is on the curve.
# One can define ''Addition'' of two points on the curve, to define a [[Group (Mathematicsmathematics)|Groupgroup]]. The addition laws are given in the [[elliptic curve#The group law|article on elliptic curves]].
#:We can form repeated multiples of a point <math>P</math>: <math>[k]P = P + \ldots + P \text{ (k times)}</math>. The addition formulae involve taking the modular slope of a chord joining <math>P</math> and <math>Q</math>, and thus division between residue classes modulo <math>n</math>, performed using the [[extended Euclidean algorithm]]. In particular, division by some <math>v \bmod n</math> includes calculation of the <math>\gcd(v,n)</math>.
#: Assuming we calculate a slope of the form <math>u/v</math> with <math>\gcd(u,v)=1</math>, then if <math>v = 0 \bmod n</math>, the result of the point addition will be <math>\infty</math>, the point "at infinity" corresponding to the intersection of the "vertical" line joining <math>P(x,y), P'(x,-y)</math> and the curve. However, if <math>\gcd(v,n) \neq 1, n</math>, then the point addition will not produce a meaningful point on the curve; but, more importantly, <math>\gcd(v,n)</math> is a non-trivial factor of <math>n</math>.
# Compute <math>[k]P</math> on the elliptic curve (<math>\bmod n</math>), where <math>k</math> is a product of many small numbers: say, a product of small primes raised to small powers, as in the [[Pollard's p &minus; 1 algorithm|p-1 algorithm]], or the [[factorial]] <math>B!</math> for some not too large <math>B</math>. This can be done efficiently, one small factor at a time. Say, to get <math>[B!]P</math>, first compute <math>[2]P</math>, then <math>[3]([2]P)</math>, then <math>[4]([3!]P)</math>, and so on. <math>B</math> is picked to be small enough so that <math>B</math>-wise point addition can be performed in reasonable time.
#*If we finish all the calculations above without encountering non-invertible elements (<math>\bmod n</math>), it means that the elliptic curves' (modulo primes) order is not [[Smooth number|smooth]] enough, so we need to try again with a different curve and starting point.
#*If we encounter a <math>\gcd(v,n) \neq 1,mn</math> we are done: it is a non-trivial factor of <math>n</math>.
 
The time complexity depends on the size of the number's smallest prime factor and can be represented by {{math|1=exp[({{sqrt|2}}&nbsp;+&nbsp;[[Big O notation#Little-o notation|o]](1)) {{sqrt|ln&nbsp;''p''&nbsp;ln&nbsp;ln&nbsp;''p''}}]}}, where ''p'' is the smallest factor of ''n'', or <math>L_p\left[\frac{1}{2},\sqrt{2}\right]</math>, in [[L-notation]].
 
==Explanation==
==Why does the algorithm work?==
 
If ''p'' and ''q'' are two prime divisors of ''n'', then {{math|1=''y''<sup>2</sup>&nbsp;=&nbsp;''x''<sup>3</sup>&nbsp;+}} {{math|1=''ax''&nbsp;+&nbsp;''b''&nbsp;(mod&nbsp;''n'')}} implies the same equation also {{math|1=modulo&nbsp;''p''}} and {{math|1=modulo&nbsp;''q''.}} These two smaller elliptic curves with the <math>\boxplus</math>-addition are now genuine [[group (mathematics)|groups]]. If these groups have ''N''<sub>''p''</sub> and ''N<sub>q</sub>'' elements, respectively, then for any point ''P'' on the original curve, by [[Lagrange's theorem (group theory)|Lagrange's theorem]], {{math|1=''k''&nbsp;>&nbsp;0}} is minimal such that <math>kP=\infty</math> on the curve modulo ''p'' implies that ''k'' divides ''N''<sub>''p''</sub>; moreover, <math>N_p P=\infty</math>. The analogous statement holds for the curve modulo ''q''. When the elliptic curve is chosen randomly, then ''N''<sub>''p''</sub> and ''N''<sub>''q''</sub> are random numbers close to {{math|1=''p''&nbsp;+&nbsp;1}} and {{math|1=''q''&nbsp;+&nbsp;1,}} respectively (see below). Hence it is unlikely that most of the prime factors of ''N''<sub>''p''</sub> and ''N''<sub>''q''</sub> are the same, and it is quite likely that while computing ''eP'', we will encounter some ''kP'' that is &infin; {{math|1=modulo&nbsp;''p''}} but not {{math|1=modulo&nbsp;''q'',}} or vice versa. When this is the case, ''kP'' does not exist on the original curve, and in the computations we found some ''v'' with either {{math|1=gcd(''v'',''p'')&nbsp;=&nbsp;''p''}} or {{math|1=gcd(''v'',&nbsp;''q'')&nbsp;=&nbsp;''q'',}} but not both. That is, {{math|1=gcd(''v'',&nbsp;''n'')}} gave a non-trivial factor {{math|1=of&nbsp;''n''.}}
 
ECM is at its core an improvement of the older [[Pollard's p &minus; 1 algorithm|{{math|1=''p''&nbsp;&minus;&nbsp;1}} algorithm]]. The {{math|1=''p''&nbsp;&minus;&nbsp;1}} algorithm finds prime factors ''p'' such that {{math|1=''p''&nbsp;&minus;&nbsp;1}} is [[smooth number|b-powersmooth]] for small values of ''b''. For any ''e'', a multiple of {{math|1=''p''&nbsp;&minus;&nbsp;1,}} and any ''a'' [[relatively prime]] to ''p'', by [[Fermat's little theorem]] we have {{math|1=''a''<sup>''e''</sup>&nbsp;&equiv;&nbsp;1 ([[modular arithmetic|mod]] ''p'')}}. Then {{math|1=[[greatest common divisor|gcd]](''a''<sup>''e''</sup>&nbsp;&minus;&nbsp;1,&nbsp;''n'')}} is likely to produce a factor of ''n''. However, the algorithm fails when {{math|1=''p''&nbsp;-&nbsp;1}} has large prime factors, as is the case for numbers containing [[strong prime]]s, for example.
 
ECM gets around this obstacle by considering the [[group (mathematics)|group]] of a random [[elliptic curve]] over the [[finite field]] '''Z'''<sub>''p''</sub>, rather than considering the [[multiplicative group]] of '''Z'''<sub>''p''</sub> which always has order&nbsp;{{math|1=''p''&nbsp;&minus;&nbsp;1.}}
Line 26 ⟶ 29:
The order of the group of an elliptic curve over '''Z'''<sub>''p''</sub> varies (quite randomly) between {{math|1=''p''&nbsp;+&nbsp;1&nbsp;&minus;&nbsp;2{{sqrt|''p''}}}} and {{math|1=''p''&nbsp;+&nbsp;1&nbsp;+&nbsp;2{{sqrt|''p''}}}} by [[Hasse's theorem on elliptic curves|Hasse's theorem]], and is likely to be smooth for some elliptic curves. Although there is no proof that a smooth group order will be found in the Hasse-interval, by using [[heuristic]] probabilistic methods, the [[Canfield–Erdős–Pomerance theorem]] with suitably optimized parameter choices, and the [[L-notation]], we can expect to try {{math|1='''[[L-notation|L]]'''[{{sqrt|2}}/2, {{sqrt|2}}]}} curves before getting a smooth group order. This heuristic estimate is very reliable in practice.
 
==AnExample exampleusage==
 
The following example is from {{harvtxt|Trappe|Washington|2006}}, with some details added.
 
We want to factor {{<math|1=''>n'' = 455839</math>.}} Let's choose the elliptic curve {{<math|1=''>y''<sup>^2</sup> = ''x''<sup>^3</sup> + 5''x''5x - 5</math>,}} with the point {{<math|1=''>P'' = (1, 1)}}</math> on it, and let's try to compute {{the point <math|1=>(10!)''P''</math>.}}
 
The slope of the tangent line at some point ''<math>A''=(''x'', ''y'')</math> ison {{math|1=''s''the =curve is (3''x''<supmath>\lambda=\frac{3x^2</sup> + 5)/(2''y'')}{2y}\ (\mathrm{mod}\ n)}}</math>. Using ''s''<math>\lambda</math>, we can compute 2''A''point <math>2A</math>. If the value of ''s''<math>\lambda</math> isdoes ofnot theexist, formas ''a/b'' whereresult ''b''of <math>y</math> 1not andhaving gcd(''a'',''b'') = 1, we have to find the [[modular inverse]], of ''b''. If it does not exist,then <math>\gcd(''n'',''b''y)</math> is a non-trivial factor of ''<math>n''</math>.
 
First, we compute <math>2!P</math>. Using [[Elliptic_curve_point_multiplication#Point_doubling|point doubling]], we have <math>\lambda(P)=\lambda(1,1)=4</math>, so the coordinates of point <math>2P=(x',y')</math> are
First we compute 2''P''. We have {{math|1=''s''(''P'') = ''s''(1,1) = 4,}} so the coordinates of {{math|1=2''P'' = (''x′'', ''y′'')}} are {{math|1=''x′'' = ''s''<sup>2</sup> – 2''x'' = 14}} and {{math|1=''y′'' = ''s''(''x'' – ''x′'') – ''y''}} {{math|1== 4(1 – 14) – 1 = –53,}} all numbers {{math|1=understood (mod ''n'').}} Just to check that this 2''P'' is indeed on the curve: {{nowrap|1=(–53)<sup>2</sup> = 2809 = 14<sup>3</sup> + 5·14 – 5.}}
:<math>x'=4^2-2(1)=14</math>
:<math>y'=4(1-14)-1=-53</math>
yielding the point <math>2P=(14,-53)</math>.
 
Next, we compute <math>3!P</math>. We have <math>\lambda(2P)=\lambda(14,-53)=-593/106\ (\mathrm{mod}\ n)</math>. Since <math>\gcd(106,455839)=1</math>, the modular inverse of 106 exists. Using the [[extended Euclidean algorithm]], we can obtain that <math>\lambda=-593/106=322522\ (\mathrm{mod}\ 455839)</math>.
Then we compute 3(2''P''). We have {{math|1=''s''(2''P'') = ''s''(14,-53) = –593/106 (mod ''n'').}} Using the [[Euclidean algorithm]]: {{nowrap|1=455839 = 4300·106 + 39,}} then {{nowrap |1=106 = 2·39 + 28,}} then {{nowrap|1=39 = 28 + 11,}} then {{nowrap |1=28 = 2·11 + 6,}} then {{nowrap|1=11 = 6 + 5,}} then {{nowrap |1=6 = 5 + 1.}} Hence {{nowrap|1=gcd(455839, 106) = 1,}} and working backwards (a version of the [[extended Euclidean algorithm]]): {{nowrap |1=1 = 6 – 5 = 2·6 – 11 = 2·28 – 5·11}} {{nowrap |1== 7·28 – 5·39 = 7·106 – 19·39 = 81707·106 – 19·455839.}} Hence {{nowrap |1=106<sup>−1</sup> = 81707 (mod 455839),}} and {{nowrap |1=–593/106 = –133317 (mod 455839).}} Given this ''s'', we can compute the coordinates of 2(2''P''), just as we did above: {{math|1=4''P'' = (259851, 116255).}} Just to check that this is indeed a point on the curve: {{math|1=''y''<sup>2</sup> = 54514 = ''x''<sup>3</sup> + 5''x'' – 5 (mod 455839).}} After this, we can compute <math>3(2P) = 4P \boxplus 2P</math>.
 
Given this, we can compute the coordinates of <math>2(2P)</math>, just as we did above. The coordinates of point <math>4P=(x',y')</math> are
We can similarly compute 4!''P'', and so on, but 8!''P'' requires inverting {{nowrap|1=599 (mod 455839).}} The Euclidean algorithm gives that 455839 is divisible by 599, and we have found a {{nowrap|1=factorization 455839 = 599·761.}}
:<math>x'=322522^2-2(14)=259851\pmod{455839}</math>
:<math>y'=322522(14-259851)-(-53)=116255\pmod{455839}</math>
This yields <math>4P=(259851,116255)</math>.
 
After this, we can compute <math>3(2P) = 4P + 2P</math> using [[Elliptic_curve_point_multiplication#Point_addition|point addition]]. The line joining <math>4P</math> and <math>2P</math> has slope <math>\lambda=116308/259837=206097\ (\mathrm{mod}\ n)</math>, so the coordinates of <math>6P=(x',y')</math> are
The reason that this worked is that the curve {{nowrap|1=(mod 599)}} has {{nowrap|1=640 = 2<sup>7</sup>·5}} points, while {{nowrap|1=(mod 761)}} it has {{nowrap|1=777 = 3·7·37}} points. Moreover, 640 and 777 are the smallest positive integers ''k'' such that {{math|1=''kP'' = &infin;}} on the curve {{nowrap|1=(mod 599)}} and {{math|1=(mod 761),}} respectively. Since {{nowrap|8!}} is a multiple of 640 but not a multiple of 777, we have {{math|1=8!''P'' = &infin;}} on the curve {{nowrap|1=(mod 599),}} but not on the curve {{nowrap|1=(mod 761),}} hence the repeated addition broke down here, yielding the factorization.
:<math>x'=206097^2-14-259851=179685\pmod{455839}</math>
:<math>y'=206097(14-179685)-(-53)=427131\pmod{455839}</math>
yielding the point <math>6P=(179685,427131)</math>
 
We can similarly compute points <math>4!''P''</math>, <math>5!P</math>, and so on, but computing <math>8!''P''</math> requires inverting {{nowrap|1=599 (mod 455839).}}, Thewhich Euclideanis algorithmnot givespossible thatbecause <math>\gcd(599,455839)\ne1</math>. is divisible by 599Thus, and we have found a factorization {{nowrap|1=factorization 455839 = 599·761.}}.
 
The reason that this workedworks is that the curve {{nowrap|1=(mod 599)}} has {{nowrap|1=640 = 2<sup>7</sup>·5}} points, while {{nowrap|1=(mod 761)}} it has {{nowrap|1=777 = 3·7·37}} points. Moreover, 640 and 777 are the smallest positive integers ''k'' such that {{math|1=''kP'' = &infin;}} on the curve {{nowrap|1=(mod 599)}} and {{mathnowrap|1=(mod 761),}} respectively. Since {{nowrap|8!}} is a multiple of 640 but not a multiple of 777, we have {{math|1=8!''P'' = &infin;}} on the curve {{nowrap|1=(mod 599),}} but not on the curve {{nowrap|1=(mod 761),}} hence the repeated addition broke down here, yielding the factorization.
 
==The algorithm with projective coordinates==
 
Before considering the projective plane over <math>(\Z/n\Z)/\sim,</math> first consider a 'normal' [[projective space]] over <math>\mathbb{R}</math>: Instead of points, lines through the origin are studied. A line may be represented as a non-zero point <math>(x,y,z)</math>, under an equivalence relation ~ given by: <math>(x,y,z)\sim(x',y',z')</math> ⇔ ∃ '''''c''''' ≠ 0 such that ''x' = '''c'''x'', ''y' = '''c'''y'' and ''z' = '''c'''z''. Under this equivalence relation, the space is called '''the projective plane''' <math>\mathbb{P}^2</math>; points, denoted by <math>(x:y:z)</math>, correspond to lines in a three-dimensional space that pass through the origin. Note that the point <math>(0:0:0)</math> does not exist in this space since to draw a line in any possible direction requires at least one of x',y' or z' ≠ 0. Now observe that almost all lines go through any given reference plane - such as the (''X'',''Y'',1)-plane, whilst the lines precisely parallel to this plane, having coordinates (''X,Y'',0), specify directions uniquely, as 'points at infinity' that are used in the affine (''X,Y'')-plane it lies above.
 
In the algorithm, only the group structure of an elliptic curve over the field <math>\mathbb{R}</math> is used. Since we do not necessarily need the field <math>\mathbb{R}</math>, a finite field will also provide a group structure on an elliptic curve. However, considering the same curve and operation over <math>(\Z/n\Z)/\sim</math> with {{mvar|n}} not a prime does not give a group. The Elliptic Curve Method makes use of the failure cases of the addition law.
 
We now state the algorithm in projective coordinates. The neutral element is then given by the point at infinity <math>(0:1:0)</math>. Let {{mvar|n}} be a (positive) integer and consider the elliptic curve (a set of points with some structure on it) <math>E(\Z/n\Z)=\{(x:y:z) \in \mathbb{P}^2\ |\ y^2z=x^3+axz^2+bz^3\}</math>.
Line 114 ⟶ 130:
==GMP-ECM and EECM-MPFQ==
 
The use of Twisted Edwards elliptic curves, as well as other techniques were used by Bernstein et al<ref name=Bernstein2008 /> to provide an optimized implementation of ECM. Its only drawback is that it works on smaller composite numbers than the more general purpose implementation, GMP-ECM of ZimmermanZimmermann.
 
==Hyperelliptic-curve method (HECM)==
Line 121 ⟶ 137:
 
==Quantum version (GEECM)==
[[Daniel J. Bernstein|Bernstein]], [[Nadia Heninger|Heninger]], Lou, &and Valenta suggest GEECM, a quantum version of ECM with Edwards curves.<ref>Bernstein D.J., [[Nadia Heninger|Heninger N.]], Lou P., Valenta L. (2017) [https://eprint.iacr.org/2017/351 Post-quantum RSA]. In: Lange T., Takagi T. (eds), ''Post-Quantum Cryptography''. PQCrypto 2017. Lecture Notes in Computer Science, vol 10346. Springer, Cham</ref> suggest GEECM, a quantum version of ECM with Edwards curves. It uses [[Grover's algorithm]] to roughly double the sizelength of the primes found compared to standard EECM, assuming a quantum computer with sufficiently many qubits and of comparable speed to the classical computer running EECM.
 
==See also==
*[[UBASIC]] for practical program (ECMX).
 
==References==
{{reflist}}
*{{cite paperjournal |lastlast1=Bernstein |firstfirst1=Daniel J. |last2=Birkner |first2=Peter |last3=Lange |first3=Tanja |last4=Peters |first4=Christiane |title=ECM using Edwards curves |journal=Mathematics of Computation |year=2013 | volume=82 | number=282 |doi=10.1090/S0025-5718-2012-02633-0 | mr=3008853 | pages=1139–1179 |doi-access=free }}
*{{cite book |lastlast1=Bosma |firstfirst1=W. |last2=Hulst |first2=M. P. M. van der |title=Primality proving with cyclotomy |publisher=Ph.D. Thesis, Universiteit van Amsterdam |year=1990 |isbn= |oclc=256778332 }}
*{{cite journal |last=Brent |first=Richard P. |title=Factorization of the tenth Fermat number |journal=Mathematics of Computation |volume=68 |issue=225 |year=1999 |pages=429–451 |doi=10.1090/S0025-5718-99-00992-8 |bibcode=1999MaCom..68..429B | mr=1489968|doi-access=free }}
*{{cite book |last=Cohen |first=Henri |title=A Course in Computational Algebraic Number Theory |volume=138 |publisher=Springer-Verlag |___location=Berlin |year=1993 |isbn=978-0-387-55640-6 | mr=1228206 | doi=10.1007/978-3-662-02945-9|series=Graduate Texts in Mathematics |s2cid=118037646 }}
*{{cite journal |last=Cosset |first=R. |title=Factorization with genus 2 curves |journal=Mathematics of Computation |volume=79 |issue=270 |year=2010 |pages=1191–1208 |doi=10.1090/S0025-5718-09-02295-9 | mr=2600562|arxiv=0905.2325 |bibcode=2010MaCom..79.1191C |s2cid=914296 }}
*{{cite book |editor1-link=Arjen Lenstra |editor1-last=Lenstra |editor1-first=A. K. |editor2-last=Lenstra Jr. |editor2-first=H. W. |title=The development of the number field sieve |series=Lecture Notes in Mathematics |volume=1554 |publisher=Springer-Verlag |___location=Berlin |year=1993 |pages=11–42 |mr=1321216 | doi=10.1007/BFb0091534|isbn=978-3-540-57013-4 |url=http://infoscience.epfl.ch/record/164684 }}
*{{cite journal |last=Lenstra Jr. |first=H. W. |title=Factoring integers with elliptic curves |journal=[[Annals of Mathematics]] |volume=126 |year=1987 |issue=3 |pages=649–673 |mr=0916721 |doi=10.2307/1971363 |url=https://openaccess.leidenuniv.nl/bitstream/handle/1887/2140/346_079.pdf?sequence=1 |jstor=1971363 |hdl=1887/2140 |hdl-access=free }}
*{{cite book
| last1= Pomerance
| first1 = Carl
| author1linkauthor1-link = Carl Pomerance
| first2 = Richard
| last2 = Crandall
Line 158 ⟶ 171:
|year=1996
|issue=12
|url=httphttps://www.ams.org/notices/199612/pomerance.pdf
|format = [[PDF]]
|mr=1416721
}}
*{{cite journal |last=Silverman |first=Robert D. |title=The Multiple Polynomial Quadratic Sieve |journal=[[Mathematics of Computation]] |volume=48 |issue=177 |year=1987 |pages=329–339 |doi=10.1090/S0025-5718-1987-0866119-8 | mr=0866119|doi-access=free }}
*{{cite book | last1=Trappe | first1= W. | last2=Washington| first2=L. C. | title=Introduction to Cryptography with Coding Theory |edition=Second | publisher=Pearson Prentice Hall | year=2006| isbn=978-0-13-186239-5| mr=2372272| ___location=Saddle River, NJ}}
*{{cite book | author=Samuel S. Wagstaff, Jr. | title=The Joy of Factoring | publisher=American Mathematical Society | ___location=Providence, RI | year=2013 | isbn=978-1-4704-1048-3 |url=httphttps://www.ams.org/bookpages/stml-68 |author-link=Samuel S. Wagstaff, Jr. |pages=173–190 }}
*{{cite book |last=Watras |first=Marcin |title=Cryptography, Number Analysis, and Very Large Numbers |publisher=Wojciechowski-Steinhagen |___location=Bydgoszcz |year=2008 |id=PL:5324564 }}
 
==External links==
* [httphttps://www.alpertron.com.ar/ECM.HTM Factorization using the Elliptic Curve Method], a JavaWebAssembly appletapplication which uses ECM and switches to the [[Quadratic sieve|Self-Initializing Quadratic Sieve]] when it is faster.
* [http://ecm.gforge.inria.fr/ GMP-ECM] {{Webarchive|url=https://web.archive.org/web/20090912000929/http://ecm.gforge.inria.fr/ |date=2009-09-12 }}, an efficient implementation of ECM.
* [http://www.loria.fr/~zimmerma/records/ecmnet.html ECMNet], an easy client-server implementation that works with several factorization projects.
* [httphttps://www.sourceforge.net/projects/pyecm pyecm], a python implementation of ECM. Much faster with psyco and/or gmpy.
* [http://www.rechenkraft.net/yoyo/ Distributed computing project yoyo@Home] Subproject ECM is a program for Elliptic Curve Factorization which is used by a couple of projects to find factors for different kindkinds of numbers.
* [https://web.archive.org/web/20130811025532/http://ardoino.com/2008/03/large-integers-factorization/ Lenstra Elliptic Curve Factorization algorithm source code] Simple C and GMP Elliptic Curve Factorization Algorithm source code.
* [https://eecm.cr.yp.to/mpfq.html EECM-MPFQ] An implementation of ECM using Edwards curves written with the MPFQ finite field library.