Talk:Binary GCD algorithm: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Add topic

Revision as of 01:14, 31 July 2013 edit Glrx (talk \| contribs) Extended confirmed users, Pending changes reviewers, Rollbackers 29,708 edits revert last 2; unintelligible ← Previous edit		Latest revision as of 19:40, 10 July 2024 edit undo Qwerfjkl (bot) (talk \| contribs) Bots, Mass message senders 4,092,804 edits m Removed deprecated parameters in {{Talk header}} that are now handled automatically (Task 30) Tag: paws [2.2]
(59 intermediate revisions by 19 users not shown)
Line 1: {{User:MiszaBot/config ~~== gcd(0,0) ==~~ \| algo = old(365d) \| archive = Talk:Binary GCD algorithm/Archive %(counter)d \| counter = 1 \| minthreadsleft = 0 }} {{Talk header}} {{WikiProject banner shell\|class=Start\| {{WikiProject Computing\|importance=Mid}} {{WikiProject Computer science\|importance=Mid}} {{WikiProject Mathematics\|importance=Mid}} }} == The Implementation section was wrong == ~~gcd(x,y) is the integer, that is divided by all common divisors of x and y.~~ Thus gcd(0,0)=0, because the common divisors of 0 and 0 are all natural numbers, and these are also divisors of 0. [[User:HenningThielemann\|HenningThielemann]] ([[User talk:HenningThielemann\|talk]]) 20:22, 2 October 2009 (UTC) <tt>gcd(i32::MAX, i32::MIN + 1)</tt> gave the result 1. Since the two inputs are negatives of each other and both are far away from one, the result should be one or the other. <tt>gcd(i32::MIN, i32::MAX)</tt> even crashes the program because of integer overflow. You '''cannot''' use unsigned integer algorithms on signed integers without knowing what's going on. The code didn't even compile without syntax warnings. :A quality source would be needed to confirm this. Cursory examination of the literature shows that, indeed, gcd(0,0) is typically ''not'' defined. (See, for instance, Niven, Montgomery, Zuckerman, "Introduction to the theory of numbers".) [[Special:Contributions/71.182.236.76\|71.182.236.76]] ([[User talk:71.182.236.76\|talk]]) 20:55, 31 October 2009 (UTC) Because people will use algorithms found on Wikipedia, I have removed that algorithm. I will attempt to rewrite it correctly. — [[User:Chai T. Rex\|Chai T. Rex]] ([[User talk:Chai T. Rex\|talk]]) 11:54, 12 July 2023 (UTC) ~~== GCD in hardware ==~~ :I have rewritten the algorithm to be correct and hidden the explanatory comment on how it avoids some branches, as it no longer applies. Improvements to the algorithm can be tested with [https://play.rust-lang.org/?edition=2021&gist=d571ddc17eddc9f504ef9f1732924d5c the linked program]. — [[User:Chai T. Rex\|Chai T. Rex]] ([[User talk:Chai T. Rex\|talk]]) 15:55, 12 July 2023 (UTC) The comparison of the Binary GCD with GCD implemented by hardware division lead me to the question, how fast GCD could be when implemented as hardware instruction. I assume it must be as fast or slow as one division. [[User:HenningThielemann\|HenningThielemann]] ([[User talk:HenningThielemann\|talk]]) 20:33, 2 October 2009 (UTC) :Yes, the implementation I originally wrote (then adapted for greater legibility) specifically took in ''u64'' (unsigned) integers. That was changed in [[Special:Diff/1154862632\|rev. 1154862632]], which introduced less-legible and incorrect code as you found out. :The revision's summary was “the previous code doesn't work, return 0 all the time” which is incorrect (see tests in [https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=056b42ed4b2292e1a4aadd79a8cecc78 playground]). ~~== unsigned integers ==~~ :Given that, I'm going to revert the changes to the implementation section: correctly handling signed integers introduces complexity which IMO takes away from the didactic & illustrative purpose of an example implementation. :I'll add a note explaining this, and how to express signed-integers GCD in terms of unsigned GCD. [[User:Nicoonoclaste\|nicoo]] ([[User talk:Nicoonoclaste\|talk]]) 12:16, 4 February 2024 (UTC) If don't see, why the algorithm is restricted to unsigned integers. As far as I can see it works equally on negative numbers. Even more the algorithm would be simplified because no v>=u comparison is necessary. [[User:134.102.210.237\|134.102.210.237]] 11:54, 31 March 2006 (UTC) ::PS: Sorry for the huuuge delay in reaction: I apparently missed all the watched-pages notifications due to the UI changes. [[User:Nicoonoclaste\|nicoo]] ([[User talk:Nicoonoclaste\|talk]]) 12:19, 4 February 2024 (UTC) ~~== ARM implementation ==~~ ~~The ARM implementation lacks the initial u = 0 test. The code also does not appear to make good use of the conditional instruction set, so I have some optimizations to suggest:~~ ~~== remove_twos_loop optimization ==~~ ~~'''gcd:'''~~ ~~@ Arguments arrive in registers r0 and r1~~ ~~mov r3, #0 @ Initialize r3, the power of 2 in the result~~ ~~orr r12, r0, r1 @ r12 = (r0 \| r1)~~ ~~tst r12, #1 @ Test least significant bit of (r0 \| r1)~~ ~~bne gcd_loop @ If nonzero, either r0 or r1 are odd, jump into middle of next loop~~ ~~'''remove_twos_loop:'''~~ ~~mov r0, r0, lsr #1 @ Shift r0 right, dividing it by 2~~ ~~mov r1, r1, lsr #1 @ Shift r1 right, dividing it by 2~~ ~~add r3, r3, #1 @ Increment r3~~ ~~tst r0, #1 @ Check least significant bit of r0~~ ~~bne gcd_loop @ If nonzero, r0 is odd, jump into middle of next loop~~ ~~tst r1, #1 @ Check least significant bit of r1~~ ~~beq remove_twos_loop @ If zero, r0 and r1 still even, keep looping~~ ~~=== optimization 1 ===~~ ~~'''gcd:'''~~ ~~@ Arguments arrive in registers r0 and r1~~ ~~mov r3, #0 @ Initialize r3, the power of 2 in the result~~ ~~orr r12, r0, r1 @ r12 = (r0 \| r1)~~ ~~'''remove_twos_loop:'''~~ ~~tst r12, #1 @ Test least significant bit of (r0 \| r1)~~ ~~moveq r0, r0, lsr #1 @ Shift r0 right, dividing it by 2~~ ~~moveq r1, r1, lsr #1 @ Shift r1 right, dividing it by 2~~ ~~moveq r12, r12, lsr #1 @ Shift r12 right, dividing it by 2~~ ~~beq remove_twos_loop @ If zero, r0 and r1 still even, keep looping~~ ~~=== optimization 2 ===~~ ~~'''gcd:'''~~ ~~@ Arguments arrive in registers r0 and r1~~ ~~mov r3, #0 @ Initialize r3, the power of 2 in the result~~ ~~'''remove_twos_loop:'''~~ ~~tst r0, #1 @ Test least significant bit of r0~~ ~~tsteq r1, #1 @ Test least significant bit of r1~~ ~~moveq r0, r0, lsr #1 @ Shift r0 right, dividing it by 2~~ ~~moveq r1, r1, lsr #1 @ Shift r1 right, dividing it by 2~~ ~~beq remove_twos_loop @ If zero, r0 and r1 still even, keep looping~~ ~~== gcd_loop optimization ==~~ ~~'''gcd_loop:''' @ Loop finding gcd of r0, r1 not both even~~ ~~tst r0, #1 @ Check least significant bit of r0~~ ~~moveq r0, r0, lsr #1 @ If r0 is even, shift r0 right, dividing it by 2, and...~~ ~~beq gcd_loop_continue @ ... continue to next iteration of loop.~~ ~~tst r1, #1 @ Check least significant bit of r1~~ ~~moveq r1, r1, lsr #1 @ If r1 is even, shift it right by 1 and...~~ ~~beq gcd_loop_continue @ ... continue to next iteration of loop.~~ ~~cmp r0, r1 @ Compare r0 to r1~~ ~~subcs r2, r0, r1 @ If r0 >= r1, take r0 minus r1, and...~~ ~~movcs r0, r2, lsr #1 @ ... shift it right and put it in r0.~~ ~~subcc r2, r1, r0 @ If r0 < r1, take r1 minus r0, and...~~ ~~movcc r1, r2, lsr #1 @ ... shift it right and put it in r1.~~ ~~'''gcd_loop_continue:'''~~ ~~cmp r0, #0 @ Has r0 dropped to zero?~~ ~~bne gcd_loop @ If not, iterate~~ ~~mov r0, r1, asl r3 @ Put r1 << r3 in the result register~~ ~~bx lr @ Return to caller~~ ~~=== optimization ===~~ ~~'''gcd_loop:''' @ Loop finding gcd of r0, r1 not both even~~ ~~tst r0, #1 @ Check least significant bit of r0~~ ~~moveq r0, r0, lsr #1 @ If r0 is even, shift r0 right, dividing it by 2, and...~~ ~~beq gcd_loop @ ... continue to next iteration of loop.~~ ~~'''gcd_loop_2:''' @ Loop finding gcd of r0, r1~~ ~~tst r1, #1 @ Check least significant bit of r1~~ ~~moveq r1, r1, lsr #1 @ If r1 is even, shift it right by 1 and...~~ ~~beq gcd_loop_2 @ ... continue to next iteration of loop.~~ ~~cmp r0, r1 @ Compare r0 to r1~~ ~~subcc r2, r1, r0 @ If r0 < r1, take r1 minus r0, and...~~ ~~movcc r1, r2, lsr #1 @ ... shift it right and put it in r1.~~ ~~bcc gcd_loop_2 @ ... continue to next iteration of loop (r0 is still odd).~~ ~~sub r2, r0, r1 @ If r0 >= r1, take r0 minus r1, and...~~ ~~movs r0, r2, lsr #1 @ ... shift it right and put it in r0.~~ ~~bne gcd_loop @ Has r0 dropped to zero? If not, iterate~~ ~~mov r0, r1, asl r3 @ Put r1 << r3 in the result register~~ ~~bx lr @ Return to caller~~ :Sounds good to me. The point of this code sample is just to illustrate how efficient binary GCD is on a real machine. Feel free to plug in the most optimized version you can imagine. [[User:Deco\|Deco]] 19:50, 31 October 2005 (UTC) ~~== Implementations ==~~ I've removed the ML implementation, because it teaches more about ML than about the algorithm. I have my doubts about the value of the assembly version, but I can't really assess it as I can't read it. IMHO, the C implementation ''is'' important because it exemplifies the use of bitwise operations for efficiency. [[User:Qwertyus\|Qwertyus]] 22:31, 13 April 2006 (UTC) :I think you did the right thing. I just restored the C and ARM implementations after [[User:Arvindn]] blanked them, for these reasons: IMHO, C isn't important for showing bitwise operations; it should be obvious that you'd use bitwise ops where appropriate. But the C implementation is easier to read than the English algorithm at the moment, so it needs to stay at least until there's an appropriate substitute ([[pseudocode]]? Knuth-style algorithm description?). IMHO, the ARM implementation is really enlightening, because it shows the real size and speed benefit of the algorithm in a way that no higher-level language's compiler even approaches. (In particular, it totally stomps the C implementation.) Without ''some'' hand-optimized assembly implementation, the advantage of binary GCD over Euclid is not obvious, IMHO. --[[User:Quuxplusone\|Quuxplusone]] 21:18, 1 January 2007 (UTC) ~~The following pseudocode describes the algoritm probably best:~~ ~~<source lang="text">~~ ~~Input: Positive integers u and v~~ ~~Output: gcd(u,v)~~ ~~Note: division and multiplication by 2 can in many languages most efficiently be done by bitshifting.~~ ~~START~~ ~~s:=1~~ ~~while even(u) and even(v) do~~ ~~u:=u/2;v:=v/2;s:=s2;~~ ~~while even(u) do u:=u/2~~ ~~while even(v) do v:=v/2~~ ~~while u<>v do~~ ~~begin~~ ~~if u>v then~~ ~~begin~~ ~~u:=u-v~~ ~~while even(u) do u:=u/2~~ ~~end~~ ~~else~~ ~~begin~~ ~~v:=v-u~~ ~~while even(v) do v:=v/2~~ ~~end~~ ~~end~~ ~~return(us)~~ ~~END~~ ~~</source>~~ ~~[[User:Hkleinnl\|Hkleinnl]] ([[User talk:Hkleinnl\|talk]]) 20:07, 12 August 2010 (UTC)~~ ~~== Implementation in assembly ==~~ Thanks [[User:Balabiot\|Balabiot]] for pointing out the incorrect result (always zero) when u=0 or v=0. I have fixed the code to return the sum of the inputs when either or both are zero. The code now matches the output of the Euclidean algorithm. ~~I have reverted movne/movne to movs/movnes to ensure that r0 and r1 are both nonzero on entry to check_two_r0. (The previous code hung e.g. with u=0, v=3.)~~ ~~I didn't understand why the case where u and v are powers of two needed fixing, could you please elaborate? -- [[User:Regregex\|Regregex]] ([[User talk:Regregex\|talk]]) 17:02, 10 October 2008 (UTC)~~ ~~:I don't remember, probably I just screwed up. :-) [[User:Balabiot\|Balabiot]] ([[User talk:Balabiot\|talk]]) 17:03, 23 October 2008 (UTC)~~ ~~== Is that page obsolete ? ==~~ ~~I ran 10 000 000 random tests on Mac OS X.6.4 with C Code and got the following results:~~ Euclide : 6180 ms Stein : 9038 ms ~~and compiling with option -O3~~ Euclide : 4728 ms Stein : 5302 ms ~~<source lang="text">~~ ~~#include <stdio.h>~~ ~~#include <sys/time.h>~~ ~~#include <stdlib.h>~~ ~~#define N 10000000~~ ~~#define MAX_UINTplus1 0x100000000LLU~~ ~~#define uint64 unsigned long long~~ ~~#define INIT_RANDOM srandom(3)~~ ~~uint64 pgcdEuclide(uint64,uint64);~~ ~~uint64 pgcdStein(uint64,uint64);~~ ~~int main(void) {~~ ~~int i;~~ ~~uint64 a,b,d,t0,t1;~~ ~~struct timeval tv;~~ ~~// Version Euclide~~ ~~INIT_RANDOM;~~ ~~gettimeofday(&tv,0);~~ ~~t0 = tv.tv_sec* 1000 1000+tv.tv_usec;~~ ~~for(i=0;i<N;i++){~~ ~~a=random()MAX_UINTplus1+random();~~ ~~b=random()MAX_UINTplus1+random();~~ ~~d=pgcdEuclide(a,b);~~ } ~~gettimeofday(&tv,0);~~ ~~t1 = tv.tv_sec10001000+tv.tv_usec;~~ ~~printf("%llu,Euclide : %llu ms\n",d,(t1-t0)/1000);~~ ~~// Version Stein~~ ~~INIT_RANDOM;~~ ~~gettimeofday(&tv,0);~~ ~~t0 = tv.tv_sec 1000 1000+tv.tv_usec;~~ ~~for(i=0;i<N;i++){~~ ~~a=(uint64)random()MAX_UINTplus1+random();~~ ~~b=(uint64)random()MAX_UINTplus1+random();~~ ~~d=pgcdStein(a,b);~~ } ~~gettimeofday(&tv,0);~~ ~~t1 = tv.tv_sec10001000+tv.tv_usec;~~ ~~printf("%llu,Stein : %llu ms\n",d,(t1-t0)/1000);~~ ~~return 0;~~ } ~~uint64 pgcdEuclide(uint64 a,uint64 b){~~ ~~if(b==0) return a;~~ ~~return pgcdEuclide(b,a%b);~~ } ~~uint64 pgcdStein(uint64 u,uint64 v){~~ ~~int shift;~~ ~~/ GCD(0,x) := x /~~ ~~if (u == 0 \|\| v == 0)~~ ~~return u \| v;~~ ~~/ Let shift := lg K, where K is the greatest power of 2~~ ~~dividing both u and v. /~~ ~~for (shift = 0; ((u \| v) & 1) == 0; ++shift) {~~ ~~u >>= 1;~~ ~~v >>= 1;~~ } ~~while ((u & 1) == 0)~~ ~~u >>= 1;~~ ~~/ From here on, u is always odd. /~~ ~~do {~~ ~~while ((v & 1) == 0) / Loop X /~~ ~~v >>= 1;~~ ~~/ Now u and v are both odd, so diff(u, v) is even.~~ ~~Let u = min(u, v), v = diff(u, v)/2. /~~ ~~if (u < v) {~~ ~~v -= u;~~ ~~} else {~~ ~~uint64 diff = u - v;~~ ~~u = v;~~ ~~v = diff;~~ } ~~v >>= 1;~~ ~~} while (v != 0);~~ ~~return u << shift;~~ } ~~</source>~~ ~~This is terrrible because mostly the random number generator is being timed. [[User:McKay\|McKay]] ([[User talk:McKay\|talk]]) 07:56, 11 July 2012 (UTC)~~ RNGs are super slow. Also note that you only went up to numbers sized 2 * 64, meaning that arithmetic operations will take constant time if you're on a 64 bit hardware/os/etc. To do an emperical comparison you should use a bigint library (eg, ints in Python) and test numbers which are >> your CPU's native ALU (or implement your own arithmetic routines so the processor is not used directly) <span style="font-size: smaller;" class="autosigned">— Preceding [[Wikipedia:Signatures\|unsigned]] comment added by [[Special:Contributions/131.215.176.115\|131.215.176.115]] ([[User talk:131.215.176.115\|talk]]) 18:06, 16 March 2013 (UTC)</span><!-- Template:Unsigned IP --> <!--Autosigned by SineBot--> ~~== Why (u \| v) & 1 == 0 is the correct test ==~~ Regarding edits by [[Special:Contributions/94.43.147.234\|94.43.147.234]] ([[User talk:94.43.147.234\|talk]]): Step 2 states:{{quote\|text=If u and v are both even, then gcd(u, v) = 2·gcd(u/2, v/2), because 2 is a common divisor.}} Therefore, before ''u'' and ''v'' may both be halved and ''shift'' incremented, the [[Least significant bit\|LSB]] of ''u'' must be zero, '''and''' the LSB of ''v'' must be zero. The test (in line 12 of the C code fragment) would look like this (avoiding the short-circuit operators for simplicity): ~~for (shift = 0; (u & 1) == 0 & (v & 1) == 0; ++shift) {~~ This is an expression of the form <math>\overline {A} \cdot \overline {B}</math>, which by [[de Morgan's laws]] can be rewritten as <math>\overline{A + B}</math>. In other words, ''Halve ''u'' and ''v'' and increment ''shift'' if the following is false: that ''u'' is odd and/or ''v'' is odd.'' Notice the Boolean negation <tt>!</tt> inserted at the front: ~~for (shift = 0; !((u & 1) == 1 \| (v & 1) == 1); ++shift) {~~ ~~As the only non-zero result of ''x'' <tt>& 1</tt> is 1, the equalities disappear:~~ ~~for (shift = 0; !((u & 1) \| (v & 1)); ++shift) {~~ ~~<tt>&</tt> is [[Distributive property\|distributive]] over <tt>\|</tt> so the two <tt>& 1</tt> operations can be combined:~~ ~~for (shift = 0; !(((u) \| (v)) & 1); ++shift) {~~ ~~Comparing with zero is equivalent to the Boolean negation. After removing redundant parentheses we obtain the test as originally written:~~ ~~for (shift = 0; ((u \| v) & 1) == 0; ++shift) {~~ As a further test, the code fragment as it stood (comprising <tt>((u & v) & 1) == 0</tt>) was compiled, and <tt>gcd(1,2)</tt> was called. As ''u'' and ''v'' have no two bits set in the same column, the test never became false and the code hung in a loop between lines 12–15. ''u'' <tt>\|</tt> ''v'' also corresponds to the ARM code (<tt>ORRS</tt>) which has been tested. – [[User:Regregex\|Regregex]] ([[User talk:Regregex\|talk]]) 16:28, 11 July 2011 (UTC) ~~== bug in wikipedia underlying framework? ==~~ ~~It appears that there are two pages:~~ ~~a) http://en.wikipedia.org/wiki/Binary_gcd~~ ~~(last modified: "4 January 2012 at 17:09.")~~ ~~trying to edit that page will take you to:~~ ~~http://en.wikipedia.org/w/index.php?title=Binary_GCD_algorithm&action=edit~~ ~~b) http://en.wikipedia.org/wiki/Binary_GCD_algorithm~~ ~~(last modified: "12 January 2012 at 15:34.")~~ ~~trying to edit that page will take you to:~~ ~~http://en.wikipedia.org/w/index.php?title=Binary_GCD_algorithm&action=edit~~ ~~However, a) and b) appear different as a) does not reflects changes to b).~~ ~~By manually modifying the "edit" URL from a) to:~~ ~~http://en.wikipedia.org/w/index.php?title=Binary_gcd&action=edit~~ ~~It appears that a) is just "#REDIRECT[[Binary GCD algorithm]]"~~ ~~However, it does not seems to work.~~ I hope this helps (I could not find a "report a bug" thing so I'm posting here) <span style="font-size: smaller;" class="autosigned">— Preceding [[Wikipedia:Signatures\|unsigned]] comment added by [[Special:Contributions/78.31.46.6\|78.31.46.6]] ([[User talk:78.31.46.6\|talk]]) 16:16, 12 January 2012 (UTC)</span><!-- Template:Unsigned IP --> <!--Autosigned by SineBot--> : [[Binary gcd]] is a redirect to [[Binary GCD algorithm]]. That link is intentional; anybody looking for the first is redirected to the article at the second. There is no need to edit [[Binary gcd]]. [[User:Glrx\|Glrx]] ([[User talk:Glrx\|talk]]) 18:58, 13 January 2012 (UTC)