Talk:Square root algorithms: Difference between revisions

Content deleted Content added
m Archiving 1 discussion(s) to Talk:Methods of computing square roots/Archive 1) (bot
Update archiving templates after a page move (Report bot issues)
 
(48 intermediate revisions by 26 users not shown)
Line 1:
{{User:MiszaBot/config
| algo = old(90d360d)
| archive = Talk:MethodsSquare ofroot computing square rootsalgorithms/Archive %(counter)d
| counter = 1
| maxarchivesize = 150K
| archiveheader = {{Automatic archive navigator}}
| minthreadstoarchive = 12
| minthreadsleft = 45
}}
{{WikiProject banner shell|class=C|
{{WikiProject Mathematics|importance= low}}
}}
{{Maths rating|class= Start|importance= low|field= analysis}}
{{archive box|auto=long}}
 
Line 27 ⟶ 29:
That cryptic constant is actually a composite of three bitfields, and twiddling it requires some understanding of what those fields are. It would be clearer, but a few more operations, to do that line as a pair of bitfield extract/inserts. But we're saving divides in the subsequent iterations, so the extra 1-cycle operations are a wash.
 
== RoughUndefined estimatebehaviour ==
 
The existing '''Rough Estimate''' section for square root is untenable. It's uncited,
and I can't find anything like it in my polynomial approximation, arithmetic, linear
algebra or computer algorithm texts. It returns 2 as the square root of 1, an
implausible result, and an error of 100%. It also returns 2 as the square root of
2. If it were a black box (and it might be, if it were implemented as a computer
routine), and the black box returned 2 as the square root estimate for 2, I'd
presume the box was broke, because it didn't do anything. And it's not useably
close to the right answer either.
 
It's so very easy to estimate the square root of a number to two or even three
significant digits, after factoring out powers of 10 until a number less than
100 is left. There can only be 9 possible first digits (it can't be 0)... surely
that's a trivial interpolation. Just a few examples of the mental approach
(brackets are the result from my calculator):
 
3 is between 1 (1^2) and 4 (2^2), so the square root of 3 must be between 1 and 2,
a bit more than half way, because 3 is a bit more than half way from 1 to 4, so I'd guess
the root is 1.7, maybe 1.75 [the square root of 3 is 1.73 to 3 significant digits, an
error of only 2 parts in 1000].
 
The square root of 37 is the square root of 36 (6) plus 1/13 or .08, because 37 is
1/13 of of the way to 7^2, which is 49. So I'd guess the root is 6.08 [the square
root of 37 is 6.08 to 3 significant digits]
 
The square root of 75 is 8 point something because 8*8 is 64, but 9*9 is 81, too big.
But 75 is a little less than 2/3s of the way to 81, so I'd guess the root is 8 + a little
less than .67, maybe .66, so 8.66 [the square root of 75 is indeed 8.66 to 3 significant
digits]
 
The square root of 47 is a little less than 7 because 7^2 is 49, that's 2/13 of the way
to 36 which is 6^2, and 2/13 is about 15%, so 7 - .15 = 6.85 is the guess. [the square
root of 47 to 4 significant digits is 6.856, an error of only 6 parts in 10000].
 
The square root of 91 is 9 something, because 9^2 is 81, but 10^2 is 100, too big. But
it's just over half way to 100, so guess .55, and the estimate is 9.55. [The square root
of 91 is 9.54 to 3 significant digits, an error of 1 part in 1000].
 
It took me only a few seconds to make each of those estimates, thinking out loud, but
I completed them far faster than I can speak or write. I didn't know any of those square
roots, I had to go punch them into my calculator. The point is, how very accurate they
are, just by interpolative guessing. With a slight modification to the process,
I think I could make get 4 digits of precision in most cases. It could probably be
quantified into an algorithm, but it's most suited for mental reckoning; computers don't
guess, they calculate. The estimates aren't 'rough' at all - they're very good for
the effort spent. That's what people actually do, and what should be presented here.
 
For paper-and-pencil doodling, I think the usual method is a linear regression line,
because it's more predictable and reliable as per error bounds, It's also very easy
to formulate... the secant line or tangent line somewhere along the arc [1.100] for
y=x^2 is a good approximation. Push it a little bit to the right or left, and viola!
you have the regression line without needing to do any integration. I can also
do that mentally, but writing it out helps to visualize it.
 
[[User:Sbalfour|Sbalfour]] ([[User talk:Sbalfour|talk]]) 21:11, 20 November 2019 (UTC)
 
:It is only a ''rough'' estimate. That is, it is limited to one decimal place and no calculation other than looking at the order of magnitude of the square. If you put any more effort into getting it, then you would be better off expending the additional effort on doing additional iterations of the Babylonian method.
:It was chosen as the one-digit value of ''x''<sub>0</sub> which minimizes the maximum possible value of <math>\epsilon_1</math> over the required (factor of ten) interval. It does that.
:This would have been apparent if you had not deleted the section showing he maximum possible values of <math>\epsilon_n</math>.
:Compare your calculations with the result of doing one or two iterations of the Babylonian method after applying the rough estimate. [[User:JRSpriggs|JRSpriggs]] ([[User talk:JRSpriggs|talk]]) 02:01, 21 November 2019 (UTC)
::Not all iterations are as simple as Newton-Raphson, and Newton-Raphson has division complexity. On most current processors, division is microprogrammed, hence very expensive. For example, on Intel's Coffee Lake, it's 29 cycles for single precision. That means that one would rather spend up to 29 bit/arithmetic instructions to save an iteration. A linear approximation won't need divide, and is only 2 operations, multiply+add. A 2-piece piece-wise linear approximation would reduce maximum error by a factor of about 10, and cost 5-6 operations. Linear approximation is the standard approach to polynomial approximation. It should certainly be presented here.[[User:Sbalfour|Sbalfour]] ([[User talk:Sbalfour|talk]]) 17:33, 22 November 2019 (UTC)
 
:::The rough estimate given may be appropriate for hand-calculation. But for machine calculation, you have a point. It would probably be better to combine the rough estimate with the first iteration of the Babylonian method and then precalculate and optimize the constants. If ''S'' is presented in decimal, this would give
::::<math> 1 \leq S < 10 \to x_1 = \frac{\sqrt[4]{10}}{2} + \frac{1}{2 \sqrt[4]{10}} \cdot S = 0.889140 + 0.281171 \cdot S </math>
::::<math> 10 \leq S < 100 \to x_1 = \frac{\sqrt[4]{1000}}{2} + \frac{1}{2 \sqrt[4]{1000}} \cdot S = 2.81171 + 0.0889140 \cdot S </math>.
:::But more likely, it would be in binary, and that subsection would apply.
::::<math> 0.5 \leq S < 2 \to x_1 = 0.5 + 0.5 \cdot S </math>.
:::OK? [[User:JRSpriggs|JRSpriggs]] ([[User talk:JRSpriggs|talk]]) 07:21, 24 November 2019 (UTC)
::::Hmmmm... I don't immediately follow your logic, but give me a little time, and I'll try to work it in.[[User:Sbalfour|Sbalfour]] ([[User talk:Sbalfour|talk]]) 17:13, 26 November 2019 (UTC)
::::This section (now called "[[Methods of computing square roots#Initial estimate]]") is becoming bloated. Please cut it back to just the best method for each application! [[User:JRSpriggs|JRSpriggs]] ([[User talk:JRSpriggs|talk]]) 01:02, 25 November 2019 (UTC)
:::::Agreed; it's too-tutorial; people just want to grab a formula and run.[[User:Sbalfour|Sbalfour]] ([[User talk:Sbalfour|talk]]) 17:13, 26 November 2019 (UTC)
{{ping|Sbalfour}} The last linear estimate you give has relative errors greater than that of the formula in my second previous comment (which is not more than 17%). [[User:JRSpriggs|JRSpriggs]] ([[User talk:JRSpriggs|talk]]) 09:23, 26 November 2019 (UTC)
:::::Did I make a computation error? I'll take another look. However, least-squares methods that minimize ''average'' differences are based on the integral (area between the curves). That often leaves an anomalously high relative error at one end of the range or other, for a small portion of the range. Arrrrrgh! We can't simultaneously minimize maximum and average error, or relative and absolute error. Maybe we need two formulas.[[User:Sbalfour|Sbalfour]] ([[User talk:Sbalfour|talk]]) 17:13, 26 November 2019 (UTC)
 
I was not sufficiently clear. I meant to say that your formula
:<math> \sqrt{a \cdot 10^{2n}} \approx (0.24a + 1.04) \cdot 10^n \text{ if } 1 \leq a < 10 </math>
is inferior to my formula
:<math> \sqrt{a \cdot 10^{2n}} \approx (0.889140 + 0.281171 \cdot a) \cdot 10^n \text{ if } 1 \leq a < 10 </math>
because yours has an error of 28% at ''a''=1 and mine has an error of under 17.1% for any ''a'' in the interval. [[User:JRSpriggs|JRSpriggs]] ([[User talk:JRSpriggs|talk]]) 03:35, 27 November 2019 (UTC)
:You are correct. I located the tangent lines at the arithmetic means of the intervals instead of the harmonic means, and later used the wrong bounds on the integral for the area between the curves. The formulas for the piece-wise linear estimates of square roots need to be adjusted. I vaguely thought that 28% was too high for the relative error, but didn't catch my mistake. Thanks for bringing this to my attention.[[User:Sbalfour|Sbalfour]] ([[User talk:Sbalfour|talk]]) 19:08, 27 November 2019 (UTC)
===Binary estimates===
This section essentially says to guess the square root of any number between 1/2 and 2, as 1. That's actually a bad guess... A good guess can't be had by ignoring the mantissa?!
 
The examples using unions are invalid C, as they invoke undefined behaviour. An easy solution that is probably even clearer for the purpose of example code would be to use memcpy, e.g.
A good guess is readily at hand: the square root of any number between .5 and 1 is between .707 and 1.0, so intuitively split the difference and guess 0.854, and similarly guess halfway between 1 and 1.414 or 1.21 for the square root of any number between 1 and 2. Again, that's a scalar estimate, and those are not very good. A linear estimate in both cases is (1+a)/2. Actually, what's happened is that Newton's (Babylonian, Heron's) method has collapsed to its own best estimate because the underlying mathematics is the same: this method is only very effective when the guess is already close to the root. And both 0.5 and 1.<something> are close to 1.
 
float f;
But we can do better than that: guessing the mean is always higher than the root, by a maximum of 0.086 for linear guesses between 1 & 2, and 0.043 for guesses between 0.5 and 1. So again split the difference and guess (1+a-.086)/2 for numbers between 1 & 2, and (1+a-.043)/2 for numbers between 0.5 & 1. This gives us an estimate with maximum absolute error of 0.043 for roots between 1 & 2, and 0.022 for roots between 0.5 & 1.
uint32_t u;
memcpy (&u, &f, sizeof (u));
 
[[Special:Contributions/37.49.68.13|37.49.68.13]] ([[User talk:37.49.68.13|talk]]) 13:08, 1 April 2022 (UTC)
But we're not done yet; the 'correction factor' above was a scalar, and that's not very good, usually. If instead we use a fractional part of the mantissa, the final estimate is:
:<math>\sqrt{S} \approx (1 + a - a/8)/2</math> for a < 1
:<math>\sqrt{S} \approx (1 + a - a/16)/2</math> for a >= 1
So for {{radic|0.5}}, the estimate is 0.719 and {{radic|0.5}} is .707, an error of 0.012, and for the {{radic|2}}, the estimate is 1.437 and {{radic|2}} is 1.414, an error of .023. Since we're working in binary, subtracting a/16 or a/8 is a rshift + sub, not a divide. The relative error is 1.63%(0.0163) in both cases, larger than 1/64 = 0.0156, so they're not quite good to 6 bits.
 
:Type punning with unions is undefined in C++, but not in C. This is a topic of much confusion.
These estimates are still too big, and there's a corollary worth noting. The linear correction factors are close to 3a/16 and 3a/32, to another bit of precision. That's again another rshift/sub. But this time, the relative error drops to .0077%, less than 1/128, which is .0078. That means the estimates would be good to 7 bits. To 7 bit precision, the normalized representation of {{radic|2}} is 0b011011 (rounded, high bit of 1 not represented); the estimate is 0b011010 exactly. An IEEE single is 24 bits of precision (23 represented), so 2 Newton-Raphson iterations of 7 bits yields 28 bits, done. An IEEE double is 53 bits (52 bits represented), so 3 Newton-Raphson iterations yields 56 bits, done. Without this last dibble-dabble, we need an additional iteration for both. It's unlikely that it would pay off to try and extend the precision of the estimate to the required 12/13+ bits to save another iteration in both cases. It's a moot point anyway - most modern processors including mine, have a sqrt operator, and we're not going to beat ''that''.
:The following is pulled from a footnote around section 6.5.2.3 of the C17 standard:
:"If the member used to read the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called “type punning”). This might be a trap representation."
:This basically says, 'you may use a union to reinterpret the bits of one type into another but we're not going to promise that the new interpretation will be valid'
:I will say that the C code in this article is rather clunky and may benefit from a bitfield to separate the different sections of the float representation so it is easier to read and understand, but I will have to flatly disagree with you that <code>memcpy() </code>is more appropriate than a union in this code snippet. [[User:WillisHershey|WillisHershey]] ([[User talk:WillisHershey|talk]]) 17:24, 25 September 2023 (UTC)
 
== Lucas sequence method - original research? ==
[[User:Sbalfour|Sbalfour]] ([[User talk:Sbalfour|talk]]) 20:54, 26 November 2019 (UTC)
 
I could not find any relevant research papers on the use of Lucas sequences for computing real square roots.
== Continued fraction expansion ==
The closest I found is
 
G. Adj and F. Rodríguez-Henríquez, "Square Root Computation over Even Extension Fields," in IEEE Transactions on Computers, vol. 63, no. 11, pp. 2829-2841, Nov. 2014, doi: 10.1109/TC.2013.145.
This section is the longest in the article, and most laden with awkward symbolism. It might in large part, be the reason for the "Too Technical" maintenance tag. I do a lot of my work on a cell phone or tablet, and it's gobbledegook. I first question whether the section belongs in the presentation at all - any number, even irrational and transcendental ones (including the square root of any number), can be represented as a [[Continued fraction]], and there's a lead article on that topic, as well as [[Generalized continued fraction]] and [[Periodic continued fraction]]. There's nothing unique about the procedure with respect to the square root of a number. It's sometimes convenient, for mathematical foundations as well as estimating, to have a continued fraction representation of frequently used numbers like {{sqrt|2}}, {{sqrt|10}}, <math>e</math>, <math>\pi</math>, <math>\phi</math>. It's tedious to work out one of these; unless there's some motive besides obtaining a final numerical approximation, we do it by looking up the expansion in a table, and it's usually for some mathematics-related number like the above enumerated ones. The worked eample has a better place in one of the aforementioned articles. If we're going to give an example here, it's more useful to use a number like {{sqrt|2}}. I'd like to see the section merged into one of the other articles, and replaced with a much shorter useful presentation, rather than a derivation-based one.[[User:Sbalfour|Sbalfour]] ([[User talk:Sbalfour|talk]]) 18:27, 30 November 2019 (UTC)
 
which is concerned with square roots in finite fields and uses a different algorithm.
I'm moving all the text associated with the {{sqrt|114}} to [[Periodic continued fraction]], for a number of reasons:
Should this paragraph be removed as original research?
1) it's an unwieldy large example; 2) it's not actually very useful, not as useful as say a similar example for {{sqrt|3}}; 3) it's more about how-to (i.e. how to expand the fraction and find the repetend) rather than using a fraction to compute a value; 4) such extreme precision as 7 denominators will almost never in practice be done; 5) all those radicals are intimidating to much of an amateur non-mathematical audience. I note that this example was once before deleted from the article, and for related reasons. Expanding a continued fraction is mostly the province of mathematicians; using one, i.e. to compute a value, is actually rather straight forward. But neither reducing a continued fraction to a rational fraction, nor computing an actual value from it is covered in that text. Finally, by my so doing, we're not actually losing any information from the encyclopedia - it's a matter of locating the example in the most generally applicable article. [[User:Sbalfour|Sbalfour]] ([[User talk:Sbalfour|talk]]) 18:03, 13 December 2019 (UTC)
Or it could also simply be made much shorter, by avoiding to repeat the properties of Lucas sequences. [[User:BlueRavel|BlueRavel]] ([[User talk:BlueRavel|talk]]) 23:27, 3 December 2023 (UTC)
 
: {{ping|BlueRavel}} I have searched, and I too failed to find any relevant source for this. It was posted into the article in 2009 without any explanation, by an editor who has never made any other substantial contribution, just one other very small edit. It looks as though it may well be original research, but whether it is or not, it is unsourced, so I have removed it. [[User:JBW|JBW]] ([[User talk:JBW|talk]]) 21:54, 5 December 2023 (UTC)
There are four different types of notation in this section:
:<math> \sqrt{S} \Rightarrow 1 + \cfrac{1}{2 + \cfrac{1}{2 + \cfrac{1}{2 + ...}}} \equiv 1 + \frac{1 |}{| 2} +\frac{1 |}{| 2} + \frac{1 |}{| 2} + \cdots \equiv [1;2,2,2...] \equiv [1;\overline{2}]</math>
 
== Merge "Approximations that depend on the floating point representation" into "Initial estimate" ==
But then, we make the incredible leap <math>[1;2,2,2] = 17/12</math>. I gave the problem to a neighbor's son who's a math whiz in high school, and he was unable to demonstrate to me how to compute 17/12 from the given expression. He didn't recognize any of the 4 expressions for continued fractions, and was unable to correctly compute anything. The mathematical brevity here is inscrutable. I didn't get continued fractions until a second course in calculus, as a college math major. A non-math major will never see any of this notation. Only applied mathematicians use that kind of notation (and they don't source wikipedia for it), and only for computing reference continued fractions. The rest of us look up the fraction, and use it to compute with. And how do you do that? Hmmmm.... [[User:Sbalfour|Sbalfour]] ([[User talk:Sbalfour|talk]]) 19:08, 13 December 2019 (UTC)
 
I believe the section "Approximations that depend on the floating point representation" should be merged into "Initial estimate", since it is a special case of "Binary estimates". Merging would clear up the fact that the floating point trick gives an initial rough approximation, which is then typically iteratively improved.
I'm going to move all the dense mathematical formalisms into a mathematical sidebar or footnote; that'll shrink the section substantially. Then, add at least one sequence showing the progressive reduction of the continued fraction to a rational (numerical) fraction, and finally, computation of the value of the root from it. That should leave the section accessible at the high school level. [[User:Sbalfour|Sbalfour]] ([[User talk:Sbalfour|talk]]) 19:32, 13 December 2019 (UTC)
 
I also believe the "Initial estimate" section should appear after the section on Heron's method, as the reader is likely more interested in the general idea of iterative refinement than in the details of how to obtain a good initial estimate in all possible ways.
== Pell's equation? ==
 
Additionally, in my opinion the entirety of the article could benefit from some trimming/rewriting, as many sections contain redundant information, unnecessary details, and awkward formulations. [[User:BlueRavel|BlueRavel]] ([[User talk:BlueRavel|talk]]) 14:54, 4 December 2023 (UTC)
Pell's equation is a Diophantine equation, meaning it has integer coefficients, and requires an integer solution. So <math>S</math> in this case must be an integer (not zero and not a perfect square). Most of the time, we're not computing square roots of integers, because we can look them up. But let's just suppose we want the square root of 1361, First we come up against: ''Find positive integers p1 and q1... This is the hard part; It can be done either by guessing, or by using fairly sophisticated techniques.'' Guess?? This isn't the same kind of guessing we do in the Initial estimate section, because a bad guess there just means more work. No, a bad guess here is utterly worthless. There's no guessable solution to p<sup>2</sup> - 1361{{dot}}q<sup>2</sup>=1. (Yes, 1361 is prime, and has a few other properties that make it a particularly irascible number, but in the end, it's just an arbitrary number). The standard method of solving Diophantine equations is by continued fractions (one must find the period of the repetend); the problem is that the repetend may be long, extremely long, and it can be that long for seemingly innocuous choices of <math>S</math>. We don't elaborate that in the section.
 
:: Your proposition makes sense to me, and I dont necessarily disagree. That said though, as a pure mathematician, I am uninclined to blur the lines between programmatical issues and mathematical problems. I think maintaining a distinction is appropriate. An analysis of the pure mathematical problem of initial estimation in these abstract reiterative processes is a decidedly distinct discussion from considerations in this programming language, or that programming language, or this architecture, or that architecture. The former is future-proofed, the latter is not. [[User:CogitoErgoCogitoSum|CogitoErgoCogitoSum]] ([[User talk:CogitoErgoCogitoSum|talk]]) 21:09, 11 February 2024 (UTC)
Pell's equation is a recursive relationship that allows one to generate infinitely many solutions after finding one. It's finding one that's the hump, as it is with any Diophantine equation. And in the usual case, while there may be infinitely many solutions, none of them are small, small enough to make trial and error a useable approach. I don't think this section ought to be in the article, unless we greatly expand it to demonstrate how actually to solve the equation. Solving a Diophantine equation is a LOT more work than an arithmetic computation of square root. [[User:Sbalfour|Sbalfour]] ([[User talk:Sbalfour|talk]]) 22:09, 14 December 2019 (UTC)
 
== Useful addition?? ==
:I agree. [[User:JRSpriggs|JRSpriggs]] ([[User talk:JRSpriggs|talk]]) 09:29, 15 December 2019 (UTC)
Not sure if its useful, but I have found that, in general, <math>\sqrt{x+2} \approx \frac{x+1}{\sqrt{x}}</math>, and if {{math|''x''{{=}}''n''{{sup|''2''}}}} we get <math>\sqrt{n^2+2} \approx n + \frac{1}{n}</math>.
 
Similarly <math>\sqrt{x+4} \approx \frac{x+2}{\sqrt{x}}</math>.
== What's the programming language used here? ==
 
I sometimes use this for quick pencil and paper calculations, if Im close enough to a convenient value.
Which languages have been used to desribe the algorothms?
 
Not sure if this is a known or established property, proven, bounded, or if its already in the article in some alternative capacity, or if its even appropriate for this article. I do know the taylor series approximation with two terms connects these expressions.
[[Special:Contributions/2003:E0:5F1B:749A:DC42:9B8D:E639:24FC|2003:E0:5F1B:749A:DC42:9B8D:E639:24FC]] ([[User talk:2003:E0:5F1B:749A:DC42:9B8D:E639:24FC|talk]]) 22:09, 7 January 2020 (UTC)
[[User:CogitoErgoCogitoSum|CogitoErgoCogitoSum]] ([[User talk:CogitoErgoCogitoSum|talk]]) 21:05, 11 February 2024 (UTC)
: There is nothing special about 2 and 4: <math>\sqrt{x+2c} \approx \frac{x+c}{\sqrt{x}}</math> provided that c is small compared to x. This is, in fact, just the first two terms of the series given in the article under the section heading "Taylor series". [[User:JBW|JBW]] ([[User talk:JBW|talk]]) 01:45, 13 February 2024 (UTC)
 
: I don't think they are useful. In the first, you have replaced a square root and an addition with a square root, an addition, and a division to get an approximate answer. [[User:Bubba73|Bubba73]] <sup>[[User talk:Bubba73|You talkin' to me?]]</sup> 08:02, 13 February 2024 (UTC)
:In general, at Wikipedia we try to avoid using any specific programming language in the articles. If an example is necessary, it is generally done in [[pseudocode]].
:Obviously, articles about a particular language are an exception.
:In this article, it appears that the three examples are in [[C (programming language)]]. But we do ''not'' guarantee that they will work as written. [[User:JRSpriggs|JRSpriggs]] ([[User talk:JRSpriggs|talk]]) 00:38, 8 January 2020 (UTC)