Revision as of 14:02, 23 August 2013 edit 129.132.157.229 (talk) →Examples where precision is no indicator of accuracy ← Previous edit		Revision as of 20:50, 11 September 2013 edit undo John of Reading (talk \| contribs) Autopatrolled, Extended confirmed users, Pending changes reviewers 787,576 edits m Typo/general fixing, replaced: the the → the using AWB Next edit →
Line 23: {{cite book \|title=Advanced Excel for scientific data analysis \|publisher=Oxford University Press \|author=Robert de Levie \|year=2004 \|isbn=0-19-515275-1 \|page=44 \|chapter=Algorithmic accuracy \|url=http://www.amazon.com/Advanced-Excel-Scientific-Data-Analysis/dp/0195152751/ref=sr_1_1?ie=UTF8&s=books&qid=1270770876&sr=1-1#reader_0195152751}} </ref> To illustrate, the lower figure tabulates the simple addition {{nowrap\|1 + ''x'' − 1}} for several values of ''x''. All the values of ''x'' begin at the 15-th decimal, so Excel must take them into account. Before calculating the sum 1 + ''x'', Excel first approximates ''x'' as a binary number. If this binary version of ''x'' is a simple power of 2, the 15-digit decimal approximation to ''x'' is stored in the sum, and the top two examples of the figure indicate recovery of ''x'' without error. In the third example, ''x'' is a more complicated binary number, ''x'' = 1.110111⋯111 × 2<sup>−49</sup> (15 bits altogether). Here ''x'' is approximated by the 4-bit binary 1.111 × 2<sup>−49</sup> (some insight into this approximation can be found using [[geometric progression]]: ''x'' = 1.11 × 2<sup>−49</sup> + 2<sup>−52</sup> × (1 − 2<sup>−11</sup>) ≈ 1.11 × 2<sup>−49</sup> + 2<sup>−52</sup> = 1.111 × 2<sup>−49</sup> ) and the decimal equivalent of this crude 4-bit approximation is used. In the fourth example, ''x'' is a ''decimal'' number not equivalent to a simple binary (although it agrees with the binary of the third example to the precision displayed). The decimal input is approximated by a binary and then ''that'' decimal is used. These two middle examples in the figure show that some error is introduced. The last two examples illustrate what happens if ''x'' is a rather small number. In the second from last example, ''x'' = 1.110111⋯111 × 2<sup>−50</sup>; 15 bits altogether. the binary is replaced very crudely by a single power of 2 (in this example, 2<sup>−49</sup>) and its decimal equivalent is used. In the bottom example, a decimal identical with the binary above to the precision shown, is nonetheless approximated differently than the binary, and is eliminated by truncation to 15 significant figures, making no contribution to {{nowrap\|1 + ''x'' − 1}}, leading to ''x'' = 0.<ref name=decimal_input> Line 57: ==Examples where precision is no indicator of accuracy== {{Expand section\|date=April 2010}} ===Statistical functions=== [[File:Excel Std Dev Error.PNG\|thumb\|450px\|Error in Excel 2007 calculation of standard deviation. All four columns have the same deviation of 0.5]] Line 62 ⟶ 63: Accuracy in Excel-provided functions can be an issue. [[Micah Altman]] ''et al.'' provide this example:<ref name=Altman> {{cite book \|title=Numerical issues in statistical computing for the social scientist \|author= Micah Altman, Jeff Gill, Michael McDonald \|year=2004 \|publisher=Wiley-IEEE \|isbn=0-471-23633-0 \|url=http://books.google.com/books?id=j_KevqVO3zAC&pg=PA12 \|chapter=§2.1.1 Revealing example: Computing the coefficient standard deviation \|page=12}} </ref> The population standard deviation given by: Line 75 ⟶ 76: {{cite book \|title=Advanced Excel for scientific data analysis \|author=Robert de Levie \|publisher=Oxford University Press \|year=2004 \|isbn=0-19-515275-1 \|url=http://books.google.com/books?id=IAnO-2qVazsC&printsec=frontcove\|pages=45–46}} </ref> Line 83: ===Subtraction of Subtraction Results=== Doing simple subtractions may lead to errors.<br /> As example we build a difference of ~~the~~ the cells <br /> :<math>A1: 28.552</math> :<math>A2: 27.399</math> Line 105: :<math>x= \frac{-b \pm \sqrt{b^2-4ac} }{2a}. </math> When one of these roots is very large compared to the other, that is, when the square root is close to the value ''b'', the evaluation of the root corresponding to subtraction of the two terms becomes very inaccurate due to round-off. It is possible to determine the round-off error by using the [[Taylor series]] formula for the square root:<ref name=Ryzhik> Line 124: :<math>b - \sqrt{b^2-4ac} \approx b - b + \varepsilon. </math> Under these circumstances, all the significant figures go into expressing ''b''. For example, if the precision is 15 figures, and these two numbers, ''b'' and the square root, are the same to 15 figures, the difference will be zero instead of the difference ε. A better accuracy can be obtained from a different approach, outlined below.<ref name=Step_response> Line 142: These results are not subject to round-off error, but they are not accurate unless ''b'' <sup>2</sup> is large compared to ''ac''. [[File:Excel quadratic error.PNG\|thumb \|350px\| Excel graph of the difference between two evaluations of the smallest root of a quadratic: direct evaluation using the quadratic formula (accurate at smaller ''b'') and an approximation for widely spaced roots (accurate for larger ''b''). The difference reaches a minimum at the large dots, and round-off causes squiggles in the curves beyond this minimum.]] The bottom line is that in doing this calculation using Excel, as the roots become farther apart in value, the method of calculation will have to switch from direct evaluation of the quadratic formula to some other method so as to limit round-off error. The point to switch methods varies according to the size of coefficients ''a'' and ''b''. In the figure, Excel is used to find the smallest root of the quadratic equation ''x''<sup>2</sup> + ''bx'' + ''c'' = 0 for ''c'' = 4 and ''c'' = 4 × 10<sup>5</sup>. The difference between direct evaluation using the quadratic formula and the approximation described above for widely spaced roots is plotted ''vs.'' ''b''. Initially the difference between the methods declines because the widely-spaced root method becomes more accurate at larger ''b''-values. However, beyond some ''b''-value the difference increases because the quadratic formula (good for smaller ''b''-values) becomes worse due to round-off, while the widely spaced root method (good for large ''b''-values) continues to improve. The point to switch methods is indicated by large dots, and is larger for larger ''c''&ensp;-values. At large ''b''-values, the upward sloping curve is Excel's round-off error in the quadratic formula, whose erratic behavior causes the curves to squiggle.

Numeric precision in Microsoft Excel: Difference between revisions