Talk:Algorithms for calculating variance: Difference between revisions

Content deleted Content added
Parallel algorithm: - is there an error?
m 2 revisions imported: import old edits from "Algorithms for calculating variance/Talk" in the August 2001 database dump
 
(5 intermediate revisions by 5 users not shown)
Line 1:
{{WikiProject Statisticsbanner shell| class = start Start| importance = mid }}
{{WikiProject Statistics| importance = mid }}
{{maths rating |frequentlyviewed = yes | class = start | importance = mid | field = probability and statistics}}
{{WikiProject Mathematics| importance = mid }}
 
}}
 
== Online algorithm in testing yields horrible results when mean=~0 ==
Line 356 ⟶ 357:
: Btw, I don't understand your comment at the beginning since the problematic division is outside the loop. [[User:McKay|McKay]] ([[User talk:McKay|talk]]) 01:02, 10 April 2009 (UTC)
::: It should not say "else variance = 0". That implies the sample variance is 0 when ''n'' = 1. That is incorrect. The sample variance is undefined in that case. [[User:Michael Hardy|Michael Hardy]] ([[User talk:Michael Hardy|talk]]) 01:18, 10 April 2009 (UTC)
 
{{reflist-talk}}
 
== Easiest online algorithm yet. ==
Line 605 ⟶ 608:
{\displaystyle {\begin{aligned}s_{n}^{2}&={\frac {M_{2,n}}{n-1}}\\[4pt]\sigma _{n}^{2}&={\frac {M_{2,n}}{n}}\end{aligned}}}
[[User:ProfRB|ProfRB]] ([[User talk:ProfRB|talk]]) 18:51, 22 March 2019 (UTC)
 
:2. I think the comment "These formulas suffer from numerical instability, as they repeatedly subtract a small number from a big number which scales with n" is actually wrong. Maybe there is an '''accuracy''' issue, but I don't see why there should be an instability here. A reference would be most welcome. [[User:Natchouf|Natchouf]] ([[User talk:Natchouf|talk]]) 14:08, 20 March 2023 (UTC)
 
== Typo in "Computing shifted data" section ==
 
The second shown formula in this section does not compute the population variance <math>\sigma^2</math>, it rather computes the sample variance <math>s^2</math>. This can easily be seen when comparing the (wrong) formula divisor <math>n-1</math> with the respective divisor <math>n</math> as used in the section "Naive algorithm" just above (where capital <math>N</math> is used instead of <math>n</math>).
 
The comments in the corresponding code snippet below makes the situation a bit clearer. Maybe one could write the code explicitly as
 
...
// for sample variance use
variance = (Ex2 - Ex**2 / n) / (n - 1)
// for population variance use
// variance = (Ex2 - Ex**2 / n) / n
...
 
[[Special:Contributions/141.249.133.134|141.249.133.134]] ([[User talk:141.249.133.134|talk]]) 06:17, 10 April 2024 (UTC)