Talk:Self-balancing binary search tree: Difference between revisions

Content deleted Content added
Implementing WP:PIQA (Task 26)
 
(7 intermediate revisions by 5 users not shown)
Line 1:
{{WikiProject Computingbanner shell|class=startStart|importance=}}
{{WikiProject Computer scienceComputing|importance=mid|class=Start}}
{{WikiProject Computer science|importance=mid}}
}}
{{Broken anchors|links=
* <nowiki>[[Geometric series#Closed-form formula|2<sup>0</sup>+2<sup>1</sup>+···+2<sup>''h''</sup>&nbsp;=&nbsp;2<sup>''h''+1</sup>−1]]</nowiki> The anchor (#Closed-form formula) has been [[Special:Diff/1129004581|deleted by other users]] before. <!-- {"title":"Closed-form formula","appear":{"revid":997328259,"parentid":997216384,"timestamp":"2020-12-31T00:41:26Z","removed_section_titles":["Formula"],"added_section_titles":["Closed-form formula"]},"disappear":{"revid":1129004581,"parentid":1129004434,"timestamp":"2022-12-23T03:31:31Z","removed_section_titles":["Closed-form formula"],"added_section_titles":[]}} -->
}}
 
== Missing citation on overview ==
 
The overview claims that "Maintaining the height always at its minimum value <math>\lfloor\log_2 n\rfloor</math> is not always viable; it can be proven that any insertion algorithm which did so would have an excessive overhead." with a missing citation. I believe this statement is incorrect. I sketched up a proof of a data structure that contains a sorted list backing it, which produces a perfectly height-balanced binary search tree with O(log-base2(n)) worst-case insertion time. This is as good as the insertion time for an AVL tree, so "excessive overhead" implying a slow runtime isn't right.
 
I followed one of the provided citations to The Art of Computer Programming Volume 3 (Donald Knuth), and on page 476, there is the following quote: "However, it appears that the bookkeeping required for maintaining the weight balance takes more time than Algorithm A [which is similar to an AVL tree]". I think the intended meaning of this is NOT that the asymptotic runtime is greater than O(<math>\log_2 n</math>); instead, that the multiplicative factor on maintaining a perfectly balanced BST is possibly higher, and the code complexity increases. Based on the data structure I sketched, there is an additional overhead in maintaining the weight balance (left and right weight) plus the neighbor nodes (two pointers; can be simplified to a single RANK field + a lookup, as in Knuth's book) per every node. This is in contrast to the AVL tree's two bits for left/right balance per node.
 
I propose that the line in question is changed to not state that it's impossible to maintain such a structure without sacrificing performance; instead, it should say something like
"While it is possible to maintain a BST with minimum height with the expected log(n) operations (insertion/deletion/search), in practice, the additional space requirements required to maintain such a structure outweigh the decrease in search time. An AVL tree is guaranteed to be within a factor of 1.44 of the optimal height [cite Knuth here] while requiring only two additional bits of storage in a naive implementation." An additional note can be made that AVL trees tend to come even closer to the optimal height on average. Knuth, on page 468 of the same book, stated that empirical results show that the multiplicative factor is as low as 1.01 for large N.
 
[[User:Rushingseas8|Rushingseas8]] ([[User talk:Rushingseas8|talk]]) 17:36, 8 March 2019 (UTC)
 
==tree==
Line 36 ⟶ 52:
That section should mention the differences between the various algorithms, which ones are good, which ones aren't, and which perform better in certain situations. [[User:Gerbrant|Shinobu]] ([[User talk:Gerbrant|talk]]) 03:14, 7 December 2007 (UTC)
:Good idea, this is an appropriate article for comparison. [[User:Dcoetzee|Dcoetzee]] 03:32, 7 December 2007 (UTC)
 
2-3 Tree is of course ''Self-balancing search tree'' but it is '''not''' "binary".
[[User:Midenok|Aleksey Midenkov]] ([[User talk:Midenok|talk]]) 06:09, 18 August 2016 (UTC)
 
== Proof of min size ==
Line 64 ⟶ 83:
Thus, search is O(log(n)) in both case.
[[Special:Contributions/2A01:E35:2F45:9A0:BA76:3FFF:FEF7:E4D5|2A01:E35:2F45:9A0:BA76:3FFF:FEF7:E4D5]] ([[User talk:2A01:E35:2F45:9A0:BA76:3FFF:FEF7:E4D5|talk]]) 22:18, 20 November 2014 (UTC)
 
:Er, what?
:> ''the assumption that the cost of calculating a hash and performing a comparison is comparable asymptotically, which is not the case when n goes to infinity''
:The complexity of calculating the hash and comparing values is assumed to be constant, just like greater/less comparisons in a tree lookup aren't factored into the O() complexity. That's a fair assumption given constant-length keys.
:> ''Indeed, if you have a set of n elements, you need at least log(n) bits to represent them''
:This applies just as well to a binary tree structure that needs to store left/right pointers, pointer size needs to be at least log(n). So by your argument its complexity should be O(log(log n))?
:-- [[user:intgr|intgr]]&nbsp;<small>[[user talk:intgr|[talk]]]</small> 23:21, 20 November 2014 (UTC)