Nested set model: Difference between revisions

Content deleted Content added
Drawbacks: Added the primary problem with the model. Removed the need for the citation -- res ipsa loquitur -- for anyone with DB experience.
Line 89:
==Drawbacks==
 
The use case for a dynamic endless database tree hierarchy is rare. The Nested Set model is appropriate where the tree element and one or two attributes are the only data, but is a poor choice when more complex relational data exists for the elements in the tree. Given an arbitrary starting depth for a category of 'Vehicles' and a child of 'Cars' with a child of 'Mercedes', a foreign key table relationship must be established unless the tree table is naively non-normalized. Attributes of a newly created tree item may not share all attributes with a parent, child or even a sibling. If a foreign key table is established for a table of 'Plants' attributes, no integrity is given to the child attribute data of 'Trees' and its child 'Oak'. Therefore, in each case of an item inserted into the the tree, a foreign key table of the items attributes must be created during runtime for all but the most trivial of use cases. If the tree isn't expected to change often, a properly normalized hierarchy of attribute tables can be created in the initial design of a system, leading to simpler, more portable SQL statements; specifically ones that don't require an arbitrary number of runtime, programmatically created or deleted tables for changes to the tree. For more complex systems, hierarchy can be developed through relational models rather than an implicit numeric tree structure. Depth of an item is simply another attribute rather than the basis for an entire DB architecture.
Nested sets are very slow for inserts because it requires updating left and right ___domain values for all records in the table after the insert. This can cause a lot of database thrash{{Citation needed|date=August 2012}} as many rows are rewritten and indexes rebuilt. However, if it is possible to store a forest of small trees in table instead of a single big tree, the overhead may be significantly reduced, since only one small tree must be updated.
 
The model doesn't allow for multiple parent categories. For example, an 'Oak' could be a child of 'Tree-Type', but also 'Wood-Type'. An additional tagging or taxonomy has to be established to accommodate this, again leading to a design more complex than a straightforward fixed model.
The [[Nested intervals|nested interval model]] does not suffer from this problem, but is more complex to implement, and is not as well known. The nested interval model stores the position of the nodes as rational numbers expressed as quotients (n/d). [http://www.sigmod.org/publications/sigmod-record/0506/p47-article-tropashko.pdf]
 
Nested sets are very slow for inserts because it requires updating left and right ___domain values for all records in the table after the insert. This can cause a lot of database thrash{{Citation needed|date=August 2012}}stress as many rows are rewritten and indexes rebuilt. However, if it is possible to store a forest of small trees in table instead of a single big tree, the overhead may be significantly reduced, since only one small tree must be updated.
 
The [[Nested intervals|nested interval model]] does not suffer from this problem, but is more complex to implement, and is not as well known. It still suffers from the relational foreign-key table problem. The nested interval model stores the position of the nodes as rational numbers expressed as quotients (n/d). [http://www.sigmod.org/publications/sigmod-record/0506/p47-article-tropashko.pdf]
 
==Variations==