Element distinctness problem: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 10:52, 20 February 2017 edit HaeB (talk \| contribs) Extended confirmed users, Mass message senders, Pending changes reviewers, Rollbackers 55,054 edits m Reverted edits by 93.206.41.241 (talk) to last version by David Eppstein ← Previous edit		Latest revision as of 17:33, 22 December 2024 edit undo David Eppstein (talk \| contribs) Autopatrolled, Administrators 235,788 edits Undid revision 1264526708 by 37.244.251.1 (talk) uncorrect. More than one time is more than n/n times, not more than n/2 times. Tag: Undo
(18 intermediate revisions by 11 users not shown)
Line 9: \| pages = 244–253 \| title = [[Symposium on Theory of Computing\|Proc. 22nd ACM Symposium on Theory of Computing]] \| year = 1990~~}}.</ref>~~\| s2cid = 11943779 \| doi-access = free }}.</ref> Several lower bounds in computational complexity are proved by reducing the element distinctness problem to the problem in question, i.e., by demonstrating that the solution of the element uniqueness problem may be quickly found after solving the problem in question. ==Decision tree complexity== ItThe isnumber ~~known~~of ~~that,~~comparisons ~~for~~needed ~~lists~~to solve the problem of ~~numbers~~size <math>n</math>, ~~the~~in ~~problem's~~a comparison-based model of computation such as a [[~~time~~decision ~~complexity~~tree]] isor [[~~Big~~algebraic Odecision ~~notation\|Θ~~tree]], is <math>\Theta(''n'' \log ''n'')</math>. Here, ~~i.e.~~<math>\Theta</math> invokes [[Big O notation\|big theta notation]], ~~both~~meaning that the ~~upper~~problem ~~and~~can ~~lower~~be ~~bounds~~solved onin ~~its~~a ~~time~~number ~~complexity~~of ~~are~~comparisons ofproportional ~~order~~to of<math>n\log ~~the~~n</math> (a [[linearithmic function]]) inand ~~the~~that ~~[[algebraic~~all ~~decision~~solutions ~~tree]]~~require ~~[[model~~this ofmany ~~computation]],~~comparisons.<ref>{{citation\|first=Michael\|last=Ben-Or\|contribution=Lower bounds for algebraic computation trees\|title=[[Symposium on Theory of Computing\|Proc. 15th ACM Symposium on Theory of Computing]]\|year=1983\|pages=80–86\|doi=10.1145/800061.808735\|doi-access=free}}.</ref> aIn ~~model~~these models of computation ~~in which~~, the ~~elements~~input numbers may not be used to index the computer's memory (as in the hash table solution) but may only be accessed by computing and comparing simple algebraic functions of their values. InFor ~~other~~these ~~words~~models, an algorithm based on [[~~asymptotically~~comparison ~~optimal~~sort]] ~~algorithm~~solves the ofproblem ~~linearithmic~~within ~~time~~a ~~complexity~~constant isfactor ~~known~~of ~~for~~the ~~this~~best ~~model~~possible number of comparisons. The ~~algebraic~~same ~~computation~~lower ~~tree~~bound ~~model~~applies ~~basically~~as ~~means~~well ~~that~~to the ~~allowable~~[[expected ~~algorithms~~value\|expected ~~are~~number]] ~~only~~of comparisons in the ~~ones~~[[randomized ~~that~~complexity\|randomized]] ~~can~~[[algebraic ~~perform~~decision ~~polynomial~~tree]] ~~operations~~model.<ref>{{citation\|doi=10.1007/BF01270387\|title=A oflower ~~bounded~~bound ~~degree~~for onrandomized ~~the~~algebraic ~~input~~decision ~~data~~trees\|year=1996\|last1=Grigoriev\|first1=Dima\|authorlink=Dima ~~and~~Grigoriev\|journal=Computational ~~comparisons~~Complexity\|volume=6\|pages=357\|last2=Karpinski\|first2=Marek\|author2-link=Marek ofKarpinski\|last3=Heide\|first3=Friedhelm ~~the~~Meyer\|last4=Smolensky\|first4=Roman\|issue=4\|s2cid=1462184 ~~results~~}}.</ref><ref>{{citation\|doi=10.1007/s000370050002\|title=Complexity oflower ~~these~~bounds ~~computations~~for randomized computation trees over zero characteristic fields\|year=1999\|last=Grigoriev\|first=Dima\|authorlink=Dima Grigoriev\|journal=Computational Complexity\|volume=8\|pages=316–329\|issue=4\|s2cid=10641238 }}.</ref> ==Real RAM Complexity== The same lower bound was later proved for the [[randomized complexity\|randomized]] [[algebraic decision tree]] model.<ref>{{citation\|doi=10.1007/BF01270387\|title=A lower bound for randomized algebraic decision trees\|year=1996\|last=Grigoriev\|first=Dima\|authorlink=Dima Grigoriev\|journal=Computational Complexity\|volume=6\|pages=357\|last2=Karpinski\|first2=Marek\|author2-link=Marek Karpinski\|last3=Heide\|first3=Friedhelm Meyer\|last4=Smolensky\|first4=Roman\|issue=4}}.</ref><ref>{{citation\|doi=10.1007/s000370050002\|title=Complexity lower bounds for randomized computation trees over zero characteristic fields\|year=1999\|last=Grigoriev\|first=Dima\|authorlink=Dima Grigoriev\|journal=Computational Complexity\|volume=8\|pages=316\|issue=4}}.</ref> If the elements in the problem are [[real number\|real numbers]], the decision-tree lower bound extends to the [[real RAM\|real random-access machine]] model with an instruction set that includes addition, subtraction and multiplication of real numbers, as well as comparison and either division or remaindering ("floor").<ref>{{citation\|doi=10.1137/S0097539797329397\|title=Topological Lower Bounds on Algebraic Random Access Machines\|year=2001\|last1=Ben-Amram\|first1=Amir M.\|journal=SIAM Journal on Computing\|volume=31\|issue=3\|pages=722–761\|last2=Galil\|first2=Zvi\|author2-link=Zvi Galil}}.</ref> It follows that the problem's complexity in this model is also <math>\Theta(n\log n)</math>. This RAM model covers more algorithms than the algebraic decision-tree model, as it encompasses algorithms that use indexing into tables. However, in this model all program steps are counted, not just decisions. ==Turing Machine complexity== A single-tape deterministic [[Turing machine]] can solve the problem, for ''n'' elements of {{math\|''m'' &geq; log ''n''}} bits each, in time {{math\|''O''(''n''<sup>2</sup>''m''(''m''+2–log ''n''))}}, while on a nondeterministic machine the time complexity is {{math\|''O''(''nm''(''n'' + log ''m''))}}.<ref>{{citation\|doi=10.1007/s00236-003-0125-8\|title=Element distinctness on one-tape Turing machines: a complete solution.\|year=2003\|last1=Ben-Amram\|first1=Amir M.\|journal=Acta Informatica\|volume=40\|issue=2\|pages=81–94\|last2=Berkman\|first2=Omer\|last3=Petersen\|first3=Holger\|s2cid=24821585 }}</ref> ==Quantum complexity== ~~It is also known that~~ [[~~quantum~~Quantum algorithm]]s can solve this problem faster, in <math display=inline>\Theta~~\left~~(n^{2/3}~~\right~~)</math> queries. The optimal algorithm is by [[Andris Ambainis]].<ref>{{citation \| last1=Ambainis \| first1=Andris \| author1-link=Andris Ambainis \| title=Quantum walk algorithm for element distinctness \| doi = 10.1137/S0097539705447311 \| journal = [[SIAM Journal on Computing]] \| volume = 37 \| issue = 1 \| year = 2007 \| pages=210–239 \| arxiv=quant-ph/0311001 }}</ref> [[Yaoyun Shi]] first proved a tight lower bound when the size of the range is sufficiently large.<ref> {{cite conference \| last1=Shi \| first1=Y. Line 38 ⟶ 45: \|bibcode= \|doi=10.4086/toc.2005.v001a003 \|doi-access=free }}</ref> and Kutin<ref> {{cite journal \|last1=Kutin \|first1=S. Line 48 ⟶ 56: \|bibcode= \|doi=10.4086/toc.2005.v001a002 \|doi-access=free }}</ref> independently (and via different proofs) extended his work to obtain the lower bound for all functions. ==Generalization: Finding repeated elements== {{main\|Misra–Gries heavy hitters algorithm}} Elements that occur more than ''<math>n''/''k''</math> times in a multiset of size <math>n</math> may be found by a comparison-based algorithm, the [[Misra–Gries heavy hitters algorithm]], in time <math>O(''n'' \log ''k'')</math>. The element distinctness problem is a special case of ''this problem where <math>k''=''n''</math>. This ~~algorithm~~time is optimal under the [[decision tree model]] of computation.<ref>{{citation\|first1=J.\|last1=Misra\|first2=D.\|last2=Gries\|author2-link=David Gries\|title=Finding repeated elements\|journal=Science of Computer Programming\|volume=2\|year=1982\|issue=2\|pages=143–152\|doi=10.1016/0167-6423(82)90012-0\|hdl=1813/6345\|hdl-access=free}}.</ref> The algorithm is a generalization of the one for a special case of ''k''=2 (the [[Boyer–Moore majority vote algorithm]]), which had a rather convoluted history of publication.<ref name=bm>{{citation\|first1=R. S.\|last1=Boyer\|author1-link=Robert S. Boyer\|first2=J S.\|last2=Moore\|author2-link=J Strother Moore\|contribution=MJRTY - A Fast Majority Vote Algorithm\|editor-first=R. S.\|editor-last=Boyer\|title=Automated Reasoning: Essays in Honor of Woody Bledsoe\|series=Automated Reasoning Series\|publisher=Kluwer Academic Publishers\|___location=Dordrecht, The Netherlands\|year=1991\|pages=105–117}}.</ref> The above algorithms rely only on the test of identity of the elements. If sorting is allowed, previously known [[order statistics]] finding algorithms may be exploited. For example, for ''k''=2, a [[median]] may be found first in [[linear time]], and then it may be easily tested whether there are more than ''n''/2 median elements. However the above algorithms require fewer comparisons than the order statistics algorithms.<ref name=bm/> ==See also== Line 63 ⟶ 69: {{reflist}} ~~[[Category:Articles with inconsistent citation formats]]~~ [[Category:Polynomial-time problems]]