Average-case complexity: Difference between revisions

Content deleted Content added
Further reading: more citations
Citation bot (talk | contribs)
Removed URL that duplicated identifier. | Use this bot. Report bugs. | Suggested by Abductive | Category:Analysis of algorithms | #UCB_Category 22/47
 
(11 intermediate revisions by 7 users not shown)
Line 2:
In [[computational complexity theory]], the '''average-case complexity''' of an [[algorithm]] is the amount of some computational resource (typically time) used by the algorithm, averaged over all possible inputs. It is frequently contrasted with [[worst-case complexity]] which considers the maximal complexity of the algorithm over all possible inputs.
 
There are three primary motivations for studying average-case complexity.<ref name="gol07">{{Cite journal |lastlast1=Goldreich |firstfirst1=Oded |last2=Vadhan |first2=Salil |date=December 2007-12 |title=Special Issue On Worst-case Versus Average-case Complexity Editors’Editors' Foreword |url=https://link.springer.com/10.1007/s00037-007-0232-y |journal=computationalComputational complexityComplexity |language=en |volume=16 |issue=4 |pages=325–330 |doi=10.1007/s00037-007-0232-y |issn=1016-3328|doi-access=free }}</ref> First, although some problems may be intractable in the worst-case, the inputs which elicit this behavior may rarely occur in practice, so the average-case complexity may be a more accurate measure of an algorithm's performance. Second, average-case complexity analysis provides tools and techniques to generate hard instances of problems which can be utilized in areas such as [[cryptography]] and [[derandomization]]. Third, average-case complexity allows discriminating the most efficient algorithm in practice among algorithms of equivalent best case complexity (for instance [[Quicksort#Formal analysis|Quicksort]]).
 
Average-case analysis requires a notion of an "average" input to an algorithm, which leads to the problem of devising a [[probability distribution]] over inputs. Alternatively, a [[randomized algorithm]] can be used. The analysis of such algorithms leads to the related notion of an '''expected complexity'''.<ref name="clrs"/>{{rp|28}}
Line 8:
==History and background==
 
The average-case performance of algorithms has been studied since modern notions of computational efficiency were developed in the 1950s. Much of this initial work focused on problems for which worst-case polynomial time algorithms were already known.<ref name="bog06">A.{{Cite journal |last1=Bogdanov and L.|first1=Andrej |last2=Trevisan, "|first2=Luca |date=2006 |title=Average-Case Complexity," |url=http://www.nowpublishers.com/article/Details/TCS-004 |journal=Foundations and Trends in Theoretical Computer Science, Vol.|language=en |volume=2, No |issue=1 (2006) |pages=1–106 |doi=10.1561/0400000004 |issn=1551-305X|url-access=subscription }}</ref> In 1973, [[Donald Knuth]]<ref name="knu73">D.{{cite book
| last = Knuth, | first = Donald | title = [[The Art of Computer Programming]]. Vol.| volume = 3, | publisher = Addison-Wesley, | date = 1973.
}}</ref> published Volume 3 of the [[Art of Computer Programming]] which extensively surveys average-case performance of algorithms for problems solvable in worst-case polynomial time, such as sorting and median-finding.
 
An efficient algorithm for [[NP-complete|{{math|'''NP'''}}-complete]] problems is generally characterized as one which runs in polynomial time for all inputs; this is equivalent to requiring efficient worst-case complexity. However, an algorithm which is inefficient on a "small" number of inputs may still be efficient for "most" inputs that occur in practice. Thus, it is desirable to study the properties of these algorithms where the average-case complexity may differ from the worst-case complexity and find methods to relate the two.
 
The fundamental notions of average-case complexity were developed by [[Leonid Levin]] in 1986 when he published a one-page paper<ref name="levin86">L.{{Cite journal |last=Levin, "|first=Leonid A. |date=February 1986 |title=Average caseCase completeComplete problems,"Problems |url=http://epubs.siam.org/doi/10.1137/0215020 |journal=SIAM Journal on Computing, vol.|language=en |volume=15, no. |issue=1, pp. |pages=285–286, 1986|doi=10.1137/0215020 |issn=0097-5397|url-access=subscription }}</ref> defining average-case complexity and completeness while giving an example of a complete problem for {{math|'''distNP'''}}, the average-case analogue of [[NP (complexity)|{{math|'''NP'''}}]].
 
==Definitions==
Line 26 ⟶ 28:
</math>
 
for every {{math|''n'', ''t'' &gt; 0}} and polynomial {{mvar|p}}, where {{math|''t''<sub>''A''</sub>(''x'')}} denotes the running time of algorithm {{mvar|A}} on input {{mvar|x}}, and {{mvar|ε}} is a positive constant value.<ref name="wangsurvey">J.{{cite book |last1=Wang, "Average|first1=Jie |url=https://www.cs.uml.edu/~wang/acc-case computational complexity theory," forum/avgcomp.pdf |title=Complexity Theory: Retrospective II, |date=1997 |publisher=Springer Science & Business Media |editor-last1=Hemaspaandra |editor-first1=Lane ppA. |volume=2 |pages=295–328, 1997|chapter=Average-case computational complexity theory |editor-last2=Selman |editor-first2=Alan L.}}</ref> Alternatively, this can be written as
 
:<math>
Line 32 ⟶ 34:
</math>
 
for some constants {{mvar|C}} and {{mvar|ε}}, where {{math|''n'' {{=}} {{abs|''x''}}}}.<ref name="ab09">S.{{cite book |last1=Arora and B. Barak,|first1=Sanjeev |title=Computational Complexity: A Modern Approach, |last2=Barak |first2=Boaz |date=2009 |publisher=Cambridge University Press, |___location=Cambridge; New York, NY, 2009|chapter=18. Average case complexity: Levin’s theory}}</ref> In other words, an algorithm {{mvar|A}} has good average-case complexity if, after running for {{math|''t''<sub>''A''</sub>(''n'')}} steps, {{mvar|A}} can solve all but a {{math|{{sfrac|''n''<sup>''c''</sup>|(''t''<sub>''A''</sub>(''n''))<sup>''ε''</sup>}}}} fraction of inputs of length {{mvar|n}}, for some {{math|''ε'', ''c'' &gt; 0}}.<ref name="bog06"/>
 
===Distributional problem===
Line 66 ⟶ 68:
In his original paper, Levin showed an example of a distributional tiling problem that is average-case {{math|'''NP'''}}-complete.<ref name="levin86"/> A survey of known {{math|'''distNP'''}}-complete problems is available online.<ref name="wangsurvey"/>
 
One area of active research involves finding new {{math|'''distNP'''}}-complete problems. However, finding such problems can be complicated due to a result of Gurevich which shows that any distributional problem with a flat distribution cannot be {{math|'''distNP'''}}-complete unless [[EXP|{{math|'''EXP'''}}]] = [[NEXP|{{math|'''NEXP'''}}]].<ref name="gur87">Y.{{Cite book |last=Gurevich, "|first=Yuri |chapter=Complete and incomplete randomized NP problems", Proc.|date=October 1987 |title=28th Annual Symp.Symposium on Found.Foundations of Computer Science, IEEE(SFCS (1987), pp. |pages=111–117 |doi=10.1109/SFCS.1987.14|isbn=0-8186-0807-2 }}</ref> (A flat distribution {{mvar|μ}} is one for which there exists an {{math|''ε'' &gt; 0}} such that for any {{mvar|x}}, {{math|''μ''(''x'') ≤ 2<sup>−{{abs|''x''}}<sup>''ε''</sup></sup>}}.) A result by Livne shows that all natural {{math|'''NP'''}}-complete problems have {{math|'''DistNP'''}}-complete versions.<ref name="livne06">N.{{Cite journal |last=Livne, "|first=Noam |date=December 2010 |title=All Natural NP-Complete Problems Have Average-Case Complete Versions," |url=http://link.springer.com/10.1007/s00037-010-0298-9 |journal=Computational Complexity (2010)|language=en |volume=19:477. https://|issue=4 |pages=477–499 |doi.org/=10.1007/s00037-010-0298-9 |issn=1016-3328|url-access=subscription }}</ref> However, the goal of finding a natural distributional problem that is {{math|'''DistNP'''}}-complete has not yet been achieved.<ref name="gol97">{{Citation |last=Goldreich |first=Oded |title=Notes on Levin’sLevin's Theory of Average-Case Complexity |date=2011 |work=Studies in Complexity and Cryptography. Miscellanea on the Interplay between Randomness and Computation |series=Lecture Notes in Computer Science |volume=6650 |pages=233–247 |editor-last=Goldreich |editor-first=Oded |url=http://link.springer.com/10.1007/978-3-642-22670-0_21 |access-date=2025-05-21 |place=Berlin, Heidelberg |publisher=Springer Berlin Heidelberg |doi=10.1007/978-3-642-22670-0_21 |isbn=978-3-642-22669-4|url-access=subscription }}</ref>
 
==Applications==
 
===Sorting algorithms===
As mentioned above, much early work relating to average-case complexity focused on problems for which polynomial-time algorithms already existed, such as sorting. For example, many sorting algorithms which utilize randomness, such as [[Quicksort]], have a worst-case running time of {{math|O(''n''<sup>2</sup>)}}, but an average-case running time of {{math|O(''n'' log(''n''))}}, where {{mvar|n}} is the length of the input to be sorted.<ref name="clrs">{{cite book | last1 = Cormen, | first1 = Thomas H.; | last2 = Leiserson, | first2 = Charles E., | last3 = Rivest, | first3 = Ronald L., | last4 = Stein, | first4 = Clifford (2009)| [1990].title = Introduction to Algorithms (| edition = 3rd ed.).| date = 2009 | orig-year = 1990 | publisher = MIT Press and McGraw-Hill. {{ISBN| isbn=978-0-262-03384-48 |url=https://www.worldcat.org/title/311310321 |oclc=311310321}}.</ref>
 
===Cryptography===
For most problems, average-case complexity analysis is undertaken to find efficient algorithms for a problem that is considered difficult in the worst-case. In cryptographic applications, however, the opposite is true: the worst-case complexity is irrelevant; we instead want a guarantee that the average-case complexity of every algorithm which "breaks" the cryptographic scheme is inefficient.<ref name="katz07">J.{{Cite book |last1=Katz and|first1=Jonathan Y.|title=Introduction to modern cryptography |last2=Lindell, Introduction|first2=Yehuda to|date=2021 Modern|publisher=CRC CryptographyPress (|isbn=978-1-351-13303-6 |edition=3 |series=Chapman and& Hall/CrcCRC Cryptographycryptography and Networknetwork Securitysecurity Series),series Chapman|___location=Boca and Hall/CRCRaton, 2007.FL}}</ref>{{pn|date=May 2025}}
 
Thus, all secure cryptographic schemes rely on the existence of [[one-way functions]].<ref name="bog06"/> Although the existence of one-way functions is still an open problem, many candidate one-way functions are based on hard problems such as [[integer factorization]] or computing the [[discrete log]]. Note that it is not desirable for the candidate function to be {{math|'''NP'''}}-complete since this would only guarantee that there is likely no efficient algorithm for solving the problem in the worst case; what we actually want is a guarantee that no efficient algorithm can solve the problem over random inputs (i.e. the average case). In fact, both the integer factorization and discrete log problems are in {{math|'''NP''' ∩ }}[[coNP|{{math|'''coNP'''}}]], and are therefore not believed to be {{math|'''NP'''}}-complete.<ref name="ab09"/> The fact that all of cryptography is predicated on the existence of average-case intractable problems in {{math|'''NP'''}} is one of the primary motivations for studying average-case complexity.
Line 89 ⟶ 91:
In 1990, Impagliazzo and Levin showed that if there is an efficient average-case algorithm for a {{math|'''distNP'''}}-complete problem under the uniform distribution, then there is an average-case algorithm for every problem in {{math|'''NP'''}} under any polynomial-time samplable distribution.<ref name="imp90">R. Impagliazzo and L. Levin, "No Better Ways to Generate Hard NP Instances than Picking Uniformly at Random," in Proceedings of the 31st IEEE Sympo- sium on Foundations of Computer Science, pp. 812–821, 1990.</ref> Applying this theory to natural distributional problems remains an outstanding open question.<ref name="bog06"/>
 
In 1992, Ben-David et al. showed that if all languages in {{math|'''distNP'''}} have good-on-average decision algorithms, they also have good-on-average search algorithms. Further, they show that this conclusion holds under a weaker assumption: if every language in {{math|'''NP'''}} is easy on average for decision algorithms with respect to the uniform distribution, then it is also easy on average for search algorithms with respect to the uniform distribution.<ref name="bd92">S.{{Cite book |last1=Ben-David, [[Benny Chor|Bfirst1=S. |last2=Chor]], O|first2=B. |last3=Goldreich, and M|first3=O. Luby, "|chapter=On the theory of average case complexity," Journal|date=1989 |title=Proceedings of Computerthe andtwenty-first Systemannual Sciences,ACM vol.symposium 44,on no.Theory 2,of ppcomputing - STOC '89 |chapter-url=http://portal.acm.org/citation.cfm?doid=73007.73027 193–219,|language=en 1992|publisher=ACM Press |pages=204–216 |doi=10.1145/73007.73027 |isbn=978-0-89791-307-2}}</ref> Thus, cryptographic one-way functions can exist only if there are {{math|'''distNP'''}} problems over the uniform distribution that are hard on average for decision algorithms.
 
In 1993, Feigenbaum and Fortnow showed that it is not possible to prove, under non-adaptive random reductions, that the existence of a good-on-average algorithm for a {{math|'''distNP'''}}-complete problem under the uniform distribution implies the existence of worst-case efficient algorithms for all problems in {{math|'''NP'''}}.<ref name="ff93">J.{{Cite journal |last1=Feigenbaum and L.|first1=Joan |last2=Fortnow, "|first2=Lance |date=October 1993 |title=Random-selfSelf-reducibilityReducibility of completeComplete sets,"Sets |url=http://epubs.siam.org/doi/10.1137/0222061 |journal=SIAM Journal on Computing, vol.|language=en |volume=22, pp.|issue=5 |pages=994–1005, 1993|doi=10.1137/0222061 |issn=0097-5397|url-access=subscription }}</ref> In 2003, Bogdanov and Trevisan generalized this result to arbitrary non-adaptive reductions.<ref name="bog03">A.{{Cite journal |last1=Bogdanov and L.|first1=Andrej |last2=Trevisan, "|first2=Luca |date=January 2006 |title=On worstWorst-caseCase to averageAverage-caseCase reductionsReductions for NP problems,"Problems in|url=https://epubs.siam.org/doi/10.1137/S0097539705446974 Proceedings|journal=SIAM of the 44th IEEE SymposiumJournal on FoundationsComputing of|language=en Computer|volume=36 Science,|issue=4 pp|pages=1119–1159 |doi=10.1137/S0097539705446974 308–317,|issn=0097-5397|url-access=subscription 2003.}}</ref> These results show that it is unlikely that any association can be made between average-case complexity and worst-case complexity via reductions.<ref name="bog06"/>
 
==See also==
Line 106 ⟶ 108:
Pedagogical presentations:
 
* {{Cite book |last=Impagliazzo |first=R. |chapter=A personal view of average-case complexity |date=1995 |title=Proceedings of Structure in Complexity Theory. Tenth Annual IEEE Conference |publisher=IEEE Comput. Soc. Press |pages=134–147 |doi=10.1109/SCT.1995.514853 |isbn=978-0-8186-7052-7}}
* {{cite book |last1=Arora |first1=Sanjeev |title=Computational Complexity: A Modern Approach |last2=Barak |first2=Boaz |date=2009 |publisher=Cambridge University Press |___location=Cambridge ; New York |chapter=18. Average case complexity: Levin’s theory}}
* {{cite book |last1=Wang |first1=Jie |url=https://www.cs.uml.edu/~wang/acc-forum/avgcomp.pdf |title=Complexity Theory: Retrospective II |date=1997 |publisher=Springer Science & Business Media |editor-last1=Hemaspaandra |editor-first1=Lane A. |volume=2 |pages=295–328 |chapter=Average-case computational complexity theory |editor-last2=Selman |editor-first2=Alan L.}}
* {{Citation |last=Goldreich |first=Oded |title=Average Case Complexity, Revisited |date=2011 |work=Studies in Complexity and Cryptography. Miscellanea on the Interplay between Randomness and Computation |series=Lecture Notes in Computer Science |volume=6650 |pages=422–450 |editor-last=Goldreich |editor-first=Oded |url=https://www.wisdom.weizmann.ac.il/~oded/COL/aver.pdf |place=Berlin, Heidelberg |publisher=Springer Berlin Heidelberg |doi=10.1007/978-3-642-22670-0_29 |isbn=978-3-642-22669-4}}
* {{cite book |last1=Arora |first1=Sanjeev |title=Computational Complexity: A Modern Approach |last2=Barak |first2=Boaz |date=2009 |publisher=Cambridge University Press |___location=Cambridge ; New York |chapter=18. Average case complexity: Levin’s theory}}
 
The literature of average case complexity includes the following work:
Line 208 ⟶ 211:
| url = http://www.cs.nyu.edu/csweb/Research/TechReports/TR1995-711/TR1995-711.pdf
| year = 1995}}.
*{{citation
| last = Impagliazzo | first = Russell | author-link = Russell Impagliazzo
| date = April 17, 1995
| publisher = [[University of California, San Diego]]
| title = A personal view of average-case complexity
| url = http://www-cse.ucsd.edu/~russell/average.ps}}.
*
*Christos Papadimitriou (1994). Computational Complexity. Addison-Wesley.
 
[[Category:Randomized algorithms]]