Revision as of 20:17, 25 January 2023 edit Dabed (talk \| contribs) Extended confirmed users 4,308 edits +Category:Analysis of algorithms Tag: Visual edit ← Previous edit		Revision as of 13:08, 16 March 2023 edit undo Highheath (talk \| contribs) 45 edits math formatting Next edit →
Line 10: The average-case performance of algorithms has been studied since modern notions of computational efficiency were developed in the 1950s. Much of this initial work focused on problems for which worst-case polynomial time algorithms were already known.<ref name="bog06">A. Bogdanov and L. Trevisan, "Average-Case Complexity," Foundations and Trends in Theoretical Computer Science, Vol. 2, No 1 (2006) 1–106.</ref> In 1973, [[Donald Knuth]]<ref name="knu73">D. Knuth, [[The Art of Computer Programming]]. Vol. 3, Addison-Wesley, 1973.</ref> published Volume 3 of the [[Art of Computer Programming]] which extensively surveys average-case performance of algorithms for problems solvable in worst-case polynomial time, such as sorting and median-finding. An efficient algorithm for [[NP-complete\|{{math\|'''NP'''}}-complete]] problems is generally characterized as one which runs in polynomial time for all inputs; this is equivalent to requiring efficient worst-case complexity. However, an algorithm which is inefficient on a "small" number of inputs may still be efficient for "most" inputs that occur in practice. Thus, it is desirable to study the properties of these algorithms where the average-case complexity may differ from the worst-case complexity and find methods to relate the two. The fundamental notions of average-case complexity were developed by [[Leonid Levin]] in 1986 when he published a one-page paper<ref name="levin86">L. Levin, "Average case complete problems," SIAM Journal on Computing, vol. 15, no. 1, pp. 285–286, 1986.</ref> defining average-case complexity and completeness while giving an example of a complete problem for {{math\|'''distNP'''}}, the average-case analogue of [[NP (complexity)\|{{math\|'''NP'''}}]]. ==Definitions== Line 18: ===Efficient average-case complexity=== The first task is to precisely define what is meant by an algorithm which is efficient "on average". An initial attempt might define an efficient average-case algorithm as one which runs in expected polynomial time over all possible inputs. Such a definition has various shortcomings; in particular, it is not robust to changes in the computational model. For example, suppose algorithm {{mvar\|A}} runs in time {{math\|''t''<sub>''A''</sub>(''x'')}} on input {{mvar\|x}} and algorithm {{mvar\|B}} runs in time {{math\|''t''<sub>''A''</sub>(''x'')<sup>2</sup>}} on input {{mvar\|x}}; that is, {{mvar\|B}} is quadratically slower than {{mvar\|A}}. Intuitively, any definition of average-case efficiency should capture the idea that {{mvar\|A}} is efficient-on-average if and only if {{mvar\|B}} is efficient on-average. Suppose, however, that the inputs are drawn randomly from the uniform distribution of strings with length ~~<math>~~{{mvar\|n~~</math>~~}}, and that {{mvar\|A}} runs in time {{math\|''n''<sup>2</sup>}} on all inputs except the string {{math\|1<sup>''n''</sup>}} for which {{mvar\|A}} takes time {{math\|2<sup>''n''</sup>}}. Then it can be easily checked that the expected running time of {{mvar\|A}} is polynomial but the expected running time of {{mvar\|B}} is exponential.<ref name="bog06" /> To create a more robust definition of average-case efficiency, it makes sense to allow an algorithm {{mvar\|A}} to run longer than polynomial time on some inputs but the fraction of inputs on which {{mvar\|A}} requires larger and larger running time becomes smaller and smaller. This intuition is captured in the following formula for average polynomial running time, which balances the polynomial trade-off between running time and fraction of inputs: :<math> Line 26: </math> for every {{math\|''n'', ''t'' >> 0}} and polynomial {{mvar\|p}}, where {{math\|''t''<sub>''A''</sub>(''x'')}} denotes the running time of algorithm {{mvar\|A}} on input {{mvar\|x}}, and {{mvar\|ε}} is a positive constant value.<ref name="wangsurvey">J. Wang, "Average-case computational complexity theory," Complexity Theory Retrospective II, pp. 295–328, 1997.</ref> Alternatively, this can be written as :<math> Line 32: </math> for some constants {{mvar\|C}} and {{mvar\|ε}}, where {{math\|''n'' {{=}} {{abs\|''x\|''}}}}.<ref name="ab09">S. Arora and B. Barak, Computational Complexity: A Modern Approach, Cambridge University Press, New York, NY, 2009.</ref> In other words, an algorithm {{mvar\|A}} has good average-case complexity if, after running for {{math\|''t''<sub>''A''</sub>(''n'')}} steps, {{mvar\|A}} can solve all but a <{{math~~>\frac~~\|{{sfrac\|''n^''<sup>''c}{''</sup>\|(~~t_A~~''t''<sub>''A''</sub>(''n''))~~^{\epsilon}}~~<sup>''ε''</~~math~~sup>}}}} fraction of inputs of length {{mvar\|n}}, for some {{math\|''ε'', ''c'' >> 0}}.<ref name="bog06"/> ===Distributional problem=== The next step is to define the "average" input to a particular problem. This is achieved by associating the inputs of each problem with a particular probability distribution. That is, an "average-case" problem consists of a language {{mvar\|L}} and an associated probability distribution {{mvar\|D}} which forms the pair {{math\|(''L'', ''D'')}}.<ref name="ab09"/> The two most common classes of distributions which are allowed are: #Polynomial-time computable distributions ({{math\|'''P'''}}-computable): these are distributions for which it is possible to compute the cumulative density of any given input {{mvar\|x}}. More formally, given a probability distribution {{mvar\|μ}} and a string {{math\|''x'' ∈∈ {{mset\|0, 1}}<sup>''n''</sup>}} it is possible to compute the value <math>\mu(x) = \sum\limits_{y \in \{0, 1\}^n : y \leq x} \Pr[y]</math> in polynomial time. This implies that {{math\|Pr[''x'']}} is also computable in polynomial time. #Polynomial-time samplable distributions ({{math\|'''P'''}}-samplable): these are distributions from which it is possible to draw random samples in polynomial time. These two formulations, while similar, are not equivalent. If a distribution is {{math\|'''P'''}}-computable it is also {{math\|'''P'''}}-samplable, but the converse is not true if [[P (complexity)\|{{math\|'''P'''}}]] ≠ {{math\|'''P'''<sup>'''#P'''</sup>}}.<ref name="ab09"/> ===AvgP and distNP=== A distributional problem {{math\|(''L'', ''D'')}} is in the complexity class {{math\|'''AvgP'''}} if there is an efficient average-case algorithm for {{mvar\|L}}, as defined above. The class {{math\|'''AvgP'''}} is occasionally called {{math\|'''distP'''}} in the literature.<ref name="ab09"/> A distributional problem {{math\|(''L'', ''D'')}} is in the complexity class {{math\|'''distNP'''}} if {{mvar\|L}} is in {{math\|'''NP'''}} and {{mvar\|D}} is {{math\|'''P'''}}-computable. When {{mvar\|L}} is in {{math\|'''NP'''}} and {{mvar\|D}} is {{math\|'''P'''}}-samplable, {{math\|(''L'', ''D'')}} belongs to {{math\|'''sampNP'''}}.<ref name="ab09"/> Together, {{math\|'''AvgP'''}} and {{math\|'''distNP'''}} define the average-case analogues of {{math\|'''P'''}} and {{math\|'''NP'''}}, respectively.<ref name="ab09"/> ==Reductions between distributional problems== Let {{math\|(''L'',''D'')}} and {{math\|(''L′'', ''D′'')}} be two distributional problems. {{math\|(''L'', ''D'')}} average-case reduces to {{math\|(''L′'', ''D′'')}} (written {{math\|(''L'', ''D'') ≤<sub>'''AvgP'''</sub> (''L′'', ''D′'')}}) if there is a function {{mvar\|f}} that for every {{mvar\|n}}, on input {{mvar\|x}} can be computed in time polynomial in {{mvar\|n}} and #(Correctness) {{math\|''x'' ∈ ''L''}} if and only if {{math\|''f''(''x'') ∈ ''L′''}} #(Domination) There are polynomials {{mvar\|p}} and {{mvar\|m}} such that, for every {{mvar\|n}} and {{mvar\|y}}, <math>\sum\limits_{x: f(x) = y} D_n(x) \leq p(n)D'_{m(n)}(y)</math> The domination condition enforces the notion that if problem {{math\|(''L'', ''D'')}} is hard on average, then {{math\|(''L′'', ''D′'')}} is also hard on average. Intuitively, a reduction should provide a way to solve an instance {{mvar\|x}} of problem {{mvar\|L}} by computing {{math\|''f''(''x'')}} and feeding the output to the algorithm which solves {{mvar\|L'}}. Without the domination condition, this may not be possible since the algorithm which solves {{mvar\|L}} in polynomial time on average may take super-polynomial time on a small number of inputs but {{mvar\|f}} may map these inputs into a much larger set of {{mvar\|D'}} so that algorithm {{mvar\|A'}} no longer runs in polynomial time on average. The domination condition only allows such strings to occur polynomially as often in {{mvar\|D'}}.<ref name="wangsurvey"/> ===DistNP-complete problems=== The average-case analogue to {{math\|'''NP'''}}-completeness is {{math\|'''distNP'''}}-completeness. A distributional problem {{math\|(''L′'', ''D′'')}} is {{math\|'''distNP'''}}-complete if {{math\|(''L′'', ''D′'')}} is in {{math\|'''distNP'''}} and for every {{math\|(''L'', ''D'')}} in {{math\|'''distNP'''}}, {{math\|(''L'', ''D'')}} is average-case reducible to {{math\|(''L′'', ''D′'')}}.<ref name="ab09" /> An example of a {{math\|'''distNP'''}}-complete problem is the Bounded Halting Problem, {{mvar\|BH}}, defined as follows: <math>BH = \{(M, x, 1~~<sup>~~^t~~</sup>~~) : M \text{ is a [[non-deterministic Turing machine]] that accepts } x \text{ in} ≤\leq t \text{ steps.}\}</math><ref name="ab09"/> In his original paper, Levin showed an example of a distributional tiling problem that is average-case {{math\|'''NP'''}}-complete.<ref name="levin86"/> A survey of known {{math\|'''distNP'''}}-complete problems is available online.<ref name="wangsurvey"/> One area of active research involves finding new {{math\|'''distNP'''}}-complete problems. However, finding such problems can be complicated due to a result of Gurevich which shows that any distributional problem with a flat distribution cannot be {{math\|'''distNP'''}}-complete unless [[EXP\|{{math\|'''EXP'''}}]] = [[NEXP\|{{math\|'''NEXP'''}}]].<ref name="gur87">Y. Gurevich, "Complete and incomplete randomized NP problems", Proc. 28th Annual Symp. on Found. of Computer Science, IEEE (1987), pp. 111–117.</ref> (A flat distribution {{mvar\|μ}} is one for which there exists an {{math\|''ε'' >> 0}} such that for any {{mvar\|x}}, {{math\|''μ''(''x'') ≤ 2<sup>−{{abs\|''x\|''}}<sup>''ε''</sup></sup>}}.) A result by Livne shows that all natural {{math\|'''NP'''}}-complete problems have {{math\|'''DistNP'''}}-complete versions.<ref name="livne06">N. Livne, "All Natural NP-Complete Problems Have Average-Case Complete Versions," Computational Complexity (2010) 19:477. https://doi.org/10.1007/s00037-010-0298-9</ref> However, the goal of finding a natural distributional problem that is {{math\|'''DistNP'''}}-complete has not yet been achieved.<ref name="gol97">O. Goldreich, "Notes on Levin's theory of average-case complexity," Technical Report TR97-058, Electronic Colloquium on Computational Complexity, 1997.</ref> ==Applications== ===Sorting algorithms=== As mentioned above, much early work relating to average-case complexity focused on problems for which polynomial-time algorithms already existed, such as sorting. For example, many sorting algorithms which utilize randomness, such as [[Quicksort]], have a worst-case running time of {{math\|O(''n''<sup>2</sup>)}}, but an average-case running time of {{math\|O(~~nlog~~''n'' log(''n''))}}, where {{mvar\|n}} is the length of the input to be sorted.<ref name="clrs">Cormen, Thomas H.; Leiserson, Charles E., Rivest, Ronald L., Stein, Clifford (2009) [1990]. Introduction to Algorithms (3rd ed.). MIT Press and McGraw-Hill. {{ISBN\|0-262-03384-4}}.</ref> ===Cryptography=== For most problems, average-case complexity analysis is undertaken to find efficient algorithms for a problem that is considered difficult in the worst-case. In cryptographic applications, however, the opposite is true: the worst-case complexity is irrelevant; we instead want a guarantee that the average-case complexity of every algorithm which "breaks" the cryptographic scheme is inefficient.<ref name="katz07">J. Katz and Y. Lindell, Introduction to Modern Cryptography (Chapman and Hall/Crc Cryptography and Network Security Series), Chapman and Hall/CRC, 2007.</ref> Thus, all secure cryptographic schemes rely on the existence of [[one-way functions]].<ref name="bog06"/> Although the existence of one-way functions is still an open problem, many candidate one-way functions are based on hard problems such as [[integer factorization]] or computing the [[discrete log]]. Note that it is not desirable for the candidate function to be {{math\|'''NP'''}}-complete since this would only guarantee that there is likely no efficient algorithm for solving the problem in the worst case; what we actually want is a guarantee that no efficient algorithm can solve the problem over random inputs (i.e. the average case). In fact, both the integer factorization and discrete log problems are in {{math\|'''NP''' ∩ }}[[coNP\|{{math\|'''coNP'''}}]], and are therefore not believed to be {{math\|'''NP'''}}-complete.<ref name="ab09"/> The fact that all of cryptography is predicated on the existence of average-case intractable problems in {{math\|'''NP'''}} is one of the primary motivations for studying average-case complexity. ==Other results== In 1990, Impagliazzo and Levin showed that if there is an efficient average-case algorithm for a {{math\|'''distNP'''}}-complete problem under the uniform distribution, then there is an average-case algorithm for every problem in {{math\|'''NP'''}} under any polynomial-time samplable distribution.<ref name="imp90">R. Impagliazzo and L. Levin, "No Better Ways to Generate Hard NP Instances than Picking Uniformly at Random," in Proceedings of the 31st IEEE Sympo- sium on Foundations of Computer Science, pp. 812–821, 1990.</ref> Applying this theory to natural distributional problems remains an outstanding open question.<ref name="bog06"/> In 1992, Ben-David et al. showed that if all languages in {{math\|'''distNP'''}} have good-on-average decision algorithms, they also have good-on-average search algorithms. Further, they show that this conclusion holds under a weaker assumption: if every language in {{math\|'''NP'''}} is easy on average for decision algorithms with respect to the uniform distribution, then it is also easy on average for search algorithms with respect to the uniform distribution.<ref name="bd92">S. Ben-David, B. Chor, O. Goldreich, and M. Luby, "On the theory of average case complexity," Journal of Computer and System Sciences, vol. 44, no. 2, pp. 193–219, 1992.</ref> Thus, cryptographic one-way functions can exist only if there are {{math\|'''distNP'''}} problems over the uniform distribution that are hard on average for decision algorithms. In 1993, Feigenbaum and Fortnow showed that it is not possible to prove, under non-adaptive random reductions, that the existence of a good-on-average algorithm for a {{math\|'''distNP'''}}-complete problem under the uniform distribution implies the existence of worst-case efficient algorithms for all problems in {{math\|'''NP'''}}.<ref name="ff93">J. Feigenbaum and L. Fortnow, "Random-self-reducibility of complete sets," SIAM Journal on Computing, vol. 22, pp. 994–1005, 1993.</ref> In 2003, Bogdanov and Trevisan generalized this result to arbitrary non-adaptive reductions.<ref name="bog03">A. Bogdanov and L. Trevisan, "On worst-case to average-case reductions for NP problems," in Proceedings of the 44th IEEE Symposium on Foundations of Computer Science, pp. 308–317, 2003.</ref> These results show that it is unlikely that any association can be made between average-case complexity and worst-case complexity via reductions.<ref name="bog06"/> ==See also==

Average-case complexity: Difference between revisions