Content deleted Content added
nbsp not necessary: {{math}} already prevents breaking; other minor formatting fixes |
m Open access bot: url-access=subscription updated in citation with #oabot. |
||
(61 intermediate revisions by 30 users not shown) | |||
Line 1:
{{short description|Algorithm that employs a degree of randomness as part of its logic or procedure}}
{{Probabilistic}}
{{distinguish-redirect|Randomized algorithms|Algorithmic randomness}}
A '''randomized algorithm''' is an [[algorithm]] that employs a degree of [[randomness]] as part of its logic or procedure. The algorithm typically uses [[Uniform distribution (discrete)|uniformly random]] bits as an auxiliary input to guide its behavior, in the hope of achieving good performance in the "average case" over all possible choices of random determined by the random bits; thus either the running time, or the output (or both) are random variables.
In common practice, randomized algorithms are approximated using a [[pseudorandom number generator]] in place of a true source of random bits; such an implementation may deviate from the expected theoretical behavior and mathematical guarantees which may depend on the existence of an ideal true random number generator.
==Motivation==
Line 32 ⟶ 33:
:<math> \lim_{n \to \infty} \sum_{i = 1}^{n} \frac{i}{2^i} = 2</math>
Since it is constant, the expected run time over many calls is <math>\Theta(1)</math>. (See [[Big
Monte Carlo algorithm:
Line 38 ⟶ 39:
findingA_MC(array A, n, k)
begin
i := 0
repeat
Randomly select one element out of n elements.
i := i + 1
until i = k or 'a' is found
end
</syntaxhighlight>
Line 63 ⟶ 64:
The class of problems for which both YES and NO-instances are allowed to be identified with some error is called [[Bounded-error probabilistic polynomial|BPP]]. This class acts as the randomized equivalent of [[P (complexity)|P]], i.e. BPP represents the class of efficient randomized algorithms.
==
=== Sorting ===
[[Quicksort]] was discovered by [[Tony Hoare]] in 1959, and subsequently published in 1961.<ref>{{Cite journal |last=Hoare |first=C. A. R. |date=July 1961 |title=Algorithm 64: Quicksort |url=https://dl.acm.org/doi/10.1145/366622.366644 |journal=Communications of the ACM |language=en |volume=4 |issue=7 |pages=321 |doi=10.1145/366622.366644 |issn=0001-0782|url-access=subscription }}</ref> In the same year, Hoare published the [[Quickselect|quickselect algorithm]],<ref>{{Cite journal |last=Hoare |first=C. A. R. |date=July 1961 |title=Algorithm 65: find |url=https://dl.acm.org/doi/10.1145/366622.366647 |journal=Communications of the ACM |language=en |volume=4 |issue=7 |pages=321–322 |doi=10.1145/366622.366647 |issn=0001-0782|url-access=subscription }}</ref> which finds the median element of a list in linear expected time. It remained open until 1973 whether a deterministic linear-time algorithm existed.<ref>{{Cite journal |last1=Blum |first1=Manuel |last2=Floyd |first2=Robert W. |last3=Pratt |first3=Vaughan |last4=Rivest |first4=Ronald L. |last5=Tarjan |first5=Robert E. |date=August 1973 |title=Time bounds for selection |url=https://linkinghub.elsevier.com/retrieve/pii/S0022000073800339 |journal=Journal of Computer and System Sciences |language=en |volume=7 |issue=4 |pages=448–461 |doi=10.1016/S0022-0000(73)80033-9|url-access=subscription }}</ref>
=== Number theory ===
In 1917, [[Henry Cabourn Pocklington]] introduced a randomized algorithm known as [[Pocklington's algorithm]] for efficiently finding [[square root]]s modulo prime numbers.<ref>{{citation |last1=Williams |first1=H. C. |title=Mathematics of Computation 1943–1993: a half-century of computational mathematics; Papers from the Symposium on Numerical Analysis and the Minisymposium on Computational Number Theory held in Vancouver, British Columbia, August 9–13, 1993 |volume=48 |pages=481–531 |year=1994 |editor-last=Gautschi |editor-first=Walter |series=Proceedings of Symposia in Applied Mathematics |contribution=Factoring integers before computers |publisher=Amer. Math. Soc., Providence, RI |doi=10.1090/psapm/048/1314885 |mr=1314885 |last2=Shallit |first2=J. O. |isbn=978-0-8218-0291-5 |author1-link=Hugh C. Williams |author2-link=Jeffrey Shallit}}; see p. 504, "Perhaps Pocklington also deserves credit as the inventor of the randomized algorithm".</ref>
In 1970, [[Elwyn Berlekamp]] introduced a randomized algorithm for efficiently computing the roots of a polynomial over a finite field.<ref>{{Cite book |last=Berlekamp |first=E. R. |title=Proceedings of the second ACM symposium on Symbolic and algebraic manipulation - SYMSAC '71 |chapter=Factoring polynomials over large finite fields |date=1971 |chapter-url=http://portal.acm.org/citation.cfm?doid=800204.806290 |language=en |___location=Los Angeles, California, United States |publisher=ACM Press |pages=223 |doi=10.1145/800204.806290|isbn=9781450377867 |s2cid=6464612 }}</ref> In 1977, [[Robert M. Solovay]] and [[Volker Strassen]] discovered a polynomial-time [[Solovay–Strassen primality test|randomized primality test]] (i.e., determining the [[primality test|primality]] of a number). Soon afterwards [[Michael O. Rabin]] demonstrated that the 1976 [[Miller–Rabin primality test|Miller's primality test]] could also be turned into a polynomial-time randomized algorithm. At that time, no provably polynomial-time [[deterministic algorithm|deterministic algorithms]] for primality testing were known.
=== Data structures ===
One of the earliest randomized data structures is the [[hash table]], which was introduced in 1953 by [[Hans Peter Luhn]] at [[IBM]].<ref name=":0">{{Cite book |last=Knuth |first=Donald E. |url=https://dl.acm.org/doi/10.5555/280635 |title=The art of computer programming, volume 3: (2nd ed.) sorting and searching |date=1998 |publisher=Addison Wesley Longman Publishing Co., Inc. |isbn=978-0-201-89685-5 |___location=USA |pages=536–549 }}</ref> Luhn's hash table used chaining to resolve collisions and was also one of the first applications of [[Linked list|linked lists]].<ref name=":0" /> Subsequently, in 1954, [[Gene Amdahl]], [[Elaine M. McGraw]], [[Nathaniel Rochester (computer scientist)|Nathaniel Rochester]], and [[Arthur Samuel (computer scientist)|Arthur Samuel]] of [[IBM Research]] introduced [[linear probing]],<ref name=":0" /> although [[Andrey Ershov]] independently had the same idea in 1957.<ref name=":0" /> In 1962, [[Donald Knuth]] performed the first correct analysis of linear probing,<ref name=":0" /> although the memorandum containing his analysis was not published until much later.<ref>[[Donald Knuth|Knuth, Donald]] (1963), ''[https://web.archive.org/web/20160303225949/http://algo.inria.fr/AofA/Research/11-97.html Notes on "Open" Addressing]'', archived from the original on 2016-03-03</ref> The first published analysis was due to Konheim and Weiss in 1966.<ref>{{Cite journal |last1=Konheim |first1=Alan G. |last2=Weiss |first2=Benjamin |date=November 1966 |title=An Occupancy Discipline and Applications |url=http://dx.doi.org/10.1137/0114101 |journal=SIAM Journal on Applied Mathematics |volume=14 |issue=6 |pages=1266–1274 |doi=10.1137/0114101 |issn=0036-1399|url-access=subscription }}</ref>
Early works on hash tables either assumed access to a fully random hash function or assumed that the keys themselves were random.<ref name=":0" /> In 1979, Carter and Wegman introduced [[Universal hashing|universal hash functions]],<ref>{{Cite journal |last1=Carter |first1=J. Lawrence |last2=Wegman |first2=Mark N. |date=1979-04-01 |title=Universal classes of hash functions |journal=Journal of Computer and System Sciences |language=en |volume=18 |issue=2 |pages=143–154 |doi=10.1016/0022-0000(79)90044-8 |issn=0022-0000|doi-access=free }}</ref> which they showed could be used to implement chained hash tables with constant expected time per operation.
Early work on randomized data structures also extended beyond hash tables. In 1970, Burton Howard Bloom introduced an approximate-membership data structure known as the [[Bloom filter]].<ref>{{Cite journal |last=Bloom |first=Burton H. |date=July 1970 |title=Space/time trade-offs in hash coding with allowable errors |journal=Communications of the ACM |volume=13 |issue=7 |pages=422–426 |doi=10.1145/362686.362692 |s2cid=7931252 |issn=0001-0782|doi-access=free }}</ref> In 1989, [[Raimund Seidel]] and [[Cecilia R. Aragon]] introduced a randomized balanced search tree known as the [[treap]].<ref>{{Cite book |last1=Aragon |first1=C.R. |last2=Seidel |first2=R.G. |title=30th Annual Symposium on Foundations of Computer Science |chapter=Randomized search trees |date=October 1989 |pages=540–545 |doi=10.1109/SFCS.1989.63531|isbn=0-8186-1982-1 }}</ref> In the same year, [[William Pugh (computer scientist)|William Pugh]] introduced another randomized search tree known as the [[skip list]].<ref>[[William Pugh (computer scientist)|Pugh, William]] (April 1989). ''[http://drum.lib.umd.edu/handle/1903/542 Concurrent Maintenance of Skip Lists]'' (PS, PDF) (Technical report). Dept. of Computer Science, U. Maryland. CS-TR-2222.</ref>
=== Implicit uses in combinatorics ===
Prior to the popularization of randomized algorithms in computer science, [[Paul Erdős]] popularized the use of randomized constructions as a mathematical technique for establishing the existence of mathematical objects. This technique has become known as the [[probabilistic method]].<ref name=":1">{{Cite book |last1=Alon |first1=Noga |title=The probabilistic method |date=2016 |first2=Joel H. |last2=Spencer |isbn=978-1-119-06195-3 |edition=Fourth |publisher=Wiley |___location=Hoboken, New Jersey |oclc=910535517}}</ref> [[Paul Erdős|Erdős]] gave his first application of the probabilistic method in 1947, when he used a simple randomized construction to establish the existence of Ramsey graphs.<ref>P. Erdős: Some remarks on the theory of graphs, Bull. Amer. Math. Soc. '''53''' (1947), 292--294 '''MR'''8,479d; '''Zentralblatt''' 32,192.</ref> He famously used a more sophisticated randomized algorithm in 1959 to establish the existence of graphs with high girth and chromatic number.<ref>{{Cite journal |last=Erdös |first=P. |date=1959 |title=Graph Theory and Probability |journal=Canadian Journal of Mathematics |language=en |volume=11 |pages=34–38 |doi=10.4153/CJM-1959-003-9 |s2cid=122784453 |issn=0008-414X|doi-access=free }}</ref><ref name=":1" />
==Examples==
Line 96 ⟶ 102:
[[File:Single run of Karger’s Mincut algorithm.svg|thumb|340px|Figure 2: Successful run of Karger's algorithm on a 10-vertex graph. The minimum cut has size 3 and is indicated by the vertex colours.]]
[[File:Contraction vertices.jpg|300px|thumb|center|Figure 1: Contraction of vertex A and B]]
Karger's<ref>{{cite journal
| last1=Karger | first1=David R.
| citeseerx=10.1.1.215.794
| title=Random Sampling in Cut, Flow, and Network Design Problems
| journal=[[Mathematics of Operations Research]]
| volume=24
| issue=2
| pages=383–413
| date=1999
| doi=10.1287/moor.24.2.383}}</ref> basic algorithm:
'''begin'''
i = 1
Line 113 ⟶ 128:
example of one execution of the algorithm. After execution, we get a cut of size 3.
{{math proof|1=
Because the min cut is ''k'', every vertex ''v'' must satisfy degree(''v'') ≥ ''k''. Therefore, the sum of the degree is at least ''pk''. But it is well known that the sum of vertex degrees equals 2{{abs|''E''
}}
The probability that the algorithm succeeds is 1 − the probability that all attempts fail. By independence, the probability that all attempts fail is
Line 142 ⟶ 157:
==Derandomization==
{{More citations needed|section|date=August 2025}}
Randomness can be viewed as a resource, like space and time. Derandomization is then the process of ''removing'' randomness (or using as little of it as possible).<ref>{{Cite web |title=6.046J Lecture 22: Derandomization {{!}} Design and Analysis of Algorithms {{!}} Electrical Engineering and Computer Science |url=https://ocw.mit.edu/courses/6-046j-design-and-analysis-of-algorithms-spring-2012/resources/mit6_046js12_lec22/ |access-date=2024-12-27 |website=MIT OpenCourseWare |language=en}}</ref><ref>{{Cite report |url=https://dl.acm.org/doi/10.5555/894682 |title=Pairwise Independence and Derandomization |last1=Luby |first1=Michael |last2=Wigderson |first2=Avi |date=July 1995 |publisher=University of California at Berkeley |___location=USA}}</ref> It is not currently known{{As of?|date=September 2023}} if all algorithms can be derandomized without significantly increasing their running time.<ref name=":2">{{Cite web |title=Lecture Notes, Chapter 3. Basic Derandomization Techniques |url=https://people.seas.harvard.edu/~salil/cs225/spring09/lecnotes/list.htm |access-date=2024-12-27 |website=people.seas.harvard.edu}}</ref> For instance, in [[analysis of algorithms|computational complexity]], it is unknown whether [[P (complexity)|P]] = [[Bounded-error probabilistic polynomial|BPP]],<ref name=":2" /> i.e., we do not know whether we can take an arbitrary randomized algorithm that runs in polynomial time with a small error probability and derandomize it to run in polynomial time without using randomness.
There are specific methods that can be employed to derandomize particular randomized algorithms:
* the [[method of conditional probabilities]], and its generalization, [[pessimistic estimator]]s
* [[discrepancy theory]] (which is used to derandomize geometric algorithms)
* the exploitation of limited independence in the random variables used by the algorithm, such as the [[pairwise independence]] used in [[universal hashing]]<ref>{{Cite journal |last1=Chazelle |first1=B. |last2=Friedman |first2=J. |date=1990-09-01 |title=A deterministic view of random sampling and its use in geometry |url=https://doi.org/10.1007/BF02122778 |journal=Combinatorica |language=en |volume=10 |issue=3 |pages=229–249 |doi=10.1007/BF02122778 |issn=1439-6912|url-access=subscription }}</ref>
* the use of [[expander graph]]s (or [[disperser]]s in general) to ''amplify'' a limited amount of initial randomness (this last approach is also referred to as generating [[pseudorandom]] bits from a random source, and leads to the related topic of pseudorandomness)
* changing the randomized algorithm to use a [[hash function]] as a source of randomness for the algorithm's tasks, and then derandomizing the algorithm by [[brute-force search|brute-forcing]] all possible parameters (seeds) of the hash function. This technique is usually used to exhaustively search a sample space and making the algorithm deterministic (e.g. randomized graph algorithms)
==Where randomness helps==
Line 155 ⟶ 172:
* The natural way of carrying out a numerical computation in [[embedded systems]] or [[cyber-physical system]]s is to provide a result that approximates the correct one with high probability (or Probably Approximately Correct Computation (PACC)). The hard problem associated with the evaluation of the discrepancy loss between the approximated and the correct computation can be effectively addressed by resorting to randomization<ref>{{citation|title=Intelligence for Embedded Systems|first1=Cesare|last1=Alippi|publisher=Springer|year=2014|isbn=978-3-319-05278-6}}.</ref>
* In [[communication complexity]], the equality of two strings can be verified to some reliability using <math>\log n</math> bits of communication with a randomized protocol. Any deterministic protocol requires <math>\Theta(n)</math> bits if defending against a strong opponent.<ref>{{citation|title=Communication Complexity|first1=Eyal|last1=Kushilevitz|first2=Noam|last2=Nisan|publisher=Cambridge University Press|year=2006|isbn=9780521029834}}. For the deterministic lower bound see p. 11; for the logarithmic randomized upper bound see pp. 31–32.</ref>
* The volume of a convex body can be estimated by a randomized algorithm to arbitrary precision in polynomial time.<ref>{{citation |last1=Dyer |first1=M. |last2=Frieze |first2=A. |last3=Kannan |first3=R. |title=A random polynomial-time algorithm for approximating the volume of convex bodies |journal=[[Journal of the ACM]] |volume=38 |issue=1 |year=1991 |pages=1–17 |doi=10.1145/102782.102783 |s2cid=13268711 |url=http://www.math.cmu.edu/~af1p/Texfiles/oldvolume.pdf}}</ref> [[Imre Bárány|Bárány]] and [[Zoltán Füredi|Füredi]] showed that no deterministic algorithm can do the same.<ref>{{citation |last1=Füredi |first1=Z. |author1-link=Zoltán Füredi |last2=Bárány |first2=I. |year=1986 |contribution=Computing the volume is difficult |title=Proc. 18th ACM Symposium on Theory of Computing (Berkeley, California, May 28–30, 1986) |publisher=ACM |___location=New York, NY |pages=442–447 |doi=10.1145/12130.12176 |citeseerx=10.1.1.726.9448 |isbn=0-89791-193-8 |s2cid=17867291 |url=https://ecommons.cornell.edu/bitstream/1813/8572/1/TR000688.pdf}}</ref> This is true unconditionally, i.e. without relying on any complexity-theoretic assumptions, assuming the convex body can be queried only as a black box.
* A more complexity-theoretic example of a place where randomness appears to help is the class [[IP (complexity)|IP]]. IP consists of all languages that can be accepted (with high probability) by a polynomially long interaction between an all-powerful prover and a verifier that implements a BPP algorithm. IP = [[PSPACE]].<ref>{{citation |last=Shamir |first=A. |author-link=Adi Shamir |title=IP = PSPACE |journal=Journal of the ACM |volume=39 |issue=4 |year=1992 |pages=869–877 |doi=10.1145/146585.146609 |s2cid=315182 |doi-access=free}}</ref> However, if it is required that the verifier be deterministic, then IP = [[NP (complexity)|NP]].
* In a [[chemical reaction network]] (a finite set of reactions like A+B → 2C + D operating on a finite number of molecules), the ability to ever reach a given target state from an initial state is decidable, while even approximating the probability of ever reaching a given target state (using the standard concentration-based probability for which reaction will occur next) is undecidable.
| last1 = Cook | first1 = Matthew | author1-link = Matthew Cook
| last2 = Soloveichik | first2 = David
Line 173 ⟶ 190:
| series = Natural Computing Series
| title = Algorithmic Bioprocesses
| year = 2009| isbn = 978-3-540-88868-0 | url = https://authors.library.caltech.edu/26121/1/etr090.pdf }}.</ref>
==See also==
*[[Probabilistic analysis of algorithms]]▼
*[[Atlantic City algorithm]]
*[[Monte Carlo algorithm]]▼
*[[Las Vegas algorithm]]▼
*[[Bogosort]]
*[[HyperLogLog]]▼
*[[Karger's algorithm]]
▲*[[Las Vegas algorithm]]
▲*[[Monte Carlo algorithm]]
*[[Principle of deferred decision]]
▲*[[Probabilistic analysis of algorithms]]
*[[Randomized algorithms as zero-sum games]]▼
*[[Probabilistic roadmap]]
▲*[[Randomized algorithms as zero-sum games]]
▲*[[HyperLogLog]]
▲*[[count–min sketch]]
▲*[[approximate counting algorithm]]
==Notes==
Line 202 ⟶ 220:
* {{cite journal|doi=10.1016/0022-314X(80)90084-0|title=Probabilistic algorithm for testing primality|journal=Journal of Number Theory|volume=12|pages=128–138|year=1980|last1=Rabin|first1=Michael O.|doi-access=free}}
* A. A. Tsay, W. S. Lovejoy, David R. Karger, ''Random Sampling in Cut, Flow, and Network Design Problems'', Mathematics of Operations Research, 24(2):383–413, 1999.
* [https://www.osti.gov/biblio/1807223 "Randomized Algorithms for Scientific Computing" (RASC), OSTI.GOV (July 10th, 2021).]
{{DEFAULTSORT:Randomized Algorithm}}
{{Algorithms and data structures}}
[[Category:Randomized algorithms| ]]
[[Category:Analysis of algorithms]]
|