Alias method: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 19:55, 21 June 2024 edit 97.102.205.224 (talk) Add URL for "Fast Generation of Discrete Random Variables", other editorial/wording adjustments. Bold square histogram method; will ask for a redirect. ← Previous edit		Latest revision as of 01:14, 31 December 2024 edit undo Tamfang (talk \| contribs) Extended confirmed users, Pending changes reviewers, Rollbackers 59,063 edits Undid gf revision 1266160024 by 2A01:599:607:58C4:5C65:F467:5055:AECE (talk)— not an error Tag: Undo
(5 intermediate revisions by 4 users not shown)
Line 4: ==Operation== Internally, the algorithm consults two tables, a ''[[probability]] [[Table (information)\|table]]'' {{mvar\|U<sub>i</sub>}} and an ''alias table'' {{mvar\|K<sub>i</sub>}} (for {{math\|1 ≤ ''i'' ≤ ''n''}}). To generate a random outcome, a fair [[dice\|die]] is rolled to determine an index {{mvar\|i}} into the two tables. ~~Based on the probability stored at that index, a~~A [[biased coin]] is then flipped, ~~and~~choosing ~~the~~a ~~outcome~~result of ~~the~~{{mvar\|i}} ~~flip~~with ~~is used to choose between a result of~~probability {{mvar\|U<sub>i</sub>}}, ~~and~~or {{mvar\|K<sub>i</sub>}} otherwise (probability {{math\|1 − ''U<sub>i</sub>''}}).<ref>{{cite web \|url=http://www.keithschwarz.com/darts-dice-coins/ \|title=Darts, Dice, and Coins: Sampling from a Discrete Distribution \|date=29 December 2011 \|website=KeithSchwarz.com \|access-date=2011-12-27}}</ref> More concretely, the algorithm operates as follows: Line 13: # Otherwise, return {{mvar\|K<sub>i</sub>}}. An alternative formulation of the probability table, proposed by Marsaglia et al.<ref name=marsaglia>{{Citation \|first1=George \|last1=Marsaglia \|author-link1=George Marsaglia \|first2=Wai Wan \|last2=Tsang \|first3=Jingbo \|last3=Wang \|title=Fast Generation of Discrete Random Variables \|journal=Journal of Statistical Software \|date=2004-07-12 \|volume=11 \|issue=3 \|pages=1–11 \|doi=10.18637/jss.v011.i03 \|doi-access=free \|url=https://www.researchgate.net/publication/~~5142858_Fast_Generation_of_Discrete_Random_Variables/fulltext/0e5fcf01f0c404bcbfaaa888/Fast-Generation-of-Discrete-Random-Variables.pdf~~5142858}}</ref> as the '''square histogram method''', avoids the computation of {{mvar\|y}} by instead checking the condition {{math\|1=''x'' <  ''V<sub>i</sub>''}} ~~in the third step (where {{math\|1~~=~~''V<sub>i</sub>'' =~~  (''U<sub>i</sub>''  +  ''i''  −  1)/''n''}}) in the third step. ==Table generation== Line 23: * The "exactly full" group, where {{math\|1=''U<sub>i</sub>'' = 1}} or {{mvar\|K<sub>i</sub>}} ''has'' been initialized. If {{math\|1=''U<sub>i</sub>'' = 1}}, the corresponding value {{mvar\|K<sub>i</sub>}} will never be consulted and is unimportant, but a value of {{math\|1=''K<sub>i</sub>'' = ''i''}} is sensible. This also avoids problems if the probabilities are represented as [[fixed-point number]]s which cannot represent {{math\|1=''U<sub>i</sub>'' = 1}} exactly. As long as not all table entries are exactly full, repeat the following steps: # Arbitrarily choose an overfull entry {{math\|''U<sub>i</sub>'' > 1}} and an underfull entry {{math\|''U<sub>j</sub>'' < 1}}. (If one of these exists, the other must, as well.) # Allocate the unused space in entry {{mvar\|j}} to outcome {{mvar\|i}}, by setting {{math\|1=''K<sub>j</sub>'' =← ''i''}}. # Remove the allocated space from entry {{mvar\|i}} by changing {{math\|1=''U<sub>i</sub>'' =← ''U<sub>i</sub>'' − (1 − ''U<sub>j</sub>'') = ''U<sub>i</sub>'' + ''U<sub>j</sub>'' − 1}}. # Entry {{mvar\|j}} is now exactly full. # Assign entry {{mvar\|i}} to the appropriate category based on the new value of {{mvar\|U<sub>i</sub>}}. Each iteration moves at least one entry to the "exactly full" category (and the last moves two), so the procedure is guaranteed to terminate after at most {{math\|''n'' −1}} iterations. Each iteration can be done in {{math\|''O''(1)}} time, so the table can be set up in {{math\|''O''(''n'')}} time. Vose<ref name=Vose/>{{Rp\|974}} points out that floating-point rounding errors may cause the guarantee referred to in step 1 to be violated. If one category empties before the other, the remaining entries may have {{mvar\|U<sub>i</sub>}} set to 1 with negligible error. The solution accounting for floating point is sometimes called the '''Walker-Vose method''' or the '''Vose alias method'''. Line 43: Although the alias method is very efficient if generating a uniform deviate is itself fast, there are cases where it is far from optimal in terms of random bit usage. This is because it uses a full-precision random variate {{mvar\|x}} each time, even when only a few random bits are needed. One case arises when the probabilities are particularly well balanced, so many {{math\|1=''U<sub>i</sub>'' = 1}}. ~~In for~~For these values of {{mvar\|i}}, {{mvar\|K<sub>i</sub>}} is not needed and generating {{mvar\|y}} is a waste of time. For example if {{math\|1=''p''<sub>1</sub> = ''p''<sub>2</sub> = {{frac\|1\|2}}}}, then a 32-bit random variate {{mvar\|x}} could be used to generate 32 outputs, but the alias method will only generate one. Another case arises when the probabilities are strongly unbalanced, so many {{math\|''U<sub>i</sub>'' ≈ 0}}. For example if {{math\|1=''p''<sub>1</sub> = 0.999}} and {{math\|1=''p''<sub>2</sub> = 0.001}}, then the great majority of the time, only a few random bits are required to determine that case 1 applies.