Latin hypercube sampling: Difference between revisions

Content deleted Content added
add references, wikify a bit
Line 1:
The [[statistics|statistical]] method of '''Latin hypercube sampling''' ('''LHS''') was developed by [[Ronald L. Iman]], J. C. Helton, and [[James Edward Campbell]], et al to generate a distribution of plausible collections of parameter values from a [[multidimensional distribution]]. The [[Sampling (statistics)|sampling method]] is often applied in [[uncertainty]] analysis.
 
The technique was first described by McKay<ref>{{cite journal |last=McKay |first=M.D. |coauthors=Conover, W.J.; and Beckman, R.J. |title=A Comparison of Three Methods for Selecting Values of Input Variables in the Analysis of Output from a Computer Code |year=1979 |journal=Technometrics |volume=21 |pages=239–245}}</ref> in [[1979]]. It was furher elaborated by [[Ronald L. Iman]], and others<ref>{{cite journal |last=Iman |first=R.L. |coauthors=Helton, J.C.; and [[James Edward Campbell|Campbell, J.E.]] |title=An approach to sensitivity analysis of computer models, Part 1. Introduction, input variable selection and preliminary variable assessment |journal=Journal of Quality Technology |volume=13 |issue=3 |pages=174–183 |year=1981 }}</ref> in [[1981]]. Detailed computer codes and manuals were later published.<ref>{{cite book |last=Iman |first=R.L. |coauthors=Davenport, J.M. ; Zeigler, D.K. |title=Latin hypercube sampling (program user's guide) |year=1980 |id={{OSTI|5571631}}}}</ref>
Their paper ''An approach to sensitivity analysis of computer models, Part I. Introduction, input variable selection and preliminary variable assessment.'' appeared in the Journal of Quality Technology in [[1981]]. Earlier, M.D. McCay already described this technique in McKay, M. D., W. J. Conover and R. J. Beckman. 1979.
''A Comparison of Three Methods for Selecting Values of Input Variables in the Analysis of Output from a Computer Code'', in ''Technometrics 21: 239-245''.
 
In the context of statistical sampling, a square grid containing sample positions is a [[Latin square]] if (and only if) there is only one sample in each row and each column. A '''Latin hypercube''' is the generalisation of this concept to an arbitrary number of dimensions, whereby each sample is the only one in each axis-aligned hyperplane containing it.
Line 8 ⟶ 7:
When sampling a function of <math>N</math> variables, the range of each variable is divided into <math>M</math> equally probable intervals. <math>M</math> sample points are then placed to satisfy the Latin hypercube requirements; note that this forces the number of divisions, <math>M</math>, to be equal for each variable. Also note that this sampling scheme does not require more samples for more dimensions (variables); this independence is one of the main advantages of this sampling scheme. Another advantage is that random samples can be taken one at a time, remembering which samples were taken so far.
 
'''Orthogonal sampling''' adds the requirement that the entire sample space must be sampled evenly. Although more efficient, orthogonal sampling strategy is more difficult to implement since all random samples must be generated simultaneously.
 
[[Image:LHSsampling.png|100px|right]]
 
In two dimensions the difference between random sampling, Latin Hypercube sampling and orthogonal sampling can be explained as follows:
I) #In '''random sampling''' new sample points are generated without taking into account the previously generated sample points. One does thus not necessarily need to know beforehand how many sample points that are needed.
II) #In '''Latin Hypercube sampling''' one must first decide how many sample points to use and for each sample point remember in which row and column the sample point was taken.
III) #In '''Orthogonal Sampling''', the sample space is divided into equally probable subspaces, the figure above showing four subspaces. All sample points are then chosen simultaneously making sure that the total ensemble of sample points is a Latin Hypercube sample and that each subspace is sampled with the same density.
 
Thus, orthogonal sampling ensures that the ensemble of random numbers is a very good representative of the real variability, LHS sampling ensures that the ensemble of random numbers is a good representative of the real variability whereas traditional random sampling (sometimes called brute force) is just an ensemble of random numbers without any guarantees.
I) In '''random sampling''' new sample points are generated without taking into account the previously generated sample points. One does thus not necessarily need to know beforehand how many sample points that are needed.
 
II) In '''Latin Hypercube sampling''' one must first decide how many sample points to use and for each sample point remember in which row and column the sample point was taken.
 
==References==
III) In '''Orthogonal Sampling''', the sample space is divided into equally probable subspaces, the figure above showing four subspaces. All sample points are then chosen simultaneously making sure that the total ensemble of sample points is a Latin Hypercube sample and that each subspace is sampled with the same density.
<references/>
 
Thus, orthogonal sampling ensures that the ensemble of random numbers is a very good representative of the real variability, LHS sampling ensures that the ensemble of random numbers is a good representative of the real variability whereas traditional random sampling (sometimes called brute force) is just an ensemble of random numbers without any guarantees.
 
[[Category:Sampling techniques]]