Computer experiment: Difference between revisions

Content deleted Content added
Expanded content on sampling strategies
Tag: Reverted
No edit summary
Tags: Manual revert Mobile edit Mobile web edit
 
(4 intermediate revisions by 3 users not shown)
Line 1:
{{Short description|Experiment used to study a computer simulation}}
A '''computer experiment''' or '''simulation experiment''' is an experiment used to study a computer simulation, also referred to as an [[in silico]] system. This area includes [[computational physics]], [[computational chemistry]], [[computational biology]] and other similar disciplines.
 
Line 28:
The design of computer experiments has considerable differences from [[design of experiments]] for parametric models. Since a Gaussian process prior has an infinite dimensional representation, the concepts of A and D criteria (see [[Optimal design]]), which focus on reducing the error in the parameters, cannot be used. Replications would also be wasteful in cases when the computer simulation has no error. Criteria that are used to determine a good experimental design include integrated mean squared prediction error [https://web.archive.org/web/20170918022130/https://projecteuclid.org/DPubS?service=UI&version=1.0&verb=Display&handle=euclid.ss%2F1177012413] and distance based criteria [http://www.sciencedirect.com/science/article/pii/037837589090122B].
 
Popular strategies for design include [[latin hypercube sampling]] and [[low discrepancy sequences]].
===Sampling strategies===
 
In [[Computer simulation|computational modeling]] and [[computer experiment]]s, '''sampling''' is a process of data generation, which is commonly used for [[Monte Carlo method|Monte Carlo simulation]] and [[uncertainty analysis]], global [[sensitivity analysis]] and other sensitivity and [[design of experiments]] studies, for example with [[heat maps]] and [[Response surface methodology|response surfaces]].
 
The sampling strategies include different techniques.
# [[Simple random sampling]] is the most common method used for Monte Carlo simulation.<ref>Olken, F., and Rotem, D. (1986). "Simple random sampling from relational databases."</ref> Simple random sampling can be implemented in any programming language and spreadsheet environment with a straightforward {{smallcaps|rand}} function or its variations.
# [[Latin hypercube sampling]] is a simple approach that attempts to improve coverage of the multidimensional space compared to simple random sampling but does not always outperform it.<ref>{{cite book | last1 = Fang | first1 = K. T. | last2 = Li
| first2 = R. | last3 = Sudjianto
| first3 = A. | title = Design and Modeling for Computer Experiments | year = 2006 | series = Computer Science and Data Analysis Series}}</ref>
# [[Quasi-Monte Carlo method|Quasi-random sampling]] attempts to fill the multidimensional space more uniformly by utilizing [[low-discrepancy sequences]], among which [[Sobol sequence|Sobol' sequences]] are found the best performing.<ref name="Owen2023">{{cite book |last=Owen |first=A. B. |year=2023 |title=Practical Quasi-Monte Carlo |url=https://artowen.su.domains/mc/practicalqmc.pdf }}</ref> Quasi-random sampling functionality is available in Python,<ref>{{cite web|title=Quasi-monte carlo submodule (scipy.stats.qmc)|url=https://docs.scipy.org/doc/scipy/reference/stats.qmc.html|website=SciPy|year=2023}}</ref> R,<ref>{{cite web
|title=Toolbox for Pseudo and Quasi Random Number Generation and Random Generator Tests |url=https://cran.r-project.org/web/packages/randtoolbox/randtoolbox.pdf |author1=Chalabi, Y.
|author2=Dutang, C. |author3=Savicky, P.
|author4=Wuertz, D. |author5=Knuth, D.
|author6=Matsumoto, M. |author7=Saito, M. |year=2023 }}</ref> Julia,<ref>{{cite web
|title=The QMC module for Julia
|url=https://github.com/PieterjanRobbe/QMC.jl |author=Robbe, P. |year=2018 }}</ref> and Matlab.<ref>{{cite web |title=Generating quasi-random numbers |url=https://se.mathworks.com/help/stats/generating-quasi-random-numbers.html |author=MathWorks |year=2013 }}</ref>
# [[Factorial experiment|Full factorial design]] involves choosing several points for each input variable and then evaluating the model for all their combinations. The computational complexity of this method grows exponentially with the number of input variables involved. Various [[fractional factorial design]] methods have been introduced to improve the computational efficiency of the approach.<ref>{{cite book
|last=Antony |first=J. |year=2023 |title=Design of Experiments for Engineers and Scientists |publisher=Elsevier }}</ref>
 
===Problems with massive sample sizes===
Unlike physical experiments, it is common for computer experiments to have thousands of different input combinations. Because the standard inference requires [[Invertible matrix|matrix inversion]] of a square matrix of the size of the number of samples (<math>n</math>), the cost grows on the <math> \mathcal{O} (n^3) </math>. Matrix inversion of large, dense matrices can also cause numerical inaccuracies. Currently, this problem is solved by greedy decision tree techniques, allowing effective computations for unlimited dimensionality and sample size [httphttps://wwwpatents.google.com/patentspatent/WO2013055257A1?cl=/en&hl=ru patent WO2013055257A1], or avoided by using approximation methods, e.g. [https://wayback.archive-it.org/all/20120130182750/http://www.stat.wisc.edu/~zhiguang/Multistep_AOS.pdf].
 
==See also==
Line 61 ⟶ 44:
*[[Grey box completion and validation]]
*[[Artificial financial market]]
 
 
==Further reading==
Line 68 ⟶ 50:
 
* {{cite journal | last1 = Fehr | first1 = Jörg | last2 = Heiland | first2 = Jan | last3 = Himpe | first3 = Christian | last4 = Saak | first4 = Jens | title = Best practices for replicability, reproducibility and reusability of computer-based experiments exemplified by model reduction software | journal = AIMS Mathematics | volume = 1 | issue = 3 | pages = 261–281 | date = 2016 | doi = 10.3934/Math.2016.3.261 | arxiv = 1607.01191 | s2cid = 14715031 }}
 
==References==
{{reflist}}
 
[[Category:Computational science]]