Elementary effects method: Difference between revisions

Content deleted Content added
m Duplicate word removed +
 
(24 intermediate revisions by 18 users not shown)
Line 1:
{{Short description|Screening method}}
==Overview==
 
Published in 1991 by Max Morris<ref>https://www.stat.iastate.edu/people/max-morris Home Page of Max D. Morris at [[Iowa State University]]</ref> the '''elementary effects (EE) method'''<ref name="Morris"/> is one of the most used<ref>Borgonovo, Emanuele, and Elmar Plischke. 2016. “Sensitivity Analysis: A Review of Recent Advances.” European Journal of Operational Research 248 (3): 869–87. https://doi.org/10.1016/J.EJOR.2015.06.032. </ref><ref>Iooss, Bertrand, and Paul Lemaître. 2015. “A Review on Global Sensitivity Analysis Methods.” In Uncertainty Management in Simulation-Optimization of Complex Systems, edited by G. Dellino and C. Meloni, 101–22. Boston, MA: Springer, Boston, MA. https://doi.org/10.1007/978-1-4899-7547-8_5. </ref><ref>Norton, J.P. 2015. “An Introduction to Sensitivity Assessment of Simulation Models.” Environmental Modelling & Software 69 (C): 166–74. https://doi.org/10.1016/j.envsoft.2015.03.020. </ref><ref>Wei, Pengfei, Zhenzhou Lu, and Jingwen Song. 2015. “Variable Importance Analysis: A Comprehensive Review.” Reliability Engineering & System Safety 142: 399–432. https://doi.org/10.1016/j.ress.2015.05.018.</ref> screening methods in [[sensitivity analysis]].
The Elementary Effects (EE) method is the most used screening method in [[Sensitivity_analysis|sensitivity analysis]]. It is applied to identify non-influential inputs for a computationally costly [[mathematical model]] or for a model with a large number of inputs, where the costs of estimating other sensitivity analysis measures such as the variance-based measures is not affordable.
 
EE is applied to identify non-influential inputs for a computationally costly [[mathematical model]] or for a model with a large number of inputs, where the costs of estimating other sensitivity analysis measures such as the [[variance]]-based measures is not affordable. Like all screening, the EE method provides qualitative sensitivity analysis measures, i.e. measures which allow the identification of non-influential inputs or which allow to rank the input factors in order of importance, but do not quantify exactly the relative importance of the inputs.
 
==Methodology==
 
To exemplify the EE method, let us assume to consider a mathematical model with <math> k </math> input factors. Let <math> Y </math> be the output of interest (a scalar for simplicity):<br />
 
: <math> Y = f(X_1, X_2, ... X_k).</math>
 
The original EE method of Morris <ref name="Morris">Morris, M. D. (1991). Factorial sampling plans for preliminary computational experiments. ''Technometrics'', '''33''', 161–174.</ref> provides two sensitivity measures for each input factor:
 
The original EE method of Morris <ref>Morris, M. D. (1991). Factorial sampling plans for preliminary computational experiments. ''Technometrics'', '''33''', 161–174.</ref> provides two sensitivity measures for each input factor:
 
* the measure <math> \mu </math>, assessing the overall importance of an input factor on the model output;
* the measure <math> \sigma </math>, describing [[Nonlinear system|non-linear]] effects and interactions.
 
These two measures are obtained through a design based on the construction of a series of [[Trajectory|trajectories]] in the space of the inputs, where inputs are randomly moved One-At-a-Time (OAT).
In this design, each model input is assumed to vary across <math>p</math> selected levels in the space of the input factors. The region of experimentation <math>\Omega</math> is thus a <math>k</math>-dimensional <math>p</math>-level grid.
 
Each trajectory is composed of <math>(k+1)</math> points since input factors move one by one of a step <math> \Delta </math> in <math>\{0, 1/(p-1), 2/(p-1),... , 1-1/(p-1)\}</math> while all the others remain fixed. Figure 1 shows an example of trajectory in three dimensions.
[[File:EE_trajectory.jpg|thumb | right | 300px | Figure 1: Example of a trajectory in 3 dimensions.]]
 
Along each trajectory the so -called ''elementary effect'' for each input factor is defined as:<br />
Each trajectory is composed of <math>(k+1)</math> points since input factors move one by one of a step <math> \Delta </math> in <math>\{1/(p-1), ... , 1-1/(p-1)\}</math> while all the others remain fixed. Figure 1 shows an example of trajectory in three dimensions.
 
: <math> d_i(X) = \frac{Y(X_1, \ldots ,X_{i-1}, X_i + \Delta, X_{i-+1}, \ldots, X_k ) - Y( \mathbf X)}{\Delta} </math>,
Along each trajectory the so called ''elementary effect'' for each input factor is defined as:<br />
 
: <math> d_i(X) = \frac{Y(X_1, \ldots ,X_{i-1}, X_i + \Delta, X_{i-1}, \ldots, X_k )}{\Delta} </math>,
 
where <math> \mathbf{X} = (X_1, X_2, ... X_k)</math> is any selected value in <math> \Omega </math> such that the transformed point is still in <math> \Omega </math> for each index <math> i=1,\ldots, k. </math>
 
<math> r </math> elementary effects are estimated for each input <math> d_i\left(X^{(1)} \right), d_i\left( X^{(2)} \right), \ldots, d_i\left( X^{(r)}) \right) </math> by [[Random sampling|randomly sampling]] <math> r </math> points <math> X^{(1)}, X^{(2)}, \ldots , X^{(r)}</math>.
 
Usually <math> r </math> ~ 4,-10, depending on the number of input factors, on the [[computational cost]] of the model and on the choice of the number of levels <math> p </math>, since a high number of levels to be explored needs to be balanced by a high number of trajectories, in order to obtain an exploratory sample. It is demonstrated that a convenient choice for the [[parameter]]s <math> p </math> and <math> \Delta </math> is <math> p </math> even and <math> \Delta </math> equal to <math>
<math> r </math> elementary effects are estimated for each input <math> d_i(X^{(1)}), d_i(X^{(2)}, \ldots, d_i(X^{(r)})) </math> by randomly sampling <math> r </math> points <math> X^{(1)}, X^{(2)}, \ldots , X^{(r)}</math>.
Usually <math> r </math> ~ 4,10, depending on the number of input factors, on the computational cost of the model and on the choice of the number of levels <math> p </math>, since a high number of levels to be explored needs to be balanced by a high number of trajectories, in order to obtain an exploratory sample.
It is demonstrated that a convenient choice for the parameters <math> p </math> and <math> \Delta </math> is <math> p </math> even and <math> \Delta </math> equal to <math>
p/[2(p-1)]</math>, as this ensures equal probability of sampling in the input space.
 
In case input factors are not uniformly distributed, the best practice is to sample in the space of the quantiles and to obtain the inputs values using inverse cumulative distribution functions. Note that in this case <math> \Delta </math> equals the step taken by the inputs in the space of the quantiles.<br />
[[File:Example_EEMethod.jpg|thumb | right | 400px | Figure 2: Example of identification of non-influential input factors by the use of the measures <math> \mu </math> and <math> \sigma </math>. The model is the function proposed by Morris to exemplify the method in his original work.]]
 
The two measures <math> \mu </math> and <math> \sigma </math> are defined as the mean and the [[standard deviation]] of the distribution of the elementary effects of each input:<br />
In case input factors are not uniformly distributed, the best practice is to sample in the space of the quantiles and to obtain the inputs values using inverse cumulative distribution functions. Note that in this case <math> \Delta </math> equals the step taken by the inputs in the space of the quantiles.<br />
: <math> \mu_i = \frac{1}{r} \sum_{j=1}^r d_i \left( X^{(j)} \right) </math>,<br />
: <math> \sigma_i = \sqrt{ \frac{1}{(r-1)} \sum_{j=1}^r \left( d_i \left( X^{(j)} \right) - \mu_i \right)^2} </math>.<br />
 
These two measures need to be read together (e.g. on a two-dimensional graph, see Figure 2) in order to rank input factors in order of importance and identify those inputs which do not influence the output variability. Low values of both <math> \mu </math> and <math> \sigma </math> correspond to a non-influent input.<br />
 
The two measures <math> \mu </math> and <math> \sigma </math> are defined as the mean and the standard deviation of the distribution of the elementary effects of each input:<br />
: <math> \mu_i = \frac{1}{r} \sum_{j=1}^r d_i \left( X^{(j)} \right) </math>,<br />
: <math> \sigma_i = \frac{1}{(r-1)} \sum_{j=1}^r \left( d_i \left( X^{(j)} \right) - \mu_i \right)^2 </math>.<br />
 
These two measures need to be read together (e.g. on a two-dimensional graph, see Figure 2) in order to rank input factors in order of importance and identify those inputs which do not influence the output variability. Low values of both <math> \mu </math> and <math> \sigma </math> correspond to a non-influent input.<br />
 
[[File:Mu_mustar_sigma.jpg|thumb | left | 350px | Figure 3: Advantage of using the revised measure <math> \mu^* </math> instead of the couple of measures {<math> \mu </math>, <math> \sigma </math>}: the measure alone provides a rank of the inputs and allows the identification of non-influent input factors. The test is exemplified on the ''g''-function which is a standard function used to test sensitivity analysis methods.]]
 
An improvement of this method was developed by Campolongo et al.<ref>Campolongo, F., J. Cariboni, and A. Saltelli (2007). An effective screening design for sensitivity analysis of large models. ''Environmental Modelling and Software'', '''22''',
1509&ndash;1518.</ref> who proposed a revised measure <math> \mu^* </math>, which on its own is sufficient to provide a reliable ranking of the input factors. The revised measure is the mean of the [[Distribution (mathematics)|distribution]] of the absolute values of the elementary effects of the input factors:<br />
: <math> \mu_i^* = \frac{1}{r} \sum_{j=1}^r \left| d_i \left( X^{(j)} \right) \right| </math>.<br />
The use of <math> \mu^* </math> solves the problem of the effects of opposite signs which occurs when the model is non-[[Monotonic function|monotonic]] and which can cancel each other out, thus resulting in a low value for <math> \mu </math> (see Figure 3 for an example).<br />
 
An efficient technical scheme to construct the trajectories used in the EE method is presented in the original paper by Morris while an improvement strategy aimed at better exploring the input space is proposed by Campolongo et al..
 
==References==
{{reflist}}
 
[[Category:Mathematical modeling]]
[[Category:Sensitivity analysis]]