Stochastic programming: Difference between revisions

Content deleted Content added
fix typos
 
(19 intermediate revisions by 13 users not shown)
Line 2:
{{For|the context of control theory|Stochastic control}}
 
In the field of [[mathematical optimization]], '''stochastic programming''' is a framework for [[Mathematical model|modeling]] [[Optimization (mathematics)|optimization]] problems that involve [[uncertainty]]. A '''stochastic program''' is an optimization problem in which some or all problem parameters are uncertain, but follow known [[probability distribution]]s.<ref>{{cite book|last1=Shapiro|first1=Alexander|urlauthor2-link=http://www2.isye.gatech.edu/people/faculty/Alex_Shapiro/SPbook.pdfDarinka Dentcheva|last2=Dentcheva|first2=Darinka|author3-link=Andrzej Piotr Ruszczyński|last3=Ruszczyński|first3=Andrzej|title=Lectures on stochastic programming: Modeling and theory|last2series=DentchevaMPS/SIAM Series on Optimization|first2volume=Darinka|last3=Ruszczyński|first3=Andrzej9|publisher=Society for Industrial and Applied Mathematics (SIAM)and the Mathematical Programming Society|___location=Philadelphia, PA|year=2009|pages=xvi+436|isbn=978-0-89871-687-0|seriesurl=MPShttp:/SIAM Series on Optimization|volume=9|___location=Philadelphia, PA|pages=xvi+436/www2.isye.gatech.edu/people/faculty/Alex_Shapiro/SPbook.pdf|mr=2562798|author2access-linkdate=Darinka Dentcheva2010-09-22|author3archive-linkdate=Andrzej Piotr Ruszczyński2020-03-24|agencyarchive-url=Mathematical Programming Society (MPS)https://web.archive.org/web/20200324131907/https://www2.isye.gatech.edu/people/faculty/Alex_Shapiro/SPbook.pdf|url-status=dead}}</ref><ref>{{Cite book|last1=Birge|first1=John R.|last2=Louveaux|first2=François|date=2011|title=Introduction to Stochastic Programming|url=https://doi.org/10.1007/978-1-4614-0237-4|series=Springer Series in Operations Research and Financial Engineering|language=en-gb|doi=10.1007/978-1-4614-0237-4|isbn=978-1-4614-0236-7|issn=1431-8598}}</ref> This framework contrasts with deterministic optimization, in which all problem parameters are assumed to be known exactly. The goal of stochastic programming is to find a decision which both optimizes some criteria chosen by the decision maker, and appropriately accounts for the uncertainty of the problem parameters. Because many real-world decisions involve uncertainty, stochastic programming has found applications in a broad range of areas ranging from [[finance]] to [[transportation]] to energy optimization.<ref>
Stein W. Wallace and William T. Ziemba (eds.). ''[https://books.google.com/books?id=KAI0jsuyDPsC&printsec=frontcover&dqq=%22Applications+of+Stochastic+Programming%22&hl=en&sa=X&ved=0ahUKEwivt-nn2OfiAhURXa0KHYJMC9UQ6AEIKjAA#v=onepage&q=%22Applications%20of%20Stochastic%20Programming%22&f=false Applications of Stochastic Programming]''. MPS-SIAM Book Series on Optimization 5, 2005.
</ref><ref>
Applications of stochastic programming are described at the following website, [http://stoprog.org Stochastic Programming Community].
</ref>
 
==Methods==
== Two-stage problems==
 
Several stochastic programming methods have been developed:
* Scenario-based methods including sample average approximation
* Stochastic integer programming for problems in which some variables must be integers
* [[Chance constrained programming]] for dealing with constraints that must be satisfied with a given probability
* [[Stochastic dynamic programming]]
* [[Markov decision process]]
* [[Benders decomposition]]
 
== Two-stage problemsproblem definition==
The basic idea of two-stage stochastic programming is that (optimal) decisions should be based on data available at the time the decisions are made and cannot depend on future observations. The two-stage formulation is widely used in stochastic programming. The general formulation of a two-stage stochastic programming problem is given by:
<math display="block">
Line 43 ⟶ 53:
 
=== Distributional assumption ===
The formulation of the above two-stage problem assumes that the second-stage data <math>\xi</math> is modeled as a random vector with a '''''known''''' probability distribution. This would be justified in many situations. For example, the distribution of <math>\xi</math> could be inferred from historical data if one assumes that the distribution does not significantly change over the considered period of time. Also, the empirical distribution of the sample could be used as an approximation to the distribution of the future values of <math>\xi</math>. If one has a prior model for <math>\xi</math>, one could obtain a posteriori distribution by a Bayesian update.
 
==Scenario-based approach==
 
=== Discretization ===
Line 60 ⟶ 72:
These questions are not independent. For example, the number of scenarios constructed will affect both the tractability of the deterministic equivalent and the quality of the obtained solutions.
 
=== Stochastic linear programming===
A stochastic [[linear program]] is a specific instance of the classical two-stage stochastic program. A stochastic LP is built from a collection of multi-period linear programs (LPs), each having the same structure but somewhat different data. The <math>k^{th}</math> two-period LP, representing the <math>k^{th}</math> scenario, may be regarded as having the following form:
 
<math>
\begin{array}{lccccccc}
\text{Minimize} & f^T x & + & g^T y & + & h_k^Tz_k & & \\
\text{subject to} & Tx & + & Uy & & & = & r \\
& & & V_k y & + & W_kz_k & = & s_k \\
& x & , & y & , & z_k & \geq & 0
Line 76 ⟶ 88:
Note that solving the <math>k^{th}</math> two-period LP is equivalent to assuming the <math>k^{th}</math> scenario in the second period with no uncertainty. In order to incorporate uncertainties in the second stage, one should assign probabilities to different scenarios and solve the corresponding deterministic equivalent.
 
==== Deterministic equivalent of a stochastic problem====
With a finite number of scenarios, two-stage stochastic linear programs can be modelled as large linear programming problems. This formulation is often called the deterministic equivalent linear program, or abbreviated to deterministic equivalent. (Strictly speaking a deterministic equivalent is any mathematical program that can be used to compute the optimal first-stage decision, so these will exist for continuous probability distributions as well, when one can represent the second-stage cost in some closed form.)
For example, to form the deterministic equivalent to the above stochastic linear program, we assign a probability <math>p_k</math> to each scenario <math>k=1,\dots,K</math>. Then we can minimize the expected value of the objective, subject to the constraints from all scenarios:
Line 94 ⟶ 106:
We have a different vector <math>z_k</math> of later-period variables for each scenario <math>k</math>. The first-period variables <math>x</math> and <math>y</math> are the same in every scenario, however, because we must make a decision for the first period before we know which scenario will be realized. As a result, the constraints involving just <math>x</math> and <math>y</math> need only be specified once, while the remaining constraints must be given separately for each scenario.
 
=== Scenario construction ===
In practice it might be possible to construct scenarios by eliciting experts' opinions on the future. The number of constructed scenarios should be relatively modest so that the obtained deterministic equivalent can be solved with reasonable computational effort. It is often claimed that a solution that is optimal using only a few scenarios provides more adaptable plans than one that assumes a single scenario only. In some cases such a claim could be verified by a simulation. In theory some measures of guarantee that an obtained solution solves the original problem with reasonable accuracy. Typically in applications only the ''first stage'' optimal solution <math>x^*</math> has a practical value since almost always a "true" realization of the random data will be different from the set of constructed (generated) scenarios.
 
Suppose <math>\xi</math> contains <math>d</math> independent random components, each of which has three possible realizations (for example, future realizations of each random parameters are classified as low, medium and high), then the total number of scenarios is <math>K=3^d</math>. Such ''exponential growth'' of the number of scenarios makes model development using expert opinion very difficult even for reasonable size <math>d</math>. The situation becomes even worse if some random components of <math>\xi</math> have continuous distributions.
 
====Monte Carlo sampling and Samplesample Averageaverage Approximationapproximation (SAA) Methodmethod====
 
A common approach to reduce the scenario set to a manageable size is by using Monte Carlo simulation. Suppose the total number of scenarios is very large or even infinite. Suppose further that we can generate a sample <math>\xi^1,\xi^2,\dots,\xi^N</math> of <math>N</math> replicationsrealizations of the random vector <math>\xi</math>. Usually the sample is assumed to be [[independent and identically distributed]] (i.i.d sample). Given a sample, the expectation function <math>q(x)=E[Q(x,\xi)]</math> is approximated by the sample average
 
<math>
Line 117 ⟶ 129:
</math>
 
This formulation is known as the ''Samplesample Averageaverage Approximationapproximation'' method. The SAA problem is a function of the considered sample and in that sense is random. For a given sample <math>\xi^1,\xi^2,\dots,\xi^N</math> the SAA problem is of the same form as a two-stage stochastic linear programming problem with the scenarios <math>\xi^j</math>., <math>j=1,\dots,N</math>, each taken with the same probability <math>p_j=\frac{1}{N}</math>.
 
 
=== Statistical inference ===
 
Consider the following stochastic programming problem
Line 137 ⟶ 150:
</math></div>
 
By the [[Lawlaw of Largelarge Numbersnumbers]] we have that, under some regularity conditions <math>\frac{1}{N} \sum_{j=1}^N Q(x,\xi^j)</math> converges pointwise with probability 1 to <math>E[Q(x,\xi)]</math> as <math>N \rightarrow \infty</math>. Moreover, under mild additional conditions the convergence is uniform. We also have <math>E[\hat{g}_N(x)]=g(x)</math>, i.e., <math>\hat{g}_N(x)</math> is an ''unbiased'' estimator of <math>g(x)</math>. Therefore, it is natural to expect that the optimal value and optimal solutions of the SAA problem converge to their counterparts of the true problem as <math>N \rightarrow \infty</math>.
 
====Consistency of SAA estimators====
 
Suppose the feasible set <math>X</math> of the SAA problem is fixed, i.e., it is independent of the sample. Let <math>\vartheta^*</math> and <math>S^*</math> be the optimal value and the set of optimal solutions, respectively, of the true problem and let <math>\hat{\vartheta}_N</math> and <math>\hat{S}_N</math> be the optimal value and the set of optimal solutions, respectively, of the SAA problem.
Line 174 ⟶ 187:
:: then <math>\hat{\vartheta}_N \rightarrow \vartheta^*</math> and <math>\mathbb{D}(S^*,\hat{S}_N)\rightarrow 0 </math> with probability 1 as <math>N\rightarrow \infty </math>.
 
==== Asymptotics of the SAA optimal value ====
 
Suppose the sample <math>\xi^1,\dots,\xi^N</math> is i.i.d. and fix a point <math>x \in X</math>. Then the sample average estimator <math>\hat{g}_N(x)</math>, of <math>g(x)</math>, is unbiased and has variance <math>\frac{1}{N}\sigma^2(x)</math>, where <math>\sigma^2(x):=Var[Q(x,\xi)]</math> is supposed to be finite. Moreover, by the [[central limit theorem]] we have that
Line 198 ⟶ 211:
is the sample variance estimate of <math>\sigma^2(x)</math>. That is, the error of estimation of <math>g(x)</math> is (stochastically) of order <math> O(\sqrt{N})</math>.
 
== Applications and Examplesexamples==
=== Biological applications ===
[[Stochastic dynamic programming]] is frequently used to model [[ethology|animal behaviour]] in such fields as [[behavioural ecology]].<ref>Mangel, M. & Clark, C. W. 1988. ''Dynamic modeling in behavioral ecology.'' Princeton University Press {{ISBN|0-691-08506-4}}</ref><ref>Houston, A. I & McNamara, J. M. 1999. ''Models of adaptive behaviour: an approach based on state''. Cambridge University Press {{ISBN|0-521-65539-0}}</ref> Empirical tests of models of [[Optimal foraging theory|optimal foraging]], [[Biological life cycle|life-history]] transitions such as [[Fledge|fledging in birds]] and egg laying in [[parasitoid]] wasps have shown the value of this modelling technique in explaining the evolution of behavioural decision making. These models are typically many-staged, rather than two-staged.
 
===Economic applications===
Line 212 ⟶ 225:
Suppose that at time <math>t=0</math> we have initial capital <math>W_0</math> to invest in <math>n</math> assets. Suppose further that we are allowed to rebalance our portfolio at times <math>t=1,\dots,T-1</math> but without injecting additional cash into it. At each period <math>t</math> we make a decision about redistributing the current wealth <math>W_t</math> among the <math>n</math> assets. Let <math>x_0=(x_{10},\dots,x_{n0})</math> be the initial amounts invested in the n assets. We require that each <math>x_{i0}</math> is nonnegative and that the balance equation <math>\sum_{i=1}^{n}x_{i0}=W_0</math> should hold.
 
Consider the total returns <math>\xi_t=(\xi_{1t},\dots,\xi_{nt})</math> for each period <math>t=1,\dots,T</math>. This forms a vector-valued random process <math>\xi_1,\dots,\xi_T</math>. At time period <math>t=1</math>, we can rebalance the portfolio by specifying the amounts <math>x_1=(x_{11},\dots,x_{n1})</math> invested in the respective assets. At that time the returns in the first period have been realized so it is reasonable to use this information in the rebalancing decision. Thus, the second-stage decisions, at time <math>t=1</math>, are actually functions of realization of the random vector <math>\xi_1</math>, i.e., <math>x_1=x_1(\xi_1)</math>. Similarly, at time <math>t</math> the decision <math>x_t=(x_{1t},\dots,x_{nt})</math> is a function <math>x_t=x_t(\xi_{[t]})</math> of the available information given by <math>\xi_{[t]}=(\xi_{1},\dots,\xi_{t})</math> the history of the random process up to time <math>t</math>. A sequence of functions <math>x_t=x_t(\xi_{[t]})</math>, <math>t=0,\dots,T-1</math>, with <math>x_0</math> being constant, defines an ''implementable policy'' of the decision process. It is said that such a policy is ''feasible'' if it satisfies the model constraints with probability 1, i.e., the nonnegativity constraints <math>x_{it}(\xi_{[t]})\geq 0</math>, <math>i=1,\dots,n</math>, <math>t=0,\dots,T-1</math>, and the balance of wealth constraints,
 
:<math>
Line 234 ⟶ 247:
This is a multistage stochastic programming problem, where stages are numbered from <math>t=0</math> to <math>t=T-1</math>. Optimization is performed over all implementable and feasible policies. To complete the problem description one also needs to define the probability distribution of the random process <math>\xi_1,\dots,\xi_T</math>. This can be done in various ways. For example, one can construct a particular scenario tree defining time evolution of the process. If at every stage the random return of each asset is allowed to have two continuations, independent of other assets, then the total number of scenarios is <math>2^{nT}.</math>
 
In order to write [[dynamic programming]] equations, consider the above multistage problem backward in time. At the last stage <math>t=T-1</math>, a realization <math>\xi_{[T-1]}=(\xi_{1},\dots,\xi_{T-1})</math> of the random process is known and <math>x_{T-2}</math> has been chosen. Therefore, one needs to solve the following problem
 
:<math>
Line 301 ⟶ 314:
An instance of an SP problem generated by a general modelling language tends to grow quite large (linearly in the number of scenarios), and its matrix loses the structure that is intrinsic to this class of problems, which could otherwise be exploited at solution time by specific decomposition algorithms.
Extensions to modelling languages specifically designed for SP are starting to appear, see:
*[[AIMMS]] - supports the definition of SP problems
*[[Extended Mathematical Programming (EMP)#EMP for Stochastic Programming|EMP SP]] (Extended Mathematical Programming for Stochastic Programming) - a module of [[General Algebraic Modeling System|GAMS]] created to facilitate stochastic programming (includes keywords for parametric distributions, chance constraints and risk measures such as [[Value at risk]] and [[Expected shortfall]]).
*[[SAMPL]] - a set of extensions to [[AMPL]] specifically designed to express stochastic programs (includes syntax for chance constraints, integrated chance constraints and [[Robustrobust optimization|Robust Optimization]] problems)
They both can generate SMPS instance level format, which conveys in a non-redundant form the structure of the problem to the solver.
 
==See also==
* [[Chance-constrained portfolio selection]]
 
* [[Correlation gap]]
* [[Extended Mathematical Programming (EMP)#EMP for Stochastic Programming|EMP for Stochastic Programming]]
* [[Entropic value at risk]]
* [[FortSP]]
* [[SAMPL| SAMPL algebraic modeling language]]
* [[Scenario optimization]]
* [[Stochastic optimization]]
* [[Chance-constrained portfolio selection]]
 
==References==
Line 321 ⟶ 333:
 
==Further reading==
* John R. Birge and François V. Louveaux. ''[https://books.google.com/books?hl=en&lr=&id=Vp0Bp8kjPxUC&oi=fnd&pg=PR1&dq=%22Introduction+to+Stochastic+Programming%22&otspg=q5DI3ZhhzB&sig=-RRlGMzT-vZVm6aNfHlO7oFIm2o#v=onepage&q=%22Introduction%20to%20Stochastic%20Programming%22&f=falsePR1 Introduction to Stochastic Programming]''. Springer Verlag, New York, 1997.
* {{cite book | last1=Kall|first1=Peter |last2=Wallace|first2=Stein W.| title=Stochastic programming | series=Wiley-Interscience Series in Systems and Optimization| publisher=John Wiley & Sons, Ltd.| ___location=Chichester|year=1994|pages=xii+307|isbn=0-471-95158-7| url=http://stoprog.org/index.html?introductions.html |mr=1315300 }}
* G. Ch. Pflug: ''[https://books.google.com/books?hl=en&lr=&id=XJXhBwAAQBAJ&oi=fnd&pg=PR13&dq=%22Optimization+of+Stochastic+Models.+The+Interface+between+Simulation+and+Optimization%22&ots=6CUXWxyzWs&sig=OPvBdc_bmWUz-J1aPpLh1S-ngTQ#v=onepage&q=%22Stochastic%20programming+programming%22&fpg=falsePR13 Optimization of Stochastic Models. The Interface between Simulation and Optimization]''. Kluwer, Dordrecht, 1996.
* [[András Prékopa]]. Stochastic Programming. Kluwer Academic Publishers, Dordrecht, 1995.
* [[Andrzej Piotr Ruszczyński|Andrzej Ruszczynski]] and Alexander Shapiro (eds.) (2003) ''Stochastic Programming''. Handbooks in Operations Research and Management Science, Vol. 10, Elsevier.
* {{cite book| last1=Shapiro|first1=Alexander|author2-link=Darinka Dentcheva|last2=Dentcheva|first2=Darinka|author3-link=Andrzej Piotr Ruszczyński|last3=Ruszczyński|first3=Andrzej|title=Lectures on stochastic programming: Modeling and theory| series=MPS/SIAM Series on Optimization| volume=9| publisher=Society for Industrial and Applied Mathematics (SIAM)|___location=Philadelphia,and PAthe |agency=Mathematical Programming Society (MPS)|___location=Philadelphia, PA|year=2009|pages=xvi+436|isbn=978-0-89871-687-0| url=http://www2.isye.gatech.edu/people/faculty/Alex_Shapiro/SPbook.pdf | mr=2562798 |access-date=2010-09-22|archive-date=2020-03-24|archive-url=https://web.archive.org/web/20200324131907/https://www2.isye.gatech.edu/people/faculty/Alex_Shapiro/SPbook.pdf|url-status=dead}}
* Stein W. Wallace and William T. Ziemba (eds.) (2005) ''Applications of Stochastic Programming''. MPS-SIAM Book Series on Optimization 5
* {{cite book | last1=King|first1=Alan J.|last2=Wallace|first2=Stein W.| title=Modeling with Stochastic Programming | series=Springer Series in Operations Research and Financial Engineering| publisher=Springer| ___location=New York|year=2012|isbn=978-0-387-87816-4| url=https://www.springer.com/mathematics/probability/book/978-0-387-87816-4 }}
Line 332 ⟶ 344:
==External links==
* [http://stoprog.org Stochastic Programming Community Home Page]
 
{{Major subfields of optimization}}
 
{{DEFAULTSORT:Stochastic Programming}}