Content deleted Content added
Line 56:
==== Dynamic programming ====
[[Dynamic programming]] deals with situations where decisions are made in
One dynamic basic model has two features:
1)
2) The cost function is additive over time.
For discrete
:<math>x_{k+1} = f_k(x_{k},u_{k},w_{k}) , k=0,1,...,N-1</math>
:<math>k</math> represents the index of discrete time.
:<math>x_k</math> is the state of the time k, it contains the past information and prepare it for the future optimization.
:<math>u_k</math> is the control variable.
:<math>w_k</math> is the random parameter.
For the cost function, it has the form:
:<math>g_N(X_N) + \sum_{k=0}^{N-1} gk(x_k,u_k,W_k)</math>
<math>g_N(X_N)</math> is the cost at the end of the process.
Line 84:
As the cost cannot be optimized meaningfully, it can be used the expect value:
:<math>E\{g_N(X_N) + \sum_{k=0}^{N-1} g_k(x_k,u_k,W_k) \}</math>
==== Neuro-dynamic programming ====
|