Revision as of 12:12, 27 August 2020 edit Prstalis (talk \| contribs) 9 edits m Fixed reference links that were broken ← Previous edit		Revision as of 21:00, 3 September 2020 edit undo Citation bot (talk \| contribs) Bots 5,872,477 edits Add: s2cid, pages, volume, journal, year, title, doi, author pars. 1-1. Removed URL that duplicated unique identifier. Converted bare reference to cite template. Removed parameters. Formatted dashes. Some additions/deletions were actually parameter name changes. \| You can use this bot yourself. Report bugs here. \| Suggested by AManWithNoPlan \| All pages linked from cached copy of User:AManWithNoPlan/sandbox2 \| via #UCB_webform_linked Next edit →
Line 18: The solution to this problem is an optimal control law or policy <math>\mathbf{u}^{\ast} = h(\mathbf{x}(t), t)</math>, which produces an optimal trajectory <math>\mathbf{x}^{\ast}</math> and a [[cost-to-go function]] <math>J^{\ast}</math>. The latter obeys the fundamental equation of dynamic programming: :<math>- J_{t}^{\ast} = \min_{\mathbf{u}} \left\{ f \left( \mathbf{x}(t), \mathbf{u}(t), t \right) + J_{x}^{\ast \mathsf{T}} \mathbf{g} \left( \mathbf{x}(t), \mathbf{u}(t), t \right) \right\}</math> a [[partial differential equation]] known as the [[Hamilton–Jacobi–Bellman equation]], in which <math>J_{x}^{\ast} = \frac{\partial J^{\ast}}{\partial \mathbf{x}} = \left[ \frac{\partial J^{\ast}}{\partial x_{1}} ~~~~ \frac{\partial J^{\ast}}{\partial x_{2}} ~~~~ \dots ~~~~ \frac{\partial J^{\ast}}{\partial x_{n}} \right]^{\mathsf{T}}</math> and <math>J_{t}^{\ast} = \frac{\partial J^{\ast}}{\partial t}</math>. One finds the minimizing <math>\mathbf{u}</math> in terms of <math>t</math>, <math>\mathbf{x}</math>, and the unknown function <math>J_{x}^{\ast}</math> and then substitutes the result into the Hamilton–Jacobi–Bellman equation to get the partial differential equation to be solved with boundary condition <math>J \left( t_{1} \right) = b \left( \mathbf{x}(t_{1}), t_{1} \right)</math>.<ref>{{cite book \|~~first~~first1=M. I. \|~~last~~last1=Kamien \|authorlink=Morton Kamien \|first2=N. L. \|last2=Schwartz \|authorlink2=Nancy Schwartz \|title=Dynamic Optimization: The Calculus of Variations and Optimal Control in Economics and Management \|___location=New York \|publisher=Elsevier \|edition=Second \|year=1991 \|isbn=978-0-444-01609-6 \|url=https://books.google.com/books?id=0IoGUn8wjDQC&pg=PA261 \|page=261 }}</ref> In practice, this generally requires [[Numerical partial differential equations\|numerical techniques]] for some discrete approximation to the exact optimization relationship. Alternatively, the continuous process can be approximated by a discrete system, which leads to a following recurrence relation analog to the Hamilton–Jacobi–Bellman equation: Line 368: For n=1 the problem is trivial, namely S(1,h,t) = "move a disk from rod h to rod t" (there is only one disk left). The number of moves required by this solution is 2<sup>''n''</sup> − 1. If the objective is to '''maximize''' the number of moves (without cycling) then the dynamic programming [[Bellman equation\|functional equation]] is slightly more complicated and 3<sup>''n''</sup> − 1 moves are required.<ref>{{Citation \|author=Moshe Sniedovich \|title= OR/MS Games: 2. The Towers of Hanoi Problem \|journal=INFORMS Transactions on Education \|volume=3 \|issue=1 \|year=2002 \|pages=34–51 \|~~url=https://~~doi~~.org/~~= 10.1287/ited.3.1.45 \|postscript=.}}</ref> === Egg dropping puzzle === Line 394: : ''W''(''n'',''k'') = minimum number of trials required to identify the value of the critical floor under the worst-case scenario given that the process is in state ''s'' = (''n'',''k''). Then it can be shown that<ref name="sniedovich_03">~~Sniedovich,~~{{Cite M.journal\|doi ~~(2003).~~= ~~[https://doi.org/~~10.1287/ited.4.1.48\|title = OR/MS Games: 4. The ~~joy~~Joy of ~~egg~~Egg-~~dropping~~Dropping in Braunschweig and Hong Kong].\|year = ~~INFORMS~~2003\|last1 = Sniedovich\|first1 = Moshe\|journal = Informs Transactions on Education,\|volume = 4~~(1)~~\|pages = 48–64.}}</ref> : ''W''(''n'',''k'') = 1 + min{max(''W''(''n'' − 1, ''x'' − 1), ''W''(''n'',''k'' − ''x'')): ''x'' = 1, 2, ..., ''k'' } Line 540: }} The word ''dynamic'' was chosen by Bellman to capture the time-varying aspect of the problems, and because it sounded impressive.<ref name="Eddy">{{cite journal \|last=Eddy \|first=S. R. \|authorlink=Sean Eddy \|title=What is Dynamic Programming? \|journal=Nature Biotechnology \|volume=22 \|issue= 7\|pages=909–910 \|year=2004 \|doi=10.1038/nbt0704-909 \|pmid=15229554 \|s2cid=5352062 }}</ref> The word ''programming'' referred to the use of the method to find an optimal ''program'', in the sense of a military schedule for training or logistics. This usage is the same as that in the phrases ''[[linear programming]]'' and ''mathematical programming'', a synonym for [[mathematical optimization]].<ref>{{cite book \|~~last~~last1=Nocedal \|~~first~~first1=J. \|last2=Wright \|first2=S. J. \|title=Numerical Optimization \|url=https://archive.org/details/numericaloptimiz00noce_639 \|url-access=limited \|page=[https://archive.org/details/numericaloptimiz00noce_639/page/n21 9] \|publisher=Springer \|year=2006 }}</ref> The above explanation of the origin of the term is lacking. As Russell and Norvig in their book have written, referring to the above story: "This cannot be strictly true, because his first paper using the term (Bellman, 1952) appeared before Wilson became Secretary of Defense in 1953."<ref>{{cite book \|~~last~~last1=Russell \|~~first~~first1=S. \|last2=Norvig \|first2=P. \|title=Artificial Intelligence: A Modern Approach \|edition=3rd \|publisher=Prentice Hall \|year=2009 \|isbn=978-0-13-207148-2 }}</ref> Also, there is a comment in a speech by [http://a2c2.org/awards/richard-e-bellman-control-heritage-award/2004-00-00t000000/harold-j-kushner Harold J. Kushner], where he remembers Bellman. Quoting Kushner as he speaks of Bellman: "On the other hand, when I asked him the same question, he replied that he was trying to upstage Dantzig's linear programming by adding dynamic. Perhaps both motivations were true." == Algorithms that use dynamic programming ==

Dynamic programming: Difference between revisions