Differential dynamic programming: Difference between revisions

Content deleted Content added
Cydebot (talk | contribs)
m Robot - Removing category Articles created via the Article Wizard per CFD at Wikipedia:Categories for discussion/Log/2019 January 6.
Citation bot (talk | contribs)
m Alter: isbn, template type, first. Add: hdl. Removed URL that duplicated unique identifier. | You can use this bot yourself. Report bugs here. | User-activated.
Line 8:
| year = 1966
| doi = 10.1080/00207176608921369
}}</ref> and subsequently analysed in Jacobson and Mayne's eponymous book.<ref>{{cite book|last=Mayne|first= David H. and Jacobson, David Q.|title=Differential dynamic programming|year=1970|publisher=American Elsevier Pub. Co.|___location=New York|isbn=978-0-444-00070-45|url=https://books.google.com/books?id=tA-oAAAAIAAJ}}</ref> The algorithm uses locally-quadratic models of the dynamics and cost functions, and displays [[Rate of convergence|quadratic convergence]]. It is closely related to Pantoja's step-wise Newton's method.<ref>{{Cite journal
| doi = 10.1080/00207178808906114
| issn = 0020-7179
Line 19:
| journal = International Journal of Control
| year = 1988
}}</ref><ref>{{Cite webdocument
| last = Liao
| first = L. Z.
Line 26:
| publisher = Cornell University, Ithaca, NY
| year = 1992
| urlhdl = http://hdl.handle.net/1813/5474
}}</ref>
 
Line 170:
 
== Monte Carlo version ==
Sampled differential dynamic programming (SaDDP) is a Monte Carlo variant of differential dynamic programming.<ref>{{Cite web|url=https://ieeexplore.ieee.org/document/7759229|title=Sampled differential dynamic programming - IEEE Conference Publication|website=ieeexplore.ieee.org|language=en-US|access-date=2018-10-19}}</ref><ref>{{Cite web|url=https://ieeexplore.ieee.org/document/8430799|title=Regularizing Sampled Differential Dynamic Programming - IEEE Conference Publication|website=ieeexplore.ieee.org|language=en-US|access-date=2018-10-19}}</ref><ref>{{Cite journal|last=Joose|first=Rajamäki,|date=2018|title=Random Search Algorithms for Optimal Control|url=http://urn.fi/URN:ISBN:978-952-60-8156-4|language=en|issn=1799-4942}}</ref> It is based on treating the quadratic cost of differential dynamic programming as the energy of a [[Boltzmann distribution]]. This way the quantities of DDP can be matched to the statistics of a [[Multivariate normal distribution|multidimensional normal distribution]]. The statistics can be recomputed from sampled trajectories without differentiation.
 
== See also ==