Deep backward stochastic differential equation method: Difference between revisions

Content deleted Content added
AzzurroLan (talk | contribs)
AzzurroLan (talk | contribs)
No edit summary
Line 31:
The goal is to find adapted processes <math> Y_t </math> and <math> Z_t </math> that satisfy this equation. Traditional numerical methods struggle with BSDEs due to the curse of dimensionality, which makes computations in high-dimensional spaces extremely challenging<ref name="Han2018">{{cite journal | last1=Han | first1=J. | last2=Jentzen | first2=A. | last3=E | first3=W. | title=Solving high-dimensional partial differential equations using deep learning | journal=Proceedings of the National Academy of Sciences | volume=115 | issue=34 | pages=8505-8510 | year=2018 }}</ref>.
 
===Methodology Overviewoverview<ref name="Han2018">{{cite journal | last1=Han | first1=J. | last2=Jentzen | first2=A. | last3=E | first3=W. | title=Solving high-dimensional partial differential equations using deep learning | journal=Proceedings of the National Academy of Sciences | volume=115 | issue=34 | pages=8505-8510 | year=2018 }}</ref>===
====1. Semilinear Parabolicparabolic PDEs====
We consider a general class of PDEs represented by
<math>
Line 49:
X_t = \xi + \int_0^t \mu(s, X_s) \, ds + \int_0^t \sigma(s, X_s) \, dW_s
</math>
====3. Backward Stochasticstochastic Differentialdifferential Equationequation(BSDE)====
Then the solution of the PDE satisfies the following BSDE:
 
Line 56:
</math>
 
====4. Temporal Discretizationdiscretization====
 
Discretize the time interval <math> [0, T] </math> into steps <math> 0 = t_0 < t_1 < \cdots < t_N = T </math>:
Line 70:
where <math> \Delta t_n = t_{n+1} - t_n </math> and <math> \Delta W_n = W_{t_{n+1}} - W_n </math>.
 
====5. Neural Networknetwork Approximationapproximation====
 
Use a multilayer feedforward neural network to approximate:
Line 80:
for <math> n = 1, \ldots, N </math>, where <math> \theta_n </math> are parameters of the neural network approximating <math> x \mapsto \sigma^T(t, x) \nabla u(t, x) </math> at <math> t = t_n </math>.
 
====6. Training the Neuralneural Networknetwork====
 
Stack all sub-networks in the approximation step to form a deep neural network. Train the network using paths <math> \{X_{t_n}\}_{0 \leq n \leq N} </math> and <math> \{W_{t_n}\}_{0 \leq n \leq N} </math> as input data, minimizing the loss function:
Line 90:
where <math> \hat{u} </math> is the approximation of <math> u(t, X_t) </math>.
 
===Neural Networknetwork Architecturearchitecture<ref name="Han2018" />===
Deep learning encompass a class of machine learning techniques that have transformed numerous fields by enabling the modeling and interpretation of intricate data structures. These methods, often referred to as [[deep learning]], are distinguished by their hierarchical architecture comprising multiple layers of interconnected nodes, or neurons. This architecture allows deep neural networks to autonomously learn abstract representations of data, making them particularly effective in tasks such as [[image recognition]], [[natural language processing]], and [[financial modeling]]. The core of this method lies in designing an appropriate neural network structure (such as [[fully connected network|fully connected networks]] or [[recurrent neural networks]]) and selecting effective optimization algorithms<ref>LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. *Nature, 521*(7553), 436-444.</ref>.
 
Line 105:
==Algorithms==
* First, we present the pseudocode for the ADAM algorithm as follows:
===Adam<ref name="Adam2014">{{cite arXiv |first1=Diederik |last1=Kingma |first2=Jimmy |last2=Ba |eprint=1412.6980 |title=Adam: A Method for Stochastic Optimization |year=2014 |class=cs.LG }}</ref> (short for Adaptive Moment Estimation) Algorithmalgorithm===
'''Function:''' ADAM(<math>\alpha</math>, <math>\beta_1</math>, <math>\beta_2</math>, <math>\epsilon</math>, <math>\mathcal{G}(\theta)</math>, <math>\theta_0</math>) '''is'''
Line 132:
* With the ADAM algorithm described above, we now present the pseudocode corresponding to a multilayer feedforward neural network:
 
===Backpropagation Algorithmalgorithm<ref name="DLhistory">{{cite arXiv |eprint=2212.11279 |class=cs.NE |first=Juergen |last=Schmidhuber |author-link=Juergen Schmidhuber |title=Annotated History of Modern AI and Deep Learning |date=2022}}</ref> for Multilayermultilayer Feedforwardfeedforward Neuralneural Networksnetworks===
'''Function:''' BackPropagation(''set'' <math>D=\left\{(\mathbf{x}_k,\mathbf{y}_k)\right\}_{k=1}^{m}</math>) '''is'''