Content deleted Content added
Submitting using AfC-submit-wizard |
Juliusberner (talk | contribs) Make notation consistent, add details, and fix typos. |
||
Line 6:
'''Neural operators''' are a class of [[Deep learning|deep learning]]
The primary application of neural operators is in learning surrogate maps for the solution operators of [[Partial differential equation|partial differential equations]] (PDEs)<ref name="NO journal" />, which are critical tools in modeling the natural environment.<ref name="Evans"> {{cite journal |author-link=Lawrence C. Evans |first=L. C. |last=Evans |title=Partial Differential Equations |publisher=American Mathematical Society |___location=Providence |year=1998 |isbn=0-8218-0772-2 }}</ref> Standard PDE solvers can be time-consuming and computationally intensive, especially for complex systems. Neural operators have demonstrated improved performance in solving PDEs compared to existing machine learning methodologies while being significantly faster than numerical solvers.<ref name="FNO">{{cite journal |last1=Li |first1=Zongyi |last2=Kovachki |first2=Nikola |last3=Azizzadenesheli |first3=Kamyar |last4=Liu |first4=Burigede |last5=Bhattacharya |first5=Kaushik |last6=Stuart |first6=Andrew |last7=Anima |first7=Anandkumar |title=Fourier neural operator for parametric partial differential equations |journal=arXiv preprint arXiv:2010.08895 |date=2020 |url=https://arxiv.org/pdf/2010.08895.pdf}}</ref>. The operator learning paradigm allows learning maps between function spaces, and is different from parallel ideas of learning maps from finite-dimensional spaces to function spaces <ref name="meshfreeflownet">{{cite journal | vauthors=((Esmaeilzadeh, S., Azizzadenesheli, K., Kashinath, K., Mustafa, M., Tchelepi, H. A., Marcus, P., Prabhat, M., Anandkumar, A., others)) | title=Meshfreeflownet: A physics-constrained deep continuous space-time super-resolution framework | pages=1--15 | publisher=IEEE | date=19 October 2020}}</ref><ref name="deeponet">{{cite journal | vauthors=((Lu, L., Jin, P., Pang, G., Zhang, Z., Karniadakis, G. E.)) | title=Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators | volume=3 | issue=3 | pages=218--229 | publisher=Nature Publishing Group UK London | date=19 October 2021}}</ref>, and subsumes these settings when limited to fixed input resolution.
Line 18:
== Definition and formulation ==
Architecturally, neural operators are similar to feed-forward neural networks in the sense that they are comprised of alternating [[Linear map|linear maps]] and non-linearities. Since neural operators act on and output functions, neural operators have been instead formulated as a sequence of alternating linear [[Integral operators|integral operators]] on function spaces and point-wise non-linearities.<ref name="NO journal" /> Using an analogous architecture to finite-dimensional neural networks, similar [[Universal approximation theorem|universal approximation theorems]] have been proven for neural operators. In particular, it has been shown that neural operators can approximate any continuous operator on a [[Compact space|compact]] set.<ref name="NO journal">{{cite journal |last1=Kovachki |first1=Nikola |last2=Li |first2=Zongyi |last3=Liu |first3=Burigede |last4=Azizzadenesheli |first4=Kamyar |last5=Bhattacharya |first5=Kaushik |last6=Stuart |first6=Andrew |last7=Anandkumar |first7=Anima |title=Neural operator: Learning maps between function spaces |journal=Journal of Machine Learning Research |volume=24 |page=1-97 |url=https://www.jmlr.org/papers/volume24/21-1524/21-1524.pdf}}</ref>
Neural operators seek to approximate some operator <math>\mathcal{G} : \mathcal{A} \to \mathcal{U}</math>
<math>\mathcal{G}_\
where <math>\mathcal{P}, \mathcal{Q}</math> are the lifting (lifting the codomain of the input function to a higher dimensional space) and projection (projecting the codomain of the intermediate function to the output codimension) operators, respectively. These operators act pointwise on functions and are typically parametrized as
<math>(\mathcal{K}_\phi v_t)(x) := \int_D \kappa_\phi(x, y, v_t(x), v_t(y))v_t(y)dy, </math>
where the kernel <math>\kappa_\phi</math> is a learnable implicit neural network, parametrized by <math>\phi</math>.
In practice, one is often given the input function to the neural operator at a specific resolution. For instance
<math>\int_D \kappa_\phi(x, y, v_t(x), v_t(y))v_t(y)dy\approx \sum_j^n \kappa_\phi(x, y_j, v_t(x), v_t(y_j))v_t(y_j)\Delta_{y_j}, </math>
where <math>\Delta_{y_j}</math> is the sub-area volume or quadrature weight associated to the point <math>y_j</math>. Thus, a simplified layer can be computed as
<math>v_{t+1}(x) \approx \sigma\left(\sum_j^n \kappa_\phi(x, y_j, v_t(x), v_t(y_j))v_t(y_j)\Delta_{y_j} + W_t(v_t(y_j)) + b_t(x)\right).</math>
The above approximation, along with
There have been various parameterizations of neural operators for different applications<ref name="FNO" /><ref name="Graph NO">{{cite journal |last1=Li |first1=Zongyi |last2=Kovachki |first2=Nikola |last3=Azizzadenesheli |first3=Kamyar |last4=Liu |first4=Burigede |last5=Bhattacharya |first5=Kaushik |last6=Stuart |first6=Andrew |last7=Anima |first7=Anandkumar |title=Neural operator: Graph kernel network for partial differential equations |journal=arXiv preprint arXiv:2003.03485 |date=2020 |url=https://arxiv.org/pdf/2003.03485.pdf}}</ref>. These typically differ in their parameterization of <math>\kappa</math>. The most popular instantiation is the Fourier neural operator (FNO). FNO takes <math>\kappa_\phi(x, y,
<math>(\mathcal{K}_\phi
where <math>\mathcal{F}</math> represents the Fourier transform and <math>R_\phi</math> represents the Fourier transform of some periodic function <math>\
== Training ==
Training neural operators is similar to the training process for a traditional neural network. Neural operators are typically trained in some [[Lp norm]] or [[Sobolev norm]]. In particular, for a dataset <math>\{(a_i, u_i)\}_{i=1}^N</math> of size <math>N</math>, neural operators minimize (a discretization of)
<math>\mathcal{L}_\mathcal{U}(\{(a_i, u_i)\}_{i=1}^N) := \sum_{i=1}^N \|u_i - \mathcal{G}_\theta (a_i) \|_\mathcal{U}^2</math>,
Another training paradigm is associated with physics-informed machine learning. In particular, [[Physics-informed neural networks|physics-informed neural networks]] (PINNs) use complete physics laws to fit neural networks to solutions of PDEs.
== References ==
|