Neural operators: Difference between revisions

Content deleted Content added
Mocl125 (talk | contribs)
small edits
Mocl125 (talk | contribs)
cleaning up
Line 4:
'''Neural operators''' are a class of [[Deep learning|deep learning]] architecture designed to learn maps between infinite-dimensional [[Function space|function spaces]]. Neural operators represent an extension of traditional [[Artificial neural network|artificial neural networks]], marking a departure from the typical focus on learning mappings between finite-dimensional Euclidean spaces or finite sets. Neural operators directly learn [[Operator (mathematics)|operators]] in function spaces; they can receive input functions, and the output function can be evaluated at any discretization.<ref name="NO journal">{{cite journal |last1=Kovachki |first1=Nikola |last2=Li |first2=Zongyi |last3=Liu |first3=Burigede |last4=Azizzadenesheli |first4=Kamyar |last5=Bhattacharya |first5=Kaushik |last6=Stuart |first6=Andrew |last7=Anandkumar |first7=Anima |title=Neural operator: Learning maps between function spaces |journal=Journal of Machine Learning Research |volume=24 |page=1-97 |url=https://www.jmlr.org/papers/volume24/21-1524/21-1524.pdf}}</ref>
 
The primary application of neural operators is in learning surrogate maps for the solution operators of [[Partial differential equation|partial differential equations]] (PDEs)<ref name="NO journal" />, which are critical tools in modeling of the natural environment.<ref name="Evans"> {{cite journal |author-link=Lawrence C. Evans |first=L. C. |last=Evans |title=Partial Differential Equations |publisher=American Mathematical Society |___location=Providence |year=1998 |isbn=0-8218-0772-2 }}</ref> Standard PDE solvers can be time-consuming and computationally intensive, especially for complex systems. Neural operators have demonstrated improved performance in solving PDEs compared to existing machine learning methodologies, while being significantly faster than numerical solvers.<ref name="FNO">{{cite journal |last1=Li |first1=Zongyi |last2=Kovachki |first2=Nikola |last3=Azizzadenesheli |first3=Kamyar |last4=Liu |first4=Burigede |last5=Bhattacharya |first5=Kaushik |last6=Stuart |first6=Andrew |last7=Anima |first7=Anandkumar |title=Fourier neural operator for parametric partial differential equations |journal=arXiv preprint arXiv:2010.08895 |date=2020 |url=https://arxiv.org/pdf/2010.08895.pdf}}</ref>
 
== Operator learning ==
Line 26:
where the kernel <math>\kappa_\phi</math> is a learnable implicit neural network, parametrized by <math>\phi</math>.
 
In practice, weone areis often given the input function to the neural operator at a certainspecific resolution. forFor eachinstance data point. Forfor the <math>i</math>'th data pointsample, let's consider the setting where weone haveis given the evaluation of <math>v_t</math> at <math>n</math> points <math>\{y_j\}_j^n</math>. Borrowing from [[Nyström method|Nyström integral approximation methods]] such as [[Riemann sum|Riemann sum integration]] and [[Gaussian quadrature|Gaussian quadrature]], we compute the above integral operation can be computed as follows,:
 
<math>\int_D \kappa_\phi(x, y, v_t(x), v_t(y))v_t(y)dy\approx \sum_j^n \kappa_\phi(x, y_j, v_t(x), v_t(y_j))v_t(y_j)\Delta_{y_j}, </math>
where <math>\Delta_{y_j}</math> is the sub-area volume or quadrature weight and approximation error . Ergo, a simplified layer can be computed as follows,
 
where <math>\Delta_{y_j}</math> is the sub-area volume or quadrature weight and approximation error . ErgoThus, a simplified layer can be computed as follows,
 
<math>v_{t+1}(x) \approx \sigma\left(\sum_j^n \kappa_\phi(x, y_j, v_t(x), v_t(y_j))v_t(y_j)\Delta_{y_j} + W_t(v_t(y_j)) + b_t(x)\right).</math>
 
Many variants of the architecture is developed in the prior work, and some of them are supported in the [https://neuraloperator.github.io/neuraloperator/dev/index.html neural operator library]. The above approximation, along with deployment of implicit neural network for <math>\kappa_\phi</math> results in the graph neural operator (GNO)<ref name="Graph NO">{{cite journal |last1=Li |first1=Zongyi |last2=Kovachki |first2=Nikola |last3=Azizzadenesheli |first3=Kamyar |last4=Liu |first4=Burigede |last5=Bhattacharya |first5=Kaushik |last6=Stuart |first6=Andrew |last7=Anima |first7=Anandkumar |title=Neural operator: Graph kernel network for partial differential equations |journal=arXiv preprint arXiv:2003.03485 |date=2020 |url=https://arxiv.org/pdf/2003.03485.pdf}}</ref>.
 
There have been various parameterizations of neural operators for different applications<ref name="FNO" /><ref name="Graph NO">{{cite journal |last1=Li |first1=Zongyi |last2=Kovachki |first2=Nikola |last3=Azizzadenesheli |first3=Kamyar |last4=Liu |first4=Burigede |last5=Bhattacharya |first5=Kaushik |last6=Stuart |first6=Andrew |last7=Anima |first7=Anandkumar |title=Neural operator: Graph kernel network for partial differential equations |journal=arXiv preprint arXiv:2003.03485 |date=2020 |url=https://arxiv.org/pdf/2003.03485.pdf}}</ref>. These typically differ in their parameterization of <math>\kappa</math>. The most popular instantiation is the Fourier neural operator (FNO). FNO takes <math>\kappa_\phi(x, y, a(x), a(y))v_t(y) = \kappa_\phi(x-y)</math> and by applying the [[Convolution theorem|convolution theorem]], arrives at the following parameterization of the kernel integration:
 
The varying parameterizations of neural operators typically differ in their parameterization of <math>\kappa</math>.
There have been various parameterizations of neural operators for different applications<ref name="FNO" /><ref name="Graph NO">{{cite journal |last1=Li |first1=Zongyi |last2=Kovachki |first2=Nikola |last3=Azizzadenesheli |first3=Kamyar |last4=Liu |first4=Burigede |last5=Bhattacharya |first5=Kaushik |last6=Stuart |first6=Andrew |last7=Anima |first7=Anandkumar |title=Neural operator: Graph kernel network for partial differential equations |journal=arXiv preprint arXiv:2003.03485 |date=2020 |url=https://arxiv.org/pdf/2003.03485.pdf}}</ref>. The most popular instantiation is the Fourier neural operator (FNO). FNO takes <math>\kappa_\phi(x, y, a(x), a(y))v_t(y) = \kappa_\phi(x-y)</math> and by applying the [[Convolution theorem|convolution theorem]], arrives at the following parameterization of the kernel integration:
 
<math>(\mathcal{K}_\phi(a)v_t)(x) = \mathcal{F}^{-1} (R_\phi \cdot (\mathcal{F}v_t))(x), </math>
 
where <math>\mathcal{F}</math> represents the Fourier transform and <math>R_\phi</math> represents the Fourier transform of some periodic function <math>\kappa</math>. That is, FNO parameterizes the kernel integration directly in Fourier space, using a handful of Fourier modes. When the grid at which the input function is presented is uniform, the Fourier transform can be approximated using summation, resulting in [[Discrete Fourier transform|discrete Fourier transform (DFT)]] with frequencies at some specified threshold. The discrete Fourier transform can be computed using a [[Fast Fourier transform|fast Fourier transform (FFT)]] implementation, making FNO architecture among the fastest and most sample-efficient neural operator architectures.
 
== Training ==
Line 51 ⟶ 49:
in some norm <math>\|\cdot \|_\mathcal{U}.</math> Neural operators can be trained directly using [[Backpropagation|backpropagation]] and [[Gradient descent|gradient descent]]-based methods.
 
WhenAnother dealingtraining withparadigm modelingis naturalassociated phenomena, oftenwith physics equations, mostly in the form of PDEs, drive the physical world around us.<ref name="Evans"> {{cite journal |author-link=Lawrenceinformed C.machine Evans |first=Llearning. C.In |last=Evans |title=Partial Differential Equations |publisher=American Mathematical Society |___location=Providence |year=1998 |isbn=0-8218-0772-2 }}</ref>. Based on this ideaparticular, physics-informed neural networks[[Physics-informed neural networks|physics-informed neural networks]] utilize(PINNs) use complete physics laws to fit neural networks to solutions of PDEs. The general extension of this paradigm to operator learning isare broadly called physics informed neural operator paradigmoperators (PINO),<ref name="PINO">{{cite journal |last1=Li |first1=Zongyi | last2=Hongkai| first2=Zheng |last3=Kovachki |first3=Nikola | last4=Jin | first4=David | last5=Chen | first5= Haoxuan |last6=Liu |first6=Burigede | last7=Azizzadenesheli |first7=Kamyar |last8=Anima |first8=Anandkumar |title=Physics-Informed Neural Operator for Learning Partial Differential Equations |journal=httpsarXiv preprint arXiv://arxiv.org/pdf/2111.03794.pdf |date=2021 |url=https://arxiv.org/abs/2111.03794}}</ref>, where theloss supervisionfunctions can alsocan beinclude channeled throughfull physics equations andor canpartial processphysical learning through partially available physicslaws. PINOAs isopposed mainlyto astandard supervisedPINNs, learningthe settingPINO thatparadigm isincorporates suitable for cases where partiala data or partial physics in available. In short, in PINO,loss in addition to the data loss mentioned above, physics loss <math>\mathcal{L}_{PDE}((a, \mathcal{G}_\theta (a))</math>, is used for further training. The physics loss <math>\mathcal{L}_{PDE}((a, \mathcal{G}_\theta (a))</math> quantifies how much the predicted solution of <math>\mathcal{G}_\theta (a)</math> violates the PDEs equation for the input <math>a</math>.
 
== References ==