Revision as of 19:37, 18 October 2023 edit Mocl125 (talk \| contribs) 270 edits small edits ← Previous edit		Revision as of 22:49, 18 October 2023 edit undo Mocl125 (talk \| contribs) 270 edits cleaning up Next edit →
Line 4: '''Neural operators''' are a class of [[Deep learning\|deep learning]] architecture designed to learn maps between infinite-dimensional [[Function space\|function spaces]]. Neural operators represent an extension of traditional [[Artificial neural network\|artificial neural networks]], marking a departure from the typical focus on learning mappings between finite-dimensional Euclidean spaces or finite sets. Neural operators directly learn [[Operator (mathematics)\|operators]] in function spaces; they can receive input functions, and the output function can be evaluated at any discretization.<ref name="NO journal">{{cite journal \|last1=Kovachki \|first1=Nikola \|last2=Li \|first2=Zongyi \|last3=Liu \|first3=Burigede \|last4=Azizzadenesheli \|first4=Kamyar \|last5=Bhattacharya \|first5=Kaushik \|last6=Stuart \|first6=Andrew \|last7=Anandkumar \|first7=Anima \|title=Neural operator: Learning maps between function spaces \|journal=Journal of Machine Learning Research \|volume=24 \|page=1-97 \|url=https://www.jmlr.org/papers/volume24/21-1524/21-1524.pdf}}</ref> The primary application of neural operators is in learning surrogate maps for the solution operators of [[Partial differential equation\|partial differential equations]] (PDEs)<ref name="NO journal" />, which are critical tools in modeling of the natural environment.<ref name="Evans"> {{cite journal \|author-link=Lawrence C. Evans \|first=L. C. \|last=Evans \|title=Partial Differential Equations \|publisher=American Mathematical Society \|___location=Providence \|year=1998 \|isbn=0-8218-0772-2 }}</ref> Standard PDE solvers can be time-consuming and computationally intensive, especially for complex systems. Neural operators have demonstrated improved performance in solving PDEs compared to existing machine learning methodologies, while being significantly faster than numerical solvers.<ref name="FNO">{{cite journal \|last1=Li \|first1=Zongyi \|last2=Kovachki \|first2=Nikola \|last3=Azizzadenesheli \|first3=Kamyar \|last4=Liu \|first4=Burigede \|last5=Bhattacharya \|first5=Kaushik \|last6=Stuart \|first6=Andrew \|last7=Anima \|first7=Anandkumar \|title=Fourier neural operator for parametric partial differential equations \|journal=arXiv preprint arXiv:2010.08895 \|date=2020 \|url=https://arxiv.org/pdf/2010.08895.pdf}}</ref> == Operator learning == Line 26: where the kernel <math>\kappa_\phi</math> is a learnable implicit neural network, parametrized by <math>\phi</math>. In practice, weone ~~are~~is often given the input function to the neural operator at a ~~certain~~specific resolution. ~~for~~For ~~each~~instance ~~data point. For~~for the <math>i</math>'th data ~~point~~sample, ~~let's~~ consider the setting where weone ~~have~~is given the evaluation of <math>v_t</math> at <math>n</math> points <math>\{y_j\}_j^n</math>. Borrowing from [[Nyström method\|Nyström integral approximation methods]] such as [[Riemann sum\|Riemann sum integration]] and [[Gaussian quadrature\|Gaussian quadrature]], ~~we compute~~ the above integral operation can be computed as follows,: <math>\int_D \kappa_\phi(x, y, v_t(x), v_t(y))v_t(y)dy\approx \sum_j^n \kappa_\phi(x, y_j, v_t(x), v_t(y_j))v_t(y_j)\Delta_{y_j}, </math> where <math>\Delta_{y_j}</math> is the sub-area volume or quadrature weight and approximation error . Ergo, a simplified layer can be computed as follows, ▼ ▲where <math>\Delta_{y_j}</math> is the sub-area volume or quadrature weight ~~and approximation error~~ . ~~Ergo~~Thus, a simplified layer can be computed as ~~follows,~~ <math>v_{t+1}(x) \approx \sigma\left(\sum_j^n \kappa_\phi(x, y_j, v_t(x), v_t(y_j))v_t(y_j)\Delta_{y_j} + W_t(v_t(y_j)) + b_t(x)\right).</math> ~~Many variants of the architecture is developed in the prior work, and some of them are supported in the [https://neuraloperator.github.io/neuraloperator/dev/index.html neural operator library].~~ The above approximation, along with deployment of implicit neural network for <math>\kappa_\phi</math> results in the graph neural operator (GNO)<ref name="Graph NO">{{cite journal \|last1=Li \|first1=Zongyi \|last2=Kovachki \|first2=Nikola \|last3=Azizzadenesheli \|first3=Kamyar \|last4=Liu \|first4=Burigede \|last5=Bhattacharya \|first5=Kaushik \|last6=Stuart \|first6=Andrew \|last7=Anima \|first7=Anandkumar \|title=Neural operator: Graph kernel network for partial differential equations \|journal=arXiv preprint arXiv:2003.03485 \|date=2020 \|url=https://arxiv.org/pdf/2003.03485.pdf}}</ref>. There have been various parameterizations of neural operators for different applications<ref name="FNO" /><ref name="Graph NO">{{cite journal \|last1=Li \|first1=Zongyi \|last2=Kovachki \|first2=Nikola \|last3=Azizzadenesheli \|first3=Kamyar \|last4=Liu \|first4=Burigede \|last5=Bhattacharya \|first5=Kaushik \|last6=Stuart \|first6=Andrew \|last7=Anima \|first7=Anandkumar \|title=Neural operator: Graph kernel network for partial differential equations \|journal=arXiv preprint arXiv:2003.03485 \|date=2020 \|url=https://arxiv.org/pdf/2003.03485.pdf}}</ref>. These typically differ in their parameterization of <math>\kappa</math>. The most popular instantiation is the Fourier neural operator (FNO). FNO takes <math>\kappa_\phi(x, y, a(x), a(y))v_t(y) = \kappa_\phi(x-y)</math> and by applying the [[Convolution theorem\|convolution theorem]], arrives at the following parameterization of the kernel integration:▼ ~~The varying parameterizations of neural operators typically differ in their parameterization of <math>\kappa</math>.~~ ▲There have been various parameterizations of neural operators for different applications<ref name="FNO" /><ref name="Graph NO">{{cite journal \|last1=Li \|first1=Zongyi \|last2=Kovachki \|first2=Nikola \|last3=Azizzadenesheli \|first3=Kamyar \|last4=Liu \|first4=Burigede \|last5=Bhattacharya \|first5=Kaushik \|last6=Stuart \|first6=Andrew \|last7=Anima \|first7=Anandkumar \|title=Neural operator: Graph kernel network for partial differential equations \|journal=arXiv preprint arXiv:2003.03485 \|date=2020 \|url=https://arxiv.org/pdf/2003.03485.pdf}}</ref>. The most popular instantiation is the Fourier neural operator (FNO). FNO takes <math>\kappa_\phi(x, y, a(x), a(y))v_t(y) = \kappa_\phi(x-y)</math> and by applying the [[Convolution theorem\|convolution theorem]], arrives at the following parameterization of the kernel integration: <math>(\mathcal{K}_\phi(a)v_t)(x) = \mathcal{F}^{-1} (R_\phi \cdot (\mathcal{F}v_t))(x), </math> where <math>\mathcal{F}</math> represents the Fourier transform and <math>R_\phi</math> represents the Fourier transform of some periodic function <math>\kappa</math>. That is, FNO parameterizes the kernel integration directly in Fourier space, using a handful of Fourier modes. When the grid at which the input function is presented is uniform, the Fourier transform can be approximated using summation, resulting in [[Discrete Fourier transform\|discrete Fourier transform (DFT)]] with frequencies at some specified threshold. The discrete Fourier transform can be computed using a [[Fast Fourier transform\|fast Fourier transform (FFT)]] implementation~~, making FNO architecture among the fastest and most sample-efficient neural operator architectures~~. == Training == Line 51 ⟶ 49: in some norm <math>\\|\cdot \\|_\mathcal{U}.</math> Neural operators can be trained directly using [[Backpropagation\|backpropagation]] and [[Gradient descent\|gradient descent]]-based methods. ~~When~~Another ~~dealing~~training ~~with~~paradigm ~~modeling~~is ~~natural~~associated ~~phenomena, often~~with physics ~~equations, mostly in the form of PDEs, drive the physical world around us.<ref name="Evans"> {{cite journal \|author~~-~~link=Lawrence~~informed C.machine ~~Evans \|first=L~~learning. C.In ~~\|last=Evans \|title=Partial Differential Equations \|publisher=American Mathematical Society \|___location=Providence \|year=1998 \|isbn=0-8218-0772-2 }}</ref>. Based on this idea~~particular, ~~physics-informed neural networks~~[[Physics-informed neural networks\|physics-informed neural networks]] ~~utilize~~(PINNs) use complete physics laws to fit neural networks to solutions of PDEs. The ~~general~~ extension of this paradigm to operator learning isare broadly called physics informed neural ~~operator paradigm~~operators (PINO),<ref name="PINO">{{cite journal \|last1=Li \|first1=Zongyi \| last2=Hongkai\| first2=Zheng \|last3=Kovachki \|first3=Nikola \| last4=Jin \| first4=David \| last5=Chen \| first5= Haoxuan \|last6=Liu \|first6=Burigede \| last7=Azizzadenesheli \|first7=Kamyar \|last8=Anima \|first8=Anandkumar \|title=Physics-Informed Neural Operator for Learning Partial Differential Equations \|journal=~~https~~arXiv preprint arXiv:~~//arxiv.org/pdf/~~2111.03794~~.pdf~~ \|date=2021 \|url=https://arxiv.org/abs/2111.03794}}</ref>, where ~~the~~loss ~~supervision~~functions can ~~also~~can beinclude ~~channeled through~~full physics equations ~~and~~or ~~can~~partial ~~process~~physical ~~learning through partially available physics~~laws. ~~PINO~~As isopposed ~~mainly~~to astandard ~~supervised~~PINNs, ~~learning~~the ~~setting~~PINO ~~that~~paradigm isincorporates ~~suitable for cases where partial~~a data ~~or partial physics in available. In short, in PINO,~~loss in addition to the ~~data loss mentioned above,~~ physics loss <math>\mathcal{L}_{PDE}((a, \mathcal{G}_\theta (a))</math>~~, is used for further training~~. The physics loss <math>\mathcal{L}_{PDE}((a, \mathcal{G}_\theta (a))</math> quantifies how much the predicted solution of <math>\mathcal{G}_\theta (a)</math> violates the PDEs equation for the input <math>a</math>. == References ==

Neural operators: Difference between revisions