Content deleted Content added
No edit summary |
|||
Line 20:
<math>\mathcal{G}_\theta := \mathcal{Q} \circ \sigma(W_T + \mathcal{K}_T + b_T) \circ \cdots \circ \sigma(W_1 + \mathcal{K}_1 + b_1) \circ \mathcal{P},</math>
where <math>\mathcal{P}, \mathcal{Q}</math> are the lifting (lifting the codomain of the input function to a higher dimensional space) and projection (projecting the codomain of the intermediate function to the output codimension) operators, respectively. These operators act pointwise on functions and are typically parametrized as a [[Multilayer perceptron|multilayer perceptron]]. <math>\sigma</math> is a
<math>(\mathcal{K}_\phi
where the
In practice, we are often given the input function to the neural operator at a certain resolution for each data point. For the <math>i</math>'th data point, let's consider the setting where we have evaluation of <math>v_t</math> at <math>n</math> points <math>\{y_j\}_j^n</math>. Borrowing from [[Nyström method|Nyström integral approximation methods]] such as [[Riemann sum|Riemann sum integration]] and [[Gaussian quadrature|Gaussian quadrature]], we compute the above integral operation as follows,
<math>\int_D \kappa_\phi(x, y, v_t(x), v_t(y))v_t(y)dy\approx \sum_j^n \kappa_\phi(x, y_j, v_t(x), v_t(y_j))v_t(y_j)\Delta_{y_j}, </math>
where <math>\Delta_{y_j}</math> is the sub-area volume or quadrature weight and approximation error . Ergo, a simplified layer can be computed as follows,
<math>v_{t+1}(x) \approx \sigma(\sum_j^n \kappa_\phi(x, y_j, v_t(x), v_t(y_j))v_t(y_j)\Delta_{y_j} + W_t(v_t(y_j)) + b_t(x))</math>
Many variants of the architecture is developed in the prior work, and some of them are supported in the [https://neuraloperator.github.io/neuraloperator/dev/index.html neural operator library]. The above approximation, along with deployment of implicit neural network for <math>\kappa_\phi</math> results in graph neural operator (GNO)<ref name="Graph NO">{{cite journal |last1=Li |first1=Zongyi |last2=Kovachki |first2=Nikola |last3=Azizzadenesheli |first3=Kamyar |last4=Liu |first4=Burigede |last5=Bhattacharya |first5=Kaushik |last6=Stuart |first6=Andrew |last7=Anima |first7=Anandkumar |title=Neural operator: Graph kernel network for partial differential equations |journal=arXiv preprint arXiv:2003.03485 |date=2020 |url=https://arxiv.org/pdf/2003.03485.pdf}}</ref>.
The varying parameterizations of neural operators typically differ in their parameterization of <math>\kappa</math>.
There have been various parameterizations of neural operators for different applications<ref name="FNO" /><ref name="Graph NO">{{cite journal |last1=Li |first1=Zongyi |last2=Kovachki |first2=Nikola |last3=Azizzadenesheli |first3=Kamyar |last4=Liu |first4=Burigede |last5=Bhattacharya |first5=Kaushik |last6=Stuart |first6=Andrew |last7=Anima |first7=Anandkumar |title=Neural operator: Graph kernel network for partial differential equations |journal=arXiv preprint arXiv:2003.03485 |date=2020 |url=https://arxiv.org/pdf/2003.03485.pdf}}</ref>. The most popular instantiation is the Fourier neural operator (FNO). FNO takes <math>\kappa_\phi(x, y, a(x), a(y))v_t(y) = \kappa_\phi(x-y)</math> and by applying the [[Convolution theorem|convolution theorem]], arrives at the following parameterization of the kernel integration:
<math>(\mathcal{K}_\phi(a)v_t)(x) = \mathcal{F}^{-1} (R_\phi \cdot (\mathcal{F}v_t))(x), </math>
where <math>\mathcal{F}</math> represents the Fourier transform and <math>R_\phi</math> represents the Fourier transform of some periodic function <math>\kappa</math>. That is, FNO parameterizes the kernel integration directly in Fourier space,
== Training ==
Line 38 ⟶ 50:
in some norm <math>\|\cdot \|_\mathcal{U}.</math> Neural operators can be trained directly using [[Backpropagation|backpropagation]] and [[Gradient descent|gradient descent]]-based methods.
When dealing with modeling natural phenomena, often physics equations, mostly in the form of PDEs, drive the physical world around us.<ref name="Evans"> {{cite journal |author-link=Lawrence C. Evans |first=L. C. |last=Evans |title=Partial Differential Equations |publisher=American Mathematical Society |___location=Providence |year=1998 |isbn=0-8218-0772-2 }}</ref>. Based on this idea, physics-informed neural networks[[Physics-informed neural networks|physics-informed neural networks]] utilize complete physics laws to fit neural networks to solutions of PDEs. The general extension to operator learning is physics informed neural operator paradigm (PINO),<ref name="PINO">{{cite journal |last1=Li |first1=Zongyi | last2=Hongkai| first2=Zheng |last3=Kovachki |first3=Nikola | last4=Jin | first4=David | last5=Chen | first5= Haoxuan | |last6=Liu |first6=Burigede | last7=Azizzadenesheli |first7=Kamyar |last8=Anima |first8=Anandkumar |title=Physics-Informed Neural Operator for Learning Partial Differential Equations |journal=https://arxiv.org/pdf/2111.03794.pdf |date=2021 |url=https://arxiv.org/abs/2111.03794}}</ref>, where the supervision can also be channeled through physics equations and can process learning through partially available physics. PINO is mainly a supervised learning setting that is suitable for cases where partial data or partial physics in available. In short, in PINO, in addition to the data loss mentioned above, physics loss <math>\mathcal{L}_PDE((a, \mathcal{G}_\theta (a))</math>, is used for further training. The physics loss <math>\mathcal{L}_PDE((a, \mathcal{G}_\theta (a))</math> quantifies how much the predicted solution of <math>\mathcal{G}_\theta (a)</math> violates the PDEs equation for the input <math>a</math>.
== References ==
|