Content deleted Content added
Declining submission: nn - Submission is about a topic not yet shown to meet general notability guidelines (be more specific if possible) (AFCH 0.9.1) |
→Definition and formulation: codimension is wrong here. |
||
(33 intermediate revisions by 20 users not shown) | |||
Line 1:
{{Short description|Machine learning framework}}
'''Neural operators''' are a class of [[
The primary application of neural operators is in learning surrogate maps for the solution operators of [[
== Operator learning ==
Understanding and mapping relationships between function spaces has many applications in engineering and the sciences. In particular, [[Abstract differential equation|one can cast the problem]] of solving partial differential equations as identifying a map between function spaces, such as from an initial condition to a time-evolved state. In other PDEs this map takes an input coefficient function and outputs a solution function. Operator learning is a [[
Using traditional machine learning methods, addressing this problem would involve discretizing the infinite-dimensional input and output function spaces into finite-dimensional grids and applying standard learning models, such as neural networks. This approach reduces the operator learning to finite-dimensional function learning and has some limitations, such as generalizing to discretizations beyond the grid used in training.
The primary properties of neural operators that differentiate them from traditional neural networks is discretization invariance and discretization convergence.<ref name="NO journal" />
== Definition and formulation ==
Architecturally, neural operators are similar to feed-forward neural networks in the sense that they are
Neural operators seek to approximate some operator <math>\mathcal{G} : \mathcal{A} \to \mathcal{U}</math> between function spaces <math>\mathcal{A}</math> and <math>\mathcal{U}</math> by building a parametric map <math>\mathcal{G}_\phi : \mathcal{A} \to \mathcal{U}</math>. Such parametric maps <math>\mathcal{G}_\phi</math> can generally be defined in the form
Line 27 ⟶ 19:
<math>\mathcal{G}_\phi := \mathcal{Q} \circ \sigma(W_T + \mathcal{K}_T + b_T) \circ \cdots \circ \sigma(W_1 + \mathcal{K}_1 + b_1) \circ \mathcal{P},</math>
where <math>\mathcal{P}, \mathcal{Q}</math> are the lifting (lifting the codomain of the input function to a higher dimensional space) and projection (projecting the codomain of the intermediate function to the output
<math>(\mathcal{K}_\phi v_t)(x) := \int_D \kappa_\phi(x, y, v_t(x), v_t(y))v_t(y)dy, </math>
Line 33 ⟶ 25:
where the kernel <math>\kappa_\phi</math> is a learnable implicit neural network, parametrized by <math>\phi</math>.
In practice, one is often given the input function to the neural operator at a specific resolution. For instance, consider the setting where one is given the evaluation of <math>v_t</math> at <math>n</math> points <math>\{y_j\}_j^n</math>. Borrowing from [[Nyström method|Nyström integral approximation methods]] such as [[Riemann sum|Riemann sum integration]] and [[
<math>\int_D \kappa_\phi(x, y, v_t(x), v_t(y))v_t(y)dy\approx \sum_j^n \kappa_\phi(x, y_j, v_t(x), v_t(y_j))v_t(y_j)\Delta_{y_j}, </math>
Line 41 ⟶ 33:
<math>v_{t+1}(x) \approx \sigma\left(\sum_j^n \kappa_\phi(x, y_j, v_t(x), v_t(y_j))v_t(y_j)\Delta_{y_j} + W_t(v_t(y_j)) + b_t(x)\right).</math>
The above approximation, along with parametrizing <math>\kappa_\phi</math> as an implicit neural network, results in the graph neural operator (GNO).<ref name="Graph NO">{{cite arXiv |last1=Li |first1=Zongyi |last2=Kovachki |first2=Nikola |last3=Azizzadenesheli |first3=Kamyar |last4=Liu |first4=Burigede |last5=Bhattacharya |first5=Kaushik |last6=Stuart |first6=Andrew |last7=Anima |first7=Anandkumar |title=Neural operator: Graph kernel network for partial differential equations |date=2020 |class=cs.LG |eprint=2003.03485 }}</ref>
There have been various parameterizations of neural operators for different applications.<ref name="FNO" /><ref name="Graph NO" />
<math>(\mathcal{K}_\phi v_t)(x) = \mathcal{F}^{-1} (R_\phi \cdot (\mathcal{F}v_t))(x), </math>
Line 49 ⟶ 41:
where <math>\mathcal{F}</math> represents the Fourier transform and <math>R_\phi</math> represents the Fourier transform of some periodic function <math>\kappa_\phi</math>. That is, FNO parameterizes the kernel integration directly in Fourier space, using a prescribed number of Fourier modes. When the grid at which the input function is presented is uniform, the Fourier transform can be approximated using the [[Discrete Fourier transform|discrete Fourier transform (DFT)]] with frequencies below some specified threshold. The discrete Fourier transform can be computed using a [[Fast Fourier transform|fast Fourier transform (FFT)]] implementation.
== Training ==
Training neural operators is similar to the training process for a traditional neural network. Neural operators are typically trained in some [[Lp norm]] or [[Sobolev norm]]. In particular, for a dataset <math>\{(a_i, u_i)\}_{i=1}^N</math> of size <math>N</math>, neural operators minimize (a discretization of)
<math>\mathcal{L}_\mathcal{U}(\{(a_i, u_i)\}_{i=1}^N) := \sum_{i=1}^N \|u_i - \mathcal{G}_\theta (a_i) \|_\mathcal{U}^2</math>,
where <math>\|\cdot \|_\mathcal{U}</math> is a norm on the output function space <math>\mathcal{U}</math>. Neural operators can be trained directly using [[
Another training paradigm is associated with physics-informed machine learning. In particular, [[
== See also ==
* [[Neural network (machine learning)|Neural network]]
* [[Physics-informed neural networks]]
* [[Neural field]]
== References ==
{{reflist}}
== External links ==
*[https://github.com/neuraloperator/neuraloperator/ neuralop] – Python library of various neural operator architectures
[[Category:Deep learning]]
|