Revision as of 18:06, 13 October 2023 edit 216.228.127.130 (talk) →Training ← Previous edit		Revision as of 19:37, 18 October 2023 edit undo Mocl125 (talk \| contribs) 270 edits small edits Next edit →
Line 2: '''Neural operators''' are a class of [[Deep learning\|deep learning]] architecture designed to learn maps between infinite-dimensional [[Function space\|function spaces]]. Neural ~~Operators~~operators represent an extension of traditional [[Artificial neural network\|artificial neural networks]], marking a departure from the typical focus on learning mappings between finite-dimensional Euclidean spaces or finite sets. Neural operators directly learn [[Operator (mathematics)\|operators]] in function spaces; they can receive input functions, and the output function can be evaluated at any discretization.<ref name="NO journal">{{cite journal \|last1=Kovachki \|first1=Nikola \|last2=Li \|first2=Zongyi \|last3=Liu \|first3=Burigede \|last4=Azizzadenesheli \|first4=Kamyar \|last5=Bhattacharya \|first5=Kaushik \|last6=Stuart \|first6=Andrew \|last7=Anandkumar \|first7=Anima \|title=Neural operator: Learning maps between function spaces \|journal=Journal of Machine Learning Research \|volume=24 \|page=1-97 \|url=https://www.jmlr.org/papers/volume24/21-1524/21-1524.pdf}}</ref> The primary application of neural operators is in learning surrogate maps for the solution operators of [[Partial differential equation\|partial differential equations]] (PDEs)<ref name="NO journal" />. Standard PDE solvers can be time-consuming and computationally intensive, especially for complex systems. Neural operators have demonstrated improved performance in solving PDEs compared to existing machine learning methodologies, while being significantly faster than numerical solvers.<ref name="FNO">{{cite journal \|last1=Li \|first1=Zongyi \|last2=Kovachki \|first2=Nikola \|last3=Azizzadenesheli \|first3=Kamyar \|last4=Liu \|first4=Burigede \|last5=Bhattacharya \|first5=Kaushik \|last6=Stuart \|first6=Andrew \|last7=Anima \|first7=Anandkumar \|title=Fourier neural operator for parametric partial differential equations \|journal=arXiv preprint arXiv:2010.08895 \|date=2020 \|url=https://arxiv.org/pdf/2010.08895.pdf}}</ref> == Operator ~~Learning~~learning == Understanding and mapping relationships between function spaces ~~have~~has many applications in engineering and the sciences. In particular, [[Abstract differential equation\|one can cast the problem]] of solving partial differential equations as identifying a map between function spaces, such as from an initial condition to a time-evolved state. In other PDEs this map takes an input coefficient function and outputs a solution function. Operator learning is a [[Machine learning\|machine learning]] paradigm to learn solution operators mapping the input function to the output function. Using traditional machine learning methods, addressing this problem would involve discretizing the infinite-dimensional input and output function spaces into finite-dimensional grids and applying standard learning models, such as neural networks. This approach reduces the operator learning to finite-dimensional function learning and has some limitations, such as generalizing to discretizations beyond the grid used in training. Line 20: <math>\mathcal{G}_\theta := \mathcal{Q} \circ \sigma(W_T + \mathcal{K}_T + b_T) \circ \cdots \circ \sigma(W_1 + \mathcal{K}_1 + b_1) \circ \mathcal{P},</math> where <math>\mathcal{P}, \mathcal{Q}</math> are the lifting (lifting the codomain of the input function to a higher dimensional space) and projection (projecting the codomain of the intermediate function to the output codimension) operators, respectively. These operators act pointwise on functions and are typically parametrized as a [[Multilayer perceptron\|multilayer perceptron]]. <math>\sigma</math> is a pointwise nonlinearity, such as a [[Rectifier (neural networks)\|rectified linear unit (ReLU)]], or a [[Rectifier (neural networks)#Other_non-linear_variants\|Gaussian error linear unit (GeLU)]]. Each layer <math>i=1, \dots, T</math> has a respective local operator <math>W_i</math> (usually parameterized by a pointwise neural network) and a bias function <math>b_i</math>. Given some intermediate functional representation <math>v_t</math> with ___domain <math>D</math> in a hidden layer, a kernel integral operator <math>\mathcal{K}_\phi</math> is defined as <math>(\mathcal{K}_\phi v_t)(x) = \int_D \kappa_\phi(x, y, v_t(x), v_t(y))v_t(y)dy, </math>

Neural operators: Difference between revisions