Neural operators: Difference between revisions

Content deleted Content added
added scientific [2] and [9] papers to second paragraph.
Definition and formulation: codimension is wrong here.
 
(5 intermediate revisions by 4 users not shown)
Line 1:
{{Short description|Machine learning framework}}
{{Orphan|date=January 2024}}
 
'''Neural operators''' are a class of [[deep learning]] architectures designed to learn maps between infinite-dimensional [[function space]]s. Neural operators represent an extension of traditional [[artificial neural network]]s, marking a departure from the typical focus on learning mappings between finite-dimensional Euclidean spaces or finite sets. Neural operators directly learn [[Operator (mathematics)|operators]] between function spaces; they can receive input functions, and the output function can be evaluated at any discretization.<ref name="NO journal">{{cite journal |last1=Kovachki |first1=Nikola |last2=Li |first2=Zongyi |last3=Liu |first3=Burigede |last4=Azizzadenesheli |first4=Kamyar |last5=Bhattacharya |first5=Kaushik |last6=Stuart |first6=Andrew |last7=Anandkumar |first7=Anima |title=Neural operator: Learning maps between function spaces |journal=Journal of Machine Learning Research |date=2021 |volume=24 |pages=1–97 |arxiv=2108.08481 |url=https://www.jmlr.org/papers/volume24/21-1524/21-1524.pdf}}</ref><ref name="NO Nature">{{cite journal |last1=Azizzadenesheli |first1=Kamyar |last2=Kovachki |first2=Nikola |last3=Li |first3=Zongyi |last4=Liu-Schiaffini |first4=Miguel |last5=Kossaifi |first5=Jean |last6=Anandkumar |first6=Anima |title=Neural operators for accelerating scientific simulations and design |journal=Nature Reviews Physics |date=2024 |volume=6 |pages=320–328 |arxiv=2309.15325 |url=https://www.nature.com/articles/s42254-024-00712-5}}</ref>
 
The primary application of neural operators is in learning surrogate maps for the solution operators of [[partial differential equation]]s (PDEs),<ref name="NO journal" /><ref>{{Cite journal |lastname=Lu"NO |first=Lu |last2=Jin |first2=Pengzhan |last3=Pang |first3=Guofei |last4=Zhang |first4=Zhongqiang |last5=Karniadakis |first5=George Em |date=2021-03 |title=Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators |url=https://www.nature.com/articles/s42256-021-00302-5 |journal=Nature" Machine Intelligence |language=en |volume=3 |issue=3 |pages=218–229 |doi=10.1038/s42256-021-00302-5 |issn=2522-5839}}</ref> which are critical tools in modeling the natural environment.<ref name="Evans">{{cite book |author-link=Lawrence C. Evans |first=L. C. |last=Evans |title=Partial Differential Equations |publisher=American Mathematical Society |___location=Providence |year=1998 |isbn=0-8218-0772-2 }}</ref><ref>{{cite press release |title=How AI models are transforming weather forecasting: A showcase of data-driven systems |url=https://phys.org/news/2023-09-ai-weather-showcase-data-driven.html |work=phys.org |publisher=European Centre for Medium-Range Weather Forecasts |date=6 September 2023 }}</ref> Standard PDE solvers can be time-consuming and computationally intensive, especially for complex systems. Neural operators have demonstrated improved performance in solving PDEs<ref>{{cite news |last1=Russ |first1=Dan |last2=Abinader |first2=Sacha |title=Microsoft and Accenture partner to tackle methane emissions with AI technology |url=https://azure.microsoft.com/en-us/blog/microsoft-and-accenture-partner-to-tackle-methane-emissions-with-ai-technology/ |work=Microsoft Azure Blog |date=23 August 2023 }}</ref><ref>{{Citation |lastlast1=Li |firstfirst1=Zijie |title=Transformer for Partial Differential Equations' Operator Learning |date=2023-04-27 |url=http://arxiv.org/abs/2205.13671 |access-date=2025-06-23 |publisherarxiv=arXiv |doi=10.48550/arXiv.2205.13671 |id=arXiv:2205.13671 |last2=Meidani |first2=Kazem |last3=Farimani |first3=Amir Barati}}</ref> compared to existing machine learning methodologies while being significantly faster than numerical solvers.<ref name="FNO">{{cite arXiv |last1=Li |first1=Zongyi |last2=Kovachki |first2=Nikola |last3=Azizzadenesheli |first3=Kamyar |last4=Liu |first4=Burigede |last5=Bhattacharya |first5=Kaushik |last6=Stuart |first6=Andrew |last7=Anima |first7=Anandkumar |title=Fourier neural operator for parametric partial differential equations |date=2020 |class=cs.LG |eprint=2010.08895 }}</ref><ref>{{cite news |last1=Hao |first1=Karen |title=AI has cracked a key mathematical puzzle for understanding our world |url=https://www.technologyreview.com/2020/10/30/1011435/ai-fourier-neural-network-cracks-navier-stokes-and-partial-differential-equations/ |work=MIT Technology Review |date=30 October 2020 }}</ref><ref>{{cite news |last1=Ananthaswamy |first1=Anil |title=Latest Neural Nets Solve World's Hardest Equations Faster Than Ever Before |url=https://www.quantamagazine.org/latest-neural-nets-solve-worlds-hardest-equations-faster-than-ever-before-20210419/ |work=Quanta Magazine |date=19 April 2021 }}</ref> Neural operators have also been applied to various scientific and engineering disciplines such as turbulent flow modeling, computational mechanics, graph-structured data,<ref>{{cite journal |last1=Sharma |first1=Anuj |last2=Singh |first2=Sukhdeep |last3=Ratna |first3=S. |title=Graph Neural Network Operators: a Review |journal=Multimedia Tools and Applications |date=15 August 2023 |volume=83 |issue=8 |pages=23413–23436 |doi=10.1007/s11042-023-16440-4 }}</ref> and the geosciences.<ref>{{cite journal |last1=Wen |first1=Gege |last2=Li |first2=Zongyi |last3=Azizzadenesheli |first3=Kamyar |last4=Anandkumar |first4=Anima |last5=Benson |first5=Sally M. |title=U-FNO—An enhanced Fourier neural operator-based deep-learning model for multiphase flow |journal=Advances in Water Resources |date=May 2022 |volume=163 |pages=104180 |doi=10.1016/j.advwatres.2022.104180 |arxiv=2109.03697 |bibcode=2022AdWR..16304180W }}</ref> In particular, they have been applied to learning stress-strain fields in materials, classifying complex data like spatial transcriptomics, predicting multiphase flow in porous media,<ref>{{cite journal |last1=Choubineh |first1=Abouzar |last2=Chen |first2=Jie |last3=Wood |first3=David A. |last4=Coenen |first4=Frans |last5=Ma |first5=Fei |title=Fourier Neural Operator for Fluid Flow in Small-Shape 2D Simulated Porous Media Dataset |journal=Algorithms |date=2023 |volume=16 |issue=1 |pages=24 |doi=10.3390/a16010024 |doi-access=free }}</ref> and carbon dioxide migration simulations. Finally, the operator learning paradigm allows learning maps between function spaces, and is different from parallel ideas of learning maps from finite-dimensional spaces to function spaces,<ref name="meshfreeflownet">{{cite book |doi=10.1109/SC41405.2020.00013 |chapter=MESHFREEFLOWNET: A Physics-Constrained Deep Continuous Space-Time Super-Resolution Framework |title=SC20: International Conference for High Performance Computing, Networking, Storage and Analysis |date=2020 |last1=Jiang |first1=Chiyu Lmaxr |last2=Esmaeilzadeh |first2=Soheil |last3=Azizzadenesheli |first3=Kamyar |last4=Kashinath |first4=Karthik |last5=Mustafa |first5=Mustafa |last6=Tchelepi |first6=Hamdi A. |last7=Marcus |first7=Philip |last8=Prabhat |first8=Mr |last9=Anandkumar |first9=Anima |pages=1–15 |isbn=978-1-7281-9998-6 |url=https://resolver.caltech.edu/CaltechAUTHORS:20200526-153937049 }}</ref><ref name="deeponet">{{cite journal |last1=Lu |first1=Lu |last2=Jin |first2=Pengzhan |last3=Pang |first3=Guofei |last4=Zhang |first4=Zhongqiang |last5=Karniadakis |first5=George Em |title=Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators |journal=Nature Machine Intelligence |date=18 March 2021 |volume=3 |issue=3 |pages=218–229 |doi=10.1038/s42256-021-00302-5 |arxiv=1910.03193 }}</ref> and subsumes these settings as special cases when limited to a fixed input resolution.
 
== Operator learning ==
Line 20 ⟶ 19:
<math>\mathcal{G}_\phi := \mathcal{Q} \circ \sigma(W_T + \mathcal{K}_T + b_T) \circ \cdots \circ \sigma(W_1 + \mathcal{K}_1 + b_1) \circ \mathcal{P},</math>
 
where <math>\mathcal{P}, \mathcal{Q}</math> are the lifting (lifting the codomain of the input function to a higher dimensional space) and projection (projecting the codomain of the intermediate function to the output codimensiondimension) operators, respectively. These operators act pointwise on functions and are typically parametrized as [[multilayer perceptron]]s. <math>\sigma</math> is a pointwise nonlinearity, such as a [[Rectifier (neural networks)|rectified linear unit (ReLU)]], or a [[Rectifier (neural networks)#Other non-linear variants|Gaussian error linear unit (GeLU)]]. Each layer <math>t=1, \dots, T</math> has a respective local operator <math>W_t</math> (usually parameterized by a pointwise neural network), a kernel integral operator <math>\mathcal{K}_t</math>, and a bias function <math>b_t</math>. Given some intermediate functional representation <math>v_t</math> with ___domain <math>D</math> in the <math>t</math>-th hidden layer, a kernel integral operator <math>\mathcal{K}_\phi</math> is defined as
 
<math>(\mathcal{K}_\phi v_t)(x) := \int_D \kappa_\phi(x, y, v_t(x), v_t(y))v_t(y)dy, </math>
Line 50 ⟶ 49:
 
Another training paradigm is associated with physics-informed machine learning. In particular, [[physics-informed neural networks]] (PINNs) use complete physics laws to fit neural networks to solutions of PDEs. Extensions of this paradigm to operator learning are broadly called physics-informed neural operators (PINO),<ref name="PINO">{{cite arXiv |last1=Li |first1=Zongyi | last2=Hongkai| first2=Zheng |last3=Kovachki |first3=Nikola | last4=Jin | first4=David | last5=Chen | first5= Haoxuan |last6=Liu |first6=Burigede | last7=Azizzadenesheli |first7=Kamyar |last8=Anima |first8=Anandkumar |title=Physics-Informed Neural Operator for Learning Partial Differential Equations |date=2021 |class=cs.LG |eprint=2111.03794 }}</ref> where loss functions can include full physics equations or partial physical laws. As opposed to standard PINNs, the PINO paradigm incorporates a data loss (as defined above) in addition to the physics loss <math>\mathcal{L}_{PDE}(a, \mathcal{G}_\theta (a))</math>. The physics loss <math>\mathcal{L}_{PDE}(a, \mathcal{G}_\theta (a))</math> quantifies how much the predicted solution of <math>\mathcal{G}_\theta (a)</math> violates the PDEs equation for the input <math>a</math>.
 
== See also ==
 
* [[Neural network (machine learning)|Neural network]]
* [[Physics-informed neural networks]]
* [[Neural field]]
 
== References ==