Content deleted Content added
Requesting copyvio revdel (cv-revdel) Tag: Reverted |
m Open access bot: doi updated in citation with #oabot. |
||
(14 intermediate revisions by 11 users not shown) | |||
Line 1:
{{Short description|Programming paradigm}}
{{Machine learning}}
'''Differentiable programming''' is a [[programming paradigm]] in which a numeric computer program can be [[Differentiation (mathematics)|differentiated]] throughout via [[automatic differentiation]].<ref name="izzo2016_dCGP">{{cite book |doi=10.1007/978-3-319-55696-3_3 |chapter=Differentiable Genetic Programming |title=Genetic Programming |series=Lecture Notes in Computer Science |date=2017 |last1=Izzo |first1=Dario |last2=Biscani |first2=Francesco |last3=Mereta |first3=Alessio |volume=10196 |pages=35–51 |arxiv=1611.04766 |isbn=978-3-319-55695-6 |s2cid=17786263 }}</ref><ref name="baydin2018automatic">{{cite journal |last1=Baydin |first1=Atilim Gunes |last2=Pearlmutter |first2=Barak A. |last3=Radul |first3=Alexey Andreyevich |last4=Siskind |first4=Jeffrey Mark |title=Automatic Differentiation in Machine Learning: a Survey |journal=Journal of Marchine Learning Research |date=2018 |volume=18 |issue=153 |pages=1–43 |url=https://jmlr.org/papers/v18/17-468.html }}</ref><ref>{{cite book |last1=Wang |first1=Fei |chapter=Backpropagation with Callbacks: Foundations for Efficient and Expressive Differentiable Programming |date=2018 |chapter-url=http://papers.nips.cc/paper/8221-backpropagation-with-callbacks-foundations-for-efficient-and-expressive-differentiable-programming.pdf |title=NIPS'18: Proceedings of the 32nd International Conference on Neural Information Processing Systems |pages=10201–10212 |last2=Decker |first2=James |last3=Wu |first3=Xilun |last4=Essertel |first4=Gregory |last5=Rompf |first5=Tiark |editor-last=Bengio |editor-first=S. |editor2-last=Wallach |editor2-first=H. |editor3-last=Larochelle |editor3-first=H. |editor4-last=Grauman |editor4-first=K |publisher=Curran Associates |ref={{harvid|NIPS'18}} }}</ref><ref name="innes">{{Cite journal|last=Innes|first=Mike|date=2018|title=On Machine Learning and Programming Languages|url=http://www.sysml.cc/doc/2018/37.pdf|journal=SysML Conference 2018|access-date=2019-07-04|archive-date=2019-07-17|archive-url=https://web.archive.org/web/20190717211700/http://www.sysml.cc/doc/2018/37.pdf|url-status=dead}}</ref><ref name="diffprog-zygote">{{cite
==Approaches==
Most differentiable programming frameworks work by constructing a graph containing the control flow and [[data structures]] in the program.<ref name="flux">{{cite
* ''' Static, [[compiled]] graph'''-based approaches such as [[TensorFlow]],<ref group=note>TensorFlow 1 uses the static graph approach, whereas TensorFlow 2 uses the dynamic graph approach by default.</ref> [[Theano (software)|Theano]], and [[MXNet]]. They tend to allow for good [[compiler optimization]] and easier scaling to large systems, but their static nature limits interactivity and the types of programs that can be created easily (e.g. those involving [[loop (computing)|loops]] or [[recursion]]), as well as making it harder for users to reason effectively about their programs.<ref name="flux" /> A proof
* '''[[Operator overloading]], dynamic graph'''
The use of
▲* ''' Static, [[compiled]] graph'''-based approaches such as [[TensorFlow]],<ref group=note>TensorFlow 1 uses the static graph approach, whereas TensorFlow 2 uses the dynamic graph approach by default.</ref> [[Theano (software)|Theano]], and [[MXNet]]. They tend to allow for good [[compiler optimization]] and easier scaling to large systems, but their static nature limits interactivity and the types of programs that can be created easily (e.g. those involving [[loop (computing)|loops]] or [[recursion]]), as well as making it harder for users to reason effectively about their programs.<ref name="flux" /> A proof of concept compiler toolchain called Myia uses a subset of Python as a front end and supports higher-order functions, recursion, and higher-order derivatives.<ref>{{cite book |last1=Merriënboer |first1=Bart van |last2=Breuleux |first2=Olivier |last3=Bergeron |first3=Arnaud |last4=Lamblin |first4=Pascal |chapter=Automatic differentiation in ML: where we are and where we should be going |title={{harvnb|NIPS'18}} |date=3 December 2018 |volume=31 |pages=8771–81 |chapter-url = https://papers.nips.cc/paper/2018/hash/770f8e448d07586afbf77bb59f698587-Abstract.html}}</ref><ref name="myia1">{{Cite web |last1=Breuleux |first1=O. |last2=van Merriënboer |first2=B. |date=2017 |url=https://www.sysml.cc/doc/2018/39.pdf |title=Automatic Differentiation in Myia |access-date=2019-06-24 |archive-date=2019-06-24 |archive-url=https://web.archive.org/web/20190624180156/https://www.sysml.cc/doc/2018/39.pdf |url-status=dead }}</ref><ref name="pytorchtut">{{Cite web|url=https://pytorch.org/tutorials/beginner/examples_autograd/tf_two_layer_net.html |title=TensorFlow: Static Graphs |work=Tutorials: Learning PyTorch |publisher=PyTorch.org |access-date=2019-03-04}}</ref>
A limitation of earlier approaches is that they are only able to differentiate code written in a suitable manner for the framework, limiting their interoperability with other programs. Newer approaches resolve this issue by constructing the graph from the language's syntax or IR, allowing arbitrary code to be differentiated.
▲* '''[[Operator overloading]], dynamic graph''' based approaches such as [[PyTorch]], [[NumPy]]'s [[autograd]] package as well as [https://darioizzo.github.io/audi/ Pyaudi]. Their dynamic and interactive nature lets most programs be written and reasoned about more easily. However, they lead to [[interpreter (computing)|interpreter]] overhead (particularly when composing many small operations), poorer scalability, and reduced benefit from compiler optimization.<ref name="myia1" /><ref name="pytorchtut" />
▲The use of Just-in-Time compilation has emerged recently as a possible solution to overcome some of the bottlenecks of interpreted languages. The C++ [https://bluescarni.github.io/heyoka/index.html heyoka] and python package [https://bluescarni.github.io/heyoka.py/index.html heyoka.py] make large use of this technique to offer advanced differentiable programming capabilities (also at high orders). A package for the [[Julia (programming language)|Julia]] programming language{{snd}} [https://github.com/FluxML/Zygote.jl Zygote]{{snd}} works directly on Julia's [[intermediate representation]]. <ref name="flux" /><ref>{{cite preprint |arxiv=1810.07951 |last1=Innes |first1=Michael |title=Don't Unroll Adjoint: Differentiating SSA-Form Programs |date=2018 }}</ref><ref name="diffprog-zygote" />
▲A limitation of earlier approaches is that they are only able to differentiate code written in a suitable manner for the framework, limiting their interoperability with other programs. Newer approaches resolve this issue by constructing the graph from the language's syntax or IR, allowing arbitrary code to be differentiated. <ref name="flux" /><ref name="myia1" />
==Applications==
Differentiable programming has been applied in areas such as combining [[deep learning]] with [[physics engines]] in [[robotics]],<ref>{{cite
==Multidisciplinary application==
Differentiable programming is making significant strides in various fields beyond its traditional applications. In healthcare and life sciences, for example, it is being used for deep learning in biophysics-based modelling of molecular mechanisms
==See also==
|