Content deleted Content added
→Applications: →cite journal | Add: s2cid. | Use this tool. Report bugs. | #UCB_Gadget |
Tweak cites |
||
Line 3:
'''Differentiable programming''' is a [[programming paradigm]] in which a numeric computer program can be [[Differentiation (mathematics)|differentiated]] throughout via [[automatic differentiation]].<ref name="izzo2016_dCGP">{{cite journal|last1=Izzo|first1=Dario|last2=Biscani|first2=Francesco|last3=Mereta|first3=Alessio|title=Differentiable genetic programming|journal=European Conference on Genetic Programming (EuroGP)|series=Lecture Notes in Computer Science |year=2017|volume=18|pages=35–51|doi=10.1007/978-3-319-55696-3_3 |arxiv=1611.04766 |isbn=978-3-319-55695-6 |s2cid=17786263 |url=https://link.springer.com/chapter/10.1007/978-3-319-55696-3_3}}</ref>
<ref name="baydin2018automatic">{{cite journal|last1=Baydin|first1=Atilim Gunes|last2=Pearlmutter|first2=Barak|last3=Radul|first3=Alexey Andreyevich|last4=Siskind|first4=Jeffrey|title=Automatic differentiation in machine learning: a survey|journal=Journal of Machine Learning Research |year=2018 |volume=18|pages=1–43|url=http://jmlr.org/papers/v18/17-468.html}}</ref><ref>{{
==Approaches==
Most differentiable programming frameworks work by constructing a graph containing the control flow and [[data structures]] in the program.<ref name="flux">{{cite arXiv|last1=Innes|first1=Michael|last2=Saba|first2=Elliot|last3=Fischer|first3=Keno|last4=Gandhi|first4=Dhairya|last5=Rudilosso|first5=Marco Concetto|last6=Joy|first6=Neethu Mariya|last7=Karmali|first7=Tejan|last8=Pal|first8=Avik|last9=Shah|first9=Viral|date=2018-10-31|title=Fashionable Modelling with Flux|eprint=1811.01457|class=cs.PL}}</ref> Attempts generally fall into two groups:
* ''' Static, [[compiled]] graph'''-based approaches such as [[TensorFlow]],<ref group=note>TensorFlow 1 uses the static graph approach, whereas TensorFlow 2 uses the dynamic graph approach by default.</ref> [[Theano (software)|Theano]], and [[MXNet]]. They tend to allow for good [[compiler optimization]] and easier scaling to large systems, but their static nature limits interactivity and the types of programs that can be created easily (e.g. those involving [[loop (computing)|loops]] or [[recursion]]), as well as making it harder for users to reason effectively about their programs.<ref name="flux" /> A proof of concept compiler toolchain called Myia uses a subset of Python as a front end and supports higher-order functions, recursion, and higher-order derivatives.<ref>{{cite
* '''[[Operator overloading]], dynamic graph''' based approaches such as [[PyTorch]] and [[AutoGrad (NumPy)|AutoGrad]]. Their dynamic and interactive nature lets most programs be written and reasoned about more easily. However, they lead to [[interpreter (computing)|interpreter]] overhead (particularly when composing many small operations), poorer scalability, and reduced benefit from compiler optimization.<ref name="myia1" /><ref name="pytorchtut" /> A package for the [[Julia (programming language)|Julia]] programming language{{snd}} [https://github.com/FluxML/Zygote.jl Zygote]{{snd}} works directly on Julia's [[intermediate representation]], allowing it to still be [[compiler optimization|optimized]] by Julia's just-in-time compiler.<ref name="flux" /><ref>{{cite arXiv|last=Innes|first=Michael|date=2018-10-18|title=Don't Unroll Adjoint: Differentiating SSA-Form Programs|eprint=1810.07951|class=cs.PL}}</ref><ref name="diffprog-zygote" />
|