Differentiable programming: Difference between revisions

Content deleted Content added
m top: Various citation & identifier cleanup, plus AWB genfixes (arxiv version pointless when published)
Citation bot (talk | contribs)
m Add: class, eprint. Removed parameters. | You can use this bot yourself. Report bugs here. | Headbomb
Line 1:
{{Programming paradigms}}
 
'''Differentiable programming''' is a [[programming paradigm]] in which the programs can be [[Differentiation (mathematics)|differentiated]] throughout, usually via [[automatic differentiation]].<ref>{{Citation|last=Wang|first=Fei|title=Backpropagation with Callbacks: Foundations for Efficient and Expressive Differentiable Programming|date=2018|url=http://papers.nips.cc/paper/8221-backpropagation-with-callbacks-foundations-for-efficient-and-expressive-differentiable-programming.pdf|work=Advances in Neural Information Processing Systems 31|pages=10201–10212|editor-last=Bengio|editor-first=S.|publisher=Curran Associates, Inc.|access-date=2019-02-13|last2=Decker|first2=James|last3=Wu|first3=Xilun|last4=Essertel|first4=Gregory|last5=Rompf|first5=Tiark|editor2-last=Wallach|editor2-first=H.|editor3-last=Larochelle|editor3-first=H.|editor4-last=Grauman|editor4-first=K.}}</ref><ref name="innes">{{Cite journal|last=Innes|first=Mike|date=2018|title=On Machine Learning and Programming Languages|url=http://www.sysml.cc/doc/37.pdf|journal=SysML Conference 2018|volume=|pages=|via=}}</ref> This allows for [[Gradient method|gradient based optimization]] of parameters in the program, often via [[gradient descent]]. Differentiable programming has found use in areas such as combining [[deep learning]] with [[physics engines]] in [[robotics]], differentiable [[Ray tracing (graphics)|ray tracing]], and [[image processing]].<ref>{{cite arxiv|last=Degrave|first=Jonas|last2=Hermans|first2=Michiel|last3=Dambre|first3=Joni|last4=wyffels|first4=Francis|date=2016-11-05|title=A Differentiable Physics Engine for Deep Learning in Robotics|arxiveprint=1611.01652|class=cs.NE}}</ref><ref>{{Cite web|url=https://people.csail.mit.edu/tzumao/diffrt/|title=Differentiable Monte Carlo Ray Tracing through Edge Sampling|website=people.csail.mit.edu|access-date=2019-02-13}}</ref><ref>{{Cite web|url=https://people.csail.mit.edu/tzumao/gradient_halide/|title=Differentiable Programming for Image Processing and Deep Learning in Halide|website=people.csail.mit.edu|access-date=2019-02-13}}</ref>
 
Most differentiable programming frameworks work by constructing a graph containing the control flow and [[data structures]] in the program.<ref name="flux">{{cite arxiv|last=Innes|first=Michael|last2=Saba|first2=Elliot|last3=Fischer|first3=Keno|last4=Gandhi|first4=Dhairya|last5=Rudilosso|first5=Marco Concetto|last6=Joy|first6=Neethu Mariya|last7=Karmali|first7=Tejan|last8=Pal|first8=Avik|last9=Shah|first9=Viral|date=2018-10-31|title=Fashionable Modelling with Flux|arxiveprint=1811.01457|class=cs.PL}}</ref> Earlier attempts generally featured a tradeoff between a "dynamic" [[Interpreted language|interpreted]] graph — chosen by frameworks such as [[PyTorch]] and [[AutoGrad (NumPy)|AutoGrad]] — which leads to interpreter overhead and poorer scalability, and a "static" [[compiled]] graph — chosen by frameworks such as [[TensorFlow]] — which limits interactivity and the types of programs that can be created easily, as well as making it harder for users to reason effectively about their programs.<ref name="flux" /> These earlier attempts are also generally only able to differentiate code written in a suitable manner for the framework, limiting their interoperability with other programs. A more recent framework in the [[Julia (programming language)|Julia]] programming language — called Zygote — resolves these problems by treating the language's syntax as the graph; the [[intermediate representation]] for arbitrary Julia code can then be differentiated directly, [[compiler optimization|optimized]], and compiled.<ref name="flux" /><ref>{{cite arxiv|last=Innes|first=Michael|date=2018-10-18|title=Don't Unroll Adjoint: Differentiating SSA-Form Programs|arxiveprint=1810.07951|class=cs.PL}}</ref>
 
==References==