Differentiable programming

Differentiable programming is a programming paradigm in which the programs can be differentiated throughout, usually via automatic differentiation.^[1]^[2] This allows for gradient based optimization of parameters in the program, often via gradient descent. Differentiable programming has found use in areas such as combining deep learning with physics engines in robotics, differentiable ray tracing, and image processing.^[3]^[4]^[5]

Most differentiable programming frameworks work by constructing a graph containing the control flow and data structures in the program.^[6] Earlier attempts generally featured a tradeoff between a "dynamic" interpreted graph — chosen by frameworks such as PyTorch and AutoGrad — which leads to interpreter overhead and poorer scalability, and a "static" compiled graph — chosen by frameworks such as TensorFlow — which limits interactivity and the types of programs that can be created easily, as well as making it harder for users to reason effectively about their programs.^[6] These earlier attempts are also generally only able to differentiate code written in a suitable manner for the framework, limiting their interoperability with other programs. A more recent framework in the Julia programming language — called Zygote — resolves these problems by treating the language's syntax as the graph; the intermediate representation for arbitrary Julia code can then be differentiated directly and compiled.^[6]^[7]

References

^ Wang, Fei; Decker, James; Wu, Xilun; Essertel, Gregory; Rompf, Tiark (2018), Bengio, S.; Wallach, H.; Larochelle, H.; Grauman, K. (eds.), "Backpropagation with Callbacks: Foundations for Efficient and Expressive Differentiable Programming" (PDF), Advances in Neural Information Processing Systems 31, Curran Associates, Inc., pp. 10201–10212, retrieved 2019-02-13
^ Innes, Mike (2018). "On Machine Learning and Programming Languages" (PDF). SysML Conference 2018.
^ Degrave, Jonas; Hermans, Michiel; Dambre, Joni; wyffels, Francis (2016-11-05). "A Differentiable Physics Engine for Deep Learning in Robotics". arXiv:1611.01652 [cs].
^ "Differentiable Monte Carlo Ray Tracing through Edge Sampling". people.csail.mit.edu. Retrieved 2019-02-13.
^ "Differentiable Programming for Image Processing and Deep Learning in Halide". people.csail.mit.edu. Retrieved 2019-02-13.
^ ^a ^b ^c Innes, Michael; Saba, Elliot; Fischer, Keno; Gandhi, Dhairya; Rudilosso, Marco Concetto; Joy, Neethu Mariya; Karmali, Tejan; Pal, Avik; Shah, Viral (2018-10-31). "Fashionable Modelling with Flux". arXiv:1811.01457 [cs].
^ Innes, Michael (2018-10-18). "Don't Unroll Adjoint: Differentiating SSA-Form Programs". arXiv:1810.07951 [cs].

[1] Wang, Fei; Decker, James; Wu, Xilun; Essertel, Gregory; Rompf, Tiark (2018), Bengio, S.; Wallach, H.; Larochelle, H.; Grauman, K. (eds.), "Backpropagation with Callbacks: Foundations for Efficient and Expressive Differentiable Programming" (PDF), Advances in Neural Information Processing Systems 31, Curran Associates, Inc., pp. 10201–10212, retrieved 2019-02-13

[innes-2] Innes, Mike (2018). "On Machine Learning and Programming Languages" (PDF). SysML Conference 2018.

[3] Degrave, Jonas; Hermans, Michiel; Dambre, Joni; wyffels, Francis (2016-11-05). "A Differentiable Physics Engine for Deep Learning in Robotics". arXiv:1611.01652 [cs].

[4] "Differentiable Monte Carlo Ray Tracing through Edge Sampling". people.csail.mit.edu. Retrieved 2019-02-13.

[5] "Differentiable Programming for Image Processing and Deep Learning in Halide". people.csail.mit.edu. Retrieved 2019-02-13.

[flux-6] Innes, Michael; Saba, Elliot; Fischer, Keno; Gandhi, Dhairya; Rudilosso, Marco Concetto; Joy, Neethu Mariya; Karmali, Tejan; Pal, Avik; Shah, Viral (2018-10-31). "Fashionable Modelling with Flux". arXiv:1811.01457 [cs].

[7] Innes, Michael (2018-10-18). "Don't Unroll Adjoint: Differentiating SSA-Form Programs". arXiv:1810.07951 [cs].

[1]

[2]

[3]

[4]

[5]

[6]

[7]