Differentiable programming

This is an old revision of this page, as edited by 81.154.10.60 (talk) at 18:58, 23 June 2019 (Approaches). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Differentiable programming is a programming paradigm in which the programs can be differentiated throughout, usually via automatic differentiation.[1][2] This allows for gradient based optimization of parameters in the program, often via gradient descent. Differentiable programming has found use in areas such as combining deep learning with physics engines in robotics, differentiable ray tracing, and image processing.[3][4][5]

Approaches

Most differentiable programming frameworks work by constructing a graph containing the control flow and data structures in the program.[6] Earlier attempts generally fall into two groups:

  • Static, compiled graph based approaches such as TensorFlow, Theano, and MXNet. They tend to allow for good compiler optimization and easier scaling to large systems, but their static nature limits interactivity and the types of programs that can be created easily (e.g. those involving loops or recursion), as well as making it harder for users to reason effectively about their programs.[6][7][8]
  • Operator overloading, dynamic graph based approaches such as PyTorch and AutoGrad. Their dynamic and interactive nature lets most programs be written and reasoned about more easily. However, they lead to interpreter overhead (particularly when composing many small operations), poorer scalability, and cannot gain benefit from compiler optimization.[7][8]

Both of these early approaches are only able to differentiate code written in a suitable manner for the framework, limiting their interoperability with other programs.

A more recent package in the Julia programming language — Zygote — resolves the issues that earlier attempts faced by treating the language's syntax as the graph; the design of the Julia language lets the intermediate representation for arbitrary Julia code be differentiated directly, optimized, and compiled.[6][9] An in-development differentiable programming language called Myia also uses a similar approach.[7]

APIs

Many programming languages whose design is less amenable to differentiable programming can gain differentiability features by calling on a differentiable programming framework via an API. For example, TensorFlow can be called from Python, JavaScript, C++, Java, Go, and Swift.[10]

Differentiable programming in Swift with TensorFlow extends the type system to make differentiable functions first-class values, and is implemented as a compiler transformation on the Swift intermediate language (SIL). It leverages protocol-oriented programming (type classes) to allow custom differentiable data structures. The authors hope that it will become a fully integrated part of the Swift language in the future.[11]

See also

References

  1. ^ Wang, Fei; Decker, James; Wu, Xilun; Essertel, Gregory; Rompf, Tiark (2018), Bengio, S.; Wallach, H.; Larochelle, H.; Grauman, K. (eds.), "Backpropagation with Callbacks: Foundations for Efficient and Expressive Differentiable Programming" (PDF), Advances in Neural Information Processing Systems 31, Curran Associates, Inc., pp. 10201–10212, retrieved 2019-02-13
  2. ^ Innes, Mike (2018). "On Machine Learning and Programming Languages" (PDF). SysML Conference 2018.
  3. ^ Degrave, Jonas; Hermans, Michiel; Dambre, Joni; wyffels, Francis (2016-11-05). "A Differentiable Physics Engine for Deep Learning in Robotics". arXiv:1611.01652 [cs.NE].
  4. ^ "Differentiable Monte Carlo Ray Tracing through Edge Sampling". people.csail.mit.edu. Retrieved 2019-02-13.
  5. ^ "Differentiable Programming for Image Processing and Deep Learning in Halide". people.csail.mit.edu. Retrieved 2019-02-13.
  6. ^ a b c Innes, Michael; Saba, Elliot; Fischer, Keno; Gandhi, Dhairya; Rudilosso, Marco Concetto; Joy, Neethu Mariya; Karmali, Tejan; Pal, Avik; Shah, Viral (2018-10-31). "Fashionable Modelling with Flux". arXiv:1811.01457 [cs.PL].
  7. ^ a b c "Automatic Differentiation in Myia" (PDF). Retrieved 2019-03-05.
  8. ^ a b "TensorFlow: Static Graphs". Retrieved 2019-03-04.
  9. ^ Innes, Michael (2018-10-18). "Don't Unroll Adjoint: Differentiating SSA-Form Programs". arXiv:1810.07951 [cs.PL].
  10. ^ "API Documentation | TensorFlow Core r1.14". TensorFlow. Retrieved 2019-06-20.
  11. ^ "Pre-pre-pitch: Swift Differentiable Programming Design Overview". Swift Forums. 2019-06-17. Retrieved 2019-06-18.