Revision as of 19:47, 4 July 2019 edit Yougmark (talk \| contribs) 1 edit m Fix "on machine learning..." reference url Tag: Visual edit: Switched ← Previous edit		Revision as of 12:49, 13 July 2019 edit undo 31.49.219.119 (talk) Clarify Tensorflow 1 vs 2 Next edit →
Line 5: Most differentiable programming frameworks work by constructing a graph containing the control flow and [[data structures]] in the program.<ref name="flux">{{cite arxiv\|last=Innes\|first=Michael\|last2=Saba\|first2=Elliot\|last3=Fischer\|first3=Keno\|last4=Gandhi\|first4=Dhairya\|last5=Rudilosso\|first5=Marco Concetto\|last6=Joy\|first6=Neethu Mariya\|last7=Karmali\|first7=Tejan\|last8=Pal\|first8=Avik\|last9=Shah\|first9=Viral\|date=2018-10-31\|title=Fashionable Modelling with Flux\|eprint=1811.01457\|class=cs.PL}}</ref> Earlier attempts generally fall into two groups: * ''' Static, [[compiled]] graph''' based approaches such as [[TensorFlow]]<ref group=note>TensorFlow 1 uses the static graph approach, whereas TensorFlow 2 uses the dynamic graph approach by default.</ref>, [[Theano (software)\|Theano]], and [[MXNet]]. They tend to allow for good [[compiler optimization]] and easier scaling to large systems, but their static nature limits interactivity and the types of programs that can be created easily (e.g. those involving [[loop (computing)\|loops]] or [[recursion]]), as well as making it harder for users to reason effectively about their programs.<ref name="flux" /><ref name="myia1">{{Cite web\|url=https://www.sysml.cc/doc/2018/39.pdf\|title=Automatic Differentiation in Myia\|access-date=2019-06-24}}</ref><ref name="pytorchtut">{{Cite web\|url=https://pytorch.org/tutorials/beginner/examples_autograd/tf_two_layer_net.html\|title=TensorFlow: Static Graphs\|access-date=2019-03-04}}</ref> * '''[[Operator overloading]], dynamic graph''' based approaches such as [[PyTorch]] and [[AutoGrad (NumPy)\|AutoGrad]]. Their dynamic and interactive nature lets most programs be written and reasoned about more easily. However, they lead to [[interpreter (computing)\|interpreter]] overhead (particularly when composing many small operations), poorer scalability, and cannot gain benefit from compiler optimization.<ref name="myia1" /><ref name="pytorchtut" /> Line 11: Both of these early approaches are only able to differentiate code written in a suitable manner for the framework, limiting their interoperability with other programs. A more recent package in the [[Julia (programming language)\|Julia]] programming language — [https://github.com/FluxML/Zygote.jl Zygote] — resolves the issues that earlier attempts faced by treating the language's syntax as the graph; the design of the Julia language makes it easy for the [[intermediate representation]] of arbitrary Julia code to be differentiated directly, [[compiler optimization\|optimized]], and compiled.<ref name="flux" /><ref>{{cite arxiv\|last=Innes\|first=Michael\|date=2018-10-18\|title=Don't Unroll Adjoint: Differentiating SSA-Form Programs\|eprint=1810.07951\|class=cs.PL}}</ref> An in-development differentiable programming language called [[Myia (programming language)\|Myia]] also uses a similar approach ,<ref name="myia1" />, as does an in-development project for [[Swift (programming language)\|Swift]] implemented via compiler transformation on the Swift intermediate language ([https://github.com/apple/swift/blob/tensorflow/docs/SIL.rst SIL]). <ref>{{Cite web\|url=https://forums.swift.org/t/pre-pre-pitch-swift-differentiable-programming-design-overview/25992\|title=Pre-pre-pitch: Swift Differentiable Programming Design Overview\|date=2019-06-17\|website=Swift Forums\|language=en-US\|access-date=2019-06-18}}</ref> ==See also== * [[Machine learning]] ==Notes== {{reflist\|group=note}} ==References==

Differentiable programming: Difference between revisions