Most differentiable programming frameworks work by constructing a graph containing the control flow and [[data structures]] in the program.<ref name="flux">{{cite arxiv|last=Innes|first=Michael|last2=Saba|first2=Elliot|last3=Fischer|first3=Keno|last4=Gandhi|first4=Dhairya|last5=Rudilosso|first5=Marco Concetto|last6=Joy|first6=Neethu Mariya|last7=Karmali|first7=Tejan|last8=Pal|first8=Avik|last9=Shah|first9=Viral|date=2018-10-31|title=Fashionable Modelling with Flux|eprint=1811.01457|class=cs.PL}}</ref> Earlier attempts generally fall into two groups:
* ''' Static, [[compiled]] graph based''' based approaches such as [[TensorFlow]], [[Theano]], and [[MXNet]]. They tend to allow for good compiler optimization and easier scaling to large systems, but their static nature limits interactivity and the types of programs that can be created easily (e.g. those involving loops or recursion), as well as making it harder for users to reason effectively about their programs.<ref name="flux" /><ref name="myia1">{{Cite web|url=https://github.com/mila-iqia/myia/blob/master/README.rst|title=Myia|access-date=2019-03-04}}</ref>
* '''[[Operator overloading]], (dynamic graph) based''' based approaches such as [[PyTorch]] and [[AutoGrad (NumPy)|AutoGrad]]. Their dynamic and interactive nature lets most programs be written and reasoned about more easily. However, they lead to interpreter overhead (particularly when composing many small operations), poorer scalability, and cannot gain benefit from compiler optimization.<ref name="myia1" />
Both of these earlier attempts are also generally only able to differentiate code written in a suitable manner for the framework, limiting their interoperability with other programs.