Closure (computer programming): Difference between revisions

Content deleted Content added
AnomieBOT (talk | contribs)
Bender the Bot (talk | contribs)
m External links: HTTP to HTTPS for Blogspot
 
(24 intermediate revisions by 18 users not shown)
Line 3:
{{Distinguish|text=the programming language [[Clojure]]}}
{{Use dmy dates|date=August 2020}}
In [[programming language]]s, a '''closure''', also '''lexical closure''' or '''function closure''', is a technique for implementing [[lexically scoped]] [[name binding]] in a language with [[first-class function]]s. [[Operational semantics|Operationally]], a closure is a [[Record (computer science)|record]] storing a [[Function (computer science)|function]]{{efn|The function may be stored as a [[Reference (computer science)|reference]] to a function, such as a [[function pointer]].}} together with an environment.<ref>Sussman and Steele. "Scheme: An interpreter for extended lambda calculus". "... a data structure containing a lambda expression, and an environment to be used when that lambda expression is applied to arguments." ([[s:Page:Scheme - An interpreter for extended lambda calculus.djvu/22|Wikisource]])</ref> The environment is a mapping associating each [[free variable]] of the function (variables that are used locally, but defined in an enclosing scope) with the [[valueValue (computer science)|value]] or [[Reference (computer science)|reference]] to which the name was bound when the closure was created.{{efn|These names most frequentlyusually refer to values, mutable variables, or functions, but can also be other entities such as constants, types, classes, or labels.}} Unlike a plain function, a closure allows the function to access those ''captured variables'' through the closure's copies of their values or references, even when the function is invoked outside their scope.
 
== History and etymology ==
The concept of closures was developed in the 1960s for the mechanical evaluation of expressions in the [[λ-calculus]] and was first fully implemented in 1970 as a language feature in the [[PAL (programming language)|PAL programming language]] to support lexically scoped [[first-class function]]s.<ref name=dat2012>{{cite conference |author-link=David A. Turner |first=David A. |last=Turner |year=2012 |url=http://www.cs.kent.ac.uk/people/staff/dat/tfp12/tfp12.pdf |title=Some History of Functional Programming Languages |book-title=International Symposium on Trends in Functional Programming |pages=1–20 See 12 §2, note 8 for the claim about M-expressions. |publisher=Springer |doi=10.1007/978-3-642-40447-4_1 |isbn=978-3-642-40447-4 |series=Lecture Notes in Computer Science |volume=7829}}</ref>
 
[[Peter Landin]] defined the term ''closure'' in 1964 as having an ''environment part'' and a ''control part'' as used by his [[SECD machine]] for evaluating expressions.<ref name=landin>
{{cite journal |last=Landin |first=P.J. |author-link=Peter Landin |title=The mechanical evaluation of expressions |journal=The Computer Journal |volume=6 |issue=4 |date=January 1964 |pages=308–320 |doi=10.1093/comjnl/6.4.308 |url=https://academic.oup.com/comjnl/article-pdf/6/4/308/1067901/6-4-308.pdf }}</ref> [[Joel Moses]] credits Landin with introducing the term ''closure'' to refer to a [[Anonymous function|lambda expression]] with open bindings (free variables) that have been closed by (or bound in) the lexical environment, resulting in a ''closed expression'', or closure.<ref>
{{cite journal |last=Moses |first=Joel |author-link=Joel Moses |date=June 1970 |title=The Function of FUNCTION in LISP, or Why the FUNARG Problem Should Be Called the Environment Problem |journal=ACM SigsamSIGSAM Bulletin |issue=15 |pages=13–27 |doi=10.1145/1093410.1093411 |id=[[AI Memo]] 199 |quote=A useful metaphor for the difference between FUNCTION and QUOTE in LISP is to think of QUOTE as a porous or an open covering of the function since free variables escape to the current environment. FUNCTION acts as a closed or nonporous covering (hence the term "closure" used by Landin). Thus we talk of "open" Lambda expressions (functions in LISP are usually Lambda expressions) and "closed" Lambda expressions. [...] My interest in the environment problem began while Landin, who had a deep understanding of the problem, visited MIT during 1966–67. I then realized the correspondence between the FUNARG lists which are the results of the evaluation of "closed" Lambda expressions in [[LISP 1.5|LISP]] and [[ISWIM]]'s Lambda Closures.|hdl=1721.1/5854|s2cid=17514262 |hdl-access=free }}</ref><ref>{{cite book |last=Wikström |first=Åke |year=1987 |title=Functional Programming using Standard ML |publisher=Prentice Hall |isbn=0-13-331968-7 |quote=The reason it is called a "closure" is that an expression containing free variables is called an "open" expression, and by associating to it the bindings of its free variables, you close it.}}</ref> This use was subsequently adopted by [[Gerald Jay Sussman|Sussman]] and [[Guy L. Steele Jr.|Steele]] when they defined [[Scheme (programming language)|Scheme]] in 1975,<ref>{{cite report |last1=Sussman |first1=Gerald Jay |author1-link=Gerald Jay Sussman |last2=Steele |first2=Guy L. Jr. |author2-link=Guy L. Steele Jr. |date=December 1975 |title=Scheme: An Interpreter for the Extended Lambda Calculus |id=[[AI Memo]] 349}}</ref> a lexically scoped variant of [[Lisp (programming language)|Lisp]], and became widespread.
 
Sussman and [[Harold Abelson|Abelson]] also use the term ''closure'' in the 1980s with a second, unrelated meaning: the property of an operator that adds data to a [[data structure]] to also be able to add nested data structures. This use of the term comes from [[Closure (mathematics)|mathematics use]], rather than the prior use in computer science. The authors consider this overlap in terminology to be "unfortunate."<ref>{{cite book |last1=Abelson |first1=Harold |author1-link=Harold Abelson |last2=Sussman |first2=Gerald Jay |author2-link=Gerald Jay Sussman |last3=Sussman |first3=Julie |author3-link=Julie Sussman |date=1996 |title=Structure and Interpretation of Computer Programs |url=https://mitpress.mit.edu/sites/default/files/sicp/full-text/book/book.html |publisher=MIT Press |pages=98–99 |isbn=0-262-51087-1}}</ref>
Line 40:
assert h(1)(5) == 6 # h(1) is the closure.
</syntaxhighlight>
the values of <code>a</code> and <code>b</code> are closures, in both cases produced by returning a [[nested function]] with a free variable from the enclosing function, so that the free variable binds to the value of parameter <code>x</code> of the enclosing function. The closures in <code>a</code> and <code>b</code> are functionally identical. The only difference in implementation is that in the first case we used a nested function with a name, <code>g</code>, while in the second case we used an anonymous nested function (using the Python keyword <code>lambda</code> for creating an anonymous function). The original name, if any, used in defining them is irrelevant.
 
A closure is a value like any other value. It does not need to be assigned to a variable and can instead be used directly, as shown in the last two lines of the example. This usage may be deemed an "anonymous closure".
Line 46:
The nested function definitions are not themselves closures: they have a free variable which is not yet bound. Only once the enclosing function is evaluated with a value for the parameter is the free variable of the nested function bound, creating a closure, which is then returned from the enclosing function.
 
Lastly, a closure is only distinct from a function with free variables when outside of the scope of the non-local variables, otherwise the defining environment and the execution environment coincide and there is nothing to distinguish these (static and dynamic binding cannot be distinguished because the names resolve to the same values). For example, in the below program below, functions with a free variable <code>x</code> (bound to the non-local variable <code>x</code> with global scope) are executed in the same environment where <code>x</code> is defined, so it is immaterial whether these are actually closures:
<syntaxhighlight lang="python">
x = 1
Line 59:
This is most often achieved by a function return, since the function must be defined within the scope of the non-local variables, in which case typically its own scope will be smaller.
 
This can also be achieved by [[variable shadowing]] (which reduces the scope of the [[non-local variable]]), though this is less common in practice, as it is less useful and shadowing is discouraged. In this example <code>f</code> can be seen to be a closure because <code>x</code> in the body of <code>f</code> is bound to the <code>x</code> in the global namespace, not the <code>x</code> local to <code>g</code>:
<syntaxhighlight lang="python">
x = 0
Line 74:
 
== Applications ==
The use of closures is associated with languages where functions are [[first-class object]]s, in which functions can be returned as results from [[higher-order function]]s, or passed as arguments to other function calls; if functions with free variables are first-class, then returning one creates a closure. This includes [[functional programming languages]] languages such as [[Lisp (programming language)|Lisp]] and [[ML (programming language)|ML]], as well asand many modern, multi-paradigm languages, such as [[Julia (programming language)|Julia]], [[Python (programming language)|Python]], and [[Rust (programming language)|Rust]]. Closures are also often used with [[Callback (computer programming)|callbacks]], particularly for [[event handler]]s, such as in [[JavaScript]], where they are used for interactions with a [[dynamic web page]].
[[Julia (programming language)|Julia]],
[[Python (programming language)|Python]] and
[[Rust (programming language)|Rust]].
Closures are also frequently used with [[Callback (computer programming)|callback]]s, particularly for [[event handler]]s, such as in [[JavaScript]], where they are used for interactions with a [[dynamic web page]].
 
Closures can also be used in a [[continuation-passing style]] to [[informationInformation hiding|hide state]]. Constructs such as [[objectObject (computer science)|object]]s and [[control structure]]s can thus be implemented with closures. In some languages, a closure may occur when a function is defined within another function, and the inner function refers to local variables of the outer function. At [[Run time (program lifecycle phase)|run-time]], when the outer function executes, a closure is formed, consisting of the inner function's code and references (the upvalues) to any variables of the outer function required by the closure.
 
=== First-class functions ===
Line 96 ⟶ 92:
In this example, the [[Lambda (programming)|lambda expression]] <code>(lambda (book) (>= (book-sales book) threshold))</code> appears within the function <code>best-selling-books</code>. When the lambda expression is evaluated, Scheme creates a closure consisting of the code for the lambda expression and a reference to the <code>threshold</code> variable, which is a [[free variable]] inside the lambda expression.
 
The closure is then passed to the <code>filter</code> function, which calls it repeatedly to determine which books are to be added to the result list and which are to be discarded. Because the closure itself has a reference to <code>threshold</code>, it can use that variable each time <code>filter</code> calls it. The function <code>filter</code> itself might be defined in a completely separate file.
 
Here is the same example rewritten in [[JavaScript]], another popular language with support for closures:
Line 106 ⟶ 102:
</syntaxhighlight>
 
The arrow operator <code>=></code> is used to define an [https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Functions/Arrow%20functionsArrow_functions arrow function expression], and an <code>Array.filter</code> method<ref>{{cite web |url=https://developer.mozilla.org/en/Core_JavaScript_1.5_Reference/Global_Objects/Array/filter |title=array.filter |work=Mozilla Developer Center |date=10 January 2010 |access-date=2010-02-09}}</ref> instead of a global <code>filter</code> function, but otherwise the structure and the effect of the code are the same.
 
A function may create a closure and return it, as in this example:
Line 149 ⟶ 145:
 
== Implementation and theory ==
Closures are typically implemented with a special [[data structure]] that contains a [[function pointer|pointer to the function code]], plus a representation of the function's lexical environment (i.e., the set of available variables) at the time when the closure was created. The referencing environment [[name binding|binds]] the non-local names to the corresponding variables in the lexical environment at the time the closure is created, additionally extending their lifetime to at least as long as the lifetime of the closure itself. When the closure is entered at a later time, possibly with a different lexical environment, the function is executed with its non-local variables referring to the ones captured by the closure, not the current environment.
 
A language implementation cannot easily support full closures if its run-time memory model allocates all [[automatic variable]]s on a linear [[Stack-based memory allocation|stack]]. In such languages, a function's automatic local variables are deallocated when the function returns. However, a closure requires that the free variables it references survive the enclosing function's execution. Therefore, those variables must be allocated so that they persist until no longer needed, typically via [[heap allocation]], rather than on the stack, and their lifetime must be managed so they survive until all closures referencing them are no longer in use.
 
This explains why, typically, languages that natively support closures also use [[Garbage collection (computer science)|garbage collection]]. The alternatives are manual memory management of non-local variables (explicitly allocating on the heap and freeing when done), or, if using stack allocation, for the language to accept that certain use cases will lead to [[undefined behaviour]], due to [[dangling pointer]]s to freed automatic variables, as in lambda expressions in C++11<ref>''[http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2550.pdf Lambda Expressions and Closures]'' C++ Standards Committee. 29 February 2008.</ref> or nested functions in GNU C.<ref>{{cite web |work=GCC Manual, [|url=https://gcc.gnu.org/onlinedocs/gcc/Nested-Functions.html |title=6.4 Nested Functions], "|quote=If you try to call the nested function through its address after the containing function exits, all hell breaks loose. If you try to call it after a containing scope level exits, and if it refers to some of the variables that are no longer in scope, you may be lucky, but it's not wise to take the risk. If, however, the nested function does not refer to anything that has gone out of scope, you should be safe."}}</ref> The [[funarg problem]] (or "functional argument" problem) describes the difficulty of implementing functions as first class objects in a stack-based programming language such as C or C++. Similarly in [[D (programming language)|D]] version 1, it is assumed that the programmer knows what to do with [[delegation (programming)|delegates]] and automatic local variables, as their references will be invalid after return from its definition scope (automatic local variables are on the stack) – this still permits many useful functional patterns, but for complex cases needs explicit [[heap allocation]] for variables. D version 2 solved this by detecting which variables must be stored on the heap, and performs automatic allocation. Because D uses garbage collection, in both versions, there is no need to track usage of variables as they are passed.
 
In strict functional languages with immutable data (''e.g.'' [[Erlang (programming language)|Erlang]]), it is very easy to implement automatic memory management (garbage collection), as there are no possible cycles in variables' references. For example, in Erlang, all arguments and variables are allocated on the heap, but references to them are additionally stored on the stack. After a function returns, references are still valid. Heap cleaning is done by incremental garbage collector.
Line 207 ⟶ 203:
 
===Example 2: Accidental reference to a bound variable===
For this example the expected behaviour would be that each link should emit its id when clicked; but because the variable 'e' is bound to the scope above, and lazy evaluated on click, what actually happens is that each on click event emits the id of the last element in 'elements' bound at the end of the [[for loop]].<ref>{{cite web |title=Closures |url=https://developer.mozilla.org/en-US/docs/Web/JavaScript/Closures#Creating_closures_in_loops_A_common_mistake |website=MDN Web Docs |access-date=20 November 2018}}</ref>
 
For this example the expected behaviour would be that each link should emit its id when clicked; but because the variable 'e' is bound to the scope above, and lazy evaluated on click, what actually happens is that each on click event emits the id of the last element in 'elements' bound at the end of the for loop.<ref>{{cite web |title=Closures |url=https://developer.mozilla.org/en-US/docs/Web/JavaScript/Closures#Creating_closures_in_loops_A_common_mistake |website=MDN Web Docs |access-date=20 November 2018}}</ref>
<syntaxhighlight lang="javascript">
var elements = document.getElementsByTagName('a');
Line 238 ⟶ 233:
</syntaxhighlight>
 
The binding of <code>r</code> captured by the closure defined within function <code>foo</code> is to the computation <code>(x / y)</code>—which in this case results in division by zero. However, since it is the computation that is captured, and not the value, the error only manifests itself when the closure is invoked, and actuallythen attempts to use the captured binding.
 
=== Closure leaving ===
Line 331 ⟶ 326:
 
== Closure-like constructs ==
Some languages have features which simulate the behavior of closures. In languages such as Java, [[C++]], Objective-[[C, Sharp (programming language)|C#]], VB.NET[[D (programming language)|D]], [[Java (programming language)|Java]], [[Objective-C]], and D[[Visual Basic (.NET)]] (VB.NET), these features are the result of the language's object-oriented paradigm.
 
=== Callbacks (C) ===
Some [[C (programming language)|C]] libraries support [[Callback (computer programming)|callbacks]]. This is sometimes implemented by providing two values when registering the callback with the library: a function pointer and a separate <code>void*</code> pointer to arbitrary data of the user's choice. When the library executes the callback function, it passes along the data pointer. This enables the callback to maintain state and to refer to information captured at the time it was registered with the library. The idiom is similar to closures in functionality, but not in syntax. The <code>void*</code> pointer is not [[Type safety|type safe]] so this C idiom differs from type-safe closures in C#, Haskell or ML.
Some [[C (programming language)|C]] libraries support
[[callback (computer science)|callback]]s. This is
sometimes implemented by providing two values when
registering the callback with the library: a function
pointer and a separate <code>void*</code> pointer to
arbitrary data of the user's choice. When the library
executes the callback function, it passes along the data
pointer. This enables the callback to maintain state and
to refer to information captured at the time it was
registered with the library. The idiom is similar to
closures in functionality, but not in syntax. The
<code>void*</code> pointer is not [[type safety|type safe]] so this C
idiom differs from type-safe closures in C#, Haskell or ML.
 
Callbacks are used extensively in [[graphical user interface]] (GUI) [[widget toolkit]]s to implement [[event-driven programming]] by associating general functions of graphical widgets (menus, buttons, check boxes, sliders, spinners, etc.) with application-specific functions implementing the specific desired behavior for the application.
Callbacks are extensively used in GUI [[Widget toolkits]] to
implement [[Event-driven programming]] by associating general
functions of graphical widgets (menus, buttons, check boxes,
sliders, spinners, etc.) with application-specific functions
implementing the specific desired behavior for the application.
 
====Nested function and function pointer (C)====
With a [[GNU Compiler Collection]] (GCC) extension, a nested function<ref>{{cite web
|url = https://gcc.gnu.org/onlinedocs/gcc/Nested-Functions.html
|title = Nested functions}}</ref> can be used and a function pointer can emulate closures, providingprovided the function does not exit the containing scope. The next example is invalid because <code>adder</code> is a top-level definition (depending on compiler version, it could produce a correct result if compiled with no optimizing, i.e., at <code>-O0</code>):
 
<syntaxhighlight lang="c">
Line 422 ⟶ 401:
The capturing of <code>final</code> variables enables capturing variables by value. Even if the variable to capture is non-<code>final</code>, it can always be copied to a temporary <code>final</code> variable just before the class.
 
Capturing of variables by reference can be emulated by using a <code>final</code> reference to a mutable container, for example, a singleone-element array. The local class will not be able to change the value of the container reference itself, but it will be able to change the contents of the container.
 
With the advent of Java 8's lambda expressions,<ref>{{cite web |url=http://docs.oracle.com/javase/tutorial/java/javaOO/lambdaexpressions.html |title=Lambda Expressions (|work=The Java Tutorials)}}</ref> the closure causes the above code to be executed as:
 
<syntaxhighlight lang="java">
Line 441 ⟶ 420:
 
Local classes are one of the types of [[inner class]] that are declared within the body of a method. Java also supports inner classes that are declared as ''non-static members'' of an enclosing class.<ref>
{{cite web |url=https://blogs.oracle.com/darcy/entry/nested_inner_member_and_top
|title=Nested, Inner, Member, and Top-Level Classes |work=Joseph D. Darcy's Oracle Weblog |date=July 2007|archive-url=https://web.archive.org/web/20160831172734/https://blogs.oracle.com/darcy/entry/nested_inner_member_and_top |archive-date=31 August 2016 }}</ref> They are normally referred to just as "inner classes".<ref>
{{cite web |url=httphttps://java.sun.com/docs/books/tutorial/java/javaOO/innerclasses.html
|title=Inner Class Example (|work=The Java Tutorials: Learning the Java Language: Classes and Objects)
}}</ref> These are defined in the body of the enclosing class and have full access to instance variables of the enclosing class. Due to their binding to these instance variables, an inner class may only be instantiated with an explicit binding to an instance of the enclosing class using a special syntax.<ref>
{{cite web |url=httphttps://java.sun.com/docs/books/tutorial/java/javaOO/nested.html
|title=Nested Classes (|work=The Java Tutorials: Learning the Java Language: Classes and Objects)
}}</ref>
 
Line 562 ⟶ 541:
 
=== Function objects (C++) ===
[[C++]] enables defining [[function object]]s by overloading <code>operator()</code>. These objects behave somewhat like functions in a functional programming language. They may be created at runtime and may contain state, but they do not implicitly capture local variables as closures do. As of [[C++11|the 2011 revision]], the C++ language also supports closures, which are a type of function object constructed automatically from a special [[language construct]] called ''lambda-expression''. A C++ closure may capture its context either by storing copies of the accessed variables as members of the closure object or by reference. In the latter case, if the closure object escapes the scope of a referenced object, invoking its <code>operator()</code> causes undefined behavior since C++ closures do not extend the lifetime of their context.{{main|Anonymous functionExamples_of_anonymous_functions#C++ _(since Csince_C++11)}}
 
<syntaxhighlight lang="cpp">
Line 592 ⟶ 571:
certain button, so that whenever an instance of the event type occurs on that button – because a user has clicked the button – the procedure will be executed with the mouse coordinates being passed as arguments for <code>x</code> and <code>y</code>.
 
The main limitation of Eiffel agents, which distinguishes them from closures in other languages, is that they cannot reference local variables from the enclosing scope. This design decision helps in avoiding ambiguity when talking about a local variable value in a closure - should it be the latest value of the variable or the value captured when the agent is created? Only <code>Current</code> (a reference to current object, analogous to <code>this</code> in Java), its features, and arguments of the agent itself can be accessed from within the agent body. The values of the outer local variables can be passed by providing additional closed operands to the agent.
 
=== C++Builder __closure reserved word ===
Embarcadero C++Builder provides the reservereserved word <code>__closure</code> to provide a pointer to a method with a similar syntax to a function pointer.<ref>Full documentation can be found at http://docwiki.embarcadero.com/RADStudio/Rio/en/Closure</ref>
 
Standard C allows writing a {{mono|[[typedef]]}} for a pointer to a [[function type]] using the following syntax:<syntaxhighlight lang="c++">
Line 604 ⟶ 583:
 
== See also ==
{{div col}}
* [[Anonymous function]]
* [[Blocks (C language extension)]]
* [[Command pattern]]
* [[Continuation]]
* [[Currying]]
* [[Funarg problem]]
* [[Lambda calculus]]
* [[Lazy evaluation]]
* [[Partial application]]
* [[Spaghetti stack]]
* [[Syntactic closure]]
* [[Value-level programming]]
{{div col end}}
 
== Notes ==
Line 629 ⟶ 604:
|date=2007-01-28
|title=A Definition of Closures
|url=httphttps://gafter.blogspot.com/2007/01/definition-of-closures.html
}}
* {{cite web
Line 643 ⟶ 618:
[[Category:Implementation of functional programming languages]]
[[Category:Subroutines]]
[[Category:Articles with example Python (programming language) code]]
[[Category:Articles with example Scheme (programming language) code]]
[[Category:Articles with example JavaScript code]]
[[Category:Articles with example C++ code]]
[[Category:Articles with example Eiffel code]]
[[Category:Articles with example C Sharp code]]
[[Category:Articles with example D code]]
[[Category:Articles with example Objective-CEiffel code]]
[[Category:Articles with example Scheme (programming language)Haskell code]]
[[Category:Articles with example Java code]]
[[Category:Articles with example JavaScript code]]
[[Category:Articles with example EiffelObjective-C code]]
[[Category:Articles with example Python (programming language)|Python code]] and
[[Category:Articles with example Ruby code]]
[[Category:Articles with example PythonScheme (programming language) code]]
[[Category:Articles with example Smalltalk code]]
[[Category:Articles with example Haskell code]]