Content deleted Content added
direct links |
CortexFiend (talk | contribs) Link suggestions feature: 3 links added. |
||
(10 intermediate revisions by 9 users not shown) | |||
Line 1:
{{Short description|Programming language implementation problem}}
In [[computer science]], the '''funarg problem''' ''(function argument problem)'' refers to the difficulty in implementing [[first-class function]]s ([[function (programming)|function]]s as [[first-class object]]s) in programming language implementations so as to use [[stack-based memory allocation]] of the functions.
The difficulty only arises if the body of a [[nested function]] refers directly (i.e., not by argument passing) to identifiers defined in the environment in which the function is defined, but not in the environment of the function call.<ref>[https://web.archive.org/web/20170706125408/ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-199.pdf ''The function of FUNCTION in LISP or why the FUNARG problem should be called the environment problem''], by Joel Moses, MIT Project MAC memo AI-199, MAC-M-428, June 1970 (15 pp.).</ref> A standard resolution is either to forbid such references or to create [[closure (computer science)|closures]].<ref>[http://portal.acm.org/citation.cfm?id=1093420.1093422 ''A proposed solution to the FUNARG problem''], by Erik Sandewall, in: ACM SIGSAM Bulletin 17 (Jan. 1971), pp. 29–42.</ref>
There are two subtly different versions of the funarg problem. The '''upwards funarg problem''' arises from returning (or otherwise transmitting "upwards") a function from a function call. The '''downwards funarg problem''' arises from passing a function as a parameter to another function call.
==Upwards funarg problem==
When one function calls another during a typical program's execution, the local state of the caller (including [[parameter (computer science)|parameters]] and [[local variable]]s) must be preserved in order for execution to proceed after the callee returns. In most compiled programs, this local state is stored on the [[call stack]] in a [[data structure]] called a ''[[Call stack#Structure|stack frame]]'' or ''activation record''. This stack frame is pushed, or allocated, as prelude to calling another function, and is popped, or deallocated, when the other function returns to the function that did the call. The upwards funarg problem arises when the calling function refers to the called/exited function's state after that function has returned. Therefore, the stack frame containing the called function's state variables must not be deallocated when the function returns, violating the [[stack-based memory allocation|stack-based function call paradigm]].
One solution to the upwards funarg problem is to simply allocate all activation records from the [[Heap (data structure)|heap]] instead of the stack and rely on some form of [[Garbage collection (computer science)|garbage collection]] or [[reference counting]] to deallocate them when they are no longer needed. Managing activation records on the heap has historically been perceived to be less efficient than on the stack (although this is partially contradicted<ref>[[Andrew Appel|Andrew W. Appel]], Zhong Shao. [https://www.cambridge.org/core/services/aop-cambridge-core/content/view/30303C7D7A9ACCC12AAA130855B7E6CF/S095679680000157Xa.pdf/empirical_and_analytic_study_of_stack_versus_heap_cost_for_languages_with_closures.pdf An Empirical and Analytic Study of Stack vs. Heap Cost for Languages with Closures]. [
Some efficiency-minded compilers employ a hybrid approach in which the activation records for a function are allocated from the stack if the [[compiler]] is able to deduce, through [[static program analysis]], that the function creates no upwards funargs. Otherwise, the activation records are allocated from the heap.
Another solution is to simply copy the value of the variables into the closure at the time the closure is created. This will cause a different behavior in the case of mutable variables, because the state will no longer be shared between closures. But if it is known that the variables are constant, then this approach will be equivalent. The [[ML (programming language)|ML]] languages take this approach, since variables in those languages are bound to values—i.e. variables cannot be changed. [[Java_(programming_language)|Java]] also takes this approach with respect to anonymous classes (and lambdas since Java 8), in that it only allows one to refer to variables in the enclosing scope that are effectively <code>final</code> (i.e. constant).
Some languages allow the programmer to explicitly choose between the two behaviors. [[PHP]] 5.3's anonymous functions require one to specify which variables to include in the closure using the <code>use ()</code> clause; if the variable is listed by reference, it includes a reference to the original variable; otherwise, it passes the value. In Apple's [[Blocks (C language extension)|Blocks]] anonymous functions, captured local variables are by default captured by value; if one wants to share the state between closures or between the closure and the outside scope, the variable must be declared with the <code>__block</code> modifier, in which case that variable is allocated on the heap.
===Example===
The following [[Haskell (programming language)|Haskell]]-
<code>[[Lambda calculus|λ]]</code> is the operator for constructing a new function, which in this case has one argument, <code>x</code>, and returns the result of first applying <code>g</code> to <code>x</code>, then applying <code>f</code> to that. This λ function carries the functions <code>f</code> and <code>g</code> (or pointers to them) as internal state.
Line 26 ⟶ 27:
A downwards funarg may also refer to a function's state when that function is not actually executing. However, because, by definition, the existence of a downwards funarg is contained in the execution of the function that creates it, the stack frame for the function can usually still be stored on the stack. Nonetheless, the existence of downwards funargs implies a tree structure of closures and stack frames that can complicate human and machine reasoning about the program state.
The downwards funarg problem complicates the efficient compilation of [[tail call]]s and code written in [[continuation-passing style]]. In these special cases, the intent of the programmer is (usually) that the function run in limited stack space, so the "faster" behavior may actually be undesirable.{{clarify|date=October 2023|reason=What does running in limited stack space entail in this case? What is the faster and the slower behaviour and why would either be undesirable?}}
==Practical implications==
Historically, the upwards funarg problem has proven to be
In [[functional language]]s, functions are first-class values that can be passed anywhere. Thus, implementations of [[Scheme (programming language)|Scheme]] or [[Standard ML]] must address both the upwards and downwards funarg problems. This is usually accomplished by representing function values as [[Dynamic memory allocation|heap-allocated]] closures, as previously described. The [[OCaml]] compiler employs a hybrid technique (based on [[
==See also==
|