Function (computer programming): Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 20:52, 19 May 2024 edit Thomas Meng (talk \| contribs) Extended confirmed users 2,031 edits →Side effects ← Previous edit		Latest revision as of 21:48, 5 August 2025 edit undo Ingenuity (talk \| contribs) Edit filter managers, Autopatrolled, Administrators 85,793 edits m Reverted edit by 192.145.175.229 (talk) to last version by Chatul Tag: Rollback
(42 intermediate revisions by 27 users not shown)
Line 2: {{Other uses\|Function (disambiguation){{!}}Function}} {{Use dmy dates\|date=April 2022}} ~~{{More citations needed\|date=February 2013}}~~ {{anchor\|SUBROUTINE_DEFINITION}} In [[computer programming]], a '''function''', (also '''~~subprogram~~procedure''', '''~~procedure~~method''', '''~~method~~subroutine''', '''routine''', or '''~~subroutine~~subprogram''') is a '''callable unit'''<ref>{{cite web \|title=Terminology Glossary \|url=https://pages.nist.gov/ElectionGlossary/ Line 11 ⟶ 10: \|publisher=NIST \|quote=Callable unit: (Of a software program or logical design) Function, method, operation, subroutine, procedure, or analogous structural unit that appears within a module. \|access-date=9 February 2024}}</ref> of [[software logic]] that has a well-defined [[Interface (computing)\|interface]] and [[behavior]] and can be invoked bymultiple ~~other software units to exhibit that behavior~~times. Callable units provide a powerful programming tool.<ref name="knuth1">{{cite book \|title= The Art of Computer Programming, Volume I: Fundamental Algorithms \|author= Donald E. Knuth \|year= 1997 \|author-link= Donald Knuth \|publisher= Addison-Wesley \|isbn=0-201-89683-4}}</ref> The primary purpose is to allow for the decomposition of a large and/or complicated problem into chunks that have relatively low [[cognitive load]] and to assign the chunks meaningful names (unless they are anonymous). Judicious application can reduce the cost of developing and maintaining software, while increasing its quality and reliability.<ref name="structprog">{{cite book \|author= O.-J. Dahl \|author2=E. W. Dijkstra \|author3=C. A. R. Hoare \|title= Structured Programming \|publisher= Academic Press \|year= 1972 \|isbn= 0-12-200550-3}}</ref> Callable units are present at multiple levels of [[abstraction]] in the programming environment. For example, a [[programmer]] may write a function in [[source code]] that is compiled to [[machine code]] that implements similar [[semantics]]. There is a callable unit in the source code and an associated one in the machine code, but they are different kinds of callable units {{endash}} with different implications and features. ==Terminology== Some [[programming language]]s, such as [[COBOL]] and [[BASIC]], make a distinction between functions that return a value (typically called "functions") and those that do not (typically called "subprogram", "subroutine", or "procedure"); some, such as [[C (programming language)\|C]], [[C++]], and [[Rust (programming language)\|Rust]], only use the term "function" irrespective of whether they return a value or not; others, such as [[ALGOL 60]] and [[PL/I]], only use the word ''procedure''. Some [[Object-oriented programming\|object-oriented]] languages, such as [[Java (programming language)\|Java]] and [[C Sharp (programming language)\|C#]], refer to functions inside [[Class (computer programming)\|classes]] as "[[Method (computer programming)\|method]]s". ==History== The idea of a callable unit was initially conceived by [[John Mauchly]] and [[Kathleen Antonelli]] during their work on [[ENIAC]] and recorded in a January 1947 Harvard symposium on "Preparation of Problems for [[EDVAC]]-type Machines"."<ref name=mauchly0>{{cite book \|first=J.W. \|last=Mauchly \|chapter=Preparation of Problems for EDVAC-Type Machines \|author-link=John Mauchly \|editor-first=Brian \|editor-last=Randell \|title=The Origins of Digital Computers \|doi=10.1007/978-3-642-61812-3_31 \|publisher=Springer\|date=1982\|pages=393–397 \|isbn=978-3-642-61814-7 \|chapter-url=https://archive.org/details/originsofdigital0000rand/page/365}}</ref> [[Maurice Wilkes]], [[David Wheeler (British computer scientist)\|David Wheeler]], and [[Stanley Gill]] are generally credited with the formal invention of this concept, which they termed a ''closed sub-routine'',<ref>{{Cite conference \|last1 = Wheeler \|first1 = D. J. \|author-link1 = David Wheeler (computer scientist) \|chapter = The use of sub-routines in programmes \|doi = 10.1145/609784.609816 \|title = Proceedings of the 1952 ACM national meeting (Pittsburgh) on - ACM '52 \|pages = 235 \|year = 1952 \|chapter-url = http://www.laputan.org/pub/papers/wheeler.pdf\|doi-access = free }}</ref><ref>{{cite book \|last1= Wilkes \|first1= M. V. \|last2= Wheeler \|first2= D. J. \|last3= Gill \|first3=S. \|title= Preparation of Programs for an Electronic Digital Computer \|publisher= Addison-Wesley \|year= 1951}}</ref> contrasted with an ''open subroutine'' or [[Macro (computer science)\|macro]].<ref>{{cite encyclopedia \|last=Dainith \|first=John \|title="open subroutine." A Dictionary of Computing \| date=2004 \|url=http://www.encyclopedia.com/doc/1O11-opensubroutine.html \|encyclopedia=Encyclopedia.com \|access-date=January 14, 2013}}</ref> However, [[Alan Turing]] had discussed subroutines in a paper of 1945 on design proposals for the NPL [[Automatic Computing Engine\|ACE]], going so far as to invent the concept of a [[Call stack\|return address stack]].<ref>{{Citation \| last = Turing \| first =Alan M. \| author-link = Alan Turing \| title = Report by Dr. A.M. Turing on proposals for the development of an Automatic Computing Engine (ACE): Submitted to the Executive Committee of the NPL in February 1946 \| year = 1945 }} reprinted in {{Cite book \| editor-last = Copeland \| editor-first = B. J. \| editor-link = Jack Copeland \| title = Alan Turing's Automatic Computing Engine \| ___location = Oxford \| publisher = Oxford University Press \| publication-date = 2005 \| isbn = 0-19-856593-3 \| year = 2005 \| url-access = registration \| url = https://archive.org/details/alanturingsautom0000unse \|page=383}}</ref> The idea of a subroutine was worked out after computing machines had already existed for some time. The arithmetic and conditional jump instructions were planned ahead of time and have changed relatively little, but the special instructions used for procedure calls have changed greatly over the years. The earliest computers ~~and microprocessors~~, such as the [[Manchester Baby]], and some early microprocessors, such as the [[RCA 1802]], did not have a single subroutine call instruction. Subroutines could be implemented, but they required programmers to use the call sequence—a series of instructions—at each [[call site]]. Subroutines were implemented in [[Konrad Zuse]]'s [[Z4 (computer)\|Z4]] in 1945. In 1945, [[Alan M. Turing]] used the terms "bury" and "unbury" as a means of calling and returning from subroutines.<ref name="Turing_1945">{{citation \|author-first=Alan Mathison \|author-last=Turing \|author-link=Alan Mathison Turing \|title=Proposals for Development in the Mathematics Division of an Automatic Computing Engine (ACE) \|date=1946-03-19 \|orig-year=1945}} (NB. Presented on 1946-03-19 before the Executive Committee of the National Physical Laboratory (Great Britain).)</ref><ref name="Carpenter_1977">{{cite journal \|title=The other Turing machine \|author-last1=Carpenter \|author-first1=Brian Edward \|author-link1=Brian Carpenter (Internet engineer) \|author-last2=Doran \|author-first2=Robert William \|author-link2=Robert William Doran \|date=1977-01-01 \|orig-year=October 1975 \|doi=10.1093/comjnl/20.3.269 \|volume=20 \|issue=3 \|journal=[[The Computer Journal]] \|pages=269–279\|doi-access=free }} (11 pages)</ref> In January 1947 John Mauchly presented general notes at 'A Symposium of Large Scale Digital Calculating Machinery' Line 137 ⟶ 140: ====Delayed stacking {{anchor\|Leaf procedure}}{{anchor\|Leaf function}}==== One disadvantage of the call stack mechanism is the increased cost of a procedure call and its matching return.{{ clarify \| date = November 2015 \| reason = increased relative to what?}} The extra cost includes incrementing and decrementing the stack pointer (and, in some architectures, checking for [[stack overflow]]), and accessing the local variables and parameters by frame-relative addresses, instead of absolute addresses. The cost may be realized in increased execution time, or increased processor complexity, or both. This overhead is most obvious and objectionable in ''leaf procedures'' or ''[[leaf ~~functions~~function]]s'', which return without making any procedure calls themselves.<ref>{{cite web\|url=http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.faqs/ka13785.html \|title=ARM Information Center \|publisher=Infocenter.arm.com \|access-date=2013-09-29}}</ref><ref>{{cite web \|title=x64 stack usage \|url=https://docs.microsoft.com/en-us/cpp/build/stack-usage \|website=Microsoft Docs \|publisher=Microsoft \|access-date=5 August 2019}}</ref><ref>{{cite web\|url=http://msdn.microsoft.com/en-us/library/67fa79wz%28v=VS.90%29.aspx \|title=Function Types \|publisher=Msdn.microsoft.com \|access-date=2013-09-29}}</ref> To reduce that overhead, many modern compilers try to delay the use of a call stack until it is really needed.{{Citation needed\|date=June 2011}} For example, the call of a procedure ''P'' may store the return address and parameters of the called procedure in certain processor registers, and transfer control to the procedure's body by a simple jump. If the procedure ''P'' returns without making any other call, the call stack is not used at all. If ''P'' needs to call another procedure ''Q'', it will then use the call stack to save the contents of any registers (such as the return address) that will be needed after ''Q'' returns.<!-- Mention caller-saves vs. callee-saves?--> <!---If the procedure or function itself uses stack handling commands, outside of the prologue and epilogue, e.g. to store intermediate calculation values, the programmer needs to keep track of the number of 'push' and 'pop' instructions so as not to corrupt the original return address.--> <!--Major cleanup needed from this point on--> == Features == {{uncited section\|date=August 2025}} In general, a callable unit is a list of instructions that, starting at the first instruction, executes sequentially except as directed via its internal logic. It can be invoked (called) many times during the [[Execution (computing)\|execution]] of a program. Execution continues at the next instruction after the call instruction when it [[Return statement\|returns]] control. Line 162 ⟶ 167: * Specify a return value in the function body * Call a function * Provide [[actual parameters]] that correspond to a called function's ~~actual~~formal parameters * [[return statement\|Return]] control to the caller at the point of call * Consume the return value in the caller Line 168 ⟶ 173: * Provide a private [[scope (computer science)\|naming scope]] for variables * Identify variables outside the function that are accessible within it * Propagate aan [[exception handling\|exceptional condition]] out of a function and to handle it in the calling context * Package functions into a container such as [[modular programming\|module]], [[library (computing)\|library]], [[object (computer science)\|object]], or [[class (computer programming)\|class]] Line 192 ⟶ 197: ! Convention !! Description !! Used in \|- \| [[Call by value\|by value]] \|\| A copy of the ~~argment~~argument is passed \|\| Default in most Algol-like languages after [[Algol 60]], such as Pascal, Delphi, Simula, CPL, PL/M, Modula, Oberon, Ada, and many others including C, C++ and Java \|- \| [[Call by reference\|by reference]] \|\| A reference to the argument is passed; typically its address \|\| Selectable in most Algol-like languages after [[Algol 60]], such as Algol 68, Pascal, Delphi, Simula, CPL, PL/M, Modula, Oberon, Ada, and many others including C++, Fortran, PL/I Line 216 ⟶ 221: In many contexts, a callable may have [[side-effect (computer science)\|side effect]] behavior such as modifying passed or global data, reading from or writing to a [[peripheral device]], accessing a [[computer file\|file]], halting the program or the machine, or temporarily pausing program execution. Side effects are considered ~~undesireble~~undesirable by [[Robert C. Martin]], who is known for promoting design principles. Martin argues that side effects can result in [[sequential coupling\|temporal coupling]] or order dependencies.<ref name="clean code">{{cite book \|last1=Martin \|first1=Robert C. \|title=Clean Code: A Handbook of Agile Software Craftsmanship \|date=Aug 1, 2008 \|publisher=[[Pearson PLC\|Pearson]] \|isbn=9780132350884 \|edition=1 \|url=https://www.oreilly.com/library/view/clean-code-a/9780136083238/ \|access-date=19 May 2024}}</ref> In strictly [[functional programming]] languages such as [[Haskell (programming language)\|Haskell]], a function can have no [[Side effect (computer science)\|side effects]], which means it cannot change the state of the program. Functions always return the same result for the same input. Such languages typically only support functions that return a value, since there is no value in a function that has neither return value nor side effect. Line 228 ⟶ 233: === Nested call {{endash}} recursion === If supported by the language, a callable may call itself, causing its execution to suspend while another ''nested'' execution of the same callable executes. [[Recursion (computer science)\|Recursion]] is a useful means to simplify some complex algorithms and break down complex problems. Recursive languages provide a new copy of local variables on each call. If the programmer desires the recursive callable to use the same variables instead of using locals, they typically declare them in a shared context such static or global. Languages going back to [[ALGOL]], [[PL/I]] and [[C (programming language)\|C]] and modern languages, almost invariably use a call stack, usually supported by the instruction sets to provide an activation record for each call. That way, a nested call can modify its local variables without ~~effecting~~affecting any of the suspended calls variables. Recursion allows direct implementation of functionality defined by [[mathematical induction]] and recursive [[divide and conquer algorithm]]s. Here is an example of a recursive function in C/C++ to find [[Fibonacci]] sequence\|Fibonacci numbers]]: <syntaxhighlight lang="c"> Line 250 ⟶ 255: Some languages, e.g., [[Ada (programming language)\|Ada]], [[Pascal (programming language)\|Pascal]], [[PL/I]], [[Python (programming language)\|Python]], support declaring and defining a function inside, e.g., a function body, such that the name of the inner is only visible within the body of the outer. ~~<!-- Would an example be helpful here? -->~~ A simple example in Pascal:<syntaxhighlight lang="pascal">function E(x: real): real; function F(y: real): real; begin F := x + y end; begin E := F(3) + F(4) end;</syntaxhighlight>The function <code>F</code> is nested within <code>E</code>. Note that <code>E</code>'s parameter <code>x</code> is also visible in <code>F</code> (as <code>F</code> is a part of <code>E</code>) while both <code>x</code> and <code>y</code> are invisible outside <code>E</code> and <code>F</code> respectively. === Reentrancy === Line 328 ⟶ 341: ==== Compiler optimization ==== Some optimizations for minimizing call overhead may seem straight forward, but cannot be used if the callable has side effects. For example, in the expression <code>(f(x)-1)/(f(x)+1)</code>, the function <code>f</code> cannot be called only once with its value used two times since the two calls may return different results. Moreover, in the few languages which define the order of evaluation of the division operator's operands, the value of <code>x</code> must be fetched again before the second call, since the first call may have changed it. Determining whether a callable has a side effect is difficult {{endash}} indeed, [[undecidable problem\|undecidable]] by virtue of [[Rice's theorem]]. So, while this optimization is safe in a purely functional programming language, a compiler for ana language not limited to functional typically assumes the worst case, that every callable may have side effects. ==== Inlining ==== Line 581 ⟶ 594: * [[Functional programming]] * [[Fused operation]] * [[~~Method~~Generator (computer programming)]]▼ * [[Intrinsic function]] * [[Lambda function (computer programming)]], a function that is not bound to an identifier * [[Logic programming]] ▲* [[Method (computer programming)]] * [[Modular programming]] * [[Operator overloading]]