Function (computer programming): Difference between revisions

Content deleted Content added
History: Removed citation b/c the page in the book (155) doesn't support the assertion. In fact, page 155 is the references page for chapter 10.
m Reverted edit by 192.145.175.229 (talk) to last version by Chatul
 
(47 intermediate revisions by 29 users not shown)
Line 2:
{{Other uses|Function (disambiguation){{!}}Function}}
{{Use dmy dates|date=April 2022}}
{{More citations needed|date=February 2013}}
{{anchor|SUBROUTINE_DEFINITION}}
 
In [[computer programming]], a '''function''', (also '''subprogramprocedure''', '''proceduremethod''', '''methodsubroutine''', '''routine''', or '''subroutinesubprogram''') is a '''callable unit'''<ref>{{cite web
|title=Terminology Glossary
|url=https://pages.nist.gov/ElectionGlossary/
Line 11 ⟶ 10:
|publisher=NIST
|quote=Callable unit: (Of a software program or logical design) Function, method, operation, subroutine, procedure, or analogous structural unit that appears within a module.
|access-date=9 February 2024}}</ref> of [[software logic]] that has a well-defined [[Interface (computing)|interface]] and [[behavior]] and can be invoked bymultiple other software units to exhibit that behaviortimes.
 
Callable units provide a powerful programming tool.<ref name="knuth1">{{cite book |title= The Art of Computer Programming, Volume I: Fundamental Algorithms |author= Donald E. Knuth |year= 1997 |author-link= Donald Knuth |publisher= Addison-Wesley |isbn=0-201-89683-4}}</ref> The primary purpose is to allow for the decomposition of a large and/or complicated problem into chunks that have relatively low [[cognitive load]] and to assign the chunks meaningful names (unless they are anonymous). Judicious application can reduce the cost of developing and maintaining software, while increasing its quality and reliability.<ref name="structprog">{{cite book |author= O.-J. Dahl |author2=E. W. Dijkstra |author3=C. A. R. Hoare |title= Structured Programming |publisher= Academic Press |year= 1972 |isbn= 0-12-200550-3}}</ref>
 
Callable units are present at multiple levels of [[abstraction]] in the programming environment. For example, a [[programmer]] may write a function in [[source code]] that is compiled to [[machine code]] that implements similar [[semantics]]. There is a callable unit in the source code and an associated one in the machine code, but they are different kinds of callable units {{endash}} with different implications and features.
 
==Terminology==
 
Some [[programming language]]s, such as [[COBOL]] and [[BASIC]], make a distinction between functions that return a value (typically called "functions") and those that do not (typically called "subprogram", "subroutine", or "procedure"); some, such as [[C (programming language)|C]], [[C++]], and [[Rust (programming language)|Rust]], only use the term "function" irrespective of whether they return a value or not; others, such as [[ALGOL 60]] and [[PL/I]], only use the word ''procedure''. Some [[Object-oriented programming|object-oriented]] languages, such as [[Java (programming language)|Java]] and [[C Sharp (programming language)|C#]], refer to functions inside [[Class (computer programming)|classes]] as "[[Method (computer programming)|method]]s".
 
==History==
 
The idea of a callable unit was initially conceived by [[John Mauchly]] and [[Kathleen Antonelli]] during their work on [[ENIAC]] and recorded in a January 1947 Harvard symposium on "Preparation of Problems for [[EDVAC]]-type Machines"."<ref name=mauchly0>{{cite book |first=J.W. |last=Mauchly |chapter=Preparation of Problems for EDVAC-Type Machines |author-link=John Mauchly |editor-first=Brian |editor-last=Randell |title=The Origins of Digital Computers |doi=10.1007/978-3-642-61812-3_31 |publisher=Springer|date=1982|pages=393–397 |isbn=978-3-642-61814-7 |chapter-url=https://archive.org/details/originsofdigital0000rand/page/365}}</ref> [[Maurice Wilkes]], [[David Wheeler (British computer scientist)|David Wheeler]], and [[Stanley Gill]] are generally credited with the formal invention of this concept, which they termed a ''closed sub-routine'',<ref>{{Cite conference |last1 = Wheeler |first1 = D. J. |author-link1 = David Wheeler (computer scientist) |chapter = The use of sub-routines in programmes |doi = 10.1145/609784.609816 |title = Proceedings of the 1952 ACM national meeting (Pittsburgh) on - ACM '52 |pages = 235 |year = 1952 |chapter-url = http://www.laputan.org/pub/papers/wheeler.pdf|doi-access = free }}</ref><ref>{{cite book |last1= Wilkes |first1= M. V. |last2= Wheeler |first2= D. J. |last3= Gill |first3=S. |title= Preparation of Programs for an Electronic Digital Computer |publisher= Addison-Wesley |year= 1951}}</ref> contrasted with an ''open subroutine'' or [[Macro (computer science)|macro]].<ref>{{cite encyclopedia |last=Dainith |first=John |title="open subroutine." A Dictionary of Computing | date=2004 |url=http://www.encyclopedia.com/doc/1O11-opensubroutine.html |encyclopedia=Encyclopedia.com |access-date=January 14, 2013}}</ref> However, [[Alan Turing]] had discussed subroutines in a paper of 1945 on design proposals for the NPL [[Automatic Computing Engine|ACE]], going so far as to invent the concept of a [[Call stack|return address stack]].<ref>{{Citation | last = Turing | first =Alan M. | author-link = Alan Turing | title = Report by Dr. A.M. Turing on proposals for the development of an Automatic Computing Engine (ACE): Submitted to the Executive Committee of the NPL in February 1946 | year = 1945 }} reprinted in {{Cite book | editor-last = Copeland | editor-first = B. J. | editor-link = Jack Copeland | title = Alan Turing's Automatic Computing Engine | ___location = Oxford | publisher = Oxford University Press | publication-date = 2005 | isbn = 0-19-856593-3 | year = 2005 | url-access = registration | url = https://archive.org/details/alanturingsautom0000unse |page=383}}</ref>
 
The idea of a subroutine was worked out after computing machines had already existed for some time. The arithmetic and conditional jump instructions were planned ahead of time and have changed relatively little, but the special instructions used for procedure calls have changed greatly over the years. The earliest computers and microprocessors, such as the [[Manchester Baby]], and some early microprocessors, such as the [[RCA 1802]], did not have a single subroutine call instruction. Subroutines could be implemented, but they required programmers to use the call sequence—a series of instructions—at each [[call site]].
 
Subroutines were implemented in [[Konrad Zuse]]'s [[Z4 (computer)|Z4]] in 1945.
 
In 1945, [[Alan M. Turing]] used the terms "bury" and "unbury" as a means of calling and returning from subroutines.<ref name="Turing_1945">{{citation |author-first=Alan Mathison |author-last=Turing |author-link=Alan Mathison Turing |title=Proposals for Development in the Mathematics Division of an Automatic Computing Engine (ACE) |date=1946-03-19 |orig-year=1945}} (NB. Presented on 1946-03-19 before the Executive Committee of the National Physical Laboratory (Great Britain).)</ref><ref name="Carpenter_1977">{{cite journal |title=The other Turing machine |author-last1=Carpenter |author-first1=Brian Edward |author-link1=Brian Carpenter (Internet engineer) |author-last2=Doran |author-first2=Robert William |author-link2=Robert William Doran |date=1977-01-01 |orig-year=October 1975 |doi=10.1093/comjnl/20.3.269 |volume=20 |issue=3 |journal=[[The Computer Journal]] |pages=269–279|doi-access=free }} (11 pages)</ref>
 
In January 1947 John Mauchly presented general notes at 'A Symposium of Large Scale Digital Calculating Machinery'
under the joint sponsorship of Harvard University and the Bureau of Ordnance, United States Navy. Here he discusses serial and parallel operation suggesting
{{block quoteblockquote|...the structure of the machine need not be complicated one bit. It is possible, since all the logical characteristics essential to this procedure are available, to evolve a coding instruction for placing the subroutines in the memory at places known to the machine, and in such a way that they may easily be called into use.{{pb}}In other words, one can designate subroutine A as division and subroutine B as complex multiplication and subroutine C as the evaluation of a standard error of a sequence of numbers, and so on through the list of subroutines needed for a particular problem. ... All these subroutines will then be stored in the machine, and all one needs to do is make a brief reference to them by number, as they are indicated in the coding.<ref name=mauchly0/>}}
 
[[Kathleen Antonelli|Kay McNulty]] had worked closely with John Mauchly on the [[ENIAC]] team and developed an idea for subroutines for the [[ENIAC]] computer she was programming during World War II.<ref name=":0">{{Cite web|url=http://fortune.com/2014/09/18/walter-isaacson-the-women-of-eniac/|title=Walter Isaacson on the Women of ENIAC|last=Isaacson|first=Walter|date=18 September 2014|website=Fortune|language=en|archive-url=https://web.archive.org/web/20181212003245/http://fortune.com/2014/09/18/walter-isaacson-the-women-of-eniac/|archive-date=12 December 2018|access-date=2018-12-14}}</ref> She and the other ENIAC programmers used the subroutines to help calculate missile trajectories.<ref name=":0" />
Line 44 ⟶ 47:
| access-date = February 8, 2024
}}
</ref> (1961) is one of the first computers to store subroutine return data on a stack.
 
The DEC [[PDP-6]]<ref>{{cite book
Line 137 ⟶ 140:
 
====Delayed stacking {{anchor|Leaf procedure}}{{anchor|Leaf function}}====
 
One disadvantage of the call stack mechanism is the increased cost of a procedure call and its matching return.{{ clarify | date = November 2015 | reason = increased relative to what?}} The extra cost includes incrementing and decrementing the stack pointer (and, in some architectures, checking for [[stack overflow]]), and accessing the local variables and parameters by frame-relative addresses, instead of absolute addresses. The cost may be realized in increased execution time, or increased processor complexity, or both.
 
This overhead is most obvious and objectionable in ''leaf procedures'' or ''[[leaf functionsfunction]]s'', which return without making any procedure calls themselves.<ref>{{cite web|url=http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.faqs/ka13785.html |title=ARM Information Center |publisher=Infocenter.arm.com |access-date=2013-09-29}}</ref><ref>{{cite web |title=x64 stack usage |url=https://docs.microsoft.com/en-us/cpp/build/stack-usage |website=Microsoft Docs |publisher=Microsoft |access-date=5 August 2019}}</ref><ref>{{cite web|url=http://msdn.microsoft.com/en-us/library/67fa79wz%28v=VS.90%29.aspx |title=Function Types |publisher=Msdn.microsoft.com |access-date=2013-09-29}}</ref> To reduce that overhead, many modern compilers try to delay the use of a call stack until it is really needed.{{Citation needed|date=June 2011}} For example, the call of a procedure ''P'' may store the return address and parameters of the called procedure in certain processor registers, and transfer control to the procedure's body by a simple jump. If the procedure ''P'' returns without making any other call, the call stack is not used at all. If ''P'' needs to call another procedure ''Q'', it will then use the call stack to save the contents of any registers (such as the return address) that will be needed after ''Q'' returns.<!-- Mention caller-saves vs. callee-saves?-->
<!---If the procedure or function itself uses stack handling commands, outside of the prologue and epilogue, e.g. to store intermediate calculation values, the programmer needs to keep track of the number of 'push' and 'pop' instructions so as not to corrupt the original return address.-->
<!--Major cleanup needed from this point on-->
 
== Features ==
{{uncited section|date=August 2025}}
 
In general, a callable unit is a list of instructions that, starting at the first instruction, executes sequentially except as directed via its internal logic. It can be invoked (called) many times during the [[Execution (computing)|execution]] of a program. Execution continues at the next instruction after the call instruction when it [[Return statement|returns]] control.
Line 151 ⟶ 156:
The features of [[implementation]]s of callable units evolved over time and varies by context.
This section describes features of the various common implementations.
 
=== General characteristics ===
 
Line 161 ⟶ 167:
* Specify a return value in the function body
* Call a function
* Provide [[actual parameters]] that correspond to a called function's actualformal parameters
* [[return statement|Return]] control to the caller at the point of call
* Consume the return value in the caller
Line 167 ⟶ 173:
* Provide a private [[scope (computer science)|naming scope]] for variables
* Identify variables outside the function that are accessible within it
* Propagate aan [[exception handling|exceptional condition]] out of a function and to handle it in the calling context
* Package functions into a container such as [[modular programming|module]], [[library (computing)|library]], [[object (computer science)|object]], or [[class (computer programming)|class]]
 
Line 177 ⟶ 183:
=== Call syntax ===
 
If declared to return a value, a call can be embedded in an [[expression (programming)|expression]] in order to consume the return value. For example, a square root callable unit might be called like <code>y = sqrt(x)</code>.
 
A callable unit that does not return a value is called as a stand-alone [[statement (computer science)|statement]] like <code>print("hello")</code>. This syntax can also be used for a callable unit that returns a value, but the return value will be ignored.
Line 191 ⟶ 197:
! Convention !! Description !! Used in
|-
| [[Call by value|by value]] || A copy of the argmentargument is passed || Default in most Algol-like languages after [[Algol 60]], such as Pascal, Delphi, Simula, CPL, PL/M, Modula, Oberon, Ada, and many others including C, C++ and Java
|-
| [[Call by reference|by reference]] || A reference to the argument is passed; typically its address || Selectable in most Algol-like languages after [[Algol 60]], such as Algol 68, Pascal, Delphi, Simula, CPL, PL/M, Modula, Oberon, Ada, and many others including C++, Fortran, PL/I
Line 208 ⟶ 214:
In some languages, such as BASIC, a callable has different syntax (i.e. keyword) for a callable that returns a value vs. one that does not.
In other languages, the syntax is the same regardless.
In some of these languages an extra keyword is used to declare no return value; for exmapleexample <code>void</code> in C, C++ and C#.
In some languages, such as Python, the difference is whether the body contains a return statement with a value, and a particular callable may return with or without a value based on control flow.
 
Line 214 ⟶ 220:
 
In many contexts, a callable may have [[side-effect (computer science)|side effect]] behavior such as modifying passed or global data, reading from or writing to a [[peripheral device]], accessing a [[computer file|file]], halting the program or the machine, or temporarily pausing program execution.
 
Side effects are considered undesirable by [[Robert C. Martin]], who is known for promoting design principles. Martin argues that side effects can result in [[sequential coupling|temporal coupling]] or order dependencies.<ref name="clean code">{{cite book |last1=Martin |first1=Robert C. |title=Clean Code: A Handbook of Agile Software Craftsmanship |date=Aug 1, 2008 |publisher=[[Pearson PLC|Pearson]] |isbn=9780132350884 |edition=1 |url=https://www.oreilly.com/library/view/clean-code-a/9780136083238/ |access-date=19 May 2024}}</ref>
 
In strictly [[functional programming]] languages such as [[Haskell (programming language)|Haskell]], a function can have no [[Side effect (computer science)|side effects]], which means it cannot change the state of the program. Functions always return the same result for the same input. Such languages typically only support functions that return a value, since there is no value in a function that has neither return value nor side effect.
 
{{Anchor|LVRR}} <!--what is this?-->
 
=== Local variables ===
 
Line 224 ⟶ 233:
=== Nested call {{endash}} recursion ===
 
If supported by the language, a callable may call itself, causing its execution to suspend while another ''nested'' execution of the same callable executes. [[Recursion (computer science)|Recursion]] is a useful means to simplify some complex algorithms and break down complex problems. Recursive languages provide a new copy of local variables on each call. If the programmer desires the recursive callable to use the same variables instead of using locals, they typically declare them in a shared context such static or global.
 
Languages going back to [[ALGOL]], [[PL/I]] and [[C (programming language)|C]] and modern languages, almost invariably use a call stack, usually supported by the instruction sets to provide an activation record for each call. That way, a nested call can modify its local variables without effectingaffecting any of the suspended calls variables.
 
Recursion allows direct implementation of functionality defined by [[mathematical induction]] and recursive [[divide and conquer algorithm]]s. Here is an example of a recursive function in C/C++ to find [[Fibonacci]] sequence|Fibonacci numbers]]:
 
<syntaxhighlight lang="c">
Line 245 ⟶ 254:
{{main|Nested function}}
 
Some languages, suche.g., as[[Ada (programming language)|Ada]], [[Pascal (programming language)|Pascal]], [[PL/I]], and [[AdaPython (programming language)|AdaPython]], support declaring, a.k.a.and nesting,defining a function inside, e.g., a function body {{endash}}, such that the name of the inner is only visible within the body of the outer.
 
A simple example in Pascal:<syntaxhighlight lang="pascal">function E(x: real): real;
function F(y: real): real;
begin
F := x + y
end;
begin
E := F(3) + F(4)
end;</syntaxhighlight>The function <code>F</code> is nested within <code>E</code>. Note that <code>E</code>'s parameter <code>x</code> is also visible in <code>F</code> (as <code>F</code> is a part of <code>E</code>) while both <code>x</code> and <code>y</code> are invisible outside <code>E</code> and <code>F</code> respectively.
 
=== Reentrancy ===
Line 323 ⟶ 341:
==== Compiler optimization ====
 
Some optimizations for minimizing call overhead may seem straight forward, but cannot be used if the callable has side effects. For example, in the expression <code>(f(x)-1)/(f(x)+1)</code>, the function <code>f</code> cannot be called only once with its value used two times since the two calls may return different results. Moreover, in the few languages which define the order of evaluation of the division operator's operands, the value of <code>x</code> must be fetched again before the second call, since the first call may have changed it. Determining whether a callable has a side effect is difficult {{endash}} indeed, [[undecidable problem|undecidable]] by virtue of [[Rice's theorem]]. So, while this optimization is safe in a purely functional programming language, a compiler for ana language not limited to functional typically assumes the worst case, that every callable may have side effects.
 
==== Inlining ====
Line 341 ⟶ 359:
{{Anchor|Built-in function}}
{{main|Intrinsic function}}
A ''built-in function'', or ''builtin function'', or ''intrinsic function'', is a function for which the compiler generates code at [[compile time]] or provides in a way other than for other functions.<ref>{{cite web |title=Built-in functions |url=https://www.ibm.com/docs/en/zos/2.2.0?topic=lf-built-in-functions |website=ibm.com | date=9 March 2017 |access-date=December 25, 2023}}</ref> A built-in function does not need to be defined like other functions since it is ''built in'' to the programming language.<ref>{{cite book |title=Study Material Python |date=April 2023 |page=87 |url=https://books.google.com/books?id=d0nhEAAAQBAJ&pg=PA87 |access-date=December 25, 2023}}</ref>
 
== Programming ==
Line 352 ⟶ 370:
 
* [[Decomposition (computer science)|Decomposing]] a complex programming task into simpler steps: this is one of the two main tools of [[structured programming]], along with [[data structure]]s
 
* Reducing [[duplicate code]] within a program
 
* Enabling [[Code reuse|reuse of code]] across multiple programs
 
* Dividing a large programming task among various programmers or various stages of a project
 
* [[Information hiding|Hiding implementation details]] from users of the function
 
* Improving readability of code by replacing a block of code with a function call where a [https://books.google.com/books?id=_i6bDeoCQzsC&pg=PA39&dq=descriptive+function+name descriptive] function name serves to describe the block of code. This makes the calling code concise and readable even if the function is not meant to be reused.
 
* Improving [[Traceability#Software|traceability]] (i.e. most languages offer ways to obtain the call trace which includes the names of the involved functions and perhaps even more information such as file names and line numbers); by not decomposing the code into functions, debugging would be severely impaired
 
Line 373 ⟶ 385:
=== Conventions ===
 
Many programming conventions have been developed regarding callables.
 
With respect to naming, many developers name a callable with a phrase starting with a [[verb]] when it does a certain task, with an [[adjective]] when it makes an inquiry, and with a [[noun]] when it is used to substitute variables.
Line 379 ⟶ 391:
Some programmers suggest that a callable should perform exactly one task, and if it performs more than one task, it should be split up into multiple callables. They argue that callables are key components in [[software maintenance]], and their roles in the program must remain distinct.
 
Proponents of [[modular programming]] advocate that each callable should have minimal dependency on the rest of the [[codebase]]. For example, the use of [[global variable]]s is generally deemed unwise, because it adds coupling between all callables that use the global variables. If such coupling is not necessary, they advise to [[code refactoring|refactor]] callables to accept passed [[parameter (computer programming)|parameters]] instead.
 
== Examples ==
Line 401 ⟶ 413:
===Small Basic===
 
In [[Microsoft Small Basic]], targeted to the student first leaninglearning how to program in a text-based language, a callable unit is called a ''subroutine''.
The <code>Sub</code> keyword denotes the start of a subroutine and is followed by a name identifier. Subsequent lines are the body which ends with the <code>EndSub</code> keyword.
<ref>{{cite web |title=Small Basic |url=https://smallbasic-publicwebsite.azurewebsites.net/ |website=Small Basic |access-date=8 February 2024}}</ref>
Line 412 ⟶ 424:
 
This can be called as <code>SayHello()</code>.
<ref>{{cite web |title=Small Basic Getting Started Guide: Chapter 9: Subroutines |date=17 January 2024 |url=https://social.technet.microsoft.com/wiki/contents/articles/16077.small-basic-getting-started-guide-chapter-9-subroutines.aspx |publisher=Microsoft }}</ref>
 
===Visual Basic===
 
In later versions of [[Visual Basic]] (VB), including the [[Visual Basic (.NET)|latest product line]] and [[Visual Basic (classic)|VB6]], the term ''procedure'' is used for the callable unit concept. The keyword <code>Sub</code> is used to return no value and <code>Function</code> to return a value. When used in the context of a class, a procedure is a method.
<ref>{{cite web |title=Procedures in Visual Basic |url=https://learn.microsoft.com/en-us/dotnet/visual-basic/programming-guide/language-features/procedures/ |website=Microsoft Learn |date=15 September 2021 |access-date=8 February 2024}}</ref>
 
Each parameter has a [[data type]] that can be specified, but if not, defaults to <code>Object</code> for later versions based on [[.NET]] and [[variant type|variant]] for [[Visual Basic 6|VB6]].<ref>{{cite web |title=Dim statement (Visual Basic) |url=https://learn.microsoft.com/en-us/dotnet/visual-basic/language-reference/statements/dim-statement |website=Microsoft Learn |date=15 September 2021 |access-date=8 February 2024}}</ref>
 
VB supports parameter passing conventions [[call by value|by value]] and [[call by reference|by reference]] via the keywords <code>ByVal</code> and <code>ByRef</code>, respectively.
Line 582 ⟶ 594:
* [[Functional programming]]
* [[Fused operation]]
* [[MethodGenerator (computer programming)]]
* [[Intrinsic function]]
* [[Lambda function (computer programming)]], a function that is not bound to an identifier
* [[Logic programming]]
* [[Method (computer programming)]]
* [[Modular programming]]
* [[Operator overloading]]