High-level programming language

A high-level programming language is a programming language with strong abstraction from the details of the computer. In contrast to low-level programming languages, it may use natural language elements, be easier to use, or may automate (or even hide entirely) significant areas of computing systems (e.g. memory management), making the process of developing a program simpler and more understandable than when using a lower-level language. The amount of abstraction provided defines how "high-level" a programming language is.^[1]

High-level refers to a level of abstraction from the hardware details of a processor inherent in machine and assembly code. Rather than dealing with registers, memory addresses, and call stacks, high-level languages deal with variables, arrays, objects, arithmetic and Boolean expressions, functions, loops, threads, locks, and other computer science abstractions, intended to facilitate correctness and maintainability. Unlike low-level assembly languages, high-level languages have few, if any, language elements that translate directly to a machine's native opcodes. Other features, such as string handling, Object-oriented programming features, and file input/output, may also be provided. A high-level language allows for source code that is detached and separated from the machine details. That is, unlike low-level languages like assembly and machine code, high-level language code may result in data movements without the programmer's knowledge. Some control of what instructions to execute is handed to the compiler.

History

In the 1960s, a high-level programming language using a compiler was commonly called an autocode.^[2] Examples of autocodes are COBOL and Fortran.^[3]

The first high-level programming language designed for computers was Plankalkül, created by Konrad Zuse.^[4] However, it was not implemented in his time, and his original contributions were largely isolated from other developments due to World War II, aside from the language's influence on the "Superplan" language by Heinz Rutishauser and also to some degree ALGOL. The first significantly widespread high-level language was Fortran, a machine-independent development of IBM's earlier Autocode systems. The ALGOL family, with ALGOL 58 defined in 1958 and ALGOL 60 defined in 1960 by committees of European and American computer scientists, introduced recursion as well as nested functions under lexical scope. ALGOL 60 was also the first language with a clear distinction between value and name-parameters and their corresponding semantics.^[5] ALGOL also introduced several structured programming concepts, such as the while-do and if-then-else constructs and its syntax was the first to be described in formal notation – Backus–Naur form (BNF). During roughly the same period, COBOL introduced records (also called structs) and Lisp introduced a fully general lambda abstraction in a programming language for the first time.

Abstraction penalty

A high-level language provides features that standardize common tasks, permit rich debugging, and maintain architectural agnosticism. On the other hand, a low-level language requires the coder to work at a lower-level of abstraction which is generally more challenging, but does allow for optimizations that are not possible with a high-level language. This abstraction penalty for using a high-level language instead of a low-level language is real, but in practice, low-level optimizations rarely improve performance at the user experience level.^[6]^[7]^[8] None the less, code that needs to run quickly and efficiently may require the use of a lower-level language, even if a higher-level language would make the coding easier to write and maintain. In many cases, critical portions of a program mostly in a high-level language are coded in assembly in order to meet tight timing or memory constraints. A well-designed compiler for a high-level language can produce code comparable in efficiency to what could be coded by hand in assembly, and the higher-level abstractions sometimes allow for optimizations that beat the performance of hand-coded assembly.^[9] Since a high-level language is designed independent of a specific computing system architecture, a program written in such a language can run on any computing context with a compatible compiler or interpreter.

Unlike a low-level language that is inherently tied to processor hardware, a high-level language can be improved, and new high-level languages can evolve from others with the goal of aggregating the most popular constructs with improved features. For example, Scala maintains backward compatibility with Java. Code written in Java continue to be usable even if a developer switches to Scala. This makes the transition easier and extends the lifespan of a codebase. In contrast, low-level programs rarely survive beyond the system architecture which they were written for.

Relative meaning

The terms high-level and low-level are inherently relative, and languages can be compared as higher or lower level to each other. Sometimes the C language is considered as either high-level or low-level depending on one's perspective. Regardless, most agree that C is higher level than assembly and lower level than most other languages.

C supports constructs such as expression evaluation, parameterized and recursive functions, data types and structures which are generally not supported in assembly or directly by a processor but C does provide lower-level features such as auto-increment and pointer math. But C lacks many higher-level abstracts common in other languages such as garbage collection and a built-in string type. In the introduction of The C Programming Language (second edition) by Brian Kernighan and Dennis Ritchie, C is described as "not a very high level" language.^[10]

Assembly language is higher-level than machine code, but still highly tied to the processor hardware. But, assembly may provide some higher-level features such as macros, relatively limited expressions, constants, variables, procedures, and data structures.

Machine code is at a slightly higher level abstraction than the microcode or micro-operations used internally in many processors.^[11]

Execution modes

There are three general modes of execution for modern high-level languages:

Interpreted

When code written in a language is interpreted, its syntax is read and then executed directly, with no compilation stage. A program called an interpreter reads each program statement, following the program flow, then decides what to do, and does it. A hybrid of an interpreter and a compiler will compile the statement into machine code and execute that; the machine code is then discarded, to be interpreted anew if the line is executed again. Interpreters are commonly the simplest implementations of the behavior of a language, compared to the other two variants listed here.

Compiled

When code written in a language is compiled, its syntax is transformed into an executable form before running. There are two types of compilation:

Machine code generation: Some compilers compile source code directly into machine code. This is the original mode of compilation, and languages that are directly and completely transformed to machine-native code in this way may be called truly compiled languages. See assembly language.
Intermediate representations: When code written in a language is compiled to an intermediate representation, that representation can be optimized or saved for later execution without the need to re-read the source file. When the intermediate representation is saved, it may be in a form such as bytecode. The intermediate representation must then be interpreted or further compiled to execute it. Virtual machines that execute bytecode directly or transform it further into machine code have blurred the once clear distinction between intermediate representations and truly compiled languages.

Source-to-source translated or transcompiled: Code written in a language may be translated into terms of a lower-level language for which native code compilers are already common. JavaScript and the language C are common targets for such translators. See CoffeeScript, Chicken Scheme, and Eiffel as examples. Specifically, the generated C and C++ code can be seen (as generated from the Eiffel language when using the EiffelStudio IDE) in the EIFGENs directory of any compiled Eiffel project. In Eiffel, the translated process is referred to as transcompiling or transcompiled, and the Eiffel compiler as a transcompiler or source-to-source compiler.

Note that languages are not strictly interpreted languages or compiled languages. Rather, implementations of language behavior use interpreting or compiling. For example, ALGOL 60 and Fortran have both been interpreted (even though they were more typically compiled). Similarly, Java shows the difficulty of trying to apply these labels to languages, rather than to implementations; Java is compiled to bytecode which is then executed by either interpreting (in a Java virtual machine (JVM)) or compiling (typically with a just-in-time compiler such as HotSpot, again in a JVM). Moreover, compiling, transcompiling, and interpreting is not strictly limited to only a description of the compiler artifact (binary executable or IL assembly).

High-level language computer architecture

A high-level language computer architecture refers to processor hardware designed to directly interpret high-level language source code. The Burroughs large systems were target machines for ALGOL 60, for example.^[12]

References

^ "HThreads - RD Glossary". Archived from the original on 26 August 2007.
^ London, Keith (1968). "4, Programming". Introduction to Computers. 24 Russell Square London WC1: Faber and Faber Limited. p. 184. ISBN 0571085938. The 'high' level programming languages are often called autocodes and the processor program, a compiler.{{cite book}}: CS1 maint: ___location (link)
^ London, Keith (1968). "4, Programming". Introduction to Computers. 24 Russell Square London WC1: Faber and Faber Limited. p. 186. ISBN 0571085938. Two high level programming languages which can be used here as examples to illustrate the structure and purpose of autocodes are COBOL (Common Business Oriented Language) and FORTRAN (Formular Translation).{{cite book}}: CS1 maint: ___location (link)
^ Giloi, Wolfgang, K. [de] (1997). "Konrad Zuse's Plankalkül: The First High-Level "non von Neumann" Programming Language". IEEE Annals of the History of Computing, vol. 19, no. 2, pp. 17–24, April–June, 1997. (abstract)
^ Although it lacked a notion of reference-parameters, which could be a problem in some situations. Several successors, including ALGOL W, ALGOL 68, Simula, Pascal, Modula and Ada thus included reference-parameters (The related C-language family instead allowed addresses as value-parameters).
^ Surana P (2006). "Meta-Compilation of Language Abstractions" (PDF). Archived (PDF) from the original on 17 February 2015. Retrieved 17 March 2008. {{cite journal}}: Cite journal requires |journal= (help)
^ Kuketayev, Argyn. "The Data Abstraction Penalty (DAP) Benchmark for Small Objects in Java". Application Development Trends. Archived from the original on 11 January 2009. Retrieved 17 March 2008.
^ Chatzigeorgiou; Stephanides (2002). "Evaluating Performance and Power Of Object-Oriented Vs. Procedural Programming Languages". In Blieberger; Strohmeier (eds.). Proceedings - 7th International Conference on Reliable Software Technologies - Ada-Europe'2002. Springer. p. 367.
^ Manuel Carro; José F. Morales; Henk L. Muller; G. Puebla; M. Hermenegildo (2006). "High-level languages for small devices: a case study" (PDF). Proceedings of the 2006 International Conference on Compilers, Architecture and Synthesis for Embedded Systems. ACM.
^ Kernighan, Brian W.; Ritchie, Dennis M. (1988). The C Programming Language: 2nd Edition. Prentice Hall. ISBN 9780131103627. Archived from the original on 25 October 2022. Retrieved 25 October 2022.{{cite book}}: CS1 maint: bot: original URL status unknown (link)
^ Hyde, Randall. (2010). The art of assembly language (2nd ed.). San Francisco: No Starch Press. ISBN 9781593273019. OCLC 635507601.
^ Chu, Yaohan (1975), "Concepts of High-Level Language Computer Architecture", High-Level Language Computer Architecture, Elsevier, pp. 1–14, doi:10.1016/b978-0-12-174150-1.50007-0, ISBN 9780121741501

External links

http://c2.com/cgi/wiki?HighLevelLanguage - The WikiWikiWeb's article on high-level programming languages

[1] "HThreads - RD Glossary". Archived from the original on 26 August 2007.

[kleith-2] London, Keith (1968). "4, Programming". Introduction to Computers. 24 Russell Square London WC1: Faber and Faber Limited. p. 184. ISBN 0571085938. The 'high' level programming languages are often called autocodes and the processor program, a compiler.{{cite book}}: CS1 maint: ___location (link)

[kleith2-3] London, Keith (1968). "4, Programming". Introduction to Computers. 24 Russell Square London WC1: Faber and Faber Limited. p. 186. ISBN 0571085938. Two high level programming languages which can be used here as examples to illustrate the structure and purpose of autocodes are COBOL (Common Business Oriented Language) and FORTRAN (Formular Translation).{{cite book}}: CS1 maint: ___location (link)

[4] Giloi, Wolfgang, K. [de] (1997). "Konrad Zuse's Plankalkül: The First High-Level "non von Neumann" Programming Language". IEEE Annals of the History of Computing, vol. 19, no. 2, pp. 17–24, April–June, 1997. (abstract)

[5] Although it lacked a notion of reference-parameters, which could be a problem in some situations. Several successors, including ALGOL W, ALGOL 68, Simula, Pascal, Modula and Ada thus included reference-parameters (The related C-language family instead allowed addresses as value-parameters).

[6] Surana P (2006). "Meta-Compilation of Language Abstractions" (PDF). Archived (PDF) from the original on 17 February 2015. Retrieved 17 March 2008. {{cite journal}}: Cite journal requires |journal= (help)

[7] Kuketayev, Argyn. "The Data Abstraction Penalty (DAP) Benchmark for Small Objects in Java". Application Development Trends. Archived from the original on 11 January 2009. Retrieved 17 March 2008.

[8] Chatzigeorgiou; Stephanides (2002). "Evaluating Performance and Power Of Object-Oriented Vs. Procedural Programming Languages". In Blieberger; Strohmeier (eds.). Proceedings - 7th International Conference on Reliable Software Technologies - Ada-Europe'2002. Springer. p. 367.

[9] Manuel Carro; José F. Morales; Henk L. Muller; G. Puebla; M. Hermenegildo (2006). "High-level languages for small devices: a case study" (PDF). Proceedings of the 2006 International Conference on Compilers, Architecture and Synthesis for Embedded Systems. ACM.

[10] Kernighan, Brian W.; Ritchie, Dennis M. (1988). The C Programming Language: 2nd Edition. Prentice Hall. ISBN 9780131103627. Archived from the original on 25 October 2022. Retrieved 25 October 2022.{{cite book}}: CS1 maint: bot: original URL status unknown (link)

[11] Hyde, Randall. (2010). The art of assembly language (2nd ed.). San Francisco: No Starch Press. ISBN 9781593273019. OCLC 635507601.

[12] Chu, Yaohan (1975), "Concepts of High-Level Language Computer Architecture", High-Level Language Computer Architecture, Elsevier, pp. 1–14, doi:10.1016/b978-0-12-174150-1.50007-0, ISBN 9780121741501

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]