Low-level programming language: Difference between revisions

Content deleted Content added
m Reverted edits by 89.197.109.74 (talk) (HG) (3.4.10)
m Links
Line 6:
}}
 
A '''low-level programming language''' is a [[programming language]] that provides little or no [[Abstraction (computer science)|abstraction]] from a computer's [[instruction set architecture]]—commands or functions in the language map that are structurally similar to processor's instructions. Generally, this refers to either [[machine code]] or [[assembly language]]. Because of the low (hence the word) abstraction between the language and machine language, low-level languages are sometimes described as being "close to the hardware". Programs written in low-level languages tend to be relatively [[Software portability|non-portable]], due to being optimized for a certain type of system architecture.
 
Low-level languages can convert to machine code without a [[compiler]] or [[Interpreter (computing)|interpreter]] – [[second-generation programming language]]s use a simpler processor called an [[Assembly language#Assemble|assembler]] – and—and the resulting code runs directly on the processor. A program written in a low-level language can be made to run very quickly, with a small [[memory footprint]]. An equivalent program in a [[high-level language]] can be less efficient and use more memory. Low-level languages are simple, but considered difficult to use, due to numerous technical details that the programmer must remember. By comparison, a [[high-level programming language]] isolates execution semantics of a computer architecture from the specification of the program, which simplifies development.
 
== Machine code ==
[[File:Digital pdp8-e2.jpg|thumb|Front panel of a PDP-8/E minicomputer. The row of switches at the bottom can be used to toggle in a machine language program.]]
[[Machine code]] is the only language a computer can process directly without a previous transformation. Currently, programmers almost never write programs directly in machine code, because it requires attention to numerous details that a high-level language handles automatically. Furthermore, it requires memorizing or looking up numerical codes for every instruction, and is extremely difficult to modify.
 
True ''machine code'' is a stream of raw, usually [[Binary code|binary]], data. A programmer coding in "machine code" normally codes instructions and data in a more readable form such as [[decimal]], [[octal]], or [[hexadecimal]] which is translated to internal format by a program called a [[Loader (computing)|loader]] or toggled into the computer's memory from a [[front panel]].
 
Although few programs are written in machine languages, programmers often become adept at reading it through working with [[core dump]]s or debugging from the front panel.
 
Example: A function in hexadecimal representation of 32-bit [[x86]] machine code to calculate the ''n''th [[Fibonacci number]]:
Line 24:
C14AEBF1 5BC3
 
== Assembly language ==
Second-generation languages provide one abstraction level on top of the machine code. In the early days of coding on computers like [[TX-0]] and [[PDP-1]], the first thing [[MIT]] [[Hacker culture|hackers]] did was to write assemblers.<ref>{{cite book|last=Levy|first=Stephen|year=1994|title=Hackers: Heroes of the Computer Revolution|title-link=Hackers: Heroes of the Computer Revolution|publisher=Penguin Books|page=32|isbn=0-14-100051-1}}</ref>
[[Assembly language]] has little [[Semantics (computer science)|semantics]] or formal specification, being only a mapping of human-readable symbols, including symbolic addresses, to [[opcode]]s, [[memory address|addresses]], numeric constants, [[string (computer science)|strings]] and so on. Typically, one [[machine instruction (computing)|machine instruction]] is represented as one line of assembly code. Assemblers produce [[object file]]s that can [[linker (computing)|link]] with other object files or be [[loader (computing)|loaded]] on their own.
 
Most assemblers provide [[macro (computer science)|macros]] to generate common sequences of instructions.
Line 47:
</syntaxhighlight>
 
In this code example, hardware features of the x86-64 processor (its [[Processor register|registers]]) are named and manipulated directly. The function loads its input from ''%edi'' in accordance to the [[x86 calling conventions#System V AMD64 ABI|System V ABI]] and performs its calculation by manipulating values in the '''EAX''', '''EBX''', and '''ECX''' registers until it has finished and returns. Note that in this assembly language, there is no concept of returning a value. The result having been stored in the '''EAX''' register, the '''RET''' command simply moves code processing to the code ___location stored on the stack (usually the instruction immediately after the one that called this function) and it is up to the author of the calling code to know that this function stores its result in '''EAX''' and to retrieve it from there. x86-64 assembly language imposes no standard for returning values from a function (and in fact, has no concept of a function); it is up to the calling code to examine state after the procedure returns if it needs to extract a value.
 
Compare this with the same function in [[C (programming language)|C]], a [[high-level language]]:
Line 70:
This code is very similar in structure to the assembly language example but there are significant differences in terms of abstraction:
 
* The input (parameter '''n''') is an abstraction that does not specify any storage ___location on the hardware. In practice, the C compiler follows one of many possible [[Callingcalling convention | calling conventions]]s to determine a storage ___location for the input.
* The assembly language version loads the input parameter from the stack into a register and in each iteration of the loop decrements the value in the register, never altering the value in the memory ___location on the stack. The C compiler could load the parameter into a register and do the same or could update the value wherever it is stored. Which one it chooses is an implementation decision completely hidden from the code author (and one with no [[Side effect (computer science)|side effects]], thanks to C language standards).
* The local variables a, b and c are abstractions that do not specify any specific storage ___location on the hardware. The C compiler decides how to actually store them for the target architecture.
* The return function specifies the value to return, but does not dictate ''how'' it is returned. The C compiler for any specific architecture implements a '''standard''' mechanism for returning the value. Compilers for the x86 architecture typically (but not always) use the EAX register to return a value, as in the assembly language example (the author of the assembly language example has ''chosen'' to copy the C convention but assembly language does not require this).
 
These abstractions make the C code compilable without modification on any architecture for which a C compiler has been written. The x86 assembly language code is specific to the x86 architecture.
 
== Low-level programming in high-level languages ==
During the late 1960s, [[High-level programming language|high-level languages]] such as [[IBM PL/S|PL/S]], [[BLISS]], [[BCPL]], extended [[ALGOL]] (for [[Burroughs large systems]]) and [[C (programming language)|C]] included some degree of access to low-level programming functions. One method for this is [[Inline assembly]], in which assembly code is embedded in a high-level language that supports this feature. Some of these languages also allow architecture-dependent [[Optimizing compiler|compiler optimization directives]] to adjust the way a compiler uses the target processor architecture.
 
== References ==
{{reflist}}
 
Line 86:
{{X86 assembly topics}}
 
{{DEFAULTSORT:Low-Level Programming Language}}
[[Category:Low-level programming languages| ]]
[[Category:Programming language classification]]