Talk:Bytecode: Difference between revisions

Content deleted Content added
 
(6 intermediate revisions by 3 users not shown)
Line 63:
 
The focus should probably be on 'byte-oriented', in the sense of simplifying instruction decoding. The op-code is only one of several fields -- it is not a great benefit if the op-code is easy to extract, while other fields are complex. I've always thought the instruction encoding used for the EM-1 'machine' was a good example: opcode is one byte, escape sequence is one byte, and address fields is one or two bytes. There are a few exceptions where the instructions and arguments were encoded into one byte, but this was to speed execution of very common instructions. (See Informatica Report IR-81 (from 1983) by Andrew S Tanenbaum et al.: Description of a machine architecture for use with block structured languages.) Although the term 'bytecode' is not used by the authors, it has been used in descriptions of the Amsterdam Compiler Kit, of which EM-1 was a central concept.[[User:Athulin|Athulin]] ([[User talk:Athulin|talk]]) 09:46, 10 January 2011 (UTC)
:
:Yes. [[VAX]] was well designed for byte-by-byte interpreters, specifically a microcoded processor. But that was its main failure. It does not lead to parallel or OoO execution. Each operand has a one byte operand descriptor, followed by the appropriate number of operand bytes. To find the next instruction, you have to process all the operand descriptor bytes, one by one. Even though DEC hoped for a long life for VAX, after not so many years, they went to the RISC architecture [[Alpha AXP|Alpha]]. VAX instructions can be 1 to 56 bytes long, I suspect a wider range than JVM. [[User:Gah4|Gah4]] ([[User talk:Gah4|talk]]) 20:29, 14 August 2025 (UTC)
 
==Layman's terms==
Line 151 ⟶ 153:
 
Bytecode is often _not_ interpreted! Some (many/most?) runtimes convert bytecode to native code at runtime ... via JIT compilation. [[User:Stevebroshar|Stevebroshar]] ([[User talk:Stevebroshar|talk]]) 13:27, 11 August 2025 (UTC)
 
:In general, I believe that it is reasonable to describe Pascal P-code and Java byte code as machine languages for abstract machines. Do you not consider [[MIX (abstract machine)|MIX]] to be an instruction set?
:What is the encoding of the P-code for the [[Pascal (programming language)#The Pascal-P system|Pascal-P system]]?
::
::Yes it is strange. Bytecode should apply to any byte oriented instruction encoding, back to (as far as I know) IBM System/360. An important idea behind S/360 was the low-end microcoded machines. That is, ones that interpret the instructions in software. Before S/360, the IBM scientific machines used 36 bit words. Not so many years later, we have VAX, again byte oriented and designed for microcoded processors that interpret the byte codes. VAX followed DEC 36 bit machines, such as the PDP-10. (Seems to be a pattern here.) Once byte addressable machines became popular, byte oriented intermediate code became popular for many different cases. Even more, early in the Java years, Sun had designed and built hardware for running JVM! [[User:Gah4|Gah4]] ([[User talk:Gah4|talk]]) 18:42, 11 August 2025 (UTC)
::
::But okay, MIX. MIX is designed to be either binary or decimal, such that properly written programs run in either case. Assuming ''byte'' means eight-bit unit, I think that disqualifies MIX. But the idea isn't so far off. Back to the years close to the beginning of MIX, decimal machines were not so rare for commercial processors. I suspect, though, that Knuth was trying to get people to think more generally. [[User:Gah4|Gah4]] ([[User talk:Gah4|talk]]) 18:47, 11 August 2025 (UTC)
:::The [[IBM 7030]] used 64-bit words. Other vendors had 48-bit and 60-bit scientific machines. The wierdest was the [[UNIVAC LARC]], which had a 12-digit word but allowed \ (ignore), ^ (space), - (minus), . (period) and + (plus) as digits. -- [[User:Chatul|Shmuel (Seymour J.) Metz Username:Chatul]] ([[User talk:Chatul|talk]]) 10:52, 12 August 2025 (UTC)
:::{{tq| Bytecode should apply to any byte oriented instruction encoding, back to (as far as I know) IBM System/360.}} S/360 is a byte-addressable processor, but its ''instruction set'' isn't byte-aligned, it's halfword-aligned.
:::{{tq|An important idea behind S/360 was the low-end microcoded machines. That is, ones that interpret the instructions in software.}} Microcoding goes anywhere from "it's kinda like [[SIMH]]" vertical microcode, where it looks like a software interpreter, to "it's a state machine defined by words in a control memory with bitfields that either go directly to CPU circuit or control which next control memory word is fetched" horizontal microcode, with various types in between. Often instruction fetch and decode is helped by specialized hardware controlled by the microcode, or is done by a separate hardwired or microcoded engine, with the results fed to a separate execution unit. [[User:Guy Harris|Guy Harris]] ([[User talk:Guy Harris|talk]]) 23:30, 14 August 2025 (UTC)
:{{tq|An instruction set is what processors have.}} OK, is [https://bitsavers.org/pdf/ibm/system38/GA21-9331-1_System_38_Functional_Reference_Manual_Feb81.pdf the never-directly-executed instructions generated by all compilers for System/38 ''not'' used for internal development at IBM, and one of the compilers used for internal development at IBM], or is [https://bitsavers.org/pdf/ibm/system38/SC21-9037-3_IBM_System_38_Internal_Microprogramming_Instructions_Formats_and_Functions_Reference_4th_ed_198508.pdf the executed-by-the-microcoded-processor instructions generated by another compiler used for internal development at IBM, possibly an assembler used for internal development at IBM, and the low-level system code that translates the first instruction set into this instruction set in order to run that code], the instruction set for [[IBM System/38]]? The [[AS/400]] originally continued with those instruction sets, but switched to an extended form of [[PowerPC]]/[[Power ISA]] for the second instruction set ''while continuing to run the same software without recompilation'' (unless you screwed up and "removed observability", meaning "discarding the first-instruction-set code and leaving behind only the second-instruction-set code generated by the binary-to-binary translator").
:{{tq|Bytecode need not be portable.}} If, for a given bytecode, you can write bytecode interpreters that run on more than one type of CPU (whether by having different interpreters for different machines, or by writing a portable interpreter), it's at least in principle portable, and if you actually ''do'' that, it ''is'' portable. (The OS portability issue is probably a bigger issue than the CPU portability issue.)
:{{tq|I've heard that there is a CPU for Java bytecode so in that context the bytecode is an instruction set.}} [[picoJava]] did that.
:Some other machines that could perhaps be considered bytecode machines are the [[Pascal MicroEngine]] (using the same [[MCP-1600]] that the [[LSI-11]] used), possibly the [[Lilith (computer)|Lilith]] running "M-code" for [[Modula-2]], and the Xerox D-machines, microcoded to interpret, among other things, the stack-machine code generated by the [[Mesa (programming language)|Mesa]] compiler. [[User:Guy Harris|Guy Harris]] ([[User talk:Guy Harris|talk]]) 23:54, 14 August 2025 (UTC)