Threaded code: Difference between revisions

Content deleted Content added
Pseudocode is unnecessarily confusing with multiple ++/-- operations on the same variable in an expression.
Line 226:
 
===Token threading===
Token-threaded code usesimplements liststhe thread as a list of indices into a table of operations; the index width is naturally chosen to be as small as possible for density and efficiency. 1 byte / 8-bits is the natural choice for ease of programming, but smaller sizes like 4-bits, or larger like 12-bit {{cn|date=Julyor 2020}}16 indexesbits, tocan abe tableused depending on the number of pointersoperations supported. ItAs long as the index width is notablychosen compactto be narrower than a machine pointer, it will naturally be more compact than the other threading types without much special effort by athe programmer. It is usually half to three-fourths the size of other threadings, which are themselves a quarter to an eighth the size of non-threaded code. The table's pointers can either be indirect or direct. Some Forth compilers produce token-threaded code. Some programmers consider the "[[p-code machine|p-code]]" generated by some [[Pascal (programming language)|Pascal]] compilers, as well as the [[bytecode]]s used by [[.NET Framework|.NET]], [[Java (programming language)|Java]], BASIC and some [[C (programming language)|C]] compilers, to be token-threading.
 
A common approach, historically, is [[bytecode]], which typically uses 8-bit opcodes and, often,with a stack-based virtual machine. AThe typicalarchetypal bytecode [[Interpreter (computing)|interpreter]] is known as a "[[decode and dispatch interpreter]]", and follows the form:
 
<syntaxhighlight lang="c">
start:
vpc = &thread
dispatch:
top:
iaddr = decode(&vpc++) /*/ mayConvert bethe implementednext simplybytecode as:operation to returna *vpcpointer */to machine code that implements it
// Any inter-instruction operations are performed here (e.g. updating global state, event processing, etc)
addr = table[i]
jump *addr
CODE_PTR decode(BYTE_CODE **p) {
// In a more complex encoding, there may be multiple tables to choose between or control/mode flags
addr =return table[i*(*p)++];
}
thread: /* Contains bytecode, not machine addresses. Hence it is more compact. */
1 /*pushA*/
Line 247 ⟶ 251:
pushA:
*sp++ = A
jump topdispatch
pushB:
*sp++ = B
jump topdispatch
add:
addend1 = *--sp
addend2 = *--sp
*sp++ = addend1 + addend2
jump topdispatch
</syntaxhighlight>
 
Line 262 ⟶ 266:
For instructions where the individual operations are simple, such as "push" and "add", the [[Computational overhead|overhead]] involved in deciding what to execute is larger than the cost of actually executing it, so such interpreters are often much slower than machine code. However, for more complex ("compound") instructions, the overhead percentage is proportionally less significant.
 
Counter-intuitively, there are times when token-threaded code can sometimes run faster than the equivalent machine code when that machine code ends up being too large to fit in the physical CPU's L1 instruction cache. The higher [[code density]] of threaded code, especially token-threaded code, can allow it to fit entirely in the L1 cache when it otherwise would not have thereby avoiding cache thrashing. However, it should be noted that threaded code consumes both instruction cache (for the implementation of each operation) as well as data cache (for the bytecode and tables) unlike machine code which only consumes instruction cache; this means threaded code will eat into the budget for the amount of data that can be held for processing by the CPU at any given time. In any case, if the problem being computed involves applying a large number of operations to a small amount of data then using threaded code may be an ideal optimization.
Counter-intuitively, token-threaded code can sometimes run faster than the equivalent machine code --
when the machine code is too large to fit in cache, but the higher [[code density]] of threaded code, especially token-threaded code, allows it to fit entirely in high-speed cache.
<ref name="heller"/>