Revision as of 10:02, 22 December 2020 edit 61.69.133.51 (talk) Pseudocode is unnecessarily confusing with multiple ++/-- operations on the same variable in an expression. ← Previous edit		Revision as of 12:22, 22 December 2020 edit undo 61.69.133.51 (talk) →Token threading Next edit →
Line 226: ===Token threading=== Token-threaded code ~~uses~~implements ~~lists~~the thread as a list of indices into a table of operations; the index width is naturally chosen to be as small as possible for density and efficiency. 1 byte / 8-bits is the natural choice for ease of programming, but smaller sizes like 4-bits, or larger like 12~~-bit~~ ~~{{cn\|date=July~~or ~~2020}}~~16 ~~indexes~~bits, tocan abe ~~table~~used depending on the number of ~~pointers~~operations supported. ItAs long as the index width is ~~notably~~chosen ~~compact~~to be narrower than a machine pointer, it will naturally be more compact than the other threading types without much special effort by athe programmer. It is usually half to three-fourths the size of other threadings, which are themselves a quarter to an eighth the size of non-threaded code. The table's pointers can either be indirect or direct. Some Forth compilers produce token-threaded code. Some programmers consider the "[[p-code machine\|p-code]]" generated by some [[Pascal (programming language)\|Pascal]] compilers, as well as the [[bytecode]]s used by [[.NET Framework\|.NET]], [[Java (programming language)\|Java]], BASIC and some [[C (programming language)\|C]] compilers, to be token-threading. A common approach, historically, is [[bytecode]], which typically uses 8-bit opcodes ~~and, often,~~with a stack-based virtual machine. AThe ~~typical~~archetypal bytecode [[Interpreter (computing)\|interpreter]] is known as a "[[decode and dispatch interpreter]]", and follows the form: <syntaxhighlight lang="c"> start: vpc = &thread dispatch: ~~top:~~ iaddr = decode(&vpc++) // ~~may~~Convert bethe ~~implemented~~next ~~simply~~bytecode ~~as:~~operation to ~~return~~a vpcpointer /to machine code that implements it // Any inter-instruction operations are performed here (e.g. updating global state, event processing, etc) addr = table[i]▼ jump addr CODE_PTR decode(BYTE_CODE *p) { // In a more complex encoding, there may be multiple tables to choose between or control/mode flags ▲ ~~addr =~~return table[i(p)++]; } thread: / Contains bytecode, not machine addresses. Hence it is more compact. / 1 /pushA/ Line 247 ⟶ 251: pushA: sp++ = A jump ~~top~~dispatch pushB: sp++ = B jump ~~top~~dispatch add: addend1 = --sp addend2 = --sp sp++ = addend1 + addend2 jump ~~top~~dispatch </syntaxhighlight> Line 262 ⟶ 266: For instructions where the individual operations are simple, such as "push" and "add", the [[Computational overhead\|overhead]] involved in deciding what to execute is larger than the cost of actually executing it, so such interpreters are often much slower than machine code. However, for more complex ("compound") instructions, the overhead percentage is proportionally less significant. Counter-intuitively, there are times when token-threaded code can sometimes run faster than the equivalent machine code when that machine code ends up being too large to fit in the physical CPU's L1 instruction cache. The higher [[code density]] of threaded code, especially token-threaded code, can allow it to fit entirely in the L1 cache when it otherwise would not have thereby avoiding cache thrashing. However, it should be noted that threaded code consumes both instruction cache (for the implementation of each operation) as well as data cache (for the bytecode and tables) unlike machine code which only consumes instruction cache; this means threaded code will eat into the budget for the amount of data that can be held for processing by the CPU at any given time. In any case, if the problem being computed involves applying a large number of operations to a small amount of data then using threaded code may be an ideal optimization. ~~Counter-intuitively, token-threaded code can sometimes run faster than the equivalent machine code --~~ ~~when the machine code is too large to fit in cache, but the higher [[code density]] of threaded code, especially token-threaded code, allows it to fit entirely in high-speed cache.~~ <ref name="heller"/>

Threaded code: Difference between revisions