Content deleted Content added
Stevebroshar (talk | contribs) →One-dimensional tables: Get to the point; accuracy |
Stevebroshar (talk | contribs) →top: remove uncited and arguable claim Tags: Mobile edit Mobile app edit Android app edit App section source |
||
(46 intermediate revisions by 2 users not shown) | |||
Line 1:
{{Short description |Data table used to control program flow}}
{{Refimprove|date=February 2009}}
[[File:Control table.png|thumb|220px|
A '''control table''' is a table [[data structure]] (i.e. [[Array (data structure)|array]] of [[Record (computer science)|record]]s) used to direct the [[control flow]] of a [[computer program]]. [[Software]] that uses a control table is said to be ''table-driven''.<ref>''Programs from decision tables'', Humby, E., 2007,Macdonald, 1973 ... Biggerstaff, Ted J. Englewood Cliffs, NJ : Prentice-Hall {{ISBN|0-444-19569-6}}</ref><ref>{{Cite web |url=http://www.dkl.com/wp-content/uploads/2016/05/DataKinetics-Table-Driven-Design.pdf |title=Archived copy |access-date=17 May 2016 |archive-date=10 June 2016 |archive-url=https://web.archive.org/web/20160610160908/http://www.dkl.com/wp-content/uploads/2016/05/DataKinetics-Table-Driven-Design.pdf |url-status=dead }}</ref>
In general, the mapping of input parameters can be via any data structure. A common data structure is the [[lookup table |lookup]] which provides relatively high performance but at a relatively high memory footprint. An [[associative array]] can minimize memory use at the cost of more lookup time.
How the associated behavior is referenced varies. Some languages provide a direct function reference (i.e. [[pointer (computer programming)|pointer]]) that can be used to invoke a function directly, but some languages do not. Some languages provide for [[goto |jumping]] to a ___location (i.e.[[label (programming language)|label]]). As a fallback, any language allows for mapping input to an index that can then be used to branch to a particular part of the code.
A control table is often used as part of a higher-level algorithm. It can control the [[main loop]] of an [[event-driven programming |event-driven program]]. A relatively advanced use is instructions for a [[virtual machine]] {{endash}} similar to [[bytecode]] but usually with operations implied by the table structure itself insteaad of encoded in the table data.
==Data structure==
A table can be structured in a variety of ways. It may have one or multiple dimensions and be of fixed or [[variable length code |variable length]]. The structure of the table may be similar to a [[multimap (data structure)|multimap]] [[associative array]], where a data value (or combination of data values) may be mapped to one or more functions to be performed. Often, the structure allows tabular data is [[software portability |portable]] between [[computer platform]]s as long as a compatible interpreter exists on each platform.
===Trivial hash function===
A relatively simple implementation consists of a [[sparse file |sparse]], one-dimensional array of values. The index space is fully covered by the array such that lookup involves indexing into the array by the input value, and a value is found even for an index that is not intended to be used; preventing an error that might otherwise occur for an unused index value. The lookup is achieved in [[constant time]]; without searching. In most [[computer architecture |architecture]]s, this can be accomplished in two or three [[machine instruction]]s. The technique is known as a "[[trivial hash function]]" or, when used specifically for branch tables, "[[double dispatch]]".
To be feasible, the range of index values should be relatively small. In the example below, the control table is indexed by ASCII value so it has 256 entries; an entry for each ASCII value; omitted entries are shown as '...'. Only the values for the letter A, D, M, and S are important. All other values are uninteresting and set to 0. A two-byte index would require a minimum of 65,536 entries to handle all input possibilities {{endash}} which might consume more memory than is considered worth the value it provides.
{| class="wikitable" style="text-align:center; "
|+
|- style="vertical-align:bottom;"
! index
|-
| 0 ||style="background:lightblue;"| 0
|-
| ... ||style="background:lightblue;"| 0
|-
|
|-
|
|-
|
|-
|
|-
|
|-
|
|-
|
|-
|
|-
|
|}
In automata-based programming and [[pseudoconversational transaction]] processing, if the number of distinct program states is small, a "dense sequence" control variable can be used to efficiently dictate the entire flow of the main program loop.
===Branch table===
A [[branch table]] is a one-dimensional array of [[machine code]] [[branch (computer science)|branch/jump]] instructions to effect a [[multiway branch]] to a program label when branched into by an immediately preceding, and indexed branch. It is sometimes generated by an [[optimizing compiler]] to execute a [[switch statement]] – provided that the input range is small and dense, with few gaps.<ref>http://www.netrino.com/node/137</ref> (as created by the previous array example)
Although quite compact – compared to the multiple equivalent <code>If</code> statements – the branch instructions still carry some redundancy, since the branch [[opcode]] and condition code mask are repeated alongside the branch offsets. Control tables containing only the offsets to the program labels can be constructed to overcome this redundancy (at least in assembly languages) and yet requiring only minor execution time [[computational overhead |overhead]] compared to a conventional branch table.
===Decision table===
Often, a control table implements a [[decision table]] which like a control table contains (often implied) [[propositional formula |propositions]], and one or more associated actions.
A control table can act like a [[switch statement]] or more generally as a nested [[if-then-else]] construct that includes [[logical predicate]]s (using [[Boolean algebra (logic)|boolean]] style [[logical conjunction |AND]]/[[logical disjunction |OR]] conditions) for each case. Such as control table provides for a language-independent implementation of what otherwise is a language-dependent construct. The table embodies the [[essence]] of a program; stripped of programming language syntax and platform dependent aspects; condensed to data and implied logic. The meaning of the table includes implied operations instead of being explicit as in a more typical a [[programming paradigm]].
Typically, a two-dimensional control table contains value/action pairs and may additionally contain operators and [[type system |type]] information such as the ___location, size and format of input or output data, whether [[data conversion]] is required. The table may contain [[array index |indexes]] or relative or absolute [[pointer (computer programming)|pointer]]s to generic or customized primitives or functions to be executed depending upon other values in the row.
The type of values used to in a control table depends on the [[computer language]] used for the interpreter. [[Assembly language]] provides the widest scope for [[data types]] including [[machine code]] for lookup values. Typically, a control table contains values for each possible matching class of input together with a corresponding pointer to an action function. For a language without [[pointer (computer programming)|pointer]] support, the table can encode an index which can imply an offset value that can be used to accomplish the same as a pointer.
==Storage==
A control table can be stored a variety of ways. It can be stored as part of a program; for example as a [[static variable]]. It can stored in the [[file system]]; for example as [[flat file]] or in a [[database]]. It can be built dynamically by the program. For efficiency, the table should be resident in program memory when the interpreter uses it.
==Interpreter==
A control table interpreter executes the operations as selected by input parameters. The table and resulting runtime behavior can be modified without modifying the interpreter. An interpreter can require [[software maintenance |maintenance]] to change its behavior, but hopefully the interpreter is designed to support future functionality via table changes instead.
Handler functions often are coded in the same language as the interpreter, but could be in another language provided a suitable inter-language call-linkage mechanisms exists. The choice of language for the interpreter and handler functions usually depends on how portable it needs to be across various [[Platform (computing)|platform]]s. There may be several versions of the interpreter to support [[Porting |portability]]. A subordinate control table pointer may optionally substitute for a function pointer in the action column if the interpreter supports this construct; representing a conditional drop to a lower logical level, mimicking [[structured programming]].
==Performance considerations==
The examples below have been chosen partly to illustrate potential performance gains that may not only ''compensate'' significantly for the additional tier of abstraction, but also ''improve'' upon – what otherwise might have been – less efficient, less maintainable and lengthier code. Although the examples given are for a 'low level' assembly language and for the [[C (language)|C language]], it can be seen, in both cases, that very few lines of code are required to implement the control table approach and yet can achieve very significant [[constant time]] performance improvements, reduce repetitive source coding and aid clarity, as compared with [[verbose]] conventional program language constructs.
==Examples==
===General===
CT1 is a control table that is a simple [[lookup table]]. The first column represents the input value to be tested (by an implied 'IF input = x') and, if so, the action in the corresponding 2nd column is invoked. It is like a [[multiway branch]] with return; a form of [[dynamic dispatch]].
:{| class="wikitable"
|+ CT1
! input !! action
|-
|
|-
|
|-
|
|-
|
|}
The next example illustrates how a similar effect can be achieved in a language that does not support pointer definitions in data structures but does support indexed branching to a function – contained within a ([[zero-based numbering|0-based]]) array of function pointers. The table (CT2) is used to extract the index (from 2nd column) to the pointer array (CT2P). If pointer arrays are not supported, a SWITCH statement or equivalent can be used to alter the control flow to one of a sequence of program labels (e.g.: case0, case1, case2, case3, case4) which then either process the input directly, or else perform a call (with return) to the appropriate function.
:{| class="wikitable"
|+ CT2
! input
|-
|
|-
|
|-
|
|-
|
|}
As in above examples, it is possible to
:{| class="wikitable"
|+ CT2P
! index || array
|-
|
|-
|
|-
|
|-
|
|}
:{| class="wikitable"
|+ CT3
! input 1!!alternate!!
|-
|
|-
|
|-
|
|-
|
|}
The control table entries are then much more similar to conditional statements in [[procedural language]]s but, crucially, without the actual (language dependent) conditional statements (i.e. instructions) being present (the generic code is ''physically'' in the interpreter that processes the table entries, not in the table itself – which simply embodies the program logic via its structure and values).
In tables such as these, where a series of similar table entries defines the entire logic, a table entry number or pointer may effectively take the place of a [[program counter]] in more conventional programs and may be reset in an 'action', also specified in the table entry. The example below (CT4) shows how extending the earlier table, to include a 'next' entry (and/or including an 'alter flow' ([[branch (computer science)|jump]])
[[Structured programming]] or [[structured programming|"Goto-less" code]], (incorporating the equivalent of '[[do while loop|DO WHILE]]' or '[[for loop]]' constructs), can also be accommodated with suitably designed and 'indented' control table structures.
Line 313 ⟶ 145:
|
{| class="wikitable"
|+ CT4
! input 1!!alternate!! subr # !! count !! jump
|-
| {{sdash}} || {{sdash}} ||
|-
|
|-
|
|-
|
|-
|
|-
|
|-
| {{
|}
|
{| class="wikitable"
|+ CT4P {{nobold|pointer array}}
! index || pointer
|-
|
|-
|
|-
|
|-
|
|-
|
|-
|
|-
|
|-
|
|}
|}
=== C language ===
The following example in [[C (programming language)|C]], uses two arrays that together form a table data structure. The {{code|ops}} array is a simple [[linear search]], one-dimensional array {{endash}} for obtaining the index of the item that matches the input, {{code |op}}. The {{code|handler_labels}} array contains addresses of labels to jump to. The loop tests each {{code |ops}} item against {{code |op}} and if they match, invokes {{code |goto}} for the label at the same index as the found {{code |ops}} item.
<syntaxhighlight lang="c" line style="font-size: 90%;">
static const char ops[] = { "A", "S", "M", "D" };
static const void *handler_labels[] = { &&Add, &&Subtract, &&Multiply, &&Divide };
for (int i = 0; i < sizeof(ops); i++) {
if (op == ops[i]) goto *handler_labels[i];
}
</syntaxhighlight>
This would execute faster if a 256 byte table is used to translate the ASCII input ({{code|op}}) to an index without searching. It would execute in [[constant time]] for all values of {{code |op}}. If a {{code |handler_labels}} item contained the name of a function instead of a label, the jump could be replaced with a dynamic function call, eliminating the switch-like {{code|goto}} but decreasing runtime performance.
<syntaxhighlight lang="c" line style="font-size: 80%;">
static const void *handler_labels[] = {&&Default, &&Add, &&Subtract, &&Multiply, &&Divide};
static const char handler_label_index_by_op[] = {
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 1, 0, 0, 4, 0, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0,
0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0};
size_t i = handler_label_index_by_op(x);
goto *handler_labels[i];
</syntaxhighlight>
===Table-driven rating===
Line 362 ⟶ 228:
===Spreadsheets===
A [[spreadsheet]]
==Programming paradigm==
If the control tables technique could be said to belong to any particular [[programming paradigm]], the closest analogy might be automata-based programming or [[reflection (computer science)|"reflective"]] (a form of [[metaprogramming]] – since the table entries could be said to 'modify' the behaviour of the interpreter). The interpreter itself however, and the
==
A
==Instruction fetch==
When a
==Monitoring control table execution==
Line 377 ⟶ 243:
==Advantages==
; Portability: can be designed to be language and platform independent {{endash}} except for the interpreter
; Flexibility: ability to execute either [[language primitive |primitives]] or functions transparently and be custom designed to suit the problem
; Compactness: table usually shows condition/action pairing side-by-side (without the usual platform/language implementation dependencies), often also resulting in reduced binary file size due to less duplication of instructions, reduced source code size due to eliminating conditional statements and reduced program load (or download) speeds
; Maintainability: tables often reduce the number of source lines needed to be maintained v. multiple compares
; Locality of reference: compact tables structures result in tables remaining in [[cache (computing)|cache]]
; Code re-use: the interpreter is usually reusable. Frequently it can be adapted to new programming tasks using the same technique and can grow organically, becoming, in effect, a [[standard library]] of tried and tested functions, controlled by the table definitions.
; Efficiency: system wide optimization possible. Any performance improvement to the interpreter usually improves ''all'' applications using it (see examples in 'CT1' above).
; Extensible: new instructions can be added by extending the interpreter
Optionally:
* Interpreter can be [[introspection |introspective]] and "self [[optimization (computer science)|optimize]]" using runtime [[software metric |metrics]] collected within the table itself (see CT3 and CT4 – with entries that could be periodically sorted by descending count). The interpreter can also optionally choose the most efficient lookup technique dynamically from metrics gathered at run-time (e.g. size of array, range of values, sorted or unsorted)
* [[Dynamic dispatch]]: common functions can be pre-loaded and less common functions fetched only on first encounter to reduce [[memory]] usage. In-table [[memoization]] can be employed to achieve this.
* The interpreter can have debugging, trace and monitor features built-in – that can then be switched on or off at will according to test or 'live' mode
* Control tables can be built 'on-the-fly' (according to some user input or from parameters) and then executed by the interpreter (without building code literally).
==Disadvantages==
; Complexity: Complex [[expression (
==Quotations==
Line 417 ⟶ 292:
==See also==
*
*
*
*
* {{Annotated link |Truth table}}
==Notes==
|