A control table is a table data structure (i.e. array of records) used to direct the control flow of a computer program. Software that uses a control table is said to be table-driven.[1][2] A control table encodes both the parameters to a conditional expression and a function reference. An interpreter processes a table by evaluating the conditional expression for input data and invoking the selected function. Using a control table can reduce the need for repetitive code that implements the same logic.

Control table that directs program flow according to the value of a single input value. Each entry holds a possible input value and a reference to an associated function.

In general, the mapping of input parameters can be via any data structure. A common data structure is the lookup which provides relatively high performance but at a relatively high memory footprint. An associative array can minimize memory use at the cost of more lookup time.

How the associated behavior is referenced varies. Some languages provide a direct function reference (i.e. pointer) that can be used to invoke a function directly, but some languages do not. Some languages provide for jumping to a ___location (i.e.label). As a fallback, any language allows for mapping input to an index that can then be used to branch to a particular part of the code.

A control table is often used as part of a higher-level algorithm. It can control the main loop of an event-driven program. A relatively advanced use is instructions for a virtual machine – similar to bytecode but usually with operations implied by the table structure itself insteaad of encoded in the table data.

Data structure

edit

A table can be structured in a variety of ways. It may have one or multiple dimensions and be of fixed or variable length. The structure of the table may be similar to a multimap associative array, where a data value (or combination of data values) may be mapped to one or more functions to be performed. Often, the structure allows tabular data is portable between computer platforms as long as a compatible interpreter exists on each platform.

Trivial hash function

edit

A relatively simple implementation consists of a sparse, one-dimensional array of values. The index space is fully covered by the array such that lookup involves indexing into the array by the input value, and a value is found even for an index that is not intended to be used; preventing an error that might otherwise occur for an unused index value. The lookup is achieved in constant time; without searching. In most architectures, this can be accomplished in two or three machine instructions. The technique is known as a "trivial hash function" or, when used specifically for branch tables, "double dispatch".

To be feasible, the range of index values should be relatively small. In the example below, the control table is indexed by ASCII value so it has 256 entries; an entry for each ASCII value; omitted entries are shown as '...'. Only the values for the letter A, D, M, and S are important. All other values are uninteresting and set to 0. A two-byte index would require a minimum of 65,536 entries to handle all input possibilities – which might consume more memory than is considered worth the value it provides.

index
(ASCII)
Array
0 0
... 0
65 (A) 1
... 0
68 (D) 4
... 0
77 (M) 3
... 0
83 (S) 2
... 0
255 0

In automata-based programming and pseudoconversational transaction processing, if the number of distinct program states is small, a "dense sequence" control variable can be used to efficiently dictate the entire flow of the main program loop.

Branch table

edit

A branch table is a one-dimensional array of machine code branch/jump instructions to effect a multiway branch to a program label when branched into by an immediately preceding, and indexed branch. It is sometimes generated by an optimizing compiler to execute a switch statement – provided that the input range is small and dense, with few gaps.[3] (as created by the previous array example)

Although quite compact – compared to the multiple equivalent If statements – the branch instructions still carry some redundancy, since the branch opcode and condition code mask are repeated alongside the branch offsets. Control tables containing only the offsets to the program labels can be constructed to overcome this redundancy (at least in assembly languages) and yet requiring only minor execution time overhead compared to a conventional branch table.

Decision table

edit

Often, a control table implements a decision table which like a control table contains (often implied) propositions, and one or more associated actions.

A control table can act like a switch statement or more generally as a nested if-then-else construct that includes logical predicates (using boolean style AND/OR conditions) for each case. Such as control table provides for a language-independent implementation of what otherwise is a language-dependent construct. The table embodies the essence of a program; stripped of programming language syntax and platform dependent aspects; condensed to data and implied logic. The meaning of the table includes implied operations instead of being explicit as in a more typical a programming paradigm.

Typically, a two-dimensional control table contains value/action pairs and may additionally contain operators and type information such as the ___location, size and format of input or output data, whether data conversion is required. The table may contain indexes or relative or absolute pointers to generic or customized primitives or functions to be executed depending upon other values in the row.

The type of values used to in a control table depends on the computer language used for the interpreter. Assembly language provides the widest scope for data types including machine code for lookup values. Typically, a control table contains values for each possible matching class of input together with a corresponding pointer to an action function. For a language without pointer support, the table can encode an index which can imply an offset value that can be used to accomplish the same as a pointer.

Storage

edit

A control table can be stored a variety of ways. It can be stored as part of a program; for example as a static variable. It can stored in the file system; for example as flat file or in a database. It can be built dynamically by the program. For efficiency, the table should be resident in program memory when the interpreter uses it.

Interpreter

edit

A control table interpreter executes the operations as selected by input parameters. The table and resulting runtime behavior can be modified without modifying the interpreter. An interpreter can require maintenance to change its behavior, but hopefully the interpreter is designed to support future functionality via table changes instead.

Handler functions often are coded in the same language as the interpreter, but could be in another language provided a suitable inter-language call-linkage mechanisms exists. The choice of language for the interpreter and handler functions usually depends on how portable it needs to be across various platforms. There may be several versions of the interpreter to support portability. A subordinate control table pointer may optionally substitute for a function pointer in the action column if the interpreter supports this construct; representing a conditional drop to a lower logical level, mimicking structured programming.

Performance considerations

edit

Even though using a control table entails the development cost and deployment size of an interpreter, by separating (encapsulating) the executable coding from the logic, as expressed in the table, it can be more readily targeted to perform its function. This may be experienced in a spreadsheet application where the underlying spreadsheet software transparently converts complex logical formulae in the most efficient manner it is able, in order to display its results.

The examples below have been chosen partly to illustrate potential performance gains that may not only compensate significantly for the additional tier of abstraction, but also improve upon – what otherwise might have been – less efficient, less maintainable and lengthier code. Although the examples given are for a 'low level' assembly language and for the C language, it can be seen, in both cases, that very few lines of code are required to implement the control table approach and yet can achieve very significant constant time performance improvements, reduce repetitive source coding and aid clarity, as compared with verbose conventional program language constructs.

Examples

edit

General

edit

CT1 is a control table that is a simple lookup table. The first column represents the input value to be tested (by an implied 'IF input = x') and, if so, the action in the corresponding 2nd column is invoked. It is like a multiway branch with return; a form of dynamic dispatch.

CT1
input action
A Add
S Subtract
M Multiply
D Divide

The next example illustrates how a similar effect can be achieved in a language that does not support pointer definitions in data structures but does support indexed branching to a function – contained within a (0-based) array of function pointers. The table (CT2) is used to extract the index (from 2nd column) to the pointer array (CT2P). If pointer arrays are not supported, a SWITCH statement or equivalent can be used to alter the control flow to one of a sequence of program labels (e.g.: case0, case1, case2, case3, case4) which then either process the input directly, or else perform a call (with return) to the appropriate function.

CT2
input subr #
A 1
S 2
M 3
D 4

As in above examples, it is possible to efficiently translate the potential ASCII input values (A, S, M, D) into a pointer array index without actually using a table lookup, but is shown here as a table for consistency with the first example.

CT2P
index array
1 Add
2 Subtract
3 Multiply
4 Divide

A two-dimensional control table could be used to support testing multiple conditions or performing more than one action. An action can include a pointer to another subordinate control table. The simple example below has had an implicit 'OR' condition incorporated as an extra column (to handle lower case input, however in this instance, this could equally have been handled by having an extra entry for each of the lower case characters specifying the same function identifier as the upper case characters). An extra column to count the actual run-time events for each input as they occur is also included.

CT3
input 1 alternate subr # count
A a 1 0
S s 2 0
M m 3 0
D d 4 0

The control table entries are then much more similar to conditional statements in procedural languages but, crucially, without the actual (language dependent) conditional statements (i.e. instructions) being present (the generic code is physically in the interpreter that processes the table entries, not in the table itself – which simply embodies the program logic via its structure and values).

In tables such as these, where a series of similar table entries defines the entire logic, a table entry number or pointer may effectively take the place of a program counter in more conventional programs and may be reset in an 'action', also specified in the table entry. The example below (CT4) shows how extending the earlier table, to include a 'next' entry (and/or including an 'alter flow' (jump) function) can create a loop (This example is actually not the most efficient way to construct such a control table but, by demonstrating a gradual 'evolution' from the first examples above, shows how additional columns can be used to modify behaviour.) The fifth column demonstrates that more than one action can be initiated with a single table entry – in this case an action to be performed after the normal processing of each entry ('-' values mean 'no conditions' or 'no action').

Structured programming or "Goto-less" code, (incorporating the equivalent of 'DO WHILE' or 'for loop' constructs), can also be accommodated with suitably designed and 'indented' control table structures.

CT4
input 1 alternate subr # count jump
5 0
E e 7 0
A a 1 0
S s 2 0
M m 3 0
D d 4 0
6 0 1
CT4P pointer array
index pointer
0 Default
1 Add
2 Subtract
3 Multiply
4 Divide
5 Read Input1
6 Alter flow
7 End

C language

edit

The following example in C, uses two arrays that together form a table data structure. The ops array is a simple linear search, one-dimensional array – for obtaining the index of the item that matches the input, op. The handler_labels array contains addresses of labels to jump to. The loop tests each ops item against op and if they match, invokes goto for the label at the same index as the found ops item.

static const char ops[] = { "A", "S", "M", "D" };
static const void *handler_labels[] = { &&Add, &&Subtract, &&Multiply, &&Divide };
for (int i = 0; i < sizeof(ops); i++) {
    if (op == ops[i]) goto *handler_labels[i];
}

This would execute faster if a 256 byte table is used to translate the ASCII input (op) to an index without searching. It would execute in constant time for all values of op. If a handler_labels item contained the name of a function instead of a label, the jump could be replaced with a dynamic function call, eliminating the switch-like goto but decreasing runtime performance.

static const void *handler_labels[] = {&&Default, &&Add, &&Subtract, &&Multiply, &&Divide};
static const char handler_label_index_by_op[] = {
             0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
             0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
             0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
             0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
             0, 1, 0, 0, 4, 0, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0,
             0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
             0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
             0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
             0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
             0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
             0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
             0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
             0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
             0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
             0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
             0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0};
size_t i = handler_label_index_by_op(x);
goto *handler_labels[i];

Table-driven rating

edit

In the specialist field of telecommunications rating (concerned with the determining the cost of a particular call), table-driven rating techniques illustrate the use of control tables in applications where the rules may change frequently because of market forces. The tables that determine the charges may be changed at short notice by non-programmers in many cases.[4][5]

If the algorithms are not pre-built into the interpreter (and therefore require additional runtime interpretation of an expression held in the table), it is known as "Rule-based Rating" rather than table-driven rating (and consequently consumes significantly more overhead).

Spreadsheets

edit

A spreadsheet could used as a two-dimensional control table, with the non-empty cells representing data to the underlying spreadsheet program (the interpreter). The cells containing formula are usually prefixed with an equals sign and simply designate a special type of data input that dictates the processing of other referenced cells – by altering the control flow within the interpreter. It is the externalization of formulae from the underlying interpreter that clearly identifies both spreadsheets, and the above cited "rule based rating" example as readily identifiable instances of the use of control tables by non-programmers.

Programming paradigm

edit

If the control tables technique could be said to belong to any particular programming paradigm, the closest analogy might be automata-based programming or "reflective" (a form of metaprogramming – since the table entries could be said to 'modify' the behaviour of the interpreter). The interpreter itself however, and the functions, can be programmed using any one of the available paradigms or even a mixture. The table itself can be essentially a collection of "raw data" values that do not even need to be compiled and could be read in from an external source (except in specific, platform dependent, implementations using memory pointers directly for greater efficiency).

Virtual machine

edit

A two-dimensional control table is similar to intermediate executable code (i.e. bytecode) running on a virtual machine, in that a platform-dependent interpreter is required to perform execution that is largely determined by table content. There are also similarities to Common Intermediate Language (CIL) in the aim of creating a common intermediate instruction set that is independent of platform but unlike CIL, no pretensions to be used as a common resource for other languages. P-code can also be considered a similar but earlier implementation with origins as far back as 1966.

Instruction fetch

edit

When a two-dimensional control table is used to determine program flow, the normal "hardware" program counter function is effectively simulated with either a pointer to the first (or next) table entry or else an index to it. "Fetching" the instruction involves decoding the data in that table entry – without necessarily copying all or some of the data within the entry first. Programming languages that are able to use pointers have the dual advantage that less overhead is involved, both in accessing the contents and also advancing the counter to point to the next table entry after execution. Calculating the next 'instruction' address (i.e. table entry) can even be performed as an optional additional action of every individual table entry allowing loops and or jump instructions at any stage.

Monitoring control table execution

edit

The interpreter program can optionally save the program counter (and other relevant details depending upon instruction type) at each stage to record a full or partial trace of the actual program flow for debugging purposes, hot spot detection, code coverage analysis and performance analysis (see examples CT3 & CT4 above).

Advantages

edit
Clarity
information tables are ubiquitous and mostly inherently understood even by the general public (especially fault diagnostic tables in product guides)
Portability
can be designed to be language and platform independent – except for the interpreter
Flexibility
ability to execute either primitives or functions transparently and be custom designed to suit the problem
Compactness
table usually shows condition/action pairing side-by-side (without the usual platform/language implementation dependencies), often also resulting in reduced binary file size due to less duplication of instructions, reduced source code size due to eliminating conditional statements and reduced program load (or download) speeds
Maintainability
tables often reduce the number of source lines needed to be maintained v. multiple compares
Locality of reference
compact tables structures result in tables remaining in cache
Code re-use
the interpreter is usually reusable. Frequently it can be adapted to new programming tasks using the same technique and can grow organically, becoming, in effect, a standard library of tried and tested functions, controlled by the table definitions.
Efficiency
system wide optimization possible. Any performance improvement to the interpreter usually improves all applications using it (see examples in 'CT1' above).
Extensible
new instructions can be added by extending the interpreter

Optionally:

  • Interpreter can be introspective and "self optimize" using runtime metrics collected within the table itself (see CT3 and CT4 – with entries that could be periodically sorted by descending count). The interpreter can also optionally choose the most efficient lookup technique dynamically from metrics gathered at run-time (e.g. size of array, range of values, sorted or unsorted)
  • Dynamic dispatch: common functions can be pre-loaded and less common functions fetched only on first encounter to reduce memory usage. In-table memoization can be employed to achieve this.
  • The interpreter can have debugging, trace and monitor features built-in – that can then be switched on or off at will according to test or 'live' mode
  • Control tables can be built 'on-the-fly' (according to some user input or from parameters) and then executed by the interpreter (without building code literally).

Disadvantages

edit
Training
application programmers are not usually trained to produce generic solutions
Computational overhead
some increase because of extra level of indirection caused by virtual instructions having to be 'interpreted' (this however can usually be more than offset by a well designed generic interpreter taking full advantage of efficient direct translate, search and conditional testing techniques that may not otherwise have been utilized)
Complexity
Complex expressions cannot always be used directly in data table entries for comparison purposes. These intermediate values can however be calculated beforehand instead within a function and their values referred to in the conditional table entries. Alternatively, a function can perform the complete complex conditional test (as an unconditional 'action') and, by setting a truth flag as its result, it can then be tested in the next table entry. See Structured program theorem

Quotations

edit

Multiway branching is an important programming technique which is all too often replaced by an inefficient sequence of if tests. Peter Naur recently wrote me that he considers the use of tables to control program flow as a basic idea of computer science that has been nearly forgotten; but he expects it will be ripe for rediscovery any day now. It is the key to efficiency in all the best compilers I have studied.

— Donald Knuth, Structured Programming with go to Statements

There is another way to look at a program written in interpretative language. It may be regarded as a series of subroutine calls, one after another. Such a program may in fact be expanded into a long sequence of calls on subroutines, and, conversely, such a sequence can usually be packed into a coded form that is readily interpreted. The advantage of interpretive techniques are the compactness of representation, the machine independence, and the increased diagnostic capability. An interpreter can often be written so that the amount of time spent in interpretation of the code itself and branching to the appropriate routine is negligible

— Donald Knuth, The Art of Computer Programming Volume 1, 1997, page 202

The space required to represent a program can often be decreased by the use of interpreters in which common sequences of operations are represented compactly. A typical example is the use of a finite-state machine to encode a complex protocol or lexical format into a small table

— Jon Bentley, Writing Efficient Programs

Jump tables can be especially efficient if the range tests can be omitted. For example, if the control value is an enumerated type (or a character) then it can only contain a small fixed range of values and a range test is redundant provided the jump table is large enough to handle all possible values

— David.A. SPULER, Compiler Code Generation for Multiway Branch Statements as a Static Search Problem

Programs must be written for people to read, and only incidentally for machines to execute.

— "Structure and Interpretation of Computer Programs", preface to the first edition, Abelson & Sussman

Show me your flowchart and conceal your tables, and I shall continue to be mystified. Show me your tables, and I won't usually need your flowchart; it'll be obvious.

— "The Mythical Man-Month: Essays on Software Engineering", Fred Brooks

See also

edit

Notes

edit
  1. ^ Programs from decision tables, Humby, E., 2007,Macdonald, 1973 ... Biggerstaff, Ted J. Englewood Cliffs, NJ : Prentice-Hall ISBN 0-444-19569-6
  2. ^ "Archived copy" (PDF). Archived from the original (PDF) on 10 June 2016. Retrieved 17 May 2016.{{cite web}}: CS1 maint: archived copy as title (link)
  3. ^ http://www.netrino.com/node/137
  4. ^ Carl Wright, Service Level Corpo. (2002) Program Code Based vs. Table-driven vs. Rule-Based Rating, Rating Matters issue n. 12, 13 November 2002 ISSN 1532-1886
  5. ^ Brian E. Clauser, Melissa J. Margolis, Stephen G. Clyman, Linette P. Ross (1997) Development of Automated Scoring Algorithms for Complex Performance Assessments: A Comparison of Two Approaches Journal of Educational Measurement, Vol. 34, No. 2 (Summer, 1997), pp. 141–161

References

edit
edit