Static program analysis

Static code analysis is a set of methods for analysing software source code or object code in an effort to gain understanding of what the software does. Applications include targeting areas for review and/or rewrite.

Schematically, there exist several types of static analysis (which may be used in combination, even inside the same programming tool):

tools such as lint essentially look for constructs that "look dangerous" from an informal point of view;
formal methods consider mathematical definition of the behaviors of programs, known as semantics:

Formal methods

Static analysis is a family of formal methods for automatically deriving information about the behavior of computer software (and also hardware). One possible application of static analysis is automated debugging aid, especially the finding of run-time errors -- roughly speaking, events causing program crashes.

Briefly, program analysis — including finding possible run-time errors -- is undecidable: there's is no mechanical method that can always answer truthfully whether programs may or not exhibit runtime errors. This is a mathematically founded result dating from the works of Church, Gödel and Turing in the 1930s (see halting problem and Rice's theorem).

There exist two main families of formal static analysis:

model checking considers systems that have finite state or may be reduced to finite state by abstraction (computer science);
static analysis by abstract interpretation approximates the behavior of the system, either from above (considering more behaviors than can happen in reality), or from below.

Interest in the development of static analysis tools, especially for use on safety-critical computer systems, was renewed after the high profile disaster of Ariane 5 Flight 501, when a space rocket exploded shortly after launch due to a computer bug, surely one of the most expensive computer bugs in history.

Software metrics

This family of methods aims at identifying possible problems in software using some numerical measurements over the source code. The number of metrics (measures) that can be applied are numerous. One of the crudest measurements, for example, is the size of code, usually expressed in kLOC's (1000 lines of code). Defining a consistent definition of a line of code has long been an item for debate.

One common metric in static analysis is McCabes Cyclomatic Complexity Metric which measures the number of choices a function makes. Functions with a high number of conditional statements (if's, while etc.) will have a high complexity. Such functions therefore can be considered more likely to contain bugs and be harder to maintain.

Tools

External links

information of software management
Citations from CiteSeer
ASTRÉE project, with explanations on static analysis by abstract interpretation