Naming convention (programming)

In computer programming a naming convention is a set of rules for choosing the character sequence to be used for an identifier.

Reasons for using a naming convention (as opposed to allowing people, eg, programmers, to choose any character sequence) include the following:

to provide useful information to a reader, eg, an identifier's type (see: Hungarian notation) or its intended use
to enhance clarity (for example by disallowing overly long names or abbreviations);

The choice of naming conventions can be an enormously controversial issue, with partisans of each holding theirs to be the best and others to be inferior.

Multiple-word identifiers

A common recommendation is "Use meaningful identifiers." A single word may not be as sufficiently meaningful, or specific, as multiple words. As most programming languages do not allow the whitespace in identifiers, a method of delimiting each word is needed (to make it easier for subsequent readers to interpret those character sequences belonging to each word). There are several in widespread use; each with a significant following.

One approach is to delimit separate words with a nonalphanumeric character. The two characters commonly used for this purpose are the hyphen ('-') and the underscore ('_'), eg, the two-word name two words would be represented as two-words or two_words. The hyphen is used by nearly all programmers writing Cobol and Lisp. Many other languages (eg, languages in the C and Pascal families) reserve the hyphen for use as the subtraction operator, and so it is not avaialble for use in identifiers.

An alternate approach is to indicate word boundaries using capitalization, thus rendering two words as either twoWords or TwoWords. The term CamelCase is sometimes used to describe this technique.

Information in identifiers

There is significant disagreement over whether it is permissible to use short (ie, containing few characters) identifiers. The argument being that it is not possible to encode much information, if any, in a short sequence of characters. Whether programmers prefer short identifiers because they are too lazy to type, or think up, longer identifiers, or because in many situations a longer identifier simply clutters the visible code and provides no worthwhile additional benefit (over a shorter identifier) is an open research issue.

There are several well known systems for codifying specific technical aspects of a particular identifier in the name. Perhaps the most well-known is Hungarian notation, which encodes the type of a variable in its name. Several more minor conventions are widespread; one example is the convention of excluding the use of lowercase letters in identifiers representing macros in C and C++.

External links

100 page pdf that uses linguistics and psychology to attempt a cost/benefit analysis of identifier naming issues

This programming-language-related article is a stub. You can help Wikipedia by expanding it.