Revision as of 18:54, 31 January 2006 edit Circeus (talk \| contribs) Administrators 50,330 edits No edit summary ← Previous edit		Revision as of 22:39, 20 February 2006 edit undo Derek farn (talk \| contribs) Extended confirmed users 3,376 edits Partial rewrote (more to do) Next edit →
Line 1: In [[computer programming]] a '''naming convention''' is a set of rules for choosing the character sequence to be used for an [[identifier]]s. Reasons for using a naming convention (as opposed to allowing people, eg, [[programmer]]s, to choose any character sequence) include the following: ~~Naming conventions are commonly used for various purposes:~~ * to ~~embed~~provide useful information ~~about~~to ~~the~~a ~~entity~~reader, ~~such~~eg, asan ~~its~~identifier's type ~~and intended use, in the identifier~~ (~~See~~see: [[Hungarian notation]]); or its intended use * to ~~ensure~~enhance clarity (for example by disallowing overly long names or abbreviations);▼ * to work around restrictions imposed by the language on what an identifier may look like (See: [[CamelCase\|CamelCase notation]]); ▲* to ensure clarity (for example by disallowing overly long names or abbreviations); ~~Programmers tend to be very picky about naming conventions, and~~The choice of naming conventions can ~~become~~be an enormously controversial issue, with partisans of each holding theirs to be the best and others to be inferior. == Multiple-word identifiers == As most programming languages do not allow spaces in identifiers, some system must be devised when a programmer wishes to use a name containing multiple words. There are several in widespread use; each has a significant following, though sometimes one dominates amongst users of a particular [[programming language]]. There are also some programmers who eschew multiple-word names entirely, and so use none of these systems (see the section below on the amount of information in identifiers). A common recommendation is "Use meaningful identifiers." A single [[word]] may not be sufficiently meaningful, or specific, as multiple words. One approach is to replace spaces with another character. The two characters commonly used for this purpose are the hyphen ('-') and the underscore ('_'), so the two-word name ''two words'' would be represented as ''two-words'' or ''two_words''. The hyphen is arguably the easier to type and more readable of these, and is used by nearly all programmers of [[Lisp programming language\|Lisp]], [[Scheme programming language\|Scheme]], and other languages that permit hyphens in identifiers. However, many other languages reserve the hyphen for use as the [[subtraction]] operator, and so do not permit it in identifiers. Thus some programmers of these languages use underscores instead. However, underscores are somewhat harder to type due to their ___location on most [[English language]] [[keyboard]]s, and so this solution has not been universally adopted; it is, however, in fairly widespread use among programmers of [[C programming language\|C]], [[Perl]], and many [[scripting language]]s.▼ As most [[programming language]]s do not allow the [[whitespace_(computer science)\|whitespace]] in identifiers, a method of delimiting each word is needed (to make it easier for subsequent readers to interpret the character sequence). There are several in widespread use; each has a significant following. ▲One approach is to ~~replace~~delimit ~~spaces~~separate words with ~~another~~a [[alphanumeric\|nonalphanumeric]] character. The two characters commonly used for this purpose are the hyphen ('-') and the underscore ('_'), so the two-word name ''two words'' would be represented as ''two-words'' or ''two_words''. The hyphen ~~is arguably the easier to type and more readable of these, and~~ is used by nearly all programmers of [[Lisp programming language\|Lisp]], [[Scheme programming language\|Scheme]], and other languages that permit hyphens in identifiers. However, many other languages reserve the hyphen for use as the [[subtraction]] operator, and so do not permit it in identifiers. ~~Thus some programmers of these languages use underscores instead. However, underscores are somewhat harder to type due to their ___location on most [[English language]] [[keyboard]]s, and so this~~This solution ~~has not been universally adopted; it~~ is~~, however, in~~ fairly widespread ~~use~~ among programmers of [[C programming language\|C]], [[Perl]], and many [[scripting language]]s. An alternate approach, developed mostly as an alternative to the underscore in languages that do not permit hyphens, is to omit the space and indicate word boundaries using capitalization, thus rendering ''two words'' as either ''twoWords'' or ''TwoWords''. This is called [[CamelCase]], among other names.▼ ▲An alternate ~~approach, developed mostly as an alternative~~approachis to ~~the underscore in languages that do not permit hyphens, is to omit the space and~~ indicate word boundaries using capitalization, thus rendering ''two words'' as either ''twoWords'' or ''TwoWords''. This is called [[CamelCase]], among other names. == Information in identifiers == There is significant disagreement over whether it is permissible to use short (ie, containing few characters) identifiers. The argument being that it is not possible to encode much information, if any, in a short sequence of characters. Whether programmers prefer short identifiers because they are too lazy to type, or think up, longer identifiers, or because in many situations a longer identifier simply clutters the visible code and provides no worthwhile additional benefit (over a shorter identifier) is an open research issue. There is significant disagreement over how much information to put in identifiers. This was driven initially by technical reasons, as some early programming languages only allowed identifiers of a certain length. Thus in the standard C library (C was initially one of those languages), one finds ''atoi'' as the name of a function that converts [[ASCII]] strings to [[integer]]s. In Lisp, one would be more likely to encounter the same function named as ''ascii-to-integer'' or similar. However, the use of shorter identifiers has outlived those technical restrictions, partly as heritage (it continues more commonly in those languages that once had the restrictions), and partly out of ease of use -- it's simply easier to type shorter identifiers, especially when the identifier is used frequently. Those who prefer the longer identifiers argue that the difficulty of typing the longer identifiers is outweighed by the ease of reading code that is more descriptive rather than peppered with impenetrable acronyms and abbreviations. InThere ~~addition~~are toseveral ~~the~~well ~~issue of length of identifiers in their descriptive capacity, there are also several~~known systems for codifying specific technical aspects of a particular identifier in the name. Perhaps the most well-known is [[Hungarian notation]], which encodes the [[datatype\|type]] of a variable in its name. Several more minor conventions are widespread; one example is the convention of ~~naming~~excluding ~~variables~~the inuse ~~C and~~of [[~~C++~~lowercase]] ~~with~~letters anin ~~initial~~identifiers ~~lowercase~~representing ~~letter,~~macros ~~and~~in ~~naming~~C ~~user-defined~~and ~~datatypes with an initial capital letter~~[[C++]].▼ ==External links== *[http://www.coding-guidelines.com/cbook/sent787.pdf A detailed analysis of identifier naming issues] ▲In addition to the issue of length of identifiers in their descriptive capacity, there are also several systems for codifying specific technical aspects of a particular identifier in the name. Perhaps the most well-known is [[Hungarian notation]], which encodes the [[datatype\|type]] of a variable in its name. Several more minor conventions are widespread; one example is the convention of naming variables in C and [[C++]] with an initial lowercase letter, and naming user-defined datatypes with an initial capital letter. {{compu-lang-stub}}

Naming convention (programming): Difference between revisions