String (computer science): Difference between revisions

Content deleted Content added
m Copyediting
Implementations: WP:LINKs: update-standardizes, needless-WP:PIPEs > WP:NOPIPEs. Small WP:COPYEDITs WP:EoS: WP:TERSE, clarify.
 
(7 intermediate revisions by 5 users not shown)
Line 2:
[[File:String example.png|alt=Diagram of String data in computing. Shows the word "example" with each letter in a separate box. The word "String" is above, referring to the entire sentence. The label "Character" is below and points to an individual box.|thumb|Strings are typically made up of [[character (computing)|characters]], and are often used to store human-readable data, such as words or sentences.]]
 
In [[computer programming]], a '''string''' is traditionally a [[sequence]] of [[character (computing)|characters]], either as a [[literal (computer programming)|literal constant]] or as some kind of [[Variable (computer science)|variable]]. The latter may allow its elements to be [[Immutable object|mutated]] and the length changed, or it may be fixed (after creation). A string is generally considered as a [[data type]] and is often implemented as an [[array data structure]] of [[byte]]s (or [[word (computer architecture)|word]]s) that stores a sequence of elements, typically characters, using some [[character encoding]]. More general, ''Stringstring'' may also denote more general [[Array data type|arrays]] or othera sequence (or [[List (abstract data type)|list]]) of data typesother andthan structuresjust characters.
 
Depending on the programming language and precise data type used, a [[variable (programming)|variable]] declared to be a string may either cause storage in memory to be statically allocated for a predetermined maximum length or employ [[dynamic allocation]] to allow it to hold a variable number of elements.
Line 26:
==History==
 
Use of the word "string" to mean any items arranged in a line, series or succession dates back centuries.<ref>{{cite encyclopedia |encyclopedia=The Oxford English Dictionary |volume=X |publisher=Oxford at the Clarendon Press |year=1933 |title=string }}</ref><ref>{{cite web |title=string (n.) |url=https://www.etymonline.com/search?q=string |website=Online Etymology Dictionary }}</ref> In 19th -century typesetting, [[Compositor (typesetting)|compositors]] used the term "string" to denote a length of type printed on paper; the string would be measured to determine the compositor's pay.<ref>{{cite encyclopedia |encyclopedia=The Century Dictionary |author-link1=William Dwight Whitney |author-link2=Benjamin Eli Smith |first1=William Dwight |last1=Whitney |first2=Benjamin E. |last2=Smith |publisher=The Century Company |___location=New York |page=5994 |title=string }}</ref><ref name=Burchfield1986 /><ref>{{cite news |newspaper=[[Milwaukee Journal Sentinel|Milwaukee Sentinel]] |date=January 11, 1898 |title=Old Union's Demise |page=3 }}</ref>
 
Use of the word "string" to mean "a sequence of symbols or linguistic elements in a definite order" emerged from mathematics, [[symbolic logic]], and [[linguistic theory]] to speak about the [[formal system|formal]] behavior of symbolic systems, setting aside the symbols' meaning.<ref name=Burchfield1986>{{cite encyclopedia |title=string |encyclopedia=A Supplement to the Oxford English Dictionary |year=1986 |last=Burchfield |first=R.W. |publisher=Oxford at the Clarendon Press |author-link=Robert Burchfield }}</ref>
Line 43:
 
=== String length ===
Although formal strings can have an arbitrary finite length, the length of strings in real languages is often constrained to an artificial maximum. In general, there are two types of string datatypes: ''fixed-length strings'', which have a fixed maximum length to be determined at [[compile time]] and which use the same amount of memory whether this maximum is needed or not, and ''variable-length strings'', whose length is not arbitrarily fixed and which can use varying amounts of memory depending on the actual requirements at run time (see [[Memory management]]). Most strings in modern [[programming languages]] are variable-length strings. Of course, even variable-length strings are limited in length by the amount of available memory. The string length can be stored as a separate integer (which may put another artificial limit on the length) or implicitly through a termination character, known asusually a [[null-terminatedcharacter string]]value orwith aall "Cbits string",zero namedsuch afteras thein C programming language. See also "[[C (programming language)#Null-terminated|CNull-terminated]]" below.
 
=== Character encoding ===
Line 54:
=== Implementations ===
{{anchor|String Buffers}}
Some languages, such as [[C++]], [[Perl]] and [[Ruby (programming language)|Ruby]], normally allow the contents of a string to be changed after it has been created; these are termed ''mutable'' strings. In other languages, such as [[Java (programming language)|Java]], [[JavaScript]], [[Lua (programming language)|Lua]], [[Python (programming language)|Python]], and [[Go (programming language)|Go]], the value is fixed and a new string must be created if any alteration is to be made; these are termed ''immutable'' strings. Some of these languages with immutable strings also provide another type that is mutable, such as Java and [[.NET Framework|.NET]]'s {{Javadoc:SE|java/lang|StringBuilder}}, the thread-safe Java {{Javadoc:SE|java/lang|StringBuffer}}, and the [[Cocoa (API)|Cocoa]] <code>NSMutableString</code>. ThereImmutability are bothbrings advantages and disadvantages to immutability: althoughwhile immutable strings may require inefficiently creating many copies, they are simpler and completelyfully [[Thread safety|thread-safe]].
 
Strings are typically implemented as [[array data type|arrays]] of bytes, characters, or code units, in order to allow fast access to individual units or substrings—includingsubstrings, including characters when they have a fixed length. A few languages such as [[Haskell (programming language)|Haskell]] implement them as [[linked list]]s instead.
 
A lot ofMany high-level languages provide strings as a primitive data type, such as [[JavaScript]] and [[PHP]], while most others provide them as a composite data type, some with special language support in writing literals, for example, [[Java (programming language)|Java]] and [[C Sharp (programming language)|C#]].
 
Some languages, such as [[C (programming language)|C]], [[Prolog]] and [[Erlang (programming language)|Erlang]], avoid implementing a dedicated string datatype at all, instead adopting the convention of representing strings as lists of character codes. Even in programming languages having a dedicated string type, string can usually be iterated as a sequence character codes, like lists of integers or other values.