String (computer science): Difference between revisions

Content deleted Content added
m '\0' != '0'
No edit summary
Line 1:
In [[Computing]], aA '''string''' (or '''string of characters''') is aan aggregate data type used in most [[programming language]]s to represent text, and is the focus of this article.
 
The computing term '''string'' is also used' in a broader sense torefers groupto a sequence of entities; for example, tokens in a language grammar, or a sequence of states in automata. or See thea theoryrepresentation of [[computationDNA]].
 
=== RepresentationsRepresentation ===
 
A common representation is an [[array]] of characters. The length can be stored implicitly by using a special terminating character (often [[NUL]], ASCII code 0) (-- the [[C programming language]] uses this convention), -- or explicitly, (for example by treating the first byte or bytes ofprefixing the string aswith itsinteger length, avalue (convention used in [[Pascal programming language|Pascal]]).
 
Here is an example of a NUL terminated string stored in a 10 [[Integral data types|byte]] buffer., along with its ASCII representation:
NUL is the name for the character in [[ASCII]] which has the numeric value of zero.
In the C language, NUL is called '\0'.
 
<table cellspacing="0" celpadding="2" border="1">
<PRE>
<tr><td>F</td> <td>R</td> <td>A</td> <td>N</td> <td>K</td> <td>&nbsp;</td> <td>k</td> <td>f</td> <td>f</td> <td>w</td> </tr>
x x x x x x x x x x
<tr><td>46</td> <td>52</td> <td>41</td> <td>4E</td> <td>4B</td> <td>00</td> <td>6B</td> <td>66</td> <td>66</td> <td>77</td> </tr>
F R A N K\0 k f f w
</table>
x x x x x x x x x x
</PRE>
The above example is how "FRANK" would look in a 10 byte NUL terminated string. Characters after the \0 do not form part of the representation.
 
The length of a string in the above example 5 characters, but note that it occupies 6 bytes. Characters after the terminator do not form part of the representation; they may be either part of another string or just garbage.
Of course, other representations are possible. Using [[tree]]s and [[list]]s make it easier to insert characters in the middle of the string.
 
Of course, other representations are possible. Using [[tree]]s and [[list]]s makemakes itcertain easierstring tooperations, insertsuch charactersas incharacter theinsertions middleor ofdeletions, themore stringefficient.
=== String Processing ===
 
 
== String manipulation ==
 
Two most common operations on the strings are [[string search algorithm|searching]] and sorting. Because the practical value of string representation is enormous, many more-or-less efficient algorithms were discovered.
 
Advanced string algorithms often employ complex mechanisms and data structures, among them [[suffix tree]]s, [[finite state machine]]s.
 
 
 
=== String Processingutilities ===
 
Strings are such a useful datatype that several languages have been designed in order to make string processing applications easy to write. Examples include:
 
* [[awk]]
* [[Icon programming language|Icon]]
Line 29 ⟶ 37:
* [[sed]]
* [[SNOBOL]]
 
 
Many [[UNIX]] utilities perform simple string manipulations and can be used to easily program some powerful string processing algorithms. Files and finite streams may be viewed as strings.