Content deleted Content added
mNo edit summary |
mNo edit summary |
||
Line 2:
== Representation in programming languages ==
A common representation is an [[array]] of [[character code]]s, occupying one [[byte]] (e.g. in [[ASCII]] code) or two bytes (e.g. in [[unicode]]) each. The length can be stored implicitly by using a special terminating character (often [[NUL]], ASCII code 0) -- the [[C programming language]] uses this convention (see [[C string]]) -- or explicitly, for example by prefixing the string with [[integer]] value (convention used in [[Pascal programming language|Pascal]]).
Line 22 ⟶ 21:
Of course, other representations are possible. Using [[tree]]s and [[list]]s makes certain string operations, such as character insertions or deletions, more efficient.
== String utilities ==
Strings are such a useful datatype that several languages have been designed in order to make string processing applications easy to write. Examples include:
* [[awk]]
* [[Icon programming language|Icon]]
Line 38 ⟶ 34:
Recent [[scripting language]]s, including [[Perl]], [[Python programming language|Python]] and [[Ruby programming language|Ruby]], employ [[regular expression]]s to facilitate text operations.
== String manipulation ==
Line 47 ⟶ 42:
== Algorithms ==
There are a variety of string-processing [[algorithm]]s for doing various things with strings:
* [[String searching algorithm]]s
* [[regular expression algorithm]]s
== Strings in theoretical computer science ==
In theoretical [[computer science]], one starts with a [[empty set|non-empty]] [[finite]] [[set]] called the ''alphabet''; strings are then defined as finite sequences of elements from the alphabet, including the empty sequence. The set of all strings over a given alphabet, together with string concatentation, then forms a [[monoid]], in fact a free monoid. [[Formal language]]s, the central objects of study, are defined as [[subset]]s of this monoid.
|