Content deleted Content added
Reverted good faith edits by 2405:3800:836:BA23:0:0:0:1 (talk) |
→History: removed dead link |
||
(32 intermediate revisions by 25 users not shown) | |||
Line 1:
{{Short description|Data structure}}
{{Redirect|CString||C string (disambiguation)}}
{{see also|String (computer science)#Null-terminated}}
In [[computer programming]], a '''null-terminated string''' is a [[character string]] stored as an [[Array data structure|array]] containing the characters and terminated with a [[null character]] (<code>'\0'</code>, called NUL in [[ASCII]]). Alternative names are '''[[C string]]''', which refers to the [[C (programming language)|C programming language]] and '''ASCIIZ''' (although C can use encodings other than ASCII).▼
▲In [[computer programming]], a '''null-terminated string''' is a [[character string]] stored as an [[Array data structure|array]] containing the characters and terminated with a ''[[null character]]
The length of a C string is found by searching for the (first) NUL byte. This can be slow as it takes O(''n'') ([[linear time]]) with respect to the string length. It also means that a string cannot contain a NUL character (there is a NUL in memory, but it is after the last character, not "in" the string).▼
▲The length of a
== History ==
Null-terminated strings were produced by the <code>.ASCIZ</code> directive of the [[PDP-11]] [[assembly language]]s and the <code>ASCIZ</code> directive of the [[MACRO-10]] macro assembly language for the [[PDP-10]]. These predate the development of the C programming language, but other forms of strings were often used.
At the time C (and the languages that it was derived from) was developed, memory was extremely limited, so using only one byte of overhead to store the length of a string was attractive. The only popular alternative at that time,
This had some influence on CPU [[instruction set]] design. Some CPUs in the 1970s and 1980s, such as the [[Zilog Z80]] and the [[Digital Equipment Corporation|DEC]] [[VAX]], had dedicated instructions for handling length-prefixed strings. However, as the
[[FreeBSD]] developer [[Poul-Henning Kamp]], writing in ''[[ACM Queue]]'',
== Limitations ==
While simple to implement, this representation has been prone to errors and performance problems.
The inability to store a
The speed problems with finding the length can usually be mitigated by combining it with another operation that is O(''n'') anyway, such as in <code>[[strlcpy]]</code>. However, this does not always result in an intuitive [[API]].
== Character encodings ==
Null-terminated strings require that the encoding does not use a zero byte (0x00) anywhere
[[UTF-16]] uses 2-byte integers and as either byte may be zero (and in fact ''every other'' byte is, when representing ASCII text), cannot be stored in a null-terminated byte string. However, some languages implement a string of 16-bit [[UTF-16]] characters, terminated by a 16-bit NUL
== Improvements ==
Many attempts to make C string handling less error prone have been made. One strategy is to add safer functions such as <code>[[strdup]]</code> and <code>[[strlcpy]]</code>, whilst [[C standard library#Buffer overflow vulnerabilities
Most modern libraries replace C strings with a structure containing a 32-bit or larger length value (far more than were ever considered for length-prefixed strings), and often add another pointer, a reference count, and even a NUL to speed up conversion back to a C string. Memory is far larger now, such that if the addition of 3 (or 16, or more) bytes to each string is a real problem the software will have to be dealing with so many small strings that some other storage method will save even more memory (for instance there may be so many duplicates that a [[hash table]] will use less memory). Examples include the [[C++]] [[Standard Template Library]] <code>[[String (C++)|std::string]]</code>, the [[Qt (toolkit)|Qt]] <code>QString</code>, the [[Microsoft Foundation Class Library|MFC]] <code>CString</code>, and the C-based implementation <code>CFString</code> from [[Core Foundation]] as well as its [[Objective-C]] sibling <code>NSString</code> from [[Foundation Kit|Foundation]], both by Apple. More complex structures may also be used to store strings such as the [[rope (computer science)|rope]].
Line 41 ⟶ 45:
{{CProLang}}
{{Data types}}
▲{{Use dmy dates|date=January 2011}}
[[Category:String data structures]]
|