Talk:String (computer science): Difference between revisions

Content deleted Content added
String length: new section
 
(25 intermediate revisions by 18 users not shown)
Line 1:
{{Talk header}}
{{WikiProject banner shell|class=Start|
{{WikiProject Computer science|class=start|importance=high}}
}}
{{afd-merged-from|String Buffer|String Buffer|04 June 2013}}
{{annual readership}}
{{WikiProject Computer science|class=start|importance=high}}
 
== Other related topics ==
Line 68 ⟶ 72:
 
:Presumably it comes from the rather obvious expression "a string of characters" (as in "these go some characters stringing by"), equivalent to "a string of pearls" or "a sequence of characters" or other similar phrases. — [[User:Loadmaster|Loadmaster]] ([[User talk:Loadmaster|talk]]) 16:27, 9 February 2008 (UTC)
 
:I heard that it originated because in the old days of physical type-setting, the type was held together in groups by literal string (rope). I don't have any references for this, though, so I can't back it up. [[User:Showeropera|Showeropera]] ([[User talk:Showeropera|talk]]) 20:48, 14 December 2017 (UTC)
 
== Trying to stop misuse of character encodings ==
Line 84 ⟶ 90:
I would like a citation on that, what about languages with lazy evaluation like Clojure and Haskell?
 
<sourcesyntaxhighlight lang="haskell">
cycle "Is this finite? "
⇒ "Is this finite? Is this finite? Is this finite? Is this finite? ..."
Line 90 ⟶ 96:
let shouting = 'a' : shouting in putStr shouting
⇒ "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa..."
</syntaxhighlight>
</source>
 
—[[User:BiT|BiT]] ([[User talk:BiT|talk]]) 03:00, 8 December 2011 (UTC)
Line 107 ⟶ 113:
 
::: The problem is that you're giving a very specific example but state it in such a form that it&mdash;at least at first reading&mdash;appears to be a very general definition. You're either using too much formal machinery for what is an informal statement or, conversely, make a statement that not precise enough to be a formal definition. This is how one of my formal language textbooks defines a reverse:
:::<blockquote>The '''reverse''' of a string is obtained by writing the symbols in reverse order; if ''w'' is a string as shown above, then its reverse ''w''<sup>R</sup> is<br><div class="center">''w''<sup>R</sup> = ''a''<sub>''n''</sub>...''a''<sub>2</sub>''a''<sub>1</sub>.</centerdiv></blockquote>
::: Where the they explained "above" that ''a'', ''b'', ''c'', ... denote elements from the alphabet &Sigma; and ''u'', ''v'', ''w'', ... strings over that alphabet. —''[[User:Ruud Koot|Ruud]]'' 18:45, 13 November 2012 (UTC)
 
Line 137 ⟶ 143:
 
::It does sound like there are 1 or 2 extra bits per character. Are you saying there was no way for a program to read or write these extra bits? Or that the implementation was somehow different from having extra bits per character (perhaps it was a table of locations with the bit "set" and thus you were restricted to how many times it was turned on). I think it is obvious that instructions designed to use these bits to end strings won't work but that is not an explanation as to why this extra storage was not taken advantage of. It is also surprising that they would in effect reserve 1/4 of their memory for such a limited use, when you consider how incredibly expensive the memory was at that time.[[User:Spitzak|Spitzak]] ([[User talk:Spitzak|talk]]) 17:36, 24 April 2017 (UTC)
 
=== Another length prefixed representation ===
Siemens PLCs use a form of length prefixed string representation with 2 length information bytes (see [https://cache.industry.siemens.com/dl/files/480/22506480/att_105176/v1/s7_scl_string_parameterzuweisung_e.pdf Siemens Docs "Working with Strings in S7-SCL"]). Maximum reserved memory is 256 bytes with maximum 254 bytes of actual text, where one byte denotes the allocated/reserved range for the string (the maximum count of characters allowed to be represented) and the other byte denotes the actual, currently valid length of the string. Maybe this could be added as length prefixed representation variant? --[[User:Ckonnerth|Ckonnerth]] ([[User talk:Ckonnerth|talk]]) 17:18, 15 December 2017 (UTC)
 
== DNA?? ==
 
Wondering why there's a bio-related image on this article's page, I don't see how it depicts what strings actually are in computer science. <!-- Template:Unsigned IP --><small class="autosigned">—&nbsp;Preceding [[Wikipedia:Signatures|unsigned]] comment added by [[Special:Contributions/67.165.80.152|67.165.80.152]] ([[User talk:67.165.80.152#top|talk]]) 05:19, 30 March 2020 (UTC)</small> <!--Autosigned by SineBot-->
:I've changed the image to diagram a string. Though I haven't figured out how to make it the page image yet. [[User:TripleShortOfACycle|TripleShortOfACycle]] ([[User talk:TripleShortOfACycle|talk]] - [[Special:Contributions/TripleShortOfACycle|contribs]]) - (she/her/hers) 14:19, 31 January 2021 (UTC)
::Now the page thumbnail works! It displays a diagram of a string when links to this page are hovered over. [[User:TripleShortOfACycle|TripleShortOfACycle]] ([[User talk:TripleShortOfACycle|talk]] - [[Special:Contributions/TripleShortOfACycle|contribs]]) - (she/her/hers) 14:30, 31 January 2021 (UTC)
 
== Distinct, unambiguous symbols ==
 
As far as I know, it is also required that each string can be uniquely decomposed into its symbols. For example, if the alphabet itself consists of strings (as in [[Free_monoid#Free_generators_and_rank]], or in the lead of [[Alphabet (formal languages)]], with Σ = {"0", "00"}), its symbols are distinct and unambiguous (as are the members of each mathematical set), but nevertheless, a string may be composed in different ways. I guess "unambiguous" is supposed to express the requirement of unique decomposition, but I'm not sure it is precise enough. The decomposition must be unambiguous, rather than just the symbols. - [[User:Jochen Burghardt|Jochen Burghardt]] ([[User talk:Jochen Burghardt|talk]]) 18:03, 13 May 2024 (UTC)
 
== Traditionally? ==
 
WRT "In computer programming, a string is traditionally a sequence of characters..." What does 'traditionally' imply? What does string mean in a non-traditional sense? How is traditionality relevant? IMO it is a sequence of chars (period). [[User:Stevebroshar|Stevebroshar]] ([[User talk:Stevebroshar|talk]]) 14:03, 20 December 2024 (UTC)
 
== String is not a data type ==
 
WRT "A string is generally considered as a data type"
 
Can't argue that string is a type of data, but string is not a [[data type]]. Maybe that's a subtle difference to some, but there's an important difference. String is a higher level concept than data type as it pertains to programming. Many programming contexts (i.e. languages) have a string data type (or multiple). But there's significant difference between string data and a type for string data.
 
To illustrate the difference between string data and data type, consider C. It has no string type. The most commonly used data type for string data is char*; pointer to char. That is not a string type, yet it is used for string data. Note that char* can be used for non-string data; a pointer to a single char storage, for example. FWIW, the [[data structure]] is called [[null-terminated string]] or c-string.
 
What is this article about? Is it about the concept of string in general (string data)? Or about particular data types in particular languages and contexts? I assume the intention is both. But, the two should not be conflated. It should say that a string is sequence of characters and that many languages define a type for string data. It should not say that string ''is'' a data type.
 
TBO this article provides little value and should be deleted, but I'm sure folks don't like that idea. But, if it's going to exist, it shouldn't misrepresent the world. [[User:Stevebroshar|Stevebroshar]] ([[User talk:Stevebroshar|talk]]) 13:19, 10 May 2025 (UTC)
 
:I think you have a point here, and I tried to fix the lead accordingly. - [[User:Jochen Burghardt|Jochen Burghardt]] ([[User talk:Jochen Burghardt|talk]]) 16:18, 11 May 2025 (UTC)
: There are other languages than C. In some of them, strings are a native data type. You can make a similar argument about arrays. Citing a WP article to show that strings aren't a [[data type]] might carry more weight if that article didn't include them [[data type#String and text types]]. [[User:Andy Dingley|Andy Dingley]] ([[User talk:Andy Dingley|talk]]) 16:45, 11 May 2025 (UTC)
 
== String length ==
 
The description in {{alink||String length}} is overly simplistic and is incorrect for, e.g., [[PL/I]].
I propose changing {{blockquote|In general, there are two types of string datatypes: ''fixed-length strings'', which have a fixed maximum length to be determined at [[compile time]] and which use the same amount of memory whether this maximum is needed or not, and ''variable-length strings'', whose length is not arbitrarily fixed and which can use varying amounts of memory depending on the actual requirements at run time (see [[Memory management]]).}}
to {{blockquote|In general, there are three types of string datatypes: ''fixed-length strings'', which have a fixed length to be determined at [[compile time]] or [[block (programming)|block]] entry, ''variable-length strings'', which have a fixed maximum length to be determined at [[compile time]] or [[block (programming)|block]] entry and which use the same amount of memory whether this maximum is needed or not, and ''dynamic-length strings'', whose length is not arbitrarily fixed and which can use varying amounts of memory depending on the actual requirements at run time (see [[Memory management]]).}}
 
I was tempted to cite languages with each type of string, but that might be [[information overload|TMI]].
 
I suspect that most modern programming languages have dynamic-length strings, so the rest of the paragraph may also need changes. -- [[User:Chatul|Shmuel (Seymour J.) Metz Username:Chatul]] ([[User talk:Chatul|talk]]) 12:51, 11 August 2025 (UTC)