Talk:String (computer science): Difference between revisions

Content deleted Content added
Ruud Koot (talk | contribs)
String length: new section
 
(33 intermediate revisions by 26 users not shown)
Line 1:
{{Talk header}}
{{compsci|class=start|importance=high}}
{{WikiProject banner shell|class=Start|
{{compsci|class=startWikiProject Computer science|importance=high}}
}}
{{afd-merged-from|String Buffer|String Buffer|04 June 2013}}
{{annual readership}}
 
== Other related topics ==
Line 67 ⟶ 72:
 
:Presumably it comes from the rather obvious expression "a string of characters" (as in "these go some characters stringing by"), equivalent to "a string of pearls" or "a sequence of characters" or other similar phrases. — [[User:Loadmaster|Loadmaster]] ([[User talk:Loadmaster|talk]]) 16:27, 9 February 2008 (UTC)
 
:I heard that it originated because in the old days of physical type-setting, the type was held together in groups by literal string (rope). I don't have any references for this, though, so I can't back it up. [[User:Showeropera|Showeropera]] ([[User talk:Showeropera|talk]]) 20:48, 14 December 2017 (UTC)
 
== Trying to stop misuse of character encodings ==
Line 83 ⟶ 90:
I would like a citation on that, what about languages with lazy evaluation like Clojure and Haskell?
 
<sourcesyntaxhighlight lang="haskell">
cycle "Is this finite? "
⇒ "Is this finite? Is this finite? Is this finite? Is this finite? ..."
Line 89 ⟶ 96:
let shouting = 'a' : shouting in putStr shouting
⇒ "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa..."
</syntaxhighlight>
</source>
 
—[[User:BiT|BiT]] ([[User talk:BiT|talk]]) 03:00, 8 December 2011 (UTC)
Line 106 ⟶ 113:
 
::: The problem is that you're giving a very specific example but state it in such a form that it&mdash;at least at first reading&mdash;appears to be a very general definition. You're either using too much formal machinery for what is an informal statement or, conversely, make a statement that not precise enough to be a formal definition. This is how one of my formal language textbooks defines a reverse:
:::<blockquote>The '''reverse''' of a string is obtained by writing the symbols in reverse order; if ''w'' is a string as shown above, then its reverse ''w''<sup>R</sup> is<br><div class="center">''w''<sup>R</sup> = ''a''<sub>''n''</sub>...''a''<sub>2</sub>''a''<sub>1</sub>.</centerdiv></blockquote>
::: Where the they explained "above" that ''a'', ''b'', ''c'', ... denote elements from the alphabet &Sigma; and ''u'', ''v'', ''w'', ... strings over that alphabet. —''[[User:Ruud Koot|Ruud]]'' 18:45, 13 November 2012 (UTC)
 
:::: I went ahead and added a "Reversal" subsection to the article, with (hopefully) simplified language. — [[User:Loadmaster|Loadmaster]] ([[User talk:Loadmaster|talk]]) 22:25, 15 November 2012 (UTC)
 
== External links modified ==
 
Hello fellow Wikipedians,
 
I have just modified {{plural:1|one external link|1 external links}} on [[String (computer science)]]. Please take a moment to review [https://en.wikipedia.org/w/index.php?diff=prev&oldid=713479372 my edit]. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit [[User:Cyberpower678/FaQs#InternetArchiveBot|this simple FaQ]] for additional information. I made the following changes:
*Corrected formatting/usage for http://www.wearmouth.demon.co.uk/zx80.htm
 
When you have finished reviewing my changes, please set the ''checked'' parameter below to '''true''' or '''failed''' to let others know (documentation at {{tlx|Sourcecheck}}).
 
{{sourcecheck|checked=false}}
 
Cheers.—[[User:Cyberbot II|<sup style="color:green;font-family:Courier">cyberbot II</sup>]]<small><sub style="margin-left:-14.9ex;color:green;font-family:Comic Sans MS">[[User talk:Cyberbot II|<span style="color:green">Talk to my owner</span>]]:Online</sub></small> 08:39, 4 April 2016 (UTC)
 
== Discussion to move "String" to "String (disambiguation)" ==
 
In order to make way for moving [[Draft:String]] to article space to take the place as the primary topic, I've posted a proposal at '''[[Talk:String#Requested move 16 January 2017]]''' to move the disambiguation page currently at "[[String]]" to "[[String (disambiguation)]]". Your input would be helpful to establish a common consensus on whether or not this move, or something else, should be done. I look forward to your thoughts on the matter. [[User talk:The Transhumanist|<i>The&nbsp;Transhumanist</i>]] 22:50, 16 January 2017 (UTC)
 
== String length ==
 
In section '''String datatypes'''/'''Representations'''/'''Null-terminated''' the IBM 1401 word-mark terminated string is discussed.
:Somewhat similar, "data processing" machines like the [[IBM 1401]] used a special [[Word mark (computer hardware)|word mark]] bit to delimit strings at the left, where the operation would start at the right. This bit had to be clear in all other parts of the string. This meant that, while the IBM 1401 had a seven-bit word, almost no-one ever thought to use this as a feature, and override the assignment of the seventh bit to (for example) handle ASCII codes.
That seventh bit idea could not have been implemented. The wordmark bit is hardware implemented. The MCW ('''M'''ove '''C'''haracters '''W'''ordmark) instruction for instance moved variable length fields terminating on the word mark. Numeric or alpha were treated no different. The Honeywell H200 H1200 H3200 and H4200 all had MCW instructions. Arithmetic operations also used wordmark field demarcation. The Honeywell computers had 8 bit memory having 6 data, a word mark and item mark bits.
[[User:Steamerandy|Steamerandy]] ([[User talk:Steamerandy|talk]]) 17:26, 24 April 2017 (UTC)
 
::It does sound like there are 1 or 2 extra bits per character. Are you saying there was no way for a program to read or write these extra bits? Or that the implementation was somehow different from having extra bits per character (perhaps it was a table of locations with the bit "set" and thus you were restricted to how many times it was turned on). I think it is obvious that instructions designed to use these bits to end strings won't work but that is not an explanation as to why this extra storage was not taken advantage of. It is also surprising that they would in effect reserve 1/4 of their memory for such a limited use, when you consider how incredibly expensive the memory was at that time.[[User:Spitzak|Spitzak]] ([[User talk:Spitzak|talk]]) 17:36, 24 April 2017 (UTC)
 
=== Another length prefixed representation ===
Siemens PLCs use a form of length prefixed string representation with 2 length information bytes (see [https://cache.industry.siemens.com/dl/files/480/22506480/att_105176/v1/s7_scl_string_parameterzuweisung_e.pdf Siemens Docs "Working with Strings in S7-SCL"]). Maximum reserved memory is 256 bytes with maximum 254 bytes of actual text, where one byte denotes the allocated/reserved range for the string (the maximum count of characters allowed to be represented) and the other byte denotes the actual, currently valid length of the string. Maybe this could be added as length prefixed representation variant? --[[User:Ckonnerth|Ckonnerth]] ([[User talk:Ckonnerth|talk]]) 17:18, 15 December 2017 (UTC)
 
== DNA?? ==
 
Wondering why there's a bio-related image on this article's page, I don't see how it depicts what strings actually are in computer science. <!-- Template:Unsigned IP --><small class="autosigned">—&nbsp;Preceding [[Wikipedia:Signatures|unsigned]] comment added by [[Special:Contributions/67.165.80.152|67.165.80.152]] ([[User talk:67.165.80.152#top|talk]]) 05:19, 30 March 2020 (UTC)</small> <!--Autosigned by SineBot-->
:I've changed the image to diagram a string. Though I haven't figured out how to make it the page image yet. [[User:TripleShortOfACycle|TripleShortOfACycle]] ([[User talk:TripleShortOfACycle|talk]] - [[Special:Contributions/TripleShortOfACycle|contribs]]) - (she/her/hers) 14:19, 31 January 2021 (UTC)
::Now the page thumbnail works! It displays a diagram of a string when links to this page are hovered over. [[User:TripleShortOfACycle|TripleShortOfACycle]] ([[User talk:TripleShortOfACycle|talk]] - [[Special:Contributions/TripleShortOfACycle|contribs]]) - (she/her/hers) 14:30, 31 January 2021 (UTC)
 
== Distinct, unambiguous symbols ==
 
As far as I know, it is also required that each string can be uniquely decomposed into its symbols. For example, if the alphabet itself consists of strings (as in [[Free_monoid#Free_generators_and_rank]], or in the lead of [[Alphabet (formal languages)]], with Σ = {"0", "00"}), its symbols are distinct and unambiguous (as are the members of each mathematical set), but nevertheless, a string may be composed in different ways. I guess "unambiguous" is supposed to express the requirement of unique decomposition, but I'm not sure it is precise enough. The decomposition must be unambiguous, rather than just the symbols. - [[User:Jochen Burghardt|Jochen Burghardt]] ([[User talk:Jochen Burghardt|talk]]) 18:03, 13 May 2024 (UTC)
 
== Traditionally? ==
 
WRT "In computer programming, a string is traditionally a sequence of characters..." What does 'traditionally' imply? What does string mean in a non-traditional sense? How is traditionality relevant? IMO it is a sequence of chars (period). [[User:Stevebroshar|Stevebroshar]] ([[User talk:Stevebroshar|talk]]) 14:03, 20 December 2024 (UTC)
 
== String is not a data type ==
 
WRT "A string is generally considered as a data type"
 
Can't argue that string is a type of data, but string is not a [[data type]]. Maybe that's a subtle difference to some, but there's an important difference. String is a higher level concept than data type as it pertains to programming. Many programming contexts (i.e. languages) have a string data type (or multiple). But there's significant difference between string data and a type for string data.
 
To illustrate the difference between string data and data type, consider C. It has no string type. The most commonly used data type for string data is char*; pointer to char. That is not a string type, yet it is used for string data. Note that char* can be used for non-string data; a pointer to a single char storage, for example. FWIW, the [[data structure]] is called [[null-terminated string]] or c-string.
 
What is this article about? Is it about the concept of string in general (string data)? Or about particular data types in particular languages and contexts? I assume the intention is both. But, the two should not be conflated. It should say that a string is sequence of characters and that many languages define a type for string data. It should not say that string ''is'' a data type.
 
TBO this article provides little value and should be deleted, but I'm sure folks don't like that idea. But, if it's going to exist, it shouldn't misrepresent the world. [[User:Stevebroshar|Stevebroshar]] ([[User talk:Stevebroshar|talk]]) 13:19, 10 May 2025 (UTC)
 
:I think you have a point here, and I tried to fix the lead accordingly. - [[User:Jochen Burghardt|Jochen Burghardt]] ([[User talk:Jochen Burghardt|talk]]) 16:18, 11 May 2025 (UTC)
: There are other languages than C. In some of them, strings are a native data type. You can make a similar argument about arrays. Citing a WP article to show that strings aren't a [[data type]] might carry more weight if that article didn't include them [[data type#String and text types]]. [[User:Andy Dingley|Andy Dingley]] ([[User talk:Andy Dingley|talk]]) 16:45, 11 May 2025 (UTC)
 
== String length ==
 
The description in {{alink||String length}} is overly simplistic and is incorrect for, e.g., [[PL/I]].
I propose changing {{blockquote|In general, there are two types of string datatypes: ''fixed-length strings'', which have a fixed maximum length to be determined at [[compile time]] and which use the same amount of memory whether this maximum is needed or not, and ''variable-length strings'', whose length is not arbitrarily fixed and which can use varying amounts of memory depending on the actual requirements at run time (see [[Memory management]]).}}
to {{blockquote|In general, there are three types of string datatypes: ''fixed-length strings'', which have a fixed length to be determined at [[compile time]] or [[block (programming)|block]] entry, ''variable-length strings'', which have a fixed maximum length to be determined at [[compile time]] or [[block (programming)|block]] entry and which use the same amount of memory whether this maximum is needed or not, and ''dynamic-length strings'', whose length is not arbitrarily fixed and which can use varying amounts of memory depending on the actual requirements at run time (see [[Memory management]]).}}
 
I was tempted to cite languages with each type of string, but that might be [[information overload|TMI]].
 
I suspect that most modern programming languages have dynamic-length strings, so the rest of the paragraph may also need changes. -- [[User:Chatul|Shmuel (Seymour J.) Metz Username:Chatul]] ([[User talk:Chatul|talk]]) 12:51, 11 August 2025 (UTC)