Talk:String (computer science): Difference between revisions

Content deleted Content added
String length: PL/I (F)
 
(9 intermediate revisions by 6 users not shown)
Line 146:
=== Another length prefixed representation ===
Siemens PLCs use a form of length prefixed string representation with 2 length information bytes (see [https://cache.industry.siemens.com/dl/files/480/22506480/att_105176/v1/s7_scl_string_parameterzuweisung_e.pdf Siemens Docs "Working with Strings in S7-SCL"]). Maximum reserved memory is 256 bytes with maximum 254 bytes of actual text, where one byte denotes the allocated/reserved range for the string (the maximum count of characters allowed to be represented) and the other byte denotes the actual, currently valid length of the string. Maybe this could be added as length prefixed representation variant? --[[User:Ckonnerth|Ckonnerth]] ([[User talk:Ckonnerth|talk]]) 17:18, 15 December 2017 (UTC)
 
=== Dope vectors ===
The IBM PL/I (F) compiler{{efn|All subsequent IBM PL/I compilers replaced the SDV with a locator/descriptor.<ref>{{cite book
| title = OS - PL/I Optimizing Compiler - Programmer's Guide - Program Numbers 5734-PL1 - 5734-LM4 - 5734-LM5
| date = September 1971
| edition = first
| section = Types of Arguments and Parameters
| section-url = http://bitsavers.org/pdf/ibm/370/pli/SC33-0006-0_OS_PLI_Optimizing_Compiler_Programmers_Guide_Sep71.pdf#page=160
| page = 160
| quote = <u>Problem Data</u>: Only arithmetic element variables are passed as arguments by passing the addresses of their locations in storage. All other problem data types are passed as arguments by passing the address of a block of storage known as a <u>locator/descriptor</u>. A locator/descriptor contains the address and other relevant information about the data item that it represents. The address of the first byte of the data item is always present in the first fullword of the associated locator/descriptor. Locator/descriptors are employed for string, area, and aggregate data. For a varying-length string, the locator contains the address of a 2-byte field that contains the current length of the string and immediately precedes the data part of the string in storage.
| series = Program Product
| access-date = September 2, 2025
}}
</ref>}} generate a '''string dope vector'''<ref>{{cite book
| title = IBM System/360 Operating System - PL/I (F) - Programmer's Guide - Program Number 360S-NL-5ll
| id = C28-6594-4
| date = November 1968
| edition = Fifth
| section = String Data
| section-url = http://bitsavers.org/pdf/ibm/360/pli/C28-6594-4_PL1_F_Programmers_Guide_Nov68.pdf#page=136
| page = 136
| quote = Variable-length data has associated control areas known as "dope vectors" which describe the strings. A dope vector contains a record of the maximum length and the current length of the string, together with a pointer to the beginning of the string. Dope vectors need not be adjacent to the data they describe, but will normally occupy storage of the same storage class.
| series = Systems Reference Library
| url = http://bitsavers.org/pdf/ibm/360/pli/C28-6594-4_PL1_F_Programmers_Guide_Nov68.pdf
| access-date = September 2, 2025
}}
</ref> (SDV) for variable-length strings. The SDV contains a current length and a maximum length, and is not adjacent to the string proper. -- [[User:Chatul|Shmuel (Seymour J.) Metz Username:Chatul]] ([[User talk:Chatul|talk]]) 12:56, 2 September 2025 (UTC)
 
<!-- Keep after last sig -->
{{notelist-talk}}
{{reflist-talk}}
 
== DNA?? ==
Line 159 ⟶ 190:
== Traditionally? ==
 
WRT "In computer programming, a string is traditionally a sequence of characters..." What does 'traditionally' imply? What does string mean in a non-traditional sense? How is traditionality relevant? IMO it is a sequence of chars (period or full-stop as they say across the pond). [[User:Stevebroshar|Stevebroshar]] ([[User talk:Stevebroshar|talk]]) 14:03, 20 December 2024 (UTC)
 
== String is not a data type ==
 
WRT "A string is generally considered as a data type"
 
Can't argue that string is a type of data, but string is not a [[data type]]. Maybe that's a subtle difference to some, but there's an important difference. String is a higher level concept than data type as it pertains to programming. Many programming contexts (i.e. languages) have a string data type (or multiple). But there's significant difference between string data and a type for string data.
 
To illustrate the difference between string data and data type, consider C. It has no string type. The most commonly used data type for string data is char*; pointer to char. That is not a string type, yet it is used for string data. Note that char* can be used for non-string data; a pointer to a single char storage, for example. FWIW, the [[data structure]] is called [[null-terminated string]] or c-string.
 
What is this article about? Is it about the concept of string in general (string data)? Or about particular data types in particular languages and contexts? I assume the intention is both. But, the two should not be conflated. It should say that a string is sequence of characters and that many languages define a type for string data. It should not say that string ''is'' a data type.
 
TBO this article provides little value and should be deleted, but I'm sure folks don't like that idea. But, if it's going to exist, it shouldn't misrepresent the world. [[User:Stevebroshar|Stevebroshar]] ([[User talk:Stevebroshar|talk]]) 13:19, 10 May 2025 (UTC)
 
:I think you have a point here, and I tried to fix the lead accordingly. - [[User:Jochen Burghardt|Jochen Burghardt]] ([[User talk:Jochen Burghardt|talk]]) 16:18, 11 May 2025 (UTC)
: There are other languages than C. In some of them, strings are a native data type. You can make a similar argument about arrays. Citing a WP article to show that strings aren't a [[data type]] might carry more weight if that article didn't include them [[data type#String and text types]]. [[User:Andy Dingley|Andy Dingley]] ([[User talk:Andy Dingley|talk]]) 16:45, 11 May 2025 (UTC)
 
:You could as well argue that boolean and character are not data types, because in PL/I they are derived from bit string and character string with lengths 1:
:<syntaxhighlight lang=pli>
DECLARE
ISGREEN BIT(1).
GLYPH CHAR(1);
</syntaxhighlight>
:What types are basic and what derived is very much language-dependent. -- [[User:Chatul|Shmuel (Seymour J.) Metz Username:Chatul]] ([[User talk:Chatul|talk]]) 11:47, 2 September 2025 (UTC)
 
== String length ==
 
The description in {{alink||String length}} is overly simplistic and is incorrect for, e.g., [[PL/I]].
I propose changing {{blockquote|In general, there are two types of string datatypes: ''fixed-length strings'', which have a fixed maximum length to be determined at [[compile time]] and which use the same amount of memory whether this maximum is needed or not, and ''variable-length strings'', whose length is not arbitrarily fixed and which can use varying amounts of memory depending on the actual requirements at run time (see [[Memory management]]).}}
to {{blockquote|In general, there are three types of string datatypes: ''fixed-length strings'', which have a fixed length to be determined at [[compile time]] or [[block (programming)|block]] entry, ''variable-length strings'', which have a fixed maximum length to be determined at [[compile time]] or [[block (programming)|block]] entry and which use the same amount of memory whether this maximum is needed or not, and ''dynamic-length strings'', whose length is not arbitrarily fixed and which can use varying amounts of memory depending on the actual requirements at run time (see [[Memory management]]).}}
 
I was tempted to cite languages with each type of string, but that might be [[information overload|TMI]].
 
I suspect that most modern programming languages have dynamic-length strings, so the rest of the paragraph may also need changes. -- [[User:Chatul|Shmuel (Seymour J.) Metz Username:Chatul]] ([[User talk:Chatul|talk]]) 12:51, 11 August 2025 (UTC)