Schlemiel the Painter's algorithm: Difference between revisions

Content deleted Content added
Spolsky's example: Hyphenated "high level"
Line 19:
The programming practice that Spolsky used to make his point was repeated concatenation of null-terminated character arrays ("strings").<ref name="basics" />
 
The first step in every implementation of the [[C standard library|standard C library]] function for concatenating strings is determining the length of the first string being appended to by checking each character in the array, starting from the beginning, to see if it is the terminating [[null character]]. InNext, subsequentthe steps, anothersecond string is then copied to the end of the first string, so effectively concatenating the two. At the end of the concatenation, the length of the combined string is discarded upon return to the calling code.
 
In Spolsky's example, the "Schlemiels" occur when multiple strings are being concatenated together:
# <code>[[strcat]]( buffer, "John" ); &nbsp;&nbsp; </code>/* Here, the string "John" is appended to the buffer */
# <code>strcat( buffer, "Paul" ); &nbsp;&nbsp; </code>/* Now the string "Paul" is appended to ''that'' */
# <code>strcat( buffer, "George" );&nbsp; </code>/* ... and the string "George" is appended to ''that'' */
# <code>strcat( buffer, "Ringo" ); &nbsp; </code>/* ... and the string "Ringo" is appended to ''that'' */
After Paul is finished appending to John, the length of "JohnPaul" (or, more precisely, the position of the terminating null character) is known within the [[Scope (programming)|scope]] of <code>strcat()</code> but is discarded upon its return to the point after Paul and before George. Afterwards, when <code>strcat()</code> is told to append George to "JohnPaul", <code>strcat()</code> starts at the very first character of the array (which is 'J') all over again just to find the terminating null character. Each subsequent call to <code>strcat()</code> has to compute the length again before concatenating another name to the <code>buffer</code>.
 
AnalogousAfter to"Paul" Schlemiel'shas notbeen carryingappended to "John", the paint-bucketlength of "JohnPaul" (or, more precisely, the string'sposition lengthof the terminating null character) withis him,known allwithin the subsequent[[Scope (programming)|scope]] of <code>strcat()</code>s havebut tois againdiscarded "walk"upon theits lengthreturn ofto the stringpoint toafter determinePaul whereand thebefore secondGeorge. stringAfterwards, shouldwhen be<code>strcat()</code> copied.is Astold moreto dataappend is added"George" to "JohnPaul", <code>bufferstrcat()</code>, thatstarts terminatingat nullthe very first character alsoof getsthe fartherarray away("J") fromall over again just to find the beginningterminating withnull eachcharacter. Each subsequent call to <code>strcat()</code>, meaning more checks must be takenhas to findcompute thatthe characterlength andagain subsequentbefore callsconcatenating areanother increasingly slower&mdash;just as "Schlemiel's" pathname to histhe bucket keeps getting longer<code>buffer</code>.
 
Analogous to Schlemiel not carrying the paint-bucket (or the string's length) with him, all the subsequent <code>strcat()</code>s have to "walk" the length of the string again to determine where the second string should be copied. As more data is added to <code>buffer</code> with each call to <code>strcat()</code>, that terminating null character also gets farther away from the beginning, meaning subsequent calls are increasingly slow—just as Schlemiel's path to his bucket keeps getting longer.
 
The problems illustrated by Spolsky's example are not noticed by a programmer who is using a high-level language and has little or no knowledge of its underlying principles and functions. "Some of the biggest mistakes people make even at the highest architectural levels come from having a weak or broken understanding of a few simple things at the very lowest levels."<ref name="basics" />