Schlemiel the Painter's algorithm: Difference between revisions

Content deleted Content added
Addbot (talk | contribs)
m Bot: Adding Orphan Tag (Questions) (Report Errors)
m +{{Redirect category shell}} for multiple-{{R}} #Rs using AWB
 
(34 intermediate revisions by 28 users not shown)
Line 1:
#REDIRECT [[Joel Spolsky#Schlemiel the Painter's algorithm]]
{{Orphan|date=February 2009}}
In software development, a '''Schlemiel the Painter'''['s]<!-- Spolsky uses it both with and without the possessive 's -->''' algorithm''' denotes any methodology that is inefficient because the programmer has overlooked some fundamental issues at the very lowest levels of [[software design]]. The term was coined by software engineer and essayist [[Joel Spolsky]].
 
{{Redirect category shell|1=
__TOC__
{{R from Merge}}
==Spolsky's analogy==
{{R to Section}}
Spolsky used a [[Yiddish]] joke to illustrate a certain poor programming practice. In the joke, Schlemiel (also rendered Shlemiel) has a job painting the dotted lines down the middle of a road. Each day, Schlemiel paints less than he painted the day before. When asked how that could possibly be, Schlemiel complains that it is because each day he gets further away from the paint can.<ref name="basics">{{citation|last=Spolsky|first=Joel|title=Back to Basics|date=December 11, 2001|series=Joel on Software|url=http://www.joelonsoftware.com/articles/fog0000000319.html|publisher=joelonsoftware.com}}.</ref>
}}
 
The inefficiency Spolsky was drawing an analogy to was the poor programming practice of repeated [[concatenation]] of [[C (programming language)|C]]-style null terminated character arrays (in general computing parlance known as "[[String (computer science)|strings]]")&mdash; in which the length of the destination string has to be recomputed each time because it is not carried over from a previous concatenation.
 
Spolsky condemned such inefficiencies as typical for programmers who had not been taught basic programming techniques before they began programming using higher level languages: "Generations of graduates are descending on us and creating ''Shlemiel The Painter algorithms'' right and left and they don't even realize it, since they fundamentally have no idea that strings are, at a very deep level, difficult."<ref name="basics" />
 
Coined in 2001, the term has since becoming part of the vernacular to denote inefficient programming techniques.<sup>''cf.'' </sup><ref>{{citation|title=Programming interview questions|last=Cox|first=William|date=November 19, 2005|url=http://discuss.techinterview.org/default.asp?interview.11.246942.7|publisher=techinterview.org}}.</ref>
<ref>{{citation|last=Atwood|first=Jeff|date=September 19, 2007|title=Everything Is Fast For Small n|url=http://www.codinghorror.com/blog/archives/000957.html|publisher=codinghorror.com}}.</ref> Spolsky's essays have been cited as examples of good writing "about their insular world in a way that wins the respect of their colleagues and the attention of outsiders."<ref>{{citation|last=Rosenberg|first=
Scott|title=The Shlemiel way of software|date=December 9, 2004|url=http://dir.salon.com/story/tech/feature/2004/12/09/spolsky/|publisher=salon.com}}.</ref>
 
==Spolsky's example==
The programming practice that Spolsky used to make his point was repeated [[concatenation]] of [[C (programming language)|C]]-style character arrays ("strings").<ref name="basics" />
 
The first step in every implementation of the [[C standard library|standard C library]] function for [[concatenation|concatenating]] strings is a determination of the length of the string being appended to. In subsequent steps, another string is then copied to the end of the first string, so effectively concatenating the two. At the end of the concatenation the combined length will be known, but will be discarded upon return to the calling code.
 
In Spolsky's example the "Schlemiels" occur when multiple strings are being concatenated together:
# <code>[[strcat]]( buffer, "John" ); &nbsp; </code>/* Here, the string "John" is appended to the buffer */
# <code>strcat( buffer, "Paul" ); &nbsp; </code>/* Now the string "Paul" is appended to ''that'' */
# <code>strcat( buffer, "George" ); </code>/* ... and the string "George" is appended to ''that'' */
# <code>strcat( buffer, "Ringo" ); &nbsp;</code>/* ... and the string "Ringo" is appended to ''that'' */
By the end of the first invocation of <code>strcat()</code> the length of <code>buffer</code> will be known but not preserved upon return to the point just after step 1 and just before step 2. Subsequent calls to <code>strcat()</code> have to compute that length again before concatenating another name to the <code>buffer</code>.
 
As such, and analogous to Schmiel's not carrying the paint-bucket with him, all the subsequent <code>strcat()</code>s have to again "walk" the length of the string to determine where the second string should be copied. As more data is added to it, the string in <code>buffer</code> also gets longer with each call to <code>strcat()</code>, and with increasing length, the determination of that length takes longer, which means that subsequent calls are increasingly slower. Just as "Schlemiel's" path to his bucket keeps getting longer.
 
The problems illustrated by Spolsky's example remain hidden to a programmer with little or no knowledge of the underlying principles and functions, which every higher-level language and library will still be using even when the low-level manipulation is not immediately obvious at the higher level. "Some of the biggest mistakes people make even at the highest architectural levels come from having a weak or broken understanding of a few simple things at the very lowest levels."<ref name="basics" />
 
== References ==
{{reflist}}
 
[[Category:Software engineering terminology]]
[[Category:Software development philosophies]]