Content deleted Content added
m →Usage of SLOC measures: citation provided |
CortexFiend (talk | contribs) Link suggestions feature: 2 links added. |
||
(58 intermediate revisions by 43 users not shown) | |||
Line 1:
{{short description|Software metric used to measure the size of a computer program}}
{{Multiple issues|
{{lead too short|date=April 2012}}
{{
{{Example farm|date=May 2012}}
}}
Line 8 ⟶ 9:
==Measurement methods==
There are two major types of SLOC measures: physical SLOC (LOC) and logical SLOC (LLOC). Specific definitions of these two measures vary, but the most common definition of physical SLOC is a count of lines in the text of the program's source code excluding comment lines.<ref>{{citation |url=http://sunset.usc.edu/csse/TECHRPTS/2007/usc-csse-2007-737/usc-csse-2007-737.pdf |title=A SLOC Counting Standard |author1=Vu Nguyen |author2=Sophia Deeds-Rubin |author3=Thomas Tan |author4=Barry Boehm |publisher=Center for Systems and Software Engineering, University of Southern California |year=2007 }}</ref>
Logical SLOC attempts to measure the number of executable "statements", but their specific definitions are tied to specific computer languages (one simple logical SLOC measure for [[C (programming language)|C]]-like [[programming language]]s is the number of statement-terminating semicolons). It is much easier to create tools that measure physical SLOC, and physical SLOC definitions are easier to explain. However, physical SLOC measures are more sensitive to logically irrelevant formatting and style conventions
Consider this snippet of C code as an example of the ambiguity encountered when determining SLOC:
<
for (i = 0; i < 100; i++) printf("hello"); /* How many lines of code is this? */
</syntaxhighlight>
In this example we have:
Line 24 ⟶ 25:
* 1 comment line.
Depending on the programmer and coding standards, the above "line" of code
<
/* Now how many lines of code is this? */
for (i = 0; i < 100; i++)
Line 31 ⟶ 32:
printf("hello");
}
</syntaxhighlight>
In this example we have:
Line 41 ⟶ 42:
==Origins==
At the time
==Usage of SLOC measures==
{{weasel words|date=September 2013}}
SLOC measures are somewhat controversial, particularly in the way that they are sometimes misused. Experiments have repeatedly confirmed that effort is highly correlated with SLOC{{Citation needed|date=July 2009}}, that is, programs with larger SLOC values take more time to develop. Thus, SLOC can be effective in estimating effort. However, functionality is less well correlated with SLOC: skilled developers may be able to develop the same functionality with far less code, so one program with fewer SLOC may exhibit more functionality than another similar program. Counting SLOC as productivity measure has its caveats, since a developer can develop only a few lines and yet be far more productive in terms of functionality than a developer who ends up creating more lines (and generally spending more effort). Good developers may merge multiple code modules into a single module, improving the system yet appearing to have negative productivity because they remove code
SLOC counting exhibits further accuracy issues at comparing programs written in different languages unless adjustment factors are applied to normalize languages. Various [[computer language]]s balance brevity and clarity in different ways; as an extreme example, most [[assembly language]]s would require hundreds of lines of code to perform the same task as a few characters in [[APL programming language|APL]]. The following example shows a comparison of a [[Hello world program|"hello world" program]] written in [[BASIC]], [[C (programming language)|C]], and
{| class="wikitable" style="margin: 1em auto 1em auto"
|-
! BASIC || C || COBOL
|-
|
<
# include <stdio.h>▼
PRINT "hello, world"
</syntaxhighlight>
||
<syntaxhighlight lang="c">
int main() {
printf("
}
</syntaxhighlight>
||
<
identification division.
program-id. hello .
procedure division.
display "hello, world"
goback .
end program hello .
</syntaxhighlight>
|-
|Lines of code: 1<br />(no whitespace)||Lines of code: 4<br />(excluding whitespace) || Lines of code: 6<br />(excluding whitespace)
|}
Another increasingly common problem in comparing SLOC metrics is the difference between auto-generated and hand-written code. Modern software tools often have the capability to auto-generate enormous amounts of code with a few clicks of a mouse. For instance, [[graphical user interface builder]]s automatically generate all the source code for a [[Graphical control element (software)|graphical control elements]] simply by dragging an icon onto a workspace. The work involved in creating this code cannot reasonably be compared to the work necessary to write a [[device driver]], for instance. By the same token, a hand-coded custom GUI class could easily be more demanding than a simple device driver; hence the shortcoming of this metric.
There are several cost, schedule, and effort estimation models which use SLOC as an input parameter, including the widely used Constructive Cost Model ([[COCOMO]]) series of models by [[Barry Boehm]] et al., [[PRICE Systems]] [[True S]] and Galorath's [[SEER-SEM]]. While these models have shown good predictive power, they are only as good as the estimates (particularly the SLOC estimates) fed to them. Many<ref>IFPUG [http://www.qpmg.com/pdf/articles/Quantifying_the_Benefits_Using_Function_Points.pdf "Quantifying the Benefits of Using Function Points"]</ref> have advocated the use of [[function point]]s instead of SLOC as a measure of functionality, but since function points are highly correlated to SLOC (and cannot be automatically measured) this is not a universally held view.
Line 92 ⟶ 103:
|publisher=Knowing.NET
|date=December 6, 2005
|
}}<br />This in turn cites Vincent Maraia's ''The Build Master'' as the source of the information.</ref>
|-
Line 104 ⟶ 115:
| 2001 || Windows XP || 45<ref>{{cite web
|url=https://www.facebook.com/windows/posts/155741344475532
|archive-url=https://ghostarchive.org/iarchive/facebook/30968512668/155741344475532 |archive-date=2022-02-26 |url-access=limited|title=How Many Lines of Code in Windows XP?
|publisher=Microsoft
|date=January 11, 2011
}}{{cbignore}}</ref><ref>{{Cite web|date=2012-09-21|title=A history of Windows - Microsoft Windows|url=http://windows.microsoft.com/en-AU/windows/history#T1=era6|access-date=2021-03-26|archive-url=https://web.archive.org/web/20120921002229/http://windows.microsoft.com/en-AU/windows/history#T1=era6|archive-date=2012-09-21}}</ref>
|-
| 2003 || Windows Server 2003 || 50<ref name = "Knowing.NET"/>
|}
David A. Wheeler studied the [[Red Hat]] distribution of the [[Linux|Linux operating system]], and reported that [[Red Hat Linux]] version 7.1<ref name = "Wheeler-RH7.1">{{cite web |date=2001-06-30 |url=http://www.dwheeler.com/sloc/redhat71-v1/redhat71sloc.html |author=David A. Wheeler |title=More Than a Gigabuck: Estimating GNU/Linux's Size }}</ref> (released April 2001) contained over 30 million physical SLOC. He also extrapolated that, had it been developed by conventional proprietary means, it would have required about 8,000 person-years of development effort and would have cost over $1 billion (in year 2000 U.S. dollars).
A similar study was later made of [[Debian GNU/Linux]] version 2.2 (also known as "Potato"); this operating system was originally released in August 2000. This study found that Debian GNU/Linux 2.2 included over 55 million SLOC, and if developed in a conventional proprietary way would have required 14,005 person-years and cost 1.9 billion USD to develop. Later runs of the tools used report that the following release of Debian had 104 million SLOC, and {{As of|2005|alt=as of year 2005}}, the newest release is going to include over 213 million SLOC.▼
▲A similar study was later made of [[Debian GNU/Linux]] version 2.2 (also known as "Potato"); this operating system was originally released in August 2000. This study found that Debian GNU/Linux 2.2 included over 55 million SLOC, and if developed in a conventional proprietary way would have required 14,005 person-years and cost US$1.9 billion
{| class="wikitable" summary="Operating Systems SLOC Sizes"
Line 125 ⟶ 131:
! Year || Operating system || SLOC (million)
|-
| 2000 || Debian 2.2 || 55–59<ref>{{cite web | author = González-Barahona, Jesús M.
|-
| 2002 || Debian 3.0 || 104<ref name="debian-sloc"/>
Line 135 ⟶ 141:
| 2009 || Debian 5.0 || 324<ref name="debian-sloc"/>
|-
| 2012 || Debian 7.0 || 419<ref>Debian 7.0 was released in May 2013. The number is an estimate published on 2012-02-13, using the code base which would become Debian 7.0, using the same software method as for the data published by David A. Wheeler. {{cite web|author=James Bromberger |title=Debian Wheezy: US$19 Billion. Your
|-
| 2009 || [[Opensolaris|OpenSolaris]] || 9.7
Line 141 ⟶ 147:
| || [[FreeBSD]] || 8.8
|-
| 2005 || [[Mac OS X]] 10.4 || 86<ref>{{cite web | last = Jobs | first = Steve | title = Live from WWDC 2006: Steve Jobs Keynote | url = https://www.engadget.com/2006/08/07/live-from-wwdc-2006-steve-jobs-keynote/ |date=August 2006 |
|-
| 1991 || [[
|-
| 2001 ||
|-
| 2003 ||
|-
| 2009 ||
|-
| 2009 ||
|-
| 2010 ||
|-
| 2012 ||
|-
| 2015-06-30 ||
|}
Line 166 ⟶ 172:
# Scope for automation of counting: since line of code is a physical entity, manual counting effort can be easily eliminated by automating the counting process. Small utilities may be developed for counting the LOC in a program. However, a logical code counting utility developed for a specific language cannot be used for other languages due to the syntactical and structural differences among languages. Physical LOC counters, however, have been produced which count dozens of languages.
# An intuitive metric: line of code serves as an intuitive metric for measuring the size of software because it can be seen, and the effect of it can be visualized. [[Function points]] are said to be more of an objective metric which cannot be imagined as being a physical entity, it exists only in the logical space. This way, LOC comes in handy to express the size of software among programmers with low levels of experience.
# Ubiquitous measure: LOC measures have been around since the earliest days of software.
=== Disadvantages ===
Line 172 ⟶ 178:
# Lack of cohesion with functionality: though experiments {{By whom|date=April 2010}} have repeatedly confirmed that while effort is highly correlated with LOC, functionality is less well correlated with LOC. That is, skilled developers may be able to develop the same functionality with far less code, so one program with less LOC may exhibit more functionality than another similar program. In particular, LOC is a poor productivity measure of individuals, because a developer who develops only a few lines may still be more productive than a developer creating more lines of code – even more: some good refactoring like "extract method" to get rid of [[redundant code]] and keep it clean will mostly reduce the lines of code.
# Adverse impact on estimation: because of the fact presented under point #1, estimates based on lines of code can adversely go wrong, in all possibility.
#
# Difference in languages: consider two applications that provide the same functionality (screens, reports, databases). One of the applications is written in C++ and the other application written in a language like COBOL. The number of function points would be exactly the same, but aspects of the application would be different. The lines of code needed to develop the application would certainly not be the same. As a consequence, the amount of effort required to develop the application would be different (hours per function point). Unlike lines of code, the number of function points will remain constant.
# Advent of [[Graphical user interface|GUI]] tools: with the advent of GUI-based programming languages and tools such as [[Visual Basic (classic)|Visual Basic]], programmers can write relatively little code and achieve high levels of functionality. For example, instead of writing a program to create a window and draw a button, a user with a GUI tool can use drag-and-drop and other mouse operations to place components on a workspace. Code that is automatically generated by a GUI tool is not usually taken into consideration when using LOC methods of measurement. This results in variation between languages; the same task that can be done in a single line of code (or no code at all) in one language may require several lines of code in another.
# Problems with multiple languages: in
# Lack of counting standards: there is no standard definition of what a line of code is. Do comments count? Are data declarations included? What happens if a statement extends over several lines? – These are the questions that often arise. Though organizations like SEI and IEEE have published some guidelines in an attempt to standardize counting, it is difficult to put these into practice especially in the face of newer and newer languages being introduced every year.
# Psychology: a programmer whose productivity is being measured in lines of code will have an incentive to write unnecessarily verbose code. The more management is focusing on lines of code, the more incentive the programmer has to expand
In the [[PBS]] documentary ''[[Triumph of the Nerds]]'', Microsoft executive [[Steve Ballmer]] criticized the use of counting lines of code:
Line 183 ⟶ 189:
In IBM there's a religion in software that says you have to count K-LOCs, and a K-LOC is a thousand lines of code. How big a project is it? Oh, it's sort of a 10K-LOC project. This is a 20K-LOCer. And this is 50K-LOCs. And IBM wanted to sort of make it the religion about how we got paid. How much money we made off [[OS/2]], how much they did. How many K-LOCs did you do? And we kept trying to convince them – hey, if we have – a developer's got a good idea and he can get something done in 4K-LOCs instead of 20K-LOCs, should we make less money? Because he's made something smaller and faster, less K-LOC. K-LOCs, K-LOCs, that's the methodology. Ugh! Anyway, that always makes my back just crinkle up at the thought of the whole thing.
</blockquote>
According to the [[Computer History Museum]] Apple Developer [[Bill Atkinson]] in 1982 found problems with this practice:
<blockquote>
When the Lisa team was pushing to finalize their software in 1982, project managers started requiring programmers to submit weekly forms reporting on the number of lines of code they had written. Bill Atkinson thought that was silly. For the week in which he had rewritten QuickDraw’s region calculation routines to be six times faster and 2000 lines shorter, he put “-2000″ on the form. After a few more weeks the managers stopped asking him to fill out the form, and he gladly complied.<ref>{{Cite web|date=2010-07-18|title=MacPaint and QuickDraw Source Code|url=https://computerhistory.org/blog/macpaint-and-quickdraw-source-code/|access-date=2021-04-15|website=CHM|language=en}}</ref><ref>{{Cite web|title=Folklore.org: -2000 Lines Of Code|url=https://www.folklore.org/StoryView.py?story=Negative_2000_Lines_Of_Code.txt|access-date=2021-04-15|website=www.folklore.org}}</ref>
</blockquote>
==See also==
* [[Software development effort estimation]]
* [[Estimation (project management)]]
* [[Cost estimation in software engineering]]
==Notes==
Line 214 ⟶ 218:
|date=May 2005}}
* {{cite journal
|url=
| last = McGraw
| first = Gary
Line 224 ⟶ 228:
| pages = 59–66
|doi=10.1109/MSECP.2003.1193213
| url-access = subscription
}}<!-- this is talking about a different type of metric -->
* {{cite journal
| author = Park, Robert E.| title = Software Size Measurement: A Framework for Counting Source Statements
| journal = Technical Report CMU/SEI-92-TR-20
| date = 31 August 1992
| url = http://www.sei.cmu.edu/library/abstracts/reports/92tr020.cfm |display-authors=etal}}
Line 237 ⟶ 243:
| title = SLOCCount
| url = http://www.dwheeler.com/sloccount
|
}}
* {{cite web
Line 244 ⟶ 250:
|date=June 2001
| url = http://www.dwheeler.com/sloc
|
}}
* Tanenbaum, Andrew S. ''Modern Operating Systems'' (2nd ed.). Prentice Hall. {{ISBN|0-13-092641-8}}.
Line 252 ⟶ 258:
|author = Howard Dahdah
|date = 2007-01-24
|
|
|
|
}}
*
* [http://folklore.org/StoryView.py?project=Macintosh&story=Negative_2000_Lines_Of_Code.txt&detail=medium/ Folklore.org: Macintosh Stories: -2000 Lines Of Code]
|