User talk:Dr pda/generatestats.js: Difference between revisions

Content deleted Content added
creating documentation
 
Example output: use more up-to-date stats as example
 
(2 intermediate revisions by one other user not shown)
Line 1:
The purpose of this script is to generate some statistics about articles which transclude a given template, namely a list of the ten longest and ten shortest articles, the mean and median length, and a histogram of the article lengths. The original motivation was to find out what were the longest and shortest <nowiki>{{featured articles}}</nowiki>, but could also be used for your favourite stub, infobox or infoboxother template.
 
 
Line 9:
==Usage==
Once you have installed the script, go to http://en.wikipedia.org/w/index.php?title=User:Dr_pda/generatestats&action=edit. A dialog box will pop up, asking you to enter the name of the template, without the word "Template:", i.e. ''featured article'', instead of ''Template:featured article''. The script will then retrieve the necessary information, 500 pages at a time, showing the progress within the edit window on that page. You can stop this at any time by navigating away from the page (e.g. clicking the back button in your browser). Once it is done the script will copy the output into the edit window and preview the page. If you desire you can then copy the wiki-text and save it somewhere else.
 
If you want to have the full list of articles sorted by size, as well as just the top and bottom ten, go to http://en.wikipedia.org/w/index.php?title=User:Dr_pda/generatestats&action=edit&list
 
To get statistics for articles whose talk pages belong to a certain category (e.g. WikiProjects) use the URL http://en.wikipedia.org/w/index.php?title=User:Dr_pda/generatestats&action=edit&usetalkcategory, or http://en.wikipedia.org/w/index.php?title=User:Dr_pda/generatestats&action=edit&usetalkcategory&list for the version with a list.
 
To get statistics for articles which transclude any template within a given category (e.g. [[:Category:Television episode infobox templates]]), use the URL http://en.wikipedia.org/w/index.php?title=User:Dr_pda/generatestats&action=edit&usetemplatecategory, or http://en.wikipedia.org/w/index.php?title=User:Dr_pda/generatestats&action=edit&usetemplatecategory&list for the version with a list.
 
==Example output==
===Ten longest articles===
# [[Intelligent design]] (163179 kB)
# [[Ronald Reagan]] (116163 kB)
# 2005 Texas Longhorn football team (146 kB)
# Byzantine[[Muhammad Empireal-Durrah incident]] (129162 kB)
# Che[[Elvis GuevaraPresley]] (125161 kB)
# [[General relativity]] (161 kB)
# Campaign history of the Roman military (125 kB)
# Bob[[Barack DylanObama]] (125161 kB)
# Belgium[[Battle of the Coral Sea]] (124156 kB)
# Sound[[Major filmdepressive disorder]] (118156 kB)
# AIDS[[Michael Jackson]] (117153 kB)
# 2005[[2007 TexasUSC LonghornTrojans football team]] (146153 kB)
# Ronald Reagan (116 kB)
 
===Ten shortest articles===
# John[[Tropical DayDepression Ten (printer2005)]] (89 kB)
# Hurricane[[Nico Irene (2005)Ditch]] (9 kB)
# Bam[[2005 ThwokAzores subtropical storm]] (1110 kB)
# Pilot[[MissingNo.]] (House) (1110 kB)
# Warren[[Hurricane CountyIrene Canal(2005)]] (1210 kB)
# "She[[Bam Shoulda Said 'No'!"Thwok]] (1210 kB)
# Common[[Tropical scoldStorm Erick (2007)]] (1210 kB)
# ROT13[[North Road (stadium)]] (1211 kB)
# 2000[[She SriShoulda LankaSaid cycloneNo!]] (1312 kB)
# [[Interactions (The Spectacular Spider-Man)]] (12 kB)
# Cincinnati, Lebanon and Northern Railway (14 kB)
 
===Statistics===
*Number of articles: 17042815
*Mean: 4650.772261 kB
*Median: 4344.444795 kB
 
===Chart===
Line 48 ⟶ 54:
id:steel value:rgb(0.6,0.7,0.8)
 
ImageSize = width:auto height:300303 barincrement:25
PlotArea = left:50 bottom:50 top:30 right:30
DateFormat = x.y
Period = from:0 till:6501100
TimeAxis = orientation:vertical
AlignBars = early
ScaleMajor = gridcolor:darkgrey increment:50100 start:0
ScaleMinor = gridcolor:lightgrey increment:2550 start:0
BackgroundColors = canvas:white
 
PlotData=
color:steel width:20 align:left
bar:0 from:0 till:119179
bar:20 from:0 till:6151014
bar:40 from:0 till:581817
bar:60 from:0 till:253447
bar:80 from:0 till:101215
bar:100 from:0 till:2882
bar:120 from:0 till:542
bar:140 from:0 till:113
bar:160 from:0 till:16
bar:60 at:0 text:"Article size in kB" shift:(0,-30)
 
Line 74 ⟶ 80:
 
==Notes==
*The size of the article is that of the wiki text, i.e. what appears in the edit window. It is NOT the readable proze size. To(This calculatecan thatbe youcalculated willon needan toarticle-by-article usebasis some other method, such asby [[User talk:Dr pda/prosesize.js|this prose size script]].) (If Iit haveis timeREALLY I might trynecessary to interfacehave thesethe tworeadable scriptsprose size, howeverthis thescript prosewill sizenow scriptsupport requiresit loadingat thehttp://en.wikipedia.org/w/index.php?title=User:Dr_pda/generatestats&action=edit&prosesize&list, HTML'''however versionthis ofrequires loading each pagearticle, which is resource intensive. and will take a long time if there are a large number of articles''' (approx 1 hour for 1500 articles).
*This script only counts pages which are in the article namespace, so it won't work for talk page templates (e.g. wikiproject banners).
*The script chooses bin sizes on the horizontal axis such that there are approximately 15 bins, but they use a sensible scale (1,2,5,10,20,50 etc). Due to the limitations of the code used to generate the chart, the labels are in the middle of each bin, rather than the left hand edge. Thus in the example above, the first bin contains articles between 0 and 20 kB, the second bin between 20 and 40 kB, and so on. Note that the upper edge of the last bin is not marked; here it contains articles between 160 and 180 kB.
*You can see the numbers for the histogram by looking in the edit window.
*Sometimes the chart doesn't show up in the preview. I'm not sure why; sometimes adding/removing a blank line, changing the height or inserting an error then correcting it, made it show up. Maybe it just needs to be saved.