- Old talk at Wikipedia talk:Categorization/Archive 1
- Archived main page discussion on Hierarchicalization at Wikipedia talk:Categorization/Archive 2
- Further discussion at Wikipedia talk:Categorization/Archive 3
- Further discussion at Wikipedia talk:Categorization/Archive 4 (archived on 4 Sep 2004, includes material up to approximately 20 Aug 2004)
- Further discussion at Wikipedia_talk:Categorization/Archive_5 Includes material up till about 8 Sep 2004
- Further discussion at Wikipedia_talk:Categorization/Archive_6 Includes material up till about 30 Sep 2004
To-do list is empty: remove {{To do}} tag or click on edit to add an item. |
Too many cats--- 20165 to be exact
There are 20165 categories, from category:.hack to category:ß-lactam antibiotics. -- user:zanimum
- And how many cats do you think there should be? —Mike 03:34, Sep 7, 2004 (UTC)
- 20,000 categories for almost a million articles seems like a fair ratio, especially taking into consideration that there are a number of categories that necessarily supercategorize other categories rather than articles. -Sean Curtin 03:39, Sep 7, 2004 (UTC)
- The number has grown by about 50% in the last month! There is a general movement to categorize taking place. I'm not convinced it's the way to go. Too many people seem to be categorizing without agreeing a structure first. Noisy 09:07, 7 Sep 2004 (UTC)
- The wiki way of determining what structure categories should have is to let many people categorize articles, and seeing what structure eventually develops out of that. Once a consensus develops the "nonstandard" categories can be tidied up to match. For people who just plain don't like categories as a concept, how about requesting a preference setting be added to the software to hide them? Bryan 15:19, 8 Sep 2004 (UTC)
- We should have more. Every article should belong to a category. (Can we get stats on how many articles that have categories?)[[User:Sverdrup|User:Sverdrup]] 15:05, 12 Sep 2004 (UTC)
- About 170'000 articles are categorized in up to 17 categories (Interstate_95, Mega_Man, Elite_(computer_game)). -- User:Docu
- I forgot to mention: Standard_Occupational_Classification_System with 24 categories (I thought it was in Wikipedia namespace). -- User:Docu
- About 170'000 articles are categorized in up to 17 categories (Interstate_95, Mega_Man, Elite_(computer_game)). -- User:Docu
- 26319 categories now - it looks like a large waste of people's effort to me. Some, like the subcategories of "Category:Swiss military trainer aircraft" are a joke. Others like "Category:EU countries" need to be renamed "Category:EU member states" to solve a dispute about what is a country. But people are too busy creating categories, when in the past they would have been improving and cleaning articles and iLinks. --Henrygb 22:38, 18 Oct 2004 (UTC)
- It's about 25300 categories for 270000 pages (all namespaces).
- BTW the aircraft categories appear quite systematic compared to others with a similar degree of detail. It just that some prefer the same information in one category instead of three different ones. -- User:Docu
- Doesn't seem at all excessive to me. Figuring on average maybe 4 categories per article and 40 articles per category, I'd expect 10% as many categories as articles. It would be interesting to know what the respective averages are at this time. -- Jmabel | Talk 05:46, Oct 19, 2004 (UTC)
- A while back I realized that there were articles on hundreds of ships in the Royal Navy that appeared to be of no particularly special note. Personally, I would have considered it a huge waste of my time to have created all those articles. However, whoever did create those articles clearly had different priorities for the use of their time than I, and I say more power to them; I didn't raise the slightest peep in VfD or anywhere else trying to get rid of all the effort they'd wasted. Instead I just categorized all those articles and moved on to other things. :) Bryan 06:02, 19 Oct 2004 (UTC)
One-page alphabetical listing
- By popular demand... User:Pearle/categories-alpha (~700k!) contains a plaintext list, sorted alphabetically, of all categories that existed in the database or were linked to from an article or subcategory in the latest database dump. -- Beland 02:50, 19 Oct 2004 (UTC)
Mix Images and articles in Categories?
I can't find any specific information about this; the question arose with Category:Vincent van Gogh, which currently contains both the articles related to him as well as the Images of his paintings. Should we keep Image: pages separate from articles in categories, or is this categorization acceptable? My common sense wants to believe that we should do as usual - keep reader-oriented (article namespace) and every thing else separate. Thoughts? Links to relevant discussions? ✏ Sverdrup 22:18, 22 Oct 2004 (UTC)
- In Bugzilla, there is " Bug 450: Categories need to be structured by namespace". To some extent the lists are already structured by namespace (their name sorts them together).
- Contrary to the deletion log sometimes included in categories or user pages, images aren't just noise in the category, but informative.
- As it's easy for readers and other users to distinguish them from articles, I'd include them. As another example, one could quote Category:Saint Helena. -- User:Docu
- Personally, I would prefer that image pages were not intermixed with article pages and I'm pretty sure this had been discussed early on somewhere, but I very much doubt that I could track it down now. It may have been on the mailing list. They might be OK as a subcat though. older≠wiser 23:07, Oct 22, 2004 (UTC)
- I agree that images should be separate. One of the reasons is that image titles are not always very descriptive and only serve to add clutter to the category. —Mike 02:24, Oct 26, 2004 (UTC)
- Ok. Let's try to avoid setting sortkeys for images to avoid that they get mixed with the articles (rather than grouped under Images:..) until Eloquence's proposal is implemented. -- User:Docu
Categories with multiple database id's
I've been looking through the database, regarding catagories, and I found 19 catagories with multiple id numbers. I've listed them below. Please let me know if you know anything about this. I'm just curious.
- Category:1886 has id's: 884675, 884676
- Category:Franklin_County,_Maine has id's: 786710, 786713
- Category:German_patrol_aircraft_1920-1929 has id's: 798147, 798154
- Category:Cambodian_people has id's: 805688, 805689
- Category:Uranus'_moons has id's: 706527, 706528
- Category:Republic_of_Singapore_Air_Force has id's: 839304, 839305
- Category:Italian_military_aircraft_1980-1989 has id's: 799817, 799818
- Category:Emo_albums has id's: 712744, 712745
- Category:Intersexual has id's: 698855, 698856
- Category:Golf has id's: 805695, 805696
- Category:Governors_of_Indiana has id's: 786464, 786465
- Category:Rivers_of_Canada has id's: 891748, 891749
- Category:Spielberg_films has id's: 738186, 738187
- Category:German_patrol_aircraft_1920-1929 has id's: 798136, 798154
- Category:Guam has id's: 901282, 901283
- Category:Dewey_Decimal_106 has id's: 771233, 771234
- Category:Towns_of_Ireland has id's: 797790, 797791
- Category:Lists_of_pieces_by_style has id's: 799314, 799322
- Category:Scientific_organizations has id's: 799318, 799325
JesseW 22:58, 5 Nov 2004 (UTC)
- There is a list of all those at User:Topbanana/Reports/Duplicate_article_title. Yesterday I fixed most of the article titles with duplicate ids (see the talk page). As the category links table uses titles instead of the ids, the categories are less a problem (for me). -- User:Docu
- The ones in category namespace should be fixed now as well. -- User:Docu
- Great! Thank you! JesseW 11:43, 14 Nov 2004 (UTC)
- The ones in category namespace should be fixed now as well. -- User:Docu
Sorting categories
If we add "[[Category:A]] [[Category:B]]" to the bottom of an article, it'll display: Categories: A | B.
But if we switch it, this order will also switch.
So, there should be a default way of sorting categories. I suggest we sort them by meaning or importance within the article, and not alphabetically.
For example, let's take Edgar Allan Poe:
Categories: 1809 births | 1849 deaths | People From Baltimore | Edgar Allan Poe | American writers | Science fiction writers | U.S. poets | Virginians
That's messy. Why his birth, his death and the fact his from Baltimore come before the fact he is writer? Each category has some connection to the article's subject, and the strongest connections SHOULD come first. In this case, I think it should be something like:
Categories: Edgar Allan Poe | American writers | U.S. poets | Science fiction writers | 1809 births | 1849 deaths | Virginians | People From Baltimore
Why? Well, simply because:
- He has his own category, and therefore, this category is strongly related to the article about him
- He is mostly known as a writer, otherwise he probably wouldn't even have an article here!
- He was an important US poet ("poet" is within "writer")
- He also wrote science fiction (also inside "writer")
- He was born in 1809 (birth comes first)
- He died in 1894 (then death, if it should)
- He is a virginian
- He is from Baltimore
Does anyone agree with me? Then we should make it as a guideline and add it to the categorization page. Thanks — Kieff | Talk 08:44, Nov 7, 2004 (UTC)
- Sounds good to me. If no one objects for a week or so, I'd say add it to the page. JesseW 10:44, 7 Nov 2004 (UTC)
Other sort orders often used are: alphabetical, chronological (e.g. for office related categories). Pywikipediabot formats them in alphabetical order. Template based categories appear above others (generally useful, e.g. the Template:Europe based Category:European countries on Sweden).
Preferences may vary depending on the skin used (some skins display the categories conveniently in the top right corner, rather than at the bottom in a separate box). Where the categories are placed at the bottom of the pages, they are generally preceded by a footer listing the most important topics.
The ideal order isn't necessarily the same for articles with 1-5, 6-9, 10 and more categories (see stats for biographies, Hank_Aaron has 26, George H. W. Bush 16).
Personally, I find alphabetical ordering convenient for a smaller number of categories (let's say 1-5) and I'm not convinced if it's a good idea to re-write the opening paragraph of the article in terms of categories. If an article has a category of its own, most of the other categories should probably go on that category rather than the article. -- User:Docu
- I'm not sure there should be a default. But if we do set a guideline, I lead toward alphabetical order, partly because it's objective. Maurreen 19:08, 7 Nov 2004 (UTC)
Category for places that are not cities
A series of municipalities (e.g. these), as well as other towns, villages and resorts in Switzerland, have articles, but they wont fit into Cities in Switzerland.
Is there a suggested standard? Should they use Towns in Switzerland? -- User:Docu
- If it were up to me, I'd put all or most of them together with cities as "Communities" or "Municipalities." But if people are dividing municipalities between cities and towns, then it also makes sense to have other groups for any other types of municipalities or unincorporated communities. Maurreen 19:19, 7 Nov 2004 (UTC)
- This distinction will be particularly difficult in the U.S.:
- cities
- townships
- incorporated villages
- unincorporated municipalities
- counties which may contain cities or (as with Kings County, New York) be contained within them
- parishes (in Louisiana)
(I'm sure I'm missing something) -- Jmabel | Talk 20:02, Nov 7, 2004 (UTC)
- Just a pedantic point, but "unincorporated municipalities" is a bit of an oxymoron--the term municipality almost exclusively refers to incorporated, self-governing communities. The term "unincorporated community" is used though. A couple of other U.S.-specific terms: town (which in some places may refer to a city-like municipality or in other places to a township-like municipality), borough, and for New York, hamlet is often used to refer to communities within a town that are not self-governing or separately incorporated. older≠wiser 21:27, Nov 7, 2004 (UTC)
I was using "municipalities" in a (possibly misguided) effort to be general, but, for example, in New York State there are "unicorporated villages". -- Jmabel | Talk 23:36, Nov 7, 2004 (UTC)
- "Boroughs", of course, also exist in some parts of the UK (and probably elsewhere); Greater London, for example, is broken down into small units mostly known as "boroughs" (e.g. Borough of Kensington and Chelsea). -- Jmabel | Talk 23:36, Nov 7, 2004 (UTC)
- Perhaps to answer this question you just need to find someone from Switzerland and ask them, "What do you call these?" —Mike 05:04, Nov 12, 2004 (UTC)
Poll about Category:Whatever|* , |(space), |!
- Halló, at Wikipedia:Categorisation FAQ#How do I sort the article differently on the category page? there are examples about [[Category:Whatever|*]] or [[Category:Whatever| ]] (and also [[Category:Whatever|!]] as used at Category:Wikipedia categorization).
- For the general look and feel it would be usefull to have one solution only. "*" is used in books for "notes", "!" indicates "attention!".
- How to make a poll about this, get consensus, adjust the links with a bot and recomend it to other languages as well? Regards Gangleri 12:06, 2004 Nov 8 (UTC)
- I think we should use "| " for titular articles(i.e. in Category:Swiss Military Airplanes, the article Swiss Military Airplanes should be listed as [[Category:Swiss Military Airplanes| ]]. I'm curious what other uses are in use. Maybe someone could grovel through the sql dump for this? JesseW 20:40, 11 Nov 2004 (UTC)
- "*" is the most common such sort key; see #Using sort keys to sort certain articles at the top? above. "(space)" is the second most common, and is my favourite choice. —AlanBarrett 20:54, 11 Nov 2004 (UTC)
- I'm more curious what uses other than putting titular articles first is the sort key used for. Any ideas? JesseW 21:04, 11 Nov 2004 (UTC)
- To sort people by surname instead of first name (e.g., to sort "Albert Einstein" under "E" instead of "A")
- To sort people by name instead of title (e.g., to sort "King Alfred" under "A" instead of "K")
- To sort articles in a special namespace by article name instead of namespace name (e.g., to sort "Wikipedia:Categorization" under "C" instead of "W")
- —AlanBarrett 21:17, 11 Nov 2004 (UTC)
- I guess I failed to make my question clear. I'll try again. What do people use "| ", "|*", etc. for, other than listing titular(articles whose title is the same as the category title) articles? JesseW 11:38, 14 Nov 2004 (UTC)
- Also used to highlight selected articles (or subcategories) in a category from a jumble of others, where the jumble shares some property not shared by the highlighted ones (sometimes effectively a "see also"). For example, articles named "list of <X>" in category <X>. Category purests would probably object to such usage, although I find it extremely useful. Example categories include category:U.S. states (I can't think of others at the moment, but as I run across more I'll add them here). -- Rick Block 16:41, 21 Nov 2004 (UTC)
- I'm more curious what uses other than putting titular articles first is the sort key used for. Any ideas? JesseW 21:04, 11 Nov 2004 (UTC)
- I think "*" would be the best solution. ja: agreed on this too. Gangleri | Th | T 05:16, 2004 Nov 13 (UTC)
- "*" seems best to me. Fredrik | talk 05:44, 13 Nov 2004 (UTC)
- I vote "|*" because it is more commonly used than "| ", I have never seen the latter. — PhilHibbs | talk 17:13, 10 Jan 2005 (UTC)
- "*" is the most common such sort key; see #Using sort keys to sort certain articles at the top? above. "(space)" is the second most common, and is my favourite choice. —AlanBarrett 20:54, 11 Nov 2004 (UTC)
- Is this settled yet? I often run into both "|*" and "| " (although not usually in the same category). I prefer "| " (and if pywikipediabot doesn't handle this form someone should fix it). -- Rick Block 16:41, 21 Nov 2004 (UTC)
Categorisation and Categorization
- Please take a look at
- At least for documentation and categories only one word (probably Categorization) should be used. meta: and other projects could be affected as well.
- Could a bot fix this? Regards Gangleri 12:18, 2004 Nov 8 (UTC)
- User:Docu
- Thanks for the google link. This is not what I was thinking about. I believe that only one word shoud be used else we would have everythink twice here and lot of (wrong) InterWiki links too. Categorization is used more often then Categorisation. Could you take care about starting to use only one term? Thanks for your efforts in advance! Regards Gangleri 15:49, 2004 Nov 10 (UTC)
Creating categories
The "Creating Categories" section says that if you add a link to a category page, the category will automatically be created. I added the text "[[Category:Carnegie Mellon professors|Blum, Manuel]]" to the Manuel Blum, but the category wasn't created. The link appears at the bottom, under "Categories", but it's red. Am I doing something wrong? Thanks. -- Creidieki 02:01, 10 Nov 2004 (UTC)
- Nope, you got it right. If you click on the red link you can see that it is populated. When a category appears as a red link, I think it means that it is an "orphan" category and needs to have a parent category added. It might also mean that it doesn't have any text in it, but I think that is optional. For example, in Category:Carnegie Mellon professors, you might add Category:Carnegie Mellon University and perhaps some text, something like "Articles about professors at Carnegie Mellon University, which is located in Pittsburgh in the U.S. state of Pennsylvania. Of course, at present Category:Carnegie Mellon University is also an orphan category and would need similar steps. older≠wiser 02:23, Nov 10, 2004 (UTC)
[[category vs {{category?
- Some articles have
[[Category:Plants]]
at the bottom, which comes out as a link as you would expect; some have
{{Category:Plants}}
which comes out as
Categories: Botany | Tree of life
?? What's the deal? Which is more correct? Why are there two different techniques? Shouldn't these two be connected?Gzuckier 22:48, 10 Nov 2004 (UTC)
- Huh. That's interesting. Haven't come across that before. I think it may be a mistaken attempt to use some sort of shortcut by inserting Category:Plants using the template inclusion syntax: {{Category:Plants}}. The resulting "Categories: Botany | Tree of life" appear in the article because they are the parent categories for Category:Plants. If you look at the article Hen and chicks, you can not only see the not only the categories, but also the {{Catmore}} template also mistakenly appears below the external links section. This should be corrected whenever you come across it and converted to use [[Category:Plants]] (or a more specific subcategory) older≠wiser 23:07, Nov 10, 2004 (UTC)
Ah. Many thanks. Gzuckier 15:45, 11 Nov 2004 (UTC)
"An article on a subject should be in a category of the same name."
User:Hyacinth added this line recently. While I think its a good idea, I don't totally understand it. I assume it means that if there is an article with the same name as a category, it should be included in the category; it does not mean that all articles should be in a category of their own, or that all categories should have an article with their same name. Or does it? If I was sure, I would just change it, but I'm not. Hyacinth, please explain further. JesseW 01:26, 11 Nov 2004 (UTC)
Better sorting for year categories
After some discussion at Category talk:Years (some of which occurred on the Village Pump), I'd like to propose the following way of handling year categories, such as Category:2004:
- Subcategories and articles should be sorted in the category by topic (and, when relevant, chronologically), rather than simply alphabetically.
- This means that an article like 2004 in film would contain the category reference [[Category:2004|Film]], 2004 Canadian budget would contain the reference [[Category:2004|Canadian budget]], and so forth. (Similarly for subcategory pages.)
- A List of... article like List of religious leaders in 2004 would contain [[Category:2004|Religious leaders]].
- The article 2004 itself would contain [[Category:2004|*]], to make it sort first among all articles.
- A month article like October 2004 would contain [[Category:2004|*2004-10]] to make it sort chronologically in the first section on the category page.
See Category:2004, Category:2003 and Category:2002 for the results of these proposals. (Note that these have changed a little since I last worked on them, so they don't follow this scheme exactly. You should still get the idea, though.)
Compare to, say, Category:2001, Category:2000 and Category:1999, which have not been systematically changed to this format.
Any objections to my adding (some of) these suggestions to Wikipedia:Categorization? Should this be officially voted on? I forsee the most controversy coming in the areas of 2004 in... and List of... type pages, but so far I've only heard one objection, from User:Docu — see his objections at User talk:D6#Better sorting for year categories. - dcljr 21:53, 12 Nov 2004 (UTC)
The year without a summer
While working on Selected Anniversaries, I came across "The Year Without a Summer" which actually happened in U.S. history (also called the year Eighteen Hundred and Froze to Death). This weather-related event clearly belongs to the category Category:Anomalous phenomenon but this Category is currently a sub-category of Category:Paranormal phenomena. I would like to separate Anomalous phenomenon from this POV category, as it is more of an Unexplained item than something in a fringe category. Ancheta Wis 17:29, 14 Nov 2004 (UTC)
- I wikilined your mentions of the categories(hope you don't mind). I think this is probably fine. You should check over the items in Anomalous phenomenon to make sure they fit the new definition for the catagory, but it looks OK to me. Please put a short description of what should be in the catagory on it's page. That will help future catagorizers. JesseW 22:23, 14 Nov 2004 (UTC)
- Er. There is no Category:Anomalous phenomenon. Um... JesseW 22:26, 14 Nov 2004 (UTC)
- Created Category:Anomalous phenomena
Removed section of discussion from article
I've removed the following section from project page because it's mainly hypothetical discussion, which belongs here on the talk page. If someone wants to put it back, please expand the original idea in a way that takes into account, but does not literally contain, the comments. - dcljr 21:26, 21 Nov 2004 (UTC)
Category extraction
An advantage of categorization is that it allows extraction of large portions of Wikipedia. For instance, if years and dates were as below (leftmost items are regular articles, the rest are categories), extracting, say, a timeline for the 21st century would be trivial.
2004 -> Years in the 21st century -> Years - \ --> Time periods / 30 March -----> Days in March ----> Days ---
- Please expand this explanation. I see no way from this to "extract a timeline for the 21st century" just a way to create a list of, say, years in the 21st century or days in March. So where is the whoopie in that? - Marshman 17:32, 5 Jun 2004 (UTC)
- Where this becomes slightly more interesting is when you have articles on historical events (e.g. Pearl Harbor, John F. Kennedy's Birth, the Great Northeast Blackout, etc.) put in the appropriate time-related category. But the ability to do completely automated extraction depends on how structured the category relationships are. You'd ideally like to be able to specify that the article is about an "event that occurred during" the category or "is a part of" or "is a member of" (say, for geographical or political relationships). So far we can only specify generic parent-child and "is related to" assignments; any other semantics must be inferred. -- Beland 09:32, 13 Jun 2004 (UTC)
- Sample with Canadian biographies: Wikipedia:People by year/Reports/Canadians. -- User:Docu
Categorising articles about sources/references
Hi, wondered if any of you had any good ideas about how to organise a schema for articles about sources/references? We've had a look at the existing category schemes and this seems to be rather big gap. Discussion at Wikipedia_talk:Forum_for_Encyclopedic_Standards#Source_category : ChrisG 18:40, 28 Nov 2004 (UTC)
Human/Personal Life
I'm not very familiar with how articles that fall under these categories are categorized but my quick checks indicate that the categorization scheme excludes the possibility of "intelligent" alien life with the same things as humans. Brianjd
Categories vs keywords
I believe many contributors seem to think of categories as keywords rather than hierarchical "is a" relationships and effectively attempt to use categorization as a general indexing mechanism for wikipedia articles. In the limit, I think this leads to a category for every word or concept expressed in an article as well as requests for features like category intersection and union. As gracefool eloquently points out in What is a category? the current category feature is in reality a mechanism for defining sets where each category (or set) contains articles and other categories (sets). Using explicitly defined set membership as in indexing mechanism seems fairly inefficient. Given the ability of google to fully index nearly every page on the entire web, I don't see any particular reason wikipedia should not provide a searchable, full text, index of all its articles. In addition, just as google provides the ability to restrict searches by DNS ___domain, wikipedia could provide the ability to restrict searches by category and by wikipedia links. I think this would go a long way toward eliminating the desire to create categories and could help avoid some of the arguments about how articles are categorized. For example (categories currently in WP:CFD):
- category:Oil for Food - completely unnecessary, search for "links-to:oil for food" (this category isn't even currently necessary given "what links here")
- categories of the form <some property> <actual category> (e.g. category:Venezuelan Soap Operas) - completely unnecessary, search for "links-to:soap opera" and "links-to:venezuela". If you don't trust that the links exist in the article, search for the words "soap opera" and "venezuela".
The point is that many (perhaps even most) of the existing categories have nothing to do with categorization but rather address some form of indexing. If indexing is explicitly addressed with a generalized search mechanism, the category feature can be used for something else. -- Rick Block 20:26, 5 Dec 2004 (UTC)
List vs. Topic categories?
If you descend in Category:Rock music groups, you can get to lots of non-music-groups. For example, ->Category:The Beatles->John Lennon. I'm pretty sure that this type of structure is deprecated. But Wikipedia:Categorization and its subpages are huge and ill-organized, so I haven't been able to find an unambiguous straight answer. Could someone in the know confirm or deny? Do you agree that Category:Rock music groups needs a big reorganization? --Dbenbenn 03:55, 8 Dec 2004 (UTC)
- IMO, even assuming there were "rules" about this (and I don't think there are any - and I've recently read all the archived talk from this page), they're effectively completely unenforceable. Per the previous topic on this page categories are simply sets, i.e. named collections of arbitrarily related articles and other categories. Rather than some sort of strict "is a" hierarchy I think a much better mental model for categories as they currently exist is a neural network - quite similar to how articles are freely linked to other articles. Bottom line is I don't think it's a problem that from category:Rock music groups you can "descend" and get to John Lennon. In fact if either Category:The Beatles was not in Category:Rock music groups or John Lennon was not in Category:The Beatles I'd say the categorization scheme was broken. -- Rick Block 05:06, 8 Dec 2004 (UTC)
- I agree. I think the categorization is more useful to readers if the category means, "these are things pertaining to rock music groups" (topic) rather than, "these are all rock music groups" (list). --Gary D 07:38, Dec 8, 2004 (UTC)
- Thanks for your responses. "they're effectively completely unenforceable": I don't consider this a problem. Wikipedia has lots of style conventions that aren't "enforced". That's what human editors are for. For example, there's nothing that enforces the fact that Category:Rock music groups means "these are all things pertaining to rock music groups".
- I wasn't clear on how I think it should be organized. 1) Leave the article The Beatles in Category:Rock music groups, and in Category:The Beatles. Make Category:The Beatles a subcategory of Category:Rock music. The point is, Category:Rock music is a "topic category", whereas Category:Rock music groups looks to me like it should be a "list category". That way, you can still link from Category:Rock music groups to The Beatles to Category:The Beatles to anything about The Beatles. --Dbenbenn 17:14, 8 Dec 2004 (UTC)
Inconsistent criteria
The business and economics categories were disorganized, so for the last week I have spent my time trying to sort it out. The way I approached it was to create a variety of categories, subcategories and subsubcategories. (There are about 50 of them now). Then I have been going through the articles listed in the old navigation system and deciding which of the 50 or so categories are applicable to each article. It turned out that the average is 2 or 3 categories per article. So far I have added about 1000 tags. I still have 6 of the old lists to go through. In going through this process I have discovered that different people have different ideas about what criterion to use in appending category tags. In particular, I have been in conflict with two other contributers:
- One felt that there should be only one category per article and deleted all but the single most relavent tag.
- The other felt that an article could not be placed in adjacent categories.
I reject both these criteria. The criteria I use is I try to put myself in the mind of the user who is using the category system as a navigational device. I ask myself, "If I was browsing in (for example) the Finance category, what articles would I expect to find there, and what articles would I find useful there". Can we arrive at some sort of policy on the appropriate criteria to use before I get into any more edit wars about something as unimportant as which category tags to use. mydogategodshat 17:39, 12 Dec 2004 (UTC)
- "One category per article" is just plain silly. Biographical articles, for example, are nearly always in two categories just for their birth and death year, plus another for nationality (although, depending what the person did, the latter might embrace occupation as well); an author who wrote significantly in two languages should be categorized as a writer in both languages, similarly one who was significantly connected to an ethnicity other than his/her citizenship. Etc.
- The other I'm not sure I even understand what you are saying: does calling two categories "adjacent" mean they have at least one common parent category? Again, the example of a person in the same profession in two different countries at different points in a career is one where this would almost certainly be correct, so it can't be a general principle. -- Jmabel | Talk 19:40, Dec 13, 2004 (UTC)
- The complaint I was getting is that if an article is listed in, for example, [category:Product management]], it can not also be listed in [category:Marketing]] because product management is included as a subcategory on the marketing page. These people seem to think that no article included in one category should be included in any category that links to that category. mydogategodshat 21:16, 13 Dec 2004 (UTC)
- Let take an example to clarify things. In deciding what category tags to put on income statement I would ask myself "Browsers on which category pages would likely find an entry for income statement useful?". I would conclude category:Finance, category:Accounting, and category:Business. I would place these three category tags on the article. But there are some people that would revert this claiming that because these three categories are directly linked, one or two of the tags are redundant.mydogategodshat 21:35, 13 Dec 2004 (UTC)
- I would agree that you should not generally use both a category and its parent category. The only exception is if the article is the "main article" of a category, then it also goes in the parent category. For example, Accounting would go in both category:Business and category:Accounting, but Income statement would go in category:Accounting but not category:Business, which would indeed be redundant. -- Jmabel | Talk 00:05, Dec 14, 2004 (UTC)
- At the heart of the issue is whether a structural criterion should be used or a content or user based criterion. When one asks "What articles would a user expect on a category page?" one is starting from the users expectations and working back to the database architecture. When one starts with a structural criteria like number of category tags per article or exclusivity of related categories, one is starting from a conception of an ideal architectural structure and force fitting user expectations to it. I feel a compelling argument can be made that using a structural criterion is doing things backwards. The database structure is there to serve the user, not the other way round. Where the two criteria are in conflict, stuctural elegence must yield to user friendliness. The category content should reflect the real world, rather than trying to make reality conform to a categorical structure. I have seen several category wars where one person appends a category to an article and another person changes it to another category claiming that their link is more relevent. This is the type of nonsense that result from using structural criteria like these. If instead, we let reality shape the structure of the database, links would be provided to both categories, irrespective of the relationships between categories. Right now there is an arguement on the category votes for deletion page over [category:international trade]]. Some want to delete it because they think there is too much overlap with [category:international economics]] (a structural criterion). Others want to keep it but delete [category:interational economics]]. The whole disscussion is misguided. There should be considerable overlap between the two categories because browsers on both pages would expect to see some of the same articles. Many of the topics dealt with in international economics (an economics subject) are also dealt with in international trade (a business subject). Any attempt to force these topics to conform to an ideal database structure free of overlap is dysfuctional. It is an example of "the tail wagging the dog". There are other problems with exclusivity criteria. Exclusivity between categories and subcategories will, in time, result in a structure that highlites the poorest articles while hiding the most important articles. As more and more subcategories are added the most important articles, the ones that are important enough to rate a subcategory of their own, will get further and further away from the parent category. The parent category will be left with all the "odd" articles, the "left-overs". In time the parent category will highlite the poorest articles, whereas the most important ones will be buried in sub categories and sub-subcategories. mydogategodshat 04:58, 15 Dec 2004 (UTC)
- I guess I'm coming in here a little late. I was one of the users who removed some categories. Generally I agree with the categorization policy, which is that we should use the most specific categories which are applicable. This means we don't add the same articles to Category:Business, Category:Accounting, and Category:Management accounting. The reason is simple: It would be hard to enforce this as a standard. It creates needless work to require or suggest that articles should be added to categories as well as their parents. The likely result of allowing this is spotty, inconsistent categorization. And from a semantic point of view, it is useless to add an article to a category as well as its parent. Rhobite 05:53, Dec 21, 2004 (UTC)
- Maybe I did not explain myself well, but I am not suggesting that "articles should be added to categories as well as their parents". I am suggesting that we abandon any structural criteria (including this one). Whether a category is the parent of another category has no baring on whether an article should be included in those categories. The amount of overlap between categories also is irrelevent. So is the number of category tags on any one article. The only salient criterion is "Would someone browsing through a given category page expects to see the article listed on that page?" mydogategodshat 04:11, 23 Dec 2004 (UTC)
- Strongly disagree: Otherwise, most articles would be in every ancestor category of any category that pertains. Broad categories would end up with thousands of articles. -- Jmabel | Talk 06:23, Dec 23, 2004 (UTC)
- That sounds like a "straw man arguement" to me. I don't see why they would be in "every ancestor category". Based on what I have done so far it looks like the 1600 business and economics articles would be categorized as: about 700 in the business category, between 100 and 300 in each of the major subcategories (for example marketing, finance, etc.), and less than 100 in each of the sub-sub-categories. I am not claiming that articles should be placed in both subcategories and parent categories. I reject all such structural criteria. Take the example of Income statement. It should be in the parent category Business, as well as the subcategories finance and accounting because income statements are important in all of business, not just in accounting and finance. In my opinion, of the 1600 business articles about 700 are general enough and broadly applicable enough to go into the business category (in addition to subcategories). mydogategodshat 06:41, 23 Dec 2004 (UTC)
- Strongly disagree: Otherwise, most articles would be in every ancestor category of any category that pertains. Broad categories would end up with thousands of articles. -- Jmabel | Talk 06:23, Dec 23, 2004 (UTC)
- I'm with Jmabel and Rhobite. From a user point of view a category needs to be usable. Having hundreds or even thousands of articles in a category make it unreadable, one cannot see the wood from the trees. Navigating the category system should lead you from general to further detail, or from detail to generality. Having articles in ancestor categories defeats the value of this. :ChrisG 18:50, 23 Dec 2004 (UTC)
- I think your fears are unwarrented. When I look at the category:Business page with over 500 articles, I do not see something that is "unreadable". I see a very usable alphabetical list of the 500 articles most relevent to business in general. And your criterion that the navigation system must move from general to specific, even if it is a useful criterion, will not be accomplished with the current system, one in which articles in areas important enough to be placed in subcategories and subsubcategories become buried while the main categories are populated with the "left-overs". mydogategodshat 21:52, 23 Dec 2004 (UTC)
- I'm with Jmabel and Rhobite. From a user point of view a category needs to be usable. Having hundreds or even thousands of articles in a category make it unreadable, one cannot see the wood from the trees. Navigating the category system should lead you from general to further detail, or from detail to generality. Having articles in ancestor categories defeats the value of this. :ChrisG 18:50, 23 Dec 2004 (UTC)
- Agree with Jmabel, Rhobite and ChrisG. The category system shouldn't become a keyword system. I think the category system can accomodate both "IsA" categories such as Category:American writers (the most common and "clean" use of categories) as well as related topics such as Category:Biology without becoming a free-for-all keyword system if we don't go overboard on categorizing articles and focus on only including the most specific category. --Lexor|Talk 21:23, Dec 23, 2004 (UTC)
- I think the idea of "focusing on only including the most specific category" is unworkable. What is the "most specific category" for managerial economics? Is it category:management or is it category:economics. If we stop tring to make reality fit into the categorization system, rather than the other way round, surely we would conclude the article belongs in both categories, irrespective of the relationships between the categories. mydogategodshat 22:03, 23 Dec 2004 (UTC)
- It would belong in both; neither category:management nor category:economics is a subcategory of the other. -- Jmabel | Talk 23:28, Dec 23, 2004 (UTC)
- I think the idea of "focusing on only including the most specific category" is unworkable. What is the "most specific category" for managerial economics? Is it category:management or is it category:economics. If we stop tring to make reality fit into the categorization system, rather than the other way round, surely we would conclude the article belongs in both categories, irrespective of the relationships between the categories. mydogategodshat 22:03, 23 Dec 2004 (UTC)
- Agree with Jmabel, Rhobite and ChrisG. The category system shouldn't become a keyword system. I think the category system can accomodate both "IsA" categories such as Category:American writers (the most common and "clean" use of categories) as well as related topics such as Category:Biology without becoming a free-for-all keyword system if we don't go overboard on categorizing articles and focus on only including the most specific category. --Lexor|Talk 21:23, Dec 23, 2004 (UTC)
- Jmabel is right, there can be multiple categories for each article, it's just that it should be in the lowest rung of the hierarchy for that particular kind of category. For example, central dogma of molecular biology is in Category:Molecular biology (but not in Category:Biology) and Category:Molecular genetics (but not in Category:Genetics). --Lexor|Talk 02:25, Dec 24, 2004 (UTC)
- Agreed. Some articles will inevitably be in a lot of categories, because they don't fit into the various categories we create in Wikipedia or typically in the world. I far more preferable to put an article in multiple specific categories than one general category. :ChrisG 09:58, 24 Dec 2004 (UTC)
- I prefer to put them in both specific categories and general categories if they are relevent to both. mydogategodshat 20:04, 25 Dec 2004 (UTC)
Special:categories page
Page Special:categories lists all categories alphabetically, beginning with numbered years, but is useless because I don't have the patience to scroll past the numbered entries! I need to get all the way to the L's to find the name of the category I need. Can a better way be found to help identify what categories are defines? RJFJR 05:43, 17 Dec 2004 (UTC)
- Agreed. What's with the first link - [[:Category:]]? Brianjd 05:45, 2004 Dec 17 (UTC)
- Yes, see User:Pearle/categories-alpha (warning: 600KB+) for an alphabetical list of all categories that existed at the time of the last database dump. -- Beland 03:41, 17 Jan 2005 (UTC)
- Use search, then turn on the "Category namespace" checkbox, and turn off all the other checkboxes. That will search amongst just the categories. (Of course, this doesn't work when internal search is turned off and redirected to Google and Yahoo!)-- Khym Chanur 12:07, Jan 17, 2005 (UTC)
How to categorizing defining elements of a category
How does one categorize the defining element of a category, so that, for example, the article "city" is somewhere pointed to in "Category:Cities". It seems some have done this by putting the article city in a category like: Category:Cities|? or Category:Cities|* and this may be appropriate. The article "city" is not actually a city (as an article like "New York City" would be) so it doesn't belong in a category that is supposed to contain cities (although it would be appropriate in an category like "Category:Urban studies and planning". Also, it seems that the first paragraph of the defining element article might should be automatically put in the intro text for the category, increasing the automation of categories. - Centrx 22:46, 21 Dec 2004 (UTC)
- As far as I know (and I've done a fair amount of categorization work lately) there aren't any precise rules about this. Perhaps the easiest solution is using the catmore (or catmore1) template in the category page itself which generates some text suggesting the reader might be interested in the indicated link. With catmore (syntax is {{catmore}}) the link is the article with the same name as the category name. With catmore1 (syntax {{catmore1|[[whatever]]}}) the link is provided as an argument to the template. As you've noticed, the article may or may not also be added to the category using a "|*" or "| " sort key. I agree it might be nice to automatically show the first paragraph of an article with the same name as a category as the category text (I've just done this manually for all subcategories of category:Japanese prefectures), but I don't think I'd hold my breath for this. -- Rick Block 05:02, 23 Dec 2004 (UTC)
- One of the problems with catmore is that it doesn't actually provide any information to the reader when he views the category, only a link. Regarding the defining element, what then is the appropriate course of action for articles which define the subject of a category, but are not in themselves a proper part of that category? - Centrx 05:42, 23 Dec 2004 (UTC)
- IMO, simply link to the article(s) defining the subject of the category in the text of the cateogry. See, for example, Category:Canal engineers, or Category:Aichi Prefecture. You can add as much text as you'd like to define the category (which, perhaps in some remotely distant version of the software, could be automatically generated from a like-named article), and then use catmore or catmore1. Is there some specific example you're worried about? -- Rick Block 05:53, 23 Dec 2004 (UTC)
- It seems appropriate that there should be, somewhere in the defining article, a link to the category which it happens to define. Less firmly (in my mind), I think it might be appropriate that this category link might be with the other category links, but I suppose that it could also be linked to in the introduction... - Centrx 01:52, 24 Dec 2004 (UTC)
I'm running into this increasingly often lately. I believe articles ABOUT the subject matter of a category should NOT be category members, but should rather be referred to in the category description and should link to the categories (using See also: or something similar). I'm finding too often that others come along and change the articles I set up that way so that the articles are instead within the category.
I find this less useful, since these articles are different from things that are semantically part OF the category, and should be picked out specially rather than lost in the morass of category members. —Morven 00:13, Dec 28, 2004 (UTC)
- That is the reason they are normally listed at the beginning of the category using the piped sort (ie. '[[Category:Whatever| ]]'). Unfortunately, repeated discussions about categorizing, like this one, point out the inadequacies of the present software. Hopefully, someday they will develop a better implementation of categories that make all this moot. —Mike 02:21, Dec 28, 2004 (UTC)
- It's not just the inadequacies of the software, but also disagreements about what categories should mean. Some have the view that a category is generally an 'IS-A' relationship. Others think a category should contain related subjects, even if they're not strictly in the set. Maybe some software tools to make assumptions more explicit would help (e.g. maybe automatically generated see-alsos for categories, instead of putting them as members, or something.) —Morven
World War II category question
Should I delete Category:Russian World War II people and create something like Category:Soviet Union World War II people ???
Darwin 16:34, 28 Dec 2004 (UTC)
Case-sensitive sorting
The case-sensitive sorting of category entries is rather annoying. Are there any established conventions for dealing with it? For example, I just added the sort-text "Ponie" to the PONIE article to stop it from appearing before Perl 6 and Perl Design Patterns Book in Category:Perl. A more serious example is Category:Free software which has a lot of capitalized entries such as GNOME, GIMP, LAMP, etc. — PhilHibbs | talk 14:49, 10 Jan 2005 (UTC)
- I have logged a bug report for this. — PhilHibbs | talk 10:39, 11 Jan 2005 (UTC)
Proposed policy: article namespace categories should not be added to user pages
My understanding of main namespace categories, especially for people, should not include Wikipedia users, but there is not a specific admonition against this in the guidelines, but I think it should be added. The closest to an admonition is included in: Wikipedia:Categorization#Wikipedia namespace, where it says:
- Categories relating to the Wikipedia namespace should be added only to the talk page of articles. For example, tags suggesting the article is needs work, or is listed on VfD would be placed on the talk page as they are relevant to editors, not an aid to browsing in the way ordinary categories are. Please use {{wpcat}} on the Category description page to show that it is a Wikipedia-namespace category.
This arose because on Alkivar's user page, he lists himself in several article namespace people categories such as: Category:1978 births, Category:DJs, Category:People from Maine, Category:People from Maryland and Category:Libertarians which is not the intention of these article namespace categories. Listing himself in Category:Wikipedian musicians, however is entirely appropriate as it a Wikipedia-specific category. I removed the non-Wikipedia specific categories and he reverted the change, claiming no specific admonition against it. It seems clear that it is, or should be, the implicit rule, so I propose to add the following explicit guideline:
- Categories relating to the User namespace should be added only to Wikipedia-specific categories
- Users should not add their user pages to article namespace categories such as Category:People or other subcategories, Category:Biologists etc, which are reserved for pages in the article namespace. However it is entirely appropriate to add a user page to Wikipedia-specific categories such as Category:Wikipedians or other similar subcategories such as Category:Wikipedian musicians.
Any objections, please let me know. --Lexor|Talk 09:57, Jan 16, 2005 (UTC)
- Seems like a no-brainer. Could be even more specific and say that user pages should only be in categories descended from Category:Wikipedians (or possibly other categories useful for pages outside the main namespace if anyone can think of any). grendel|khan 10:22, 2005 Jan 16 (UTC)
For the record, there is some discussion about this at m:Help:Category, which is what I have cited when asking folks to remove categories from their user pages in the past: "Linking from a test page, user page, etc. to a category is considered to pollute the category." -Aranel ("Sarah") 23:37, 20 Jan 2005 (UTC)
Category creation
The section on creating categories says:
How to create categories Creating a category is as simple as adding a soft link to the appropriate article in the Category: namespace; for instance, to add Felis silvestris catus to the "fluffy creatures" category, you would edit the article and enter [[Category:Fluffy creatures]] at the bottom, but before interlanguage links. Although the link will not appear in the article text, a page called Category:Fluffy creatures is automatically created and it will list alphabetically all articles that contain the Category:Fluffy creatures link. The appeal of categories is that unlike lists, they update themselves automatically, and that one can use them to quickly find related articles. However, categories are not a substitute for lists, and you will find that many articles belong to both lists and categories.
This is incorrect. The category is not created automatically. You have to create it yourself by adding something to that page. You can still put articles into a non-existent category, but it isn't of any use to anyone. Can we please update this to reflect the actual situation? A red-linked category, for all practical purposes, doesn't really exist. (Also, a link to the category does appear after the article text.) -Aranel ("Sarah") 17:40, 22 Jan 2005 (UTC)
Category - Images "OF" People should not mean "WITH" people, should it?
Are too many images merely with a person somewhere in the picture being included in the category "images OF people"? Wikityke 22:15, 4 Feb 2005 (UTC) Discussion at Category_talk:Images_of_people
Super Categories
This category/supcategory thing needs some rethinking. As I see it, the general rule that "if something is in a subcategory, it shouldn't also be in the supercategory" often does not make sense. Sometimes the subcategories mark clear distinctions between things, but sometimes the subcategories are just unimportant attributes imposed on the category. I'll give you some examples that make sense:
Category:Musical theatre has two subcategories; Category:Operettas and Category:Musicals, both of which have all the articles about individual works of Musical theatre. This makes sense because:
- There is very little overlap between Operattas and Musicals, they are almost distinct categories
- Most people looking for a list of works would find this distinction helpful.
- The distinction that makes the subcategory is intrinsic to the category, not just a randomly chosen attribute. For instance, the works could be in subcategories that intead of using Operetta and Musicals could have used the year they were composed. This would not be very helpful for someone looking for a list of musicals.
Category:Musicals has the subcategory Category:Musical films. This is a trickier situation. Some of the articles in Musicals are in both categories. For some titles there are seperate articles for both the movies and the theatre productions. This makes sense because:
- If Wikipedia were complete there would be seperate articles for both
- The films almost always come after the theatre productions
An argument could be made for making Musical Films a subcategory of Musical Theatre instead of Musicals but it doesn't really matter.
Some categories do not work so neatly. An example which is really bothersome is Category:Film directors which has the subcategory Category:Film directors by nationality which has 28 subcategories. It does not make sense to have each director only listed in a subcategory by nationality. The nationality of the director is interesting, but not all that important. Some directors start in one country and move to another. I have no problem with there being categories for directors by nationality, but I think ALL of them should also be in the directors category. The reason for this is:
- Having them in both categories makes it easier to find a director if you know his nationality, and MUCH easier if you don't know his nationality.
Which brings me to Category:Bridges in New York City. This category has the subcategory Toll bridges in New York City. Whether a bridge has a toll or not is not all that important, and the attribute does not instruct the reader to notice something important about bridges. If you want to see the articles about the bridges in New York City, why should you have to look in two places?
The notion that articles should not be listed in categories and subcategories strikes me as an artifact left over from libraries. The beauty of hypertext is that things can be linked many ways, not just organized on shelves. Why can't things be in multiple categories? I'd like to see ALL the bridges in New York State listed in Category:Bridges in New York. This makes it easy to see a list of all the bridges in a geographical region, and also the subregion.
I made the change for all the bridges in New York City. But within a day they were all reverted. I'd like to do it for bridges the rest of the world, film directors and some other categories, but I know I need the consensus of everyone else. I've read most of the discussion relevant above, and I don't see a good argument for keeping things the way they are. The important thing is to make Wikipedia USEFULL. Samuel Wantman
- Yes, I quite agree. I made a similar point when User:SPUI moved Indiana Toll Road and Ohio Turnpike from Category:Transportation in Indiana and Category:Transportation in Ohio into Category:Toll roads in Indiana and Category:Toll roads in Ohio. I objected that both roads are very prominent fixtures of the transportation systems in both states, and are, as far as I know having lived in both states for many years, the only toll roads in either state. While it is quite possible that there may indeed be some other toll roads in those states, those have nowhere near the recognition of the ones mentioned above. It makes no sense to me to place these roads into what are currently (and as far as I can tell, for the forseeable future) single-item subcategories. I don't really see the point of the toll road sub-categories in these states where toll roads are relatively uncommon, if not singular entities. But while I'm willing to leave these subcategories be, it seems ridiculous to me to only categorize such prominent features of the transportation systems of those states in minor subcategories. I might note that many of the other by-state subcategories under Category:Toll roads in the United States are also single-item categories--though since I'm not familiar with those states I can't really say anything about the relative prominence of the roads in those states. older≠wiser 02:31, Jan 31, 2005 (UTC)
- Actually, the Ohio Turnpike is really only relevant if you live in northern Ohio, but I see your point. I think the point of putting it in its own subcategory is to make navigation via Category:Toll roads in the United States more consistent. However, there is no clear policy on whether it makes any sense to have one-member categories for navigational purposes. (Though the current trend is to say that yes, it does. See Category:Albums by artist.) Anyway, with the toll roads, I would just about go as far as saying that there is little point in having individual state subcategories; there aren't that many toll roads.
- For film directors, it is against current policy to put them in both a specific nationality category and the parent Category:Film directors. All members of subcategories are supposed to be considered members of the supercategory. The current wiki software doesn't allow that, but that doesn't make it any more useful to throw all the film directors into one unorganized mess. (Also, huge supercategories make it harder to find the subcategories with the current system. If there are more than 200 members, you start having to go through pages to find the children categories.) Ideally, of course, we could put an article in Category:Film directors and a nationality category and look up the intersection of the two, but at the moment that isn't possible.
- (Note: If you know the name of the director, you're not going to be looking up the article via categories, anyway. Huge categories are not helpful to casual browsers.)
- Category:Musical films is a special case. (There are actually musical films that were not ever stage musicals. See for instance Moulin Rouge!.) Ideally, we would come up with some term for musicals that appear on stage and make that a sibling. -Aranel ("Sarah") 03:07, 31 Jan 2005 (UTC)
I agree with Samuel that the category/subcategory thing is a problem. I've been thinking about it for a while, and here's what I decided: The reason for the difficulty with Category:Film directors vs. Category:Film directors by nationality is that Category:Film directors by nationality is fundamentally flawed. We shouldn't have subcats for "randomly chosen attributes", as Samuel put it. Instead, have
- Category:People by occupation with Category:Film directors as a subcat
- Category:People by nationality with country subcats
and other random things like
- Category:People by height with a subcat for each, say, centimeter
- Category:People by handedness with left and right subcats
etc. Then we need a software feature that allows a page to represent the intersection of some categories. So, for example, if I decide I want to see all left-handed Hungarian film directors, I can just request the intersection of Category:Left-handed people, Category:Film directors, and Category:Hungarian people. I don't have to explicitly put people into Category:Left-handed Hungarian film directors, as now.
Under this scheme, it would be possible to categorize a person with all their attributes (height, handedness, occupation, etc.) without having to worry about explicitly making intersection categories. The category organization would become flatter, with many current subcats replaced by the new software feature.
I think this proposal would neatly solve the problem that Samuel described above. dbenbenn | talk 04:13, 31 Jan 2005 (UTC)
Another example: the proposed intersection feature would allow you to find people who were born in 1850 and died in 1950 as the intersection of Category:1850 births with Category:1950 deaths. As it is now, you'd need a Category:1850 births, 1950 deaths, which would just be silly. dbenbenn | talk 04:29, 31 Jan 2005 (UTC)
- That's quite a good idea. I wonder if it's feasible performance-wise. The other problem I forsee is there's no obvious user interface. – flamuraiTM 04:19, Jan 31, 2005 (UTC)
- Thank you :). I haven't really thought about how the user interface would work yet. You would definitely want to be able to link to the intersection of two categories. One possibility: you could, for example, simply make a page IntersectionCategory:Left-handed Hungarian film directors with the content
- #intersection [[:Category:Left-handed people]] [[:Category:Hungarian people]] [[:Category:Film directors]]
- dbenbenn | talk 04:29, 31 Jan 2005 (UTC)
- Thank you :). I haven't really thought about how the user interface would work yet. You would definitely want to be able to link to the intersection of two categories. One possibility: you could, for example, simply make a page IntersectionCategory:Left-handed Hungarian film directors with the content
I agree that the whole category/subcategory criterion as it is now formulated is seriously flawed. From previous discussions, I have concluded that the rational for the "no listing in both a category and also in a sub/super category rule" goes something like this. "We want to impose order in our data base system. If we don't the resulting chaos will make the system unworkable." But this is nothing more than fear mongering based on faulty reasoning and a psychological predisposition for structure. Infact, abandoning the "not in adjacent categories rule" will make the system more user friendly. The criterion we use in deciding what categories articles should be placed in should be user driven. We should ask ourself in which categories the user would expect to see an article listed. We cannot forget that the purpose of the category system is to make it easy for users to find articles, not to create an elegant database free from duplications or overlaps. By abandoning such rules of structure we will create a database that matches the real world and the expectations of real people. Rather than forcing reality to fit into our preconceived notion of what an ideal database structure looks like by using such rules, we should let reality shape the structure of the database. Some specific examples will illustrate. I started to append category tags to the business articles. Of the 1600 business articles about 700 would go in the business category and between 300 - 100 in each of the 20 main subcategories. Obviously there was overlap. This is beceause an article like income statement, for example, while it is primarily an accounting and finance term, is also a general business term. It is used by marketers and other business people as part of their discipline, not in an accounting context. It is a general business term applicable to all of business and as such belongs both in the accounting category and the business category. Since I started working on the Business category, most of the edits have been reversed, by those claiming that an article cannot be in both a category and a subcategory. I am certaining not going to waste any more time working on such a dysfunctional system. The "no listing in both a category and also in a sub/super category rule" is not the only structural criterion that plagues the category system. There is also the "Too much overlap rule". The International trade category is currently listed for deletion and the reason given is that there is too much overlap with the International economics category. Well, news flash . . . in the real world there is conciderable overlap between the two subjects. There is also the "categories must be structured only along one dimension rule". I discovered this one in regards to the busiess law category that someone wanted to deleate. The rational given is that the other law subcategories are structured on theortical grounds rather than practical ones. On this argument, practicle subtopics like business law, maritine law, and real estate law should not have their own categories because it would be out of step with the system of legal subcategories based on legal theory. I say stamp out all of these structural criteria and let there be only one criteria, a user driven one. (If, for some reason, you want to hear more, the section "Inconsistant criteria" earlier on this page has more of my rantings on this topic) mydogategodshat 17:23, 31 Jan 2005 (UTC)
- I completely agree with user:mydogategodshat. Zocky 17:59, 31 Jan 2005 (UTC)
So can you provide examples of articles that, in your opinion, would belong in a category but not in its supercategory? -- Jmabel | Talk 18:07, Jan 31, 2005 (UTC)
- There are many examples of articles that are specific in nature and therefore only belong in a subcategory. There are also many examples of articles that are general in nature and therefore only belong in a supercategory. But this is irrelevant. The question is whether there are articles that are both general and specific, and therefore belong in both a category and a subcategy. The answer is yes. Income statement mentioned above is an example. There are many more. Take the focus group article. In a business context, a focus group is primarily a marketing term and belongs in the marketing category. However, the focus group technique is used in virtualy all of business. It belongs in the business category as well. The real world tells us to put the article in both tha marketing and business category. Wikirules tell us to decide which one to put it in because we are not allowed to put it in both. mydogategodshat 18:52, 31 Jan 2005 (UTC)
I too would love to have a properly normalized database structure for categorizing wikipedia, but the current software simply isn't meant for that. For that we would need something like [[category:People|born=6 June 1666"|died=1 July 1717]]. But that just raises additional questions (what when categories overlap? how to produce a good UI for that? would we need to invent or implement a query language? But categorization also has another important function, that of providing a table of contents, and that is completely achievable with the current system. The user should find articles easily through categories, without having to go deeper than necessary. Ideally, a category should include all articles that fall into it. If there are too many theoretical members of a category (as is usually true for supercategories), it should list the most notable and representative members. Imagine a two-frame UI where you can browse categories on the left and view articles on the right and you may see what I mean.
Here are several examples (some with non-existent categories):
- Kingston upon Hull should be in category:English cities and category:British cities, but not in category:European cities or category:World cities. London should be in all of them.
- Falklands war should probably be in category:History of the 20th century but not in category:History. World War II should be in both.
- OpenBSD should be in category:Unix, but not in category:Operating systems. Linux should be in both.
Zocky 19:15, 31 Jan 2005 (UTC)
This looks to me like a recipe for diverting massive effort from writing, or even usefully categorizing, articles into endless disputes about how broad is the importance of a particular article (or its subject). I can imagine bunches of people trying to promote (or demote) the importance of particular colleges and universities, moderately sized towns, people from particular countries. -- Jmabel | Talk 19:54, Jan 31, 2005 (UTC)
- I definitely agree. I can see someone going through and putting every minor war into Category:History, then someone have to go back and revert every article and explain things. With a community project like this, it's much easier to have black-and-white rules rather than gray rules. Importance is so subjective, it'd be very difficult for something like this to converge, much more so than a regular article. – flamuraiTM 20:00, Jan 31, 2005 (UTC)
- If you look over this page you can't help but notice that the CURRENT situation is "diverting massive effort from writing, or even usefully categorizing, articles into endless disputes about how broad is the importance of a particular article (or its subject)." With the current rules most of the categories make sense, but some of them don't. I think the important rule should be "Categories should be organized to make it easy to browse through related articles". This rule can be applied to every category. In some categories the "no sub/super category duplication" rule will make the categorys more useful. But sometimes it will make sense to break the rule. Take the example of bridges. There is a hierachy under Category:Toll bridges in the United States. Down that hierarcy it makes sense to keep the entries for Category:Toll bridges in New York City from also being in the Supercategories. But there is also a hirearchy under Category:Bridges in the United States. Putting the toll bridge categories as subcategories is really just putting a related category as a subcategory because that subcategy is already part of a different hierarchy. Because of this, it would make sense for ALL the bridges to be listed in Category:Bridges in New York City, and have some entries duplicated in Category:Toll bridges in New York City. These decisions could happen in the talk pages of the categories whenever consistancy gets in the way of useability. Samuel Wantman 04:22, 5 Feb 2005 (UTC)
Lots of good reasons to change the rules. I was skeptical about the value of categories when they first appeared. But I have found them to be a great way to BROWSE through wikipedia. If I find an interesting page, I check the categories to find other interesting related pages. This is one reason why I am frustrated with the current system. When I want to browse through the articles on film directors, I don't want to have to look at 28 different categories! Likewise, If I'm looking to see what articles might be missing about film directors, how will I know if something is missing if I have to browse through numerous categories. One comment was that the directors category would be much too long if all the film directors were in it. To this I say, that it is easier to browse through 2 or 3 lists broken alphabetically, than combine 28 lists in my head. With the current system, if I want to browse through just French film directors, I can. But, if I want to browse through ALL the directors, I can't. Why not both?
There is also a bigger philosophical issue here. By having the current rules, it forces people into making categories based on a single "world view" of what is important, and how things should be organized. That world view comes from the first people who set up the categories, and then everyone else is forced to accept that world view or have their work reverted. This is what happened to me when I found Category:Bridges in New York City arbitrarily divided between "Toll bridges" and "Toll free bridges". But who is to say that that is important? Perhaps I think that it is NOT important. I can't change it because the world view is set. What we should encourage is MULTIPLE world views. This would make Wikipedia MORE interesting, MORE usable, and (perhaps best of all), get rid of countless discussions about what the PROPER category distinctions are. Samuel Wantman 20:14, 31 Jan 2005 (UTC)
- The current system does allow multiple world views. If instead of dividing New York's bridges based on toll/non-toll you want to divide them based on whether they've got rail roads on them or not, create two new subcategories for Bridges of New York City called "Rail bridges of New York City" and "Rail-free bridges of New York City" and go nuts categorizing the various bridges into them. The toll categories will still exist, and now the rail categories will too. Articles can belong to both of them. Bryan 08:47, 1 Feb 2005 (UTC)
- But look at my concrete example about Category:Bridges in New York City. I think that all the bridges in New York City should be in this category, but they are not. Some of the bridges are in a subcategoryCategory:Toll bridges in New York City. I put the toll bridges in BOTH categories, and my changes were reverted. I am forced to the world view of toll bridges and non-toll bridges. What is wrong with having the toll bridges in both places? I don't think we should get rid of category:Toll bridges in New York City, and there is no way under the current rules for me to add those bridges to category:Bridges in New York City. So I'm stuck with a world view I don't like. Samuel Wantman 03:44, 5 Feb 2005 (UTC)
"But this is nothing more than fear mongering based on faulty reasoning and a psychological predisposition for structure." This was said about by User:mydogategodshat. It's untrue, and also very offensive, intentionally or otherwise. It is perfectly fair and valid to say that enormous categories are not at all useful to browsers. Take a look at Category:Stub. Just to figure out what subcategories there are you have to search through every page of the category. (Actually, there aren't any subcategories in that particular case.) This is not helpful to the casual browser. When I look for articles about film directors, I don't want to have to flip through 20 different pages of unorganized articles any more than someone else wants to look at 20 subcategories.
There are categories that contain thousands of articles. These categories are not particularly useful to anyone at the moment because they are so enormous that you can't reasonably page through them.
The current software does not permit us to view the intersection fo two categories. This has been proposed numerous times and would be a great idea, but we can't always function under the assumption that a feature might exist in the future. We can't pull up "all members of the film directors category who are also in the American people category", so "American film directors" is a useful distinction to make. Get some developers to look at the feasibility of this (and to agree to work on this) before trying to use the current system to do something it can't do. -Aranel ("Sarah") 23:36, 31 Jan 2005 (UTC)
- The fear mongering continues with talk about "enormous categories". If categories get too large subcategories should be created and some of the articles placed in the new subcategory. Please don't try to use the enormous category argument to try to convince us that those article that belong in both a category and a subcategory must be forced into one or the other. It dosn't wash. mydogategodshat 00:45, 1 Feb 2005 (UTC)
- Probably you didn't intend to do so, but I would appreciate it if you didn't question my motives. (But seriously, why would anyone be afraid of "enormous categories"? I don't imagine anyone has ever had a nightmare about being stuck in a huge category, or being chased by a monster category...I'll let you know if I have one tonight.) I'm trying to think practically here.
- I don't see the difference between what you propose and the current system. Category:Film directors gets to big, so we divide it into subcategories by nationality (or whatever). The musical theater example above is a good example of a case when the sub- and super-categories convey different shades of meanign and therefore both should be used. However, something like Category:French film directors is not as ambiguous. A person who is a French film director is obvious both a French person and a film director. I don't see that such an article would need to be in the parent category as well as the specific category. -Aranel ("Sarah") 04:53, 5 Feb 2005 (UTC)
- The difference is that in the current system, if you put an article in both a category and a subcategory someone will delete it from one of them because we are not allowed to have an article on two adjacent levels. I have had scores of edits reversed for this reason. As for your question "why would anybody be afraid of enormous categories", the answer according to contributors earlier on this page, is that they make the category system difficult to use. The people that use this arguement want us to believe that allowing an article in two adjacent levels will somehow make the system unworkable. mydogategodshat 03:27, 7 Feb 2005 (UTC)
As people describe their vision of category 'intersections', I think they really mean they would like a keyword type of system rather than the bucket category system like we have now. I did a lot of categorizing after I started here, but eventually gave it up because it felt like I was in quicksand. The more I worked with the current category system, the more I came to dislike it.
In a keyword system, a list of keys could be assigned to an article, and each category would also be assigned a set of keys. For example, back to the film directors, an article could be assigned the keys 'film director' and 'from Germany' (as well as any number of other keys). When the category German film directors is created, it may be assigned only the keys 'film director' and 'from Germany', and only articles containing both of those keys would be listed in that category. One might then ask, Why bother creating categories in that type of system? Because we may want the capability to describe what is being listed with that particular key combination, and also to link categories together in a structure that would be conducive to browsing. —Mike 04:25, Feb 1, 2005 (UTC)
Seems to me that a lot of this is less about categorizing than about viewing the categorized information. People could get what they needed if there were an easier way to show a few layers of the category hierarchy at once. -- 07:02, Feb 1, 2005 (UTC)
- I suspect that a keyword search would be just as problematic as the current search, which is almost always disabled. (Although perhaps not quite as bad. There wouldn't be as much text to search, but there would be just as many articles. I don't know how the search is implemented.) -Aranel ("Sarah") 04:53, 5 Feb 2005 (UTC)
Category pages should sort by namespace first
Category pages should sort by namespace first. It would make things a lot easier. For example, a lot of templates have categories attached to them. Those templates shouldn't be under "T", but there's no way to pipe them without screwing up the articles that use the template. Main namespace should be listed as-is, then after that section there should be other alphebetized sections for other namespaces. – flamuraiTM 19:05, Jan 31, 2005 (UTC)
- Actually I think it would be better if we had a method of making the templates disappear from the categories since the template itself really isn't supposed to be grouped with the articles in the category. —Mike 04:30, Feb 1, 2005 (UTC)
Categories and/or Classification
The controversy and disagreement about categories is due to TWO DIFFERENT understandings about what a category is. I'd like to propose that what we are talking about is really CATEGORIES and CLASSIFICATION. The current system and rules for CATEGORIES is trying to set up a system of CLASSIFICATION; a logic that puts every article in a well organized hierarchy. CATEGORIZATION, on the other hand, tries to associate related articles in many different ways.
So perhaps what is needed is BOTH. So I propose:
- Get rid of the constraints on CATEGORIES discussed in sections above.
- ADD A NEW FEATURE, that of CLASSIFICATION, which puts each article in just one ___location in a hierarchy.
How it might work:
You would type something like Class:Bridges in New York City (in brackets), and just like categories, it would end up in the appropriate classification page. Either the system would only allow one classification or editors would limit the classifications. Perhaps the classification would appear at the Top of an article (in small print under the article title). This might become an option that could be turned off.
This might make everyone happy. Samuel Wantman 20:35, 31 Jan 2005 (UTC)
- I don't think this helps. I could see plenty of room for argument over which single classification is the best to use and whether the classification is too specific or too broad. —Mike 04:38, Feb 1, 2005 (UTC)
- Yes, there would be arguments about classification, which is what we often have now with categories (constrained as they are). The people who WANT to argue about classifications could. The rest of us would not have to bother and just add categories. Samuel Wantman 07:21, 1 Feb 2005 (UTC)
Arithmetics
Are here categories Arithmetics that could answer to queries?:
- Intersection - what belongs to two categories.
- Substraction - what belong to one category, but not other
Conan 20:43, 3 Feb 2005 (UTC)
Related category proposal
I think I've got it...
The problem is that we are doing multiple categorization and then putting subcategories in whatever category is related. This is a good thing, but it creates problems, like the "no super/sub entry duplication" rule. Take the example of bridges. There is a hierachy under Category:Toll bridges in the United States. Down that hierarcy there is Category:Toll bridges in New York City. But there is also a hirearchy under Category:Bridges in the United States. Down that hierarcy there is Category:Bridges in New York City. Putting the toll bridge categories as subcategories of bridges is really just putting a related category and not really a subcategory because that subcategy is already part of a different hierarchy. Because of this, it would make sense for ALL the bridges to be listed in Category:Bridges in New York City, and have some entries duplicated in Category:Toll bridges in New York City. So we really have two different things going on; Subcategories and Related subcategories.
My proposal is to make Subcategories and Related Categories different things. Subcategories could look and work the same as they do now. But perhaps we could add Category:Bridges in New York City#Related and instead of being listed in the subcategory section, it would get listed in a new section called (you guessed it), "Related subcategories". It seems that any given category should be limited to being a subcategory of just one category, so perhaps, this could work by just putting the category first and any following category entries would become "Related subcategories". This might be the simplest way to implement this. Unfortunately, it would mean that many articles would have to be edited to get the correct category on top. Samuel Wantman 04:55, 5 Feb 2005 (UTC)
- I have no problem with the idea of adding "related categories", possibly as an optional thing (not controlled by software) in the same way as zeal.com does it (eg http://www.zeal.com/category/manage.jhtml?cid=560009). But it's not the same as those bridges. The category Category:Toll bridges in New York City should be a subcategory under Category:Bridges in New York City. New York, NY, toll bridge articles will go in the former (ie the bottom level), not in the latter. I DO NOT AGREE that "any given category should be limited to being a subcategory of just one category". Robin Patterson 05:29, 5 Feb 2005 (UTC)
- Well, it might be possible that a category could reasonably fit in as subcategories in two different hierarchies, but no examples come to mind right away. I'll think about that one. But, what is wrong with saying that articles should not be listed more than once in any hierarchy? My problem is the unreasonable intersection of the Bridges and Toll Bridges hierarchies that removes all toll bridges from the Bridges hierarchy. Why? What is wrong with them being in both places? Please give me some concrete reason. I don't understand why this is bothersome to so many people.
- Let me give you a hypothetical example. Let's say there is a category called "People named Bob" (Perhaps there is one!) I know there are clubs for people named Bob, so it seems legitimate that there be a hierarchy of categories for people named Bob. Perhaps it gets organized geographically. If so, there might be a subcategory called "British writers named Bob". Now, if this becomes a subcategory of Category:British writers does that mean that all the British writers named Bob should be removed from Category:British writers? Samuel Wantman 07:30, 5 Feb 2005 (UTC)
- To answer your last question, under the current system the answer is more often than not yes. —Mike 08:04, Feb 5, 2005 (UTC)
Category and Template specificity?
I didn't notice clear guidelines on the project page about how microscopic Cats and Templates should be--anybody know if such guidelines exist or have been discussed? Or am I the only person that finds Category:Unincorporated community in Seminole County, Florida with a state road passing through it stubs and Template:Unincorporated-community-in-Seminole-County-Florida-with-a-state-road-passing-through-it-stub a bit over-the-top? Niteowlneils 22:47, 6 Feb 2005 (UTC)
- I suspect that the category and template in question were created specifically to be over-the-top. People are not generally creating such categories and seriously using them. (But if you find that such categories exist, please do nominate them for deletion. We discussion too-specific catgories all the time at Wikipedia:Categories for deletion.)
- Guidelines regarding specificity of categories depend somewhat on the number of articles involved. (A very general subject that we don't write much about obviously does not need specific subcategories. A specific subject that is written about constantly may require more specific subcategories.) I did find this under "Category membership and creation": "A few categories do only merely subdivide their parent category, but unless the parent category has many potential articles under it, or many potential subdivisions, if you can't think of a second parent category, it might be a better idea to fold your smaller category into the parent" (emphasis added). -Aranel ("Sarah") 23:34, 6 Feb 2005 (UTC)
Help requested with Category:Albums by artist
This category is huge, hundreds and hundreds of subcategories. Because of this, only about 3 letters at a time are shown. It takes quite a bit of time to get to the end of the alphabet. I'm wondering how big categories should be handled. The guidelines say to divide things into new subcategories. Since this is a huge category of subcategories, that means adding another layer of hierarchy.
I gave that a try: I created a new subcategory Category:Albums by artist: A-Z, but it would take forever to move everything into new subcategories. Can this process be automated? If so how? Is there a better solution?
Since the issue of large categories has come up in previous discussions, and seems to be controversial, Is there any way to make a table of contents for a category? It seems that this would be really usefull, and would ease people's dislike of large categories. I tried making a TOC by typing [[:Category:Albums by artist&from=First "A" entry|A]] which resulted in A, etc... but that didn't work. Is there any way to do this?
It seems that any category that is too big to be shown on one page should AUTOMATICALLY have a table of contents. This should be in the software. Has this been discussed?
Thanks, Samuel Wantman 07:36, 7 Feb 2005 (UTC)
- I just did a quicky (maunual) A,B,C,... index using URLs the software seems to interpret correctly. I agree it would be nicer if there was a wiki reference (perhaps something similar to [:Category:Albums by artist#A]) or if the software did this automatically, but for what you seem to want I think this is an adequate workaround. -- Rick Block 14:55, 7 Feb 2005 (UTC)
- I've played with this a bit to try to get rid of the external link indicator (present with the default skin, but not the "classic" skin which I usually use), and as far as I can tell there is simply no way. Seems like a link of the form [http:/w/index.php?anything] should NOT be presented as an external link (since it is by definition a link served by the same web server). I suspect a more elegant looking solution for the standard skin will require some sort of software change (sigh). -- Rick Block 21:01, 7 Feb 2005 (UTC)