Wikipedia talk:Authority control integration proposal: Difference between revisions
Content deleted Content added
Carcharoth (talk | contribs) →Generating biographical databases: new section |
|||
Line 130:
:Generating such lists by bot will be no problem at all, I think. A database of all "persondata" and "authority control" data can be maintained on the toolsever. We do that for de.wikipedia data for years now (it is the database behind http://toolserver.org/~apper/pd/ ), and the whole thing is being re-programmed to serve even more purposes right now; so I think there might be a chance that the same thing can easily be adapted to en.wikipedia needs. Well, not "we" do it, acatually [[User:APPER]] did the whole work. But I think it shouldn't be any problem to get lists of double entries of authority data from such a database on a regular basis. --[[User:AndreasPraefcke|AndreasPraefcke]] ([[User talk:AndreasPraefcke|talk]]) 21:18, 4 July 2012 (UTC)
== Generating biographical databases ==
Not strictly related to this proposal, but I was wondering if this will help make it possible to generate a database of all biographical articles on Wikipedia? What I have in mind is the fact that there will be some (many) biographical articles on Wikipedia for which there are no VIAF identifiers. These may be obscure living or recent people who are borderline notable, or very obscure historical figures. Is it possible to estimate how many of the nearly 1,000,000 biographical articles on Wikipedia are likely to be matched up with a VIAF identifier and how many may not be, or is that something that we will only know once the data-crunching begins? I presume that once much of the matching up has been done, it will be possible to generate an alphabetical listing of all biography articles with VIAF identifiers? If that is so, can I ask if the following will be possible:
*(1) If VIAF identifiers for other objects get added, will it still be possible to filter out the biography ones and just generate that database (i.e. exclude the non-people objects and retain that level of filtering)?
*(2) The issue of biographical articles on two (or more) people is one that Wikipedia failed (at the start) to get a proper handle on. It would have been best to include something identifying such articles (something in metadata, other than the categories that may apply). Is it too late to include something at this stage identifying such articles as they are found during the roll-out of VIAF (if the proposal goes ahead, as it seems it will)?
*(3) Does VIAF use any gender identifier for people? I've not yet found a system (including Wikipedia) that identifies gender (even a simple male, female, unknown, other choice). So I've never yet been able to answer the question of how many articles on women Wikipedia has, compared to number of articles on men.
May have some other questions later, but those are probably enough for now. [[User:Carcharoth|Carcharoth]] ([[User talk:Carcharoth|talk]]) 20:49, 15 July 2012 (UTC)
|