Wikipedia:Authority control integration proposal: Difference between revisions

Content deleted Content added
ce & moving
m cleanup (fix OCLC mostly), typo(s) fixed: etc) → etc.), eg, → e.g., using AWB
Line 7:
 
==Introduction==
This proposed project intends to extend and systematise the use of [[authority control]] identifiers, using the {{tl|Authority control}} template, on English Wikipedia articles. ''Authority control'' is the [[term-of-art]] in librarianship, archival practice and related fields for [[unique identifiers]] to [[Wikipedia:Disambiguation|disambiguate]] objects (people, places, academic subjects, etc.). These fields of study have different conceptualisations of unique identifiers form some other fields because many systems in place are backwards-compatible to pre-computerisation systems. This project aims to connect the English Wikipedia to this [[long tail]] of identifiers.
 
The current proposal focuses on biographies, although this may be extended in future to cover other topics, and is built around the use of data from [[VIAF]], a composite system bringing together several major authority files. VIAF algorithmically matches and clusters entries from the individual authority files, and uses data scraped from Wikipedia to aid the process; as a result, there have already been a large number of Wikipedia-VIAF matched pairs identified and this provides a very effective springboard to work from.
Line 27:
*'''Returning metadata to the outside world''' - working backwards from this, once we have embedded identifiers, the curators of this metadata will find it a lot easier to incorporate information from Wikipedia, taking advantage of our fairly fast update cycle for things like death dates.
*'''Identifying alternate names''' - particularly for non-standard transliterations, the alternate headings in authority files give us an extensive and curated collection of variants of names. The linkage will help the creation of redirects.
*'''Content creation support''' - the presence of the identifiers allows future work on tools to, ege.g., develop scripts to generate author's bibliographies for articles.
 
Currently, around 4,000 articles on the English Wikipedia have some form of embedded authority control identifier, and on Commons, around 45,000 articles contain authority control. On the German Wikipedia, by comparison, [[:de:Wikipedia:Normdaten|around 220,000 articles]] have embedded identifiers.