Wikipedia:Authority control integration proposal: Difference between revisions

Content deleted Content added
Line 40:
#Interwikied articles with identifiers - around 220,000 articles in the German Wikipedia have identifiers. Where an interwiki to the German Wikipedia exists, we can pull the identifier from the linked page, doing some basic metadata checks to ensure the interwiki linkage is accurate.
#:''Around [[:de:Vorlage:NORMDATENCOUNT|145,000 articles]] on the German Wikipedia currently have VIAF identifiers; the rest use other identities, but it may be practical to match them to VIAF.''
#VIAF authority file links. - asAs part of the matching process, Wikipedia is used as a source of information to help bring VIAF "clusters" together. OCLC have provided an extracted list of over 250,000 English Wikipedia articles with corresponding VIAF numbers, though these may have to be checked to ensure that pages have not been moved since the matching was carried out.
#:''(The matching is done with this [http://dl.dropbox.com/u/10997393/wikipedia2auth3.py python code] written by OCLC Research Scientists Thom Hickey and Jenny Toves. During the algorithmic creation of the VIAF file if a Wikipedia link is matched with ~98% accuracy then it is included in the entry. Right now there are 266,202 links from VIAF to Wikipedia. Those links are available [http://dl.dropbox.com/u/10997393/wikilinks.out as a tab-delimited text file].)''