User:Clements.UWLib/sandbox: Difference between revisions

Content deleted Content added
No edit summary
No edit summary
Line 1:
== Wikibase and the German National Library (Jens Ohlig and Elena Aleynikova, Wikimedia Deutschland) ==
== Overview of Cohort Institution Projects ==
* [https://docs.google.com/presentation/d/1IC7QxCZuMghnKgwu%20OpQOvv2yYZ9lHPCZMiNDLjtybM/edit?usp=sharing Slides]
=== University of Minnesota (Christine DeZelar-Tiedman) ===
* Jens Ohlig from Wikimedia Deutschland, works in software development. Working on Wikidata, and Wikibase (underlying Wikidata software). Will talk about project with GND.
* No formalized plans, will use some of the batch loading tools to contribute wikidata based on their University researchers. Their platform (Pure) does a batch export, so they’ll find ways to load to Wikidata.
* But first an introduction to [http://wikiba.se/ Wikibase]!
* Both researchers and archive related info, papers, etc.
** Software that runs Wikidata, developed for Wikidata but available free & open source.
=== Harry Ransom Center, University of Texas (Paloma Graciani Picardo) ===
** Stores structured data.
* No formal plans quite yet
** Features: data model supports multilingual usage (support for over 300 languages); like Wikidata, properties can have multiple contradictory values (cited to different sources); exports in a number of formats, SPARQL query functionality built in.
* Local authorities. Correspondence in their archives, they keep indexes. Some might comply with authority schema on Wikidata.
** Machine friendly, building blocks for the semantic web
* Want to learn about Wikibase as possible platform for authority data
* Would like more Wikibase installations, outside the Wikidata project because not everything is in scope for the Wikidata project. Would like to see ___domain specific repositories.
=== Northwestern University (Paul Burley) ===
** Examples
* Going to work on metadata for posters from their collection of African Studies. In African languages--Hausa. LCNAF won’t work. They’ll need to create URIs in Wikidata.
*** A library of the world’s greatest jams and marmalades; supports their specific previously developed data model and ontologies.
* YES, would be interested in using QA to look up entities they’ve created (via Sinopia)
*** [https://blog.factgrid.de/ FactGrid]: another example, a database for historians, documents around the Order of the Illuminati.
=== University of Pennsylvania (John Mark Ockerbloom) ===
**** Queries can reveal more about the data than historians already know
* John not on call, maybe next call
**** Can see who wrote to whom
== Harvard Music Projects (Christina Linklater & Christine Fernsebner Eslao) ==
*** [https://rhizome.org/ Rhizome], born digital art. Have been using Wikibase since 2015 to support digital preservation. Better suited than software that was as fit to their particular purpose. Flexibility of the system an advantage as they developed data models.
* [https://docs.google.com/presentation/d/1q4SAd8BItLUofuE7SGmk1SiNVIlzbFm9bgLrkkx0ryA/edit?usp=sharing Slides]
*** [https://lingualibre.fr/wiki/LinguaLibre:Main_Page LinguaLibre]: collection of audio snippets of spoken language.
* Guido Adler Collection
* GND project: integrated authority file for the German-speaking world, maintained by the German National Library. Focus on persons, places and events. Long history of collaboration with Wikimedians. Interest in opening the GND. GND4C is a project to open to cultural institutions (may or may not involve Wikibase). Putting all their data into Wikibase, or Wikibases. Seeing Wikibase as something that can support authorities, a natural transition for them. German National Library offers courses to Wikimedians and then they can contribute to parts of the GND
** Materials on musicology, initiated musicology study at University of Vienna (advised first Jewish PhD in musicology, first PhD in musicology obtained by a woman)
* Current project: migrate the GND to Wikibase. Several workshops with engineers.
** Harvard purchased because annotated--interest and engagement with musicologists of his day
* Current state: three Wikibase installations; more open than previously but invite only. Characterized as “semi-open”
** First critical biography of Beethoven--annotated heavily by Adler--foundational text in the field
* [https://wiki.dnb.de/pages/viewpage.action?pageId=147754828 Blog post] on the project, explains starting point and current state.
** Show connection between musicologists driven into exile
* Also a [https://wiki.dnb.de/display/GND/Authority+Control+meets+Wikibase page on the project].
** Digitizing pamphlets, hope to make a product for musicologists, whom did Adler know, where did his contemporaries go after the war
* When can I see something? Evaluation will be done by the end of the summer. Wikibase is suitable, which is good news. Will need to do homework with user rights and roles (not a concept we have in the Wikidata project). Will be presented at [[Wikidata:WikidataCon_2019|WikidataCon]] in the fall. So stay tuned!
** Pretty fully described in MARC records, but could better see the connections by creating Wikidata. Enhance descriptions
=== Questions ===
** Names do have LC NAF
* Diego from METRO in NYC. Have been testing Wikibase. Where can I find a public road map for Wikibase development? Amazon Neptune acquired parts of Wikibase?
* Arthur Freedman Collection
** Jens: Using Blazegraph, some complications with that but it may not be developed in the future. Product Manager -- this is something she worries about. The road map is in the open, he can share the information offline. GraphQL [can someone help fill in?]
** More wikidata focused project. These are well described in finding aids.
** Link: [[Wikidata:Development_plan|Wikidata development plan]]
** 1,000 recordings of punk shows: audio and video. Digitizing.
* Scott MacL: Can you describe the parameters of Wikibase projects (Olaf Simons, Rhizome, etc.) that might then codified please? And are all people in Germany in the Wikibase database for persons, events, etc.? And could this be extended to people in each of all ~200 countries potentially?
** Driven to document shows he really loved
** Jens: Wikidata is the largest installation in Wikibase at the moment, nothing else comes close. Would be surprised if anyone runs into limits of Wikibase. If there are limits, please get in touch with Jens. No, the GND doesn’t cover all persons in Germany -- for the use of libraries; people who have published, people who have been published about.
** Metadata--cassette liners, but sometimes not information about the venue
* Karen Smith-Yoshimura: why three installations?
** Interest in recordings that have been digitized
** Jens: Currently evaluating different strategies. Will only have one in the end.
** Number of bands are well represented in Wikidata
* Chris: ARDC in Australia runs a research vocabularies service which uses PoolParty underneath it all. https://vocabs.ands.org.au/ Is there an institution that offers something similar for wikibase?
** In process of reconciling
** Jens: You can use Wikibase for controlled vocabularies: [https://www.conftool.net/or2019/index.php?page=browseSessions&form%20session=361 examples from Japan], material research vocabularies. Seems like Wikibase can support this usage.
** Most venues aren’t described, but a few are
* Steve: Are there directions somewhere for getting QuickStatements to work with the Docker image?
** Found Wiki project music--info boxes for Wikipedia
** Jens: Engineers have told me these problems are now resolved so try updating. QuickStatements = “quite a beast!”
** For venues and musical performances--cast a wider net for places and events
* Harvard: Who's the right person or persons to ask if you've installed the docker image, and everything works nicely, but then your queries stop returning results?
** A number of places don’t exist anymore--reflect different eras of Boston cultural life
** Jens: elaborate in an email please and he will connect you. (Will do!)
** Musical performances
* UMn: Does each installation of Wikibase offer a common set of data relationships to which the installation can make additions? What enables federation across Wikibase installations?
*** Performances of ballets, operas, hip hop concerts, etc
** Jens: loves this question! A new Wikibase is naked, you need to define everything from scratch. What we are looking into right now is federation (which means different things to different people). Looking at the idea of reusing Wikidata items in a local Wikibase installation (so the concept of “human” for example) but then model other parts yourself. Currently in a research phase -- no code has been produced.
*** Only 1 similar event to one recorded in concert
* Dan Michael O. Heggø: Did you yet look into integrating with library systems? Specifically updating linked bibliographic records when concepts are modified or merged in Wikibase?
** One question: what data is appropriate for Wikidata, what would be better elsewhere
** Jens: no, we have not. We probably lack knowledge. Libraries are an interesting field, but we want to look at other fields as well, civic data for example. Libraries close to our hearts, so collaborating and getting insights from the library community is important. No current plans for integrating but maybe we can work together to enable you to do your own integrations.
** Trying to find matches in Wikidata for bands using OpenRefine
* UMn: Could you say more about scope limitations in Wikidata and now Wikibase supports a defined scope?
** Q: Have you found any names you need in MusicBrainz
** Jens: all comes down to [[Wikidata:Notability|notability]], but pretty open. Something like “Arrowhead number three” in your local museum may not be notable enough, for example and you may need to have your own Wikibase instance.
*** A: Tends to be quite exhaustive--used by fans and record dealers. Many items in Wikidata are barebones, so matches against MusicBrainz is helpful. Hoping to use as outreach to music community--MusicBrainz tend to note things like ___location and dates active. Barebones local authorities--would like to avoid in Wikidata--don’t want to create item that no one can disambiguate
* Hillary: what if you have questions about notability, where should those be directed?
** Q: am curious about how to add all these resources into a realistic virtual earth for libraries. Am thinking Google Street View with TIME SLIDER / Maps / Earth / TensorFlow with languages +. For ex., add video of the Rathskeller in in the 1970s or 1980s and now, or add punk rock video from the 1970s differentially from the 1980s … for archival research … and then patch them together bibliographically and archivally … in a new conceiving of libraries and wikidata
How best to do this?
** Jens: it depends. Project chat is a good place to start. Community is still young and can be formed. I can’t give an answer for what the community wants, but they are quite open.
*** A: Would like to be in touch--browsing data in Wikidata--timelines and maps, time slider, visual cues about chronology--potential for mapping time periods and neighborhoods cool
*** Hillary: will be focussing on different communication channels in Wikidata / Wikibase.
** Q: Is anyone incorporating DISCOG identifiers as they’re creating wikidata entries?
* John Riemer: The Program for Cooperative Cataloging is weighing how best to conduct a Wikidata pilot project. One key question revolves around the advantages/disadvantages of working in a separate instance of Wikibase versus the “production” version of Wikidata.
*** A: it’s not linked data, but there is an api. We’re finding it useful
From listening to the several examples provided in your presentation, it seems like specialized controlled vocabularies could be used within the production instance. It also seem likes particular project like Rhizome could be signaled by a data element in the production instance. Do you have comments on the disadvantages of trying to run those projects in the main instance of Wikidata?
** Q: Work on venues is open ended?
** Martin P: One answer is that Wikidata is not for original research: everything in Wikidata should already have been published somewhere. If you want to surface data from an original research process/ closed community, set up your own Wikibase.
*** A: We have venues where recordings were made. How far down rabbit hole to go?
** Jens: data model needs to go through community discussion, and you may not want that for all the properties you need. Depends on the needs of your project.
*** Steven:Cornell had hip hop flyers that described venues and went with schema.org a few years back--could look at how schema.org does it
* Hilary: Will you be making changes to the user interface for GND?
*** Christine: Other digitized collections--complementary collections that we could link to through Wikidata
** Jens: no, they are happy with it as is. Changes to the interface with new features -- we have a UX team that thinks about this on an ongoing basis.
** Q: How best to connect them library-wise and newly for interactivity … so that people in video could become avatar bots … and then we could eventually converse with these new syntheses, libraries-wise.
Thanks.
* Hilary: will you be creating customized template for validation?
*** A: Coolness….
** Jens: we will have to see. Not on the roadmap but it is a FAQ!
** Naun: It might be interesting to see how this might work for other kinds of performances, not necessarily music performances
 
** OpenRefine--Honor’s talk on Saturday
Thanks to Jens for a great presentation!
*** We will devote one of these calls to OpenRefine, hopefully get developer to join us
* Trying to convert marc authority files into wikidata entities
== Is there interest in standing up wikibase instance for local authorities?? ==
* Christine has made instances in AWS, would love for someone with some experience to chime in
* Jens, Wikimedia -- Germany
** Working on integrated authority file
** Can check with contacts there, and present on this in future call
** They are evaluating Wikibase for authority file, making a decision about one/several Wikibase installations
* Interest Add your name here!:
** Paloma, HRC
** Tim Knight: Wikibase is something a few colleagues and I are just starting to explore; nothing groundbreaking to report at this stage
** Rhonda Super: Interested in learning about Wikibase
** We at SI are at the planning stage on creating wiki entity for SI scientists who are not on NACO. We are also exploring if creating VIAF directly is an option?!

** At Vanderbilt we have been playing with Wikibase, but have run into technical difficulties: Quickstatements don’t work and also we have encountered bot throttling issues that make them unusable. So technical advice would be great.
*** Harvard: I believe Quickstatements doesn’t work for the Docker image, for anyone
*** Jens: Yes, it is hard to get QuickStatements to work with the Docker image, but there is a solution.
** Merrilee Profit: 

I would suggest that Wikibase be part of these discussions, not separate. Wikidata / Wikibase are quite intertwingled....
** Steve Baskauf, Vanderbilt University Libraries. We have experimented using Pywikibot to load data, but the built-in throttling makes it way too slow. Interested in either a way to reduce the throttling or an alternative (preferably Python) to Pywikibot.
** Jackie Shieh (SIL)
** Mairelys Lemus-Rojas (IUPUI)
** Ahava Cohen (National LIbrary of Israel)
** Ryan Mendenhall, Columbia University
** Kristina Spurgin (UNC Chapel Hill) - we are exploring a state-wide name authority project with several other institutions. We are assessing needs and determining next steps for the project, but could be interested in WikiBase as a platform for building a shared North Carolina names authority file.
== Introduction to Wikidata WikiProjects (Hilary Thorsen) ==
[[Wikidata:WikiProjects]]
* Helpful entry point to finding out how entries are described in Wikidata
* How other projects have been approached
[[Wikidata:WikiProject_Cultural_venues aims/scope|Wikidata: WikiProject: Cultural Venues]]
* history/background
* Ways to contribute
* List of participants
* Can ask questions: how are they modeling data, look at properties they are using
* Model items
* Will talk about this more in next meeting.
** Add yours to list (in agenda above) if want to discuss
== Suggested future topic ==
* From Scott MacLeod (@WorldUnivAndSch) to Everyone: (09:53 AM)
** Future possible topic? Wikidata for brain science, libraries at the cellular and atomic levels, and with an all languages’ approach … and into brain simulations in a realistic virtual earth for species?