Top Banner
Merrilee Proffitt and Max Klein OCLC Research August 24 2012
51
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Wikipedia and Libraries: Island Hopping the Data Archipelago

Merrilee Proffitt and Max KleinOCLC ResearchAugust 24 2012

Page 2: Wikipedia and Libraries: Island Hopping the Data Archipelago

45 years old Almost 30K libraries contributing from

170 countries More than 271 M items 1200 employees 21 offices worldwide

Page 3: Wikipedia and Libraries: Island Hopping the Data Archipelago

Since 1978 46 people 3 locations (Dublin, San Mateo, Leiden)

Pure research not product R&D

not market research

Page 4: Wikipedia and Libraries: Island Hopping the Data Archipelago
Page 5: Wikipedia and Libraries: Island Hopping the Data Archipelago
Page 6: Wikipedia and Libraries: Island Hopping the Data Archipelago
Page 7: Wikipedia and Libraries: Island Hopping the Data Archipelago
Page 8: Wikipedia and Libraries: Island Hopping the Data Archipelago
Page 9: Wikipedia and Libraries: Island Hopping the Data Archipelago

Wikipedians still complain about the vector skin

Page 10: Wikipedia and Libraries: Island Hopping the Data Archipelago
Page 11: Wikipedia and Libraries: Island Hopping the Data Archipelago

Although content creation is fast

Internal policy progress is glacial, conservative

Consensus model over asynchronous and near-anonymous discussion

Page 12: Wikipedia and Libraries: Island Hopping the Data Archipelago

“The free bureaucracy, that anyone can legislate.” ~ San Francisco Wiknic 2012

Page 13: Wikipedia and Libraries: Island Hopping the Data Archipelago
Page 14: Wikipedia and Libraries: Island Hopping the Data Archipelago

Community orginated. 27,456 instances

2009 “Linkspam” accusations against OCLC. Cause links to Amazon and B&N on the WorldCat

page.

Original accuser was banned for being argumentative.

Page 15: Wikipedia and Libraries: Island Hopping the Data Archipelago

Crux: Should Wikipedia promote any organization? Open question in the community

Page 16: Wikipedia and Libraries: Island Hopping the Data Archipelago

Disambiguation Collation

Page 17: Wikipedia and Libraries: Island Hopping the Data Archipelago

Authority file matching

During creation used Wikipedia data

2013. Wikipedia will be promoted to “source” rather than reference.

Page 18: Wikipedia and Libraries: Island Hopping the Data Archipelago

English Wikipedia 4,000 instances

German Wikipeida 220,000 instances

Wikimedia Commons 45,000 instances

Added by hand Rules vary by

language

Page 19: Wikipedia and Libraries: Island Hopping the Data Archipelago

Load VIAF Data Check Deutsche Wikipedia Edit English Wikipedia

Page 20: Wikipedia and Libraries: Island Hopping the Data Archipelago

English Only, for now Targets 260,000 pages 1/16th of English Wikipedia

Still won’t be fully synched with Deutsche Wikipedia

Page 21: Wikipedia and Libraries: Island Hopping the Data Archipelago

https://github.com/notconfusing/VIAFbot Uses Pywikipediabot In community code review: running within the

next month

Page 22: Wikipedia and Libraries: Island Hopping the Data Archipelago

Transclusion & Sugarcoated HTML

Page 23: Wikipedia and Libraries: Island Hopping the Data Archipelago

Transclusion You can draw in text from other pages (typically

templates)

Can send parameters Templates can perform Simple logic operations

Simple text manipulation

Still Wikitext, not fully query-able

Page 24: Wikipedia and Libraries: Island Hopping the Data Archipelago

“The way you always thought Wikipedia worked.”~Merrilee Proffitt

Page 25: Wikipedia and Libraries: Island Hopping the Data Archipelago

Phase 1 Revamping interlanguage links

Phase 2 Data, Templates and Infoboxes

Phase 3 Semantic querying

Page 26: Wikipedia and Libraries: Island Hopping the Data Archipelago

Now: Added by hand or bot

Soon: Wikidata concept page

Page 27: Wikipedia and Libraries: Island Hopping the Data Archipelago

Soon: Properties for a concept

Page 28: Wikipedia and Libraries: Island Hopping the Data Archipelago

Soon: This won’t be a monumental effort.

Page 29: Wikipedia and Libraries: Island Hopping the Data Archipelago

The end of the assumption that Wikipagesstore Wikitext.

On Wikidata they store JSON.

Page 30: Wikipedia and Libraries: Island Hopping the Data Archipelago

All the work VIAFbot is doing, will be accessible across 270 Wikis.

Plus language specific lookup…

Page 31: Wikipedia and Libraries: Island Hopping the Data Archipelago

RDF Data

Page 32: Wikipedia and Libraries: Island Hopping the Data Archipelago

Backers: Google, Paul Allen Institute for Artificial Intelligence, Gordon and Betty Moore Foundation.

Release Date: January 2013 Caveat: Requires adoption by each individual

language wiki – by consensus. Wikipedias having found consensus so far: …

Page 33: Wikipedia and Libraries: Island Hopping the Data Archipelago

Hungarian Wikipedia

Page 34: Wikipedia and Libraries: Island Hopping the Data Archipelago

Bibliographic data is both: An element of citation

An articles in its own right

Page 35: Wikipedia and Libraries: Island Hopping the Data Archipelago

• 411,274 citations of books

• 244, 236 citations of journals

• 57,868 citations of encyclopedias

• 342,470 of newspapers

• 1,055,845 total print citations

• 1,169,495 citations of webhttp://en.wikipedia.org/wiki/User:Maximilianklein/Citations

Page 37: Wikipedia and Libraries: Island Hopping the Data Archipelago
Page 38: Wikipedia and Libraries: Island Hopping the Data Archipelago

Wikipedia features bidirectional linking. Take links forward all the time, why not backwards?

Page 39: Wikipedia and Libraries: Island Hopping the Data Archipelago

Could add “what cites this”

What cites this

Page 40: Wikipedia and Libraries: Island Hopping the Data Archipelago

A Wikipedia article could be a good way of declaring the aboutness of a record.

~Asaf Bartov (User:Ijon)

Page 41: Wikipedia and Libraries: Island Hopping the Data Archipelago

links to

Page 42: Wikipedia and Libraries: Island Hopping the Data Archipelago

Could add “what’s about this”

What’s about this

Page 43: Wikipedia and Libraries: Island Hopping the Data Archipelago

What’s about this

Page 44: Wikipedia and Libraries: Island Hopping the Data Archipelago

Dream Take your browser history

Page 45: Wikipedia and Libraries: Island Hopping the Data Archipelago
Page 46: Wikipedia and Libraries: Island Hopping the Data Archipelago
Page 47: Wikipedia and Libraries: Island Hopping the Data Archipelago

Would still have to create bidirectional links between WorldCat and Wikipeida

Page 48: Wikipedia and Libraries: Island Hopping the Data Archipelago

There is the practical solution.

VIAFbot is the prototype of the link reciprocation solution

Page 49: Wikipedia and Libraries: Island Hopping the Data Archipelago

Have to gain Wikipedia approval to reciprocate links with a bot Subject to community approval

Requires maintenance Can become unsynchronized

Page 50: Wikipedia and Libraries: Island Hopping the Data Archipelago

Seaplanes Imitated bidirectional

Islands Wikipedia, VIAF, WorldCat

Data Archipelago

Page 51: Wikipedia and Libraries: Island Hopping the Data Archipelago

Max Klein and Merrilee Proffitt@notconfusing and@merrileeiam