W3C Library Linked Data Incubator Group

Download Report

Transcript W3C Library Linked Data Incubator Group

On Libraries & Linked Data
Antoine Isaac
UB Utrecht, April 6, 2011
Who am I?
• Europeana
• Web & Media Lab, Vrije Universiteit Amsterdam
• W3C Library Linked Data group
• (2006-2009) W3C Semantic Web Deployment group
SKOS
[email protected]
Demo
Following one’s nose to subject heading lists as linked data
• American LCSH
http://id.loc.gov/authorities/sh85145447#concept
• French RAMEAU
http://stitch.cs.vu.nl/vocabularies/rameau/ark:/12148/cb11931913j
• German SWD
http://d-nb.info/gnd/4064689-0
• Agrovoc
http://aims.fao.org/aos/agrovoc/c_8309
• STW
http://zbw.eu/stw/descriptor/14188-0
• Further on to DBPedia
http://dbpedia.org/resource/Water
Demo (fallback option)
Subject heading lists as SKOS linked data
• American LCSH
http://id.loc.gov
• French RAMEAU:
http://stitch.cs.vu.nl/rameau
• German SWD:
http://d-nb.info/gnd/
• mapped using manual links from the MACS project
http://macs.cenl.org
Starting from http://id.loc.gov/authorities/sh85014310#concept
Linked Data?
1. Use URIs as names for things
2. Use HTTP URIs so that people can look up those names
3. When someone looks up a URI, provide useful information
using standards (RDF, SPARQL)
4. Include links to other URIs, so that they can discover more
things
Tim Berners-Lee, http://linkeddata.org/
(Linked) Data Representation
• That subject heading data follows a link-intensive
data model
Uniform resource identifiers (URI)
Resource Description Framework (RDF)
(Linked) Data Representation
• Use more-or-less the same standard vocabulary
Simple Knowledge Organization System (SKOS)
http://www.w3.org/2004/02/skos/
For representing thesauri, classifications, etc. on the
Semantic Web
A SKOS graph
animals
cats
UF domestic cats
RT wildcats
BT animals
SN used only for domestic cats
domestic cats
USE cats
wildcats
SKOS mappings
SKOS provides conceptual links to bridge across different
contexts
KOS 1:
animals
cats
wildcats
KOS 2:
animal
human
object
Links in the data
Links in the data
Growing interest for linked data in the
library community
Linked Library Cloud beginning 2008
[Ross Singer, Code4Lib2010]
http://code4lib.org/conference/2010/singer
Linked Library “sector” in 2010
Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/
Libraries and LD, the perfect match?
• Libraries have been producing (meta)data for ages
• Libraries (often) produce high-quality metadata
Libraries and LD, the perfect match?
• Library metadata was locked in record silos
• But it maintain links to the outside world
• Bibliographic and web references
• Shared vocabularies
• Same books!
Libraries and LD, the perfect match?
LD is about
• Citing object
• Linking to them
• Re-using data
Think of web-native union catalogues
A vision for the Dutch National Library
Johan Stapel, Koninklijke Bibliotheek (now bibliotheek.nl)
A web of cultural heritage data?
?
?
The current portal
Towards semantic search: facets
Building a search engine on top of metadata is difficult
Intrinsic quality problems: correctness, coverage
Especially when data is so heterogeneous
100s of formats
From flat 5-fields records to 100-nodes XML trees
Language issue!
We currently use a simple, flat interoperability format
Quick-win quickly showing its limits
Semantic ThoughtLab:
experimenting solutions
We can better use institutions’ original metadata
Accommodate their different practices
Data structures and semantics
Access objects via a semantic layer of vocabularies for
subjects, persons, places…
Towards semantics-enabled search
Building a "semantic layer" to help accessing content
Towards semantics-enabled search
• Enhance access to Europeana content by semantics
– Query expansion, clustering of results
• Exploiting various types of relations
– "located in", "lived in", "is more specific concept"…
• Semantics are already there, in metadata and
"controlled vocabularies" used in metadata
– Thesauri, classifications…
• Requires to make it properly machine-accessible
Europeana Data Model
Trying to evolve towards RDF and Linked Data
• Representing objects, persons, places, etc. as
resources
• Linking and re-using external sources
• (Re-using) richer data modeling features
SKOS, CIDOC-CRM, OAI-ORE
• Enabling domain-specific data profiles
• Separating original data from enrichments
http://version1.europeana.eu/web/europeana-project/technicaldocuments/
Prototype: Europeana Thought Lab
http://europeana.eu/portal/thought-lab.html
Clustering of results
Baseline: matching concepts' label
Metadata for the object
Controlled place name from a
vocabulary at the Rijskmuseum
A "more specific Egypte"?
A "more specific Egypte"?
Metadata for the object
A place more specific than the Egypt one
Semantic information on the Giza
place in the Rijskmuseum Vocabulary
Following other relations
Following other relations - creator
Metadata for the object
Controlled person name from a
vocabulary at the Rijskmuseum
Following other relations - match
Information on Gustave Le Gray
from the Rijskmuseum Vocabulary
Matched to a "Gustave Le Gray"
from another Vocabulary
Enabling bits & pieces
Exploiting semantic links in CH vocabularies
Concept “Giza” narrower than concept “Egypte”
Mapping/alignment between CH vocabularies
Louvre’s “Égypte” equivalent to Rijksmuseum’s “Egypte”
Enrichment of existing metadata
The string “Egypt” in a metadata record indicates the concept of
Egypt defined in Rijksmuseum thesaurus
Challenge #1: Linking
Challenge #1: Linking
Manual mapping of large vocabularies is labour-intensive
• LCSH, RAMEAU and SWD mapped in the MACS project
http://macs.cenl.org
• SWD and DDC mapped in the CRISS-CROSS project
http://linux2.fbi.fh-koeln.de/crisscross/
Automatic linking is not perfect but can help
• STW, AGROVOC…
• Some studies (and further pointers) for automatic library
thesaurus alignment in the STITCH project
http://stitch.cs.vu.nl
Challenge #1: Linking
• (Semi-)automatic techniques are necessary to
– Connect objects to vocabularies (esp. for legacy data)
– Connect objects themselves together
• Crowdsourcing?
• Making the way librarians create metadata evolve?
Linking strategy for libraries?
Linking strategy for libraries?
• Links to library-originated sources
– VIAF, LCSH, DDC, UDC, Worldcat, PND…
• Links to resources from cultural environment
–
–
–
–
Museums, archives
Scientific communities: bibliographic data & research data
Publishers
Europeana and other aggregators
Semantic Annotation
Conclusion?
• Linked Data won’t not solve everything right now
• Just a set of techniques and a vision for better
sharing, cross-linking and re-use data, fitting the web
• Which is not bad!
If we stop here, thanks for your attention!
Any (more) questions?
Some references
W3C Library LD Incubator
http://www.w3.org/2005/Incubator/lld
• 1-year group
• OCLC, LC, VU Amsterdam, DNB, etc.
• help increase global interoperability of library data on the
Web
• bringing together people involved in Linked Data—in the
library community and beyond
• building on existing initiatives and collaboration tracks for the
future
Library LD
Use Cases
• LLD use cases and case
studies (work in progress)
http://www.w3.org/2005/Incubator/lld/wiki/UseCases
• JISC cases for open
bibliographic data
http://obd.jisc.ac.uk
Useful vocabularies to express data
dublincore.org/
• Dublin Core
www.w3.org/2004/02/skos/
• SKOS
bibliontology.com/
• BIBO
www.openarchives.org/ore/
• OAI-ORE
www.foaf-project.org/
• FOAF
www.loc.gov/standards/mads/rdf/
• MADS
metadataregistry.org/rdabrowse.htm
In progress
• RDA vocabularies
• FRBR@IFLA
labs.mondeca.com/dataset/lov/
Cf. Linked Open Vocabularies
Note: vocabularies can be combined and articulated together
Datasets
• Controlled vocabularies (thesauri, etc.)
LCSH, DDC, Agrovoc, VIAF, GND
• Bibliographic data
Nat. Libraries of Hungary, Sweden
• Trying to keep track of some on CKAN
http://ckan.net/group/lld
In the Netherlands
• DEN, Bibliotheek.nl, KB, Vrije Universiteit Amsterdam, Beeld
en Geluid, UvA Library
• Amsterdam Museum as Linked Data
http://semanticweb.cs.vu.nl/lod/am/
• Dutch Culture Link
http://sites.google.com/site/dclod11/
• Dublin Core 2011
http://dcevents.dublincore.org/index.php/IntConf/dc-2011
Pictures
• http://www.europeana.eu/portal/record/03903/8C5C6AEFF6B50DCCEDF6
A23A99DD3A2D66AEB2CC.html
• http://www.europeana.eu/portal/record/03912/E9666896A50FDDE5F7F1
5A17C11219A7FBCBBC50.html
(Europeana links give access to resources on original sites)
First Demo pointers
•
•
•
•
•
•
American LCSH
French RAMEAU:
German SWD:
Agrovoc:
STW:
DBPedia:
http://id.loc.gov
http://stitch.cs.vu.nl/rameau
http://d-nb.info/gnd/
http://aims.fao.org/
http://zbw.eu/stw/
http://dbpedia.org/