Diapositive 1
Download
Report
Transcript Diapositive 1
SWIB 2012
Linked Open Library Data in Practice:
Lessons Learned and Opportunities for data.bnf.fr
Romain Wenz
Bibliothèque nationale de France
Conservateur
Département de l’information bibliographique et numérique
What it looks like
Web pages about
Authors,
Works,
Subjects
Gathering information
Library records (12 million at BnF)
Archive materials
Digital objects (2 million at BnF: Gallica)
Part I
The purpose and difficulties
Build Web pages
About writers, books, subjects
Linking to all resources in the library
Completely automatic
Exemple
Information about Cicero,
http://data.bnf.fr/11885977/ciceron/
Most studied books, editions of theses books
Digitized books,
Activities,
such as translations by Cicero
Regroupement par
« Œuvres »
http://data.bnf.fr/11952658/dante_alig
hieri_la_divine_comedie/
Manuscripts
Editions
Digital books
About a
« theme »
Books about diving
http://data.bnf.fr/12647518/natation/
Several formats
Marc catalogues
XML-EAD archives and manuscripts
Dublin Core digital Library
Authorities:
Persons and Organisations
Works (Uniform titles)
Subject Headings
Several structures
Library records : flat structure
Archival fonds with hierarchical
structure and heritage
Digital Content that can be
processed: tables of contents, OCR
Purpose: info
about concepts
Pages for humans
Structure for machines
Links and
authorities
ARK identifiers from authorities
Materials to make the matchings:
Dates
Preferred and alternative labels
Graph of links : relations, roles
Workflow
Digital documents
Web pages for humans
Archives and Manuscripts
Matchings- Alignments
data for computers
Library catalogue records
Data model
Ontologie
complexe
Romain WENZ BnF-IBN
13
Part II
Feedback on activities
How?
FRBR principles
Things that work
Principes FRBR
Functional Requirements for Bibliographic
Records
Uses
Dates
Labels
Related roles
Wich roles:
creation of a work
production of a version: language, type,
material production: publication,
life of an item
Why FRBR?
Linking writers and works with a useful type of
links:
- Writer of a work
- Contributor of an edition: translator, preface,
…
- Producer : physical copy with a printer,
distributor
- Associated with a unique item: owner,
annotator
From a
bibliographic record
Make the link towards a work
Common properties
Possible « expressions »
Author
Dates
Name
Role
Type of document
Language
Date
Title
Matching
(« Aligning »)
Using a « prediction function » to:
Predict to wich Work a bibliographic
ressource is associated :
Words of all titles
Goups of words
Give a threshold
Stopwords and improvements
Clustering
From the manifestations that are not
matched
If there are enough common points
What it looks like in theory…
and in practice
The purpose
Gather data
Make them useful on the Web
Upgrade the catalogs
Part III
« Linked Open Library »
Open:
Technical
Legal
With the “Open data” initiatives led
by the French government, it is
possible to use an Open Licence.
Currently a strong state incentive
around open data and formats
Once data is linked and open, what
comes next?
First, changes in general use, since
people can now find BnF’s resources
directly on the Web.
Mailing address: lots of mail, « new
publics »
Use statistic: 80%+ users from search
engines
R and D: Improvements to integrate in
main catalogues and archives
Secondly, the data is being used by
broader communities.
small public libraries, new procedures
are being explored for re-use of the
dataset in local catalogues.
Example of « OpenCat » with Fresnes
Use in other contexts: example of IF
verso (translations) Institut français
http://ifverso.com/
Specific catalogues (bindings)
In the long term ?
Semantic Web technologies could set
a standard for library data,
if we keep them
linked
and
open.
Library missions
Strengths or weaknesses?
Descriptive information :trust
produced to handle a collection and not for marketing purposes
Describing local « concepts » : local use
For documents, not encyclopaedically
Use of standards: long-time perspective
MARC catalogues, EAD archives, DC digital collection
Already « machine-readable »
But not with Web standards yet
Thanks
[email protected]
Projet:
[email protected]