Diapositive 1

Download Report

Transcript Diapositive 1

SWIB 2012
Linked Open Library Data in Practice:
Lessons Learned and Opportunities for data.bnf.fr
Romain Wenz
Bibliothèque nationale de France
Conservateur
Département de l’information bibliographique et numérique
What it looks like
 Web pages about
 Authors,
 Works,
 Subjects
 Gathering information
 Library records (12 million at BnF)
 Archive materials
 Digital objects (2 million at BnF: Gallica)
Part I
 The purpose and difficulties
 Build Web pages
 About writers, books, subjects
 Linking to all resources in the library
 Completely automatic
Exemple
 Information about Cicero,
http://data.bnf.fr/11885977/ciceron/
 Most studied books, editions of theses books
 Digitized books,
 Activities,
such as translations by Cicero
Regroupement par
« Œuvres »
http://data.bnf.fr/11952658/dante_alig
hieri_la_divine_comedie/
 Manuscripts
 Editions
 Digital books
About a
« theme »
 Books about diving
http://data.bnf.fr/12647518/natation/
Several formats




Marc catalogues
XML-EAD archives and manuscripts
Dublin Core digital Library
Authorities:
 Persons and Organisations
 Works (Uniform titles)
 Subject Headings
Several structures
 Library records : flat structure
 Archival fonds with hierarchical
structure and heritage
 Digital Content that can be
processed: tables of contents, OCR
Purpose: info
about concepts
 Pages for humans
 Structure for machines
Links and
authorities
 ARK identifiers from authorities
 Materials to make the matchings:
 Dates
 Preferred and alternative labels
 Graph of links : relations, roles
Workflow
Digital documents
Web pages for humans
Archives and Manuscripts
Matchings- Alignments
data for computers
Library catalogue records
Data model
Ontologie
complexe
Romain WENZ BnF-IBN
13
Part II
 Feedback on activities
How?
 FRBR principles
 Things that work
Principes FRBR
 Functional Requirements for Bibliographic
Records
 Uses
 Dates
 Labels
 Related roles
 Wich roles:




creation of a work
production of a version: language, type,
material production: publication,
life of an item
Why FRBR?
Linking writers and works with a useful type of
links:
- Writer of a work
- Contributor of an edition: translator, preface,
…
- Producer : physical copy with a printer,
distributor
- Associated with a unique item: owner,
annotator
From a
bibliographic record
 Make the link towards a work
 Common properties
 Possible « expressions »
 Author
 Dates
 Name
 Role
 Type of document
 Language
 Date
 Title
Matching
(« Aligning »)
 Using a « prediction function » to:
 Predict to wich Work a bibliographic
ressource is associated :
 Words of all titles
 Goups of words
 Give a threshold
 Stopwords and improvements
Clustering
 From the manifestations that are not
matched
 If there are enough common points
 What it looks like in theory…
 and in practice
The purpose
 Gather data
 Make them useful on the Web
 Upgrade the catalogs
Part III
 « Linked Open Library »
Open:
Technical
Legal
 With the “Open data” initiatives led
by the French government, it is
possible to use an Open Licence.
 Currently a strong state incentive
around open data and formats
 Once data is linked and open, what
comes next?
 First, changes in general use, since
people can now find BnF’s resources
directly on the Web.
 Mailing address: lots of mail, « new
publics »
 Use statistic: 80%+ users from search
engines
 R and D: Improvements to integrate in
main catalogues and archives
 Secondly, the data is being used by
broader communities.
 small public libraries, new procedures
are being explored for re-use of the
dataset in local catalogues.
Example of « OpenCat » with Fresnes
 Use in other contexts: example of IF
verso (translations) Institut français
http://ifverso.com/
 Specific catalogues (bindings)
In the long term ?
 Semantic Web technologies could set
a standard for library data,
 if we keep them
 linked
and
 open.
Library missions
 Strengths or weaknesses?
 Descriptive information :trust
produced to handle a collection and not for marketing purposes
 Describing local « concepts » : local use
For documents, not encyclopaedically
 Use of standards: long-time perspective
MARC catalogues, EAD archives, DC digital collection
 Already « machine-readable »
But not with Web standards yet
Thanks
[email protected]
Projet:
[email protected]