VIAF and WorldCat/WorldCat identities. Improving works and

Download Report

Transcript VIAF and WorldCat/WorldCat identities. Improving works and

VIAF Council Meeting, Singapore, 2013-08-16
Improving works and expressions in VIAF
VIAF and
WorldCat/WorldCat
Identities.
Janifer Gatenby
OCLC
The world’s libraries. Connected.
Please note:
The world’s libraries. Connected.
IFLA, Singapore, 2013-08-19
Multilingual WorldCat
presented by Janifer Gatenby
Karen Smith Yoshimura
Eric Childress
Robert Bremer
Janifer Gatenby
JD Shipengrover
Jean Godby
Gail Thornburg
Richard Greene
Jenny Toves
Diane Vizine Goetz
The world’s libraries. Connected.
Jay Weitz
WorldCat Today
• Resources in nearly
all languages
• Contributed by more
than 20,000 libraries
worldwide
• More than half the
database is for works
not in English
The world’s libraries. Connected.
WorldCat Today
• Bibliographic Records
• Hybrid records
• Parallel records
• Clustered at Work
level (FRBR)
The world’s libraries. Connected.
Existing Architecture
Subj
Subj
Classif
Subj
Classif
Classif
Author
Author
s
s
Authors
Holdin
Holdin
g
g
Holdings
Bibliographic
record
Work
cluster
Content
cluster
The world’s libraries. Connected.
Manifes
tation
cluster
Complementary Initiatives
GLIMIR
Manifestation
& Content
Clusters
Work Level
Record
Multi-lingual
Bibliographic
Structure
The world’s libraries. Connected.
Work Level Record
http://www.oclc.org/research/activities/workrecs.html
The world’s libraries. Connected.
Work Level Record: Objective
Create a landing
page summarizing
content for a work
The world’s libraries. Connected.
GLIMIR
• The Content Cluster
• Enables better work record displays by reducing the number of lines that display
for large works
• Enables a choice of format and presents the formats that could be acceptable
substitutes
• Consolidates holdings for identical content
• The Manifestation Cluster is important
• Consolidates holdings at manifestation level
• In the short term allows the record catalogued in the language of the interface to
be chosen for display
• Reduces apparent duplication
• Allows a more accurate count of the number of manifestations in WorldCat (as
opposed to the number of records)
The world’s libraries. Connected.
Multilingual Bibliographic Structure Project
Creates true multi-lingual displays
• At work and manifestation levels
• Using all available data instead of “most appropriate
record”
• Generates data
Corrects many of the 28 million records coded “und”
Better control and linking of translations
Input to refinement of work clusters
Smarter data storage
The world’s libraries. Connected.
“Most appropriate” questioned
• Worldcat.org selects the most appropriate record
to show to a user as representative of the work in
the short result list and beyond
• The end result will not be very satisfactory from a
multi-lingual viewpoint… here’s why
The world’s libraries. Connected.
Which record is better to present to a German speaker?
The world’s libraries. Connected.
Incomplete Swedish Record
The world’s libraries. Connected.
Hybrid record
The world’s libraries. Connected.
Most appropriate display
The world’s libraries. Connected.
Multilingual Bibliographic Structure Project
• Work level data, mined from all associated
bibliographic records will be displayed
supplemented with expression / manifestation
level data as the user drills through the short to
fuller versions of the metadata.
End user interface will show works and manifestations not bibliographic
records; the cataloguing client will also show bibliographic records
The world’s libraries. Connected.
Proposed new architecture
eng
ger
jpn
eng
Author
Author
s
s
Authors
oNotes
fre
fre
Contents
++
jpn
ger
Subj
Subj
Classif
fre
sif
Holdin
Holdin
g
g
Holding
jpn
eng
Work
eng
fre
ger
jpn
fre
ger
Translations
(Language of work)
Manif
Manif
eng
Manif
eng
eng
Manif
engManif
eng
jpn
Holdin
g
Holding
ger
fre
eng
The world’s libraries. Connected.
Manif
Manif
eng
fre
Important principles
• Language tagging of elements, particularly
• Summaries (M21 520)
• Subject headings
• Display in script preferred by the user if data is
available
• Improve translated interfaces
• Show consolidated holdings as appropriate
The world’s libraries. Connected.
The world’s libraries. Connected.
The world’s libraries. Connected.
The world’s libraries. Connected.
The world’s libraries. Connected.
Translations
The world’s libraries. Connected.
Great works are translated
• The cream of the world’s cultural and knowledge
heritage is shared by being translated
• WorldCat contains many rich cataloguing records
for these translations
GOAL: Data mine the really good records to
improve clustering, presentation, authority
records and linked data
The world’s libraries. Connected.
Translations
• Inconsistencies causing work clusters to be
incomplete & less than optimal search results
• Titles without subtitles
• Different forms of uniform title or missing uniform title
• Inverted title
• Different coding of original and translated information
Generated uniform title authority records will overcome most of these
differences without needing to edit individual records
The world’s libraries. Connected.
Generate uniform title authority records
• Improve FRBR work groups
• Made by data mining
• Contribute to VIAF
• Diffuse via VIAF as linked data
• Possibility to create web page / web service
The world’s libraries. Connected.
The world’s libraries. Connected.
Translation records in VIAF
• Will enrich VIAF significantly
• New elements - translated title and translator
Author
Title
Expressions in VIAF
Translation count in
WorldCat
Atwood
Blind assassin
8
31
Guevara
Notas de viaje
0
11
Hawking
Grand design
0
18
Lenard
Grosse naturforscher
1
3
Loti
Pêcheur d’Islande
1
31
The world’s libraries. Connected.
Diffusion of Translation records
• Records are freely available to the world from
VIAF in
• MARC-21
• XML
• RDF (linked data)
• Just links in JSON
• And other formats as introduced
The world’s libraries. Connected.
We don’t know now, but soon will
• # of manifestations as
opposed to # of records
• # of works that have
translations
• Top translated authors
and works
• And more 
The world’s libraries. Connected.