Union catalogue models

Download Report

Transcript Union catalogue models

Union Catalogue Models
Presentation for IFLA Libraries for the Blind
Section
Janifer Gatenby, Strategic Research
Agenda
• Needs
• Models
• Models and Needs
• Infrastructure
• Models and Infrastrucure
• Advanced Needs
• Summary
2
Needs for Union Catalogues
•
•
•
•
•
•
Discovery
Web presence – Exposure
Delivery
Cataloguing
Collection Development
Advanced Needs
– Gestalt effect – data quality, data mining
– Collective Innovation
3
Purposes of Union Catalogues
Discovery and Exposure
• Unified point of discovery
– larger resources are better exposed in
general (single shop versus row of shops
versus mall)
– comprehensive resources are more
attractive to users
• more likely to use; more likely to contribute via
WIKI
• Ease load on local OPAC
• provide identity to a library
cooperative
4
Purposes of Union Catalogues
Discovery and Exposure
5
Purposes of Union Catalogues
Discovery and Exposure
6
Purposes of Union Catalogues
FRBR – Alternative editions
7
Purposes of Union Catalogues
Without clustering
8
FRBR
With Clustering
9
FRBR
Expansion of search
Global
Group
Local
10
Purposes of Union Catalogues
Worldcat.org Statistics
• March 2007 Reference :
14,118,777
• 03 2006
 96%
– Daily average: 455,444
• Links to libraries: 834,866
 363%
11
Exposition
Referrals to WorldCat.org pages
Month over month
Referrals
In millions
12,000,000
2005
2006
10
10,000,000
8,000,000
6
6,000,000
4,000,000
2
2,000,000
0
Jul
Aug
Sep
Oct
Nov
Dec
12
Exposure
Worldcat.org Referrals March / April 2007
blogs, CNN, AltV +
Google Scholar
8%
Google 3%
library sites
31%
9%
Google books
14%
education sites
9%
Yahoo
26%
library sites
education sites
Yahoo
Google books
Google
Google Scholar
blogs, CNN, AltV +
13
Delivery
• Resource sharing
• Sharing of users
– Your users find resources elsewhere
– Other’s users find your resources
• Delivery becoming increasingly separated
from discovery
• Increasing delivery options
– Digitisation on demand, reference look-up, purchase,
resolution
– Union catalogues need to link to multiple providers
14
Purposes of Union Catalogues
Appropriate Delivery Options
Wanted Item
Digital
Free
Physical
Licensed
In Print
Out of Print
Rare
In Copyright
Link, Resolve, Print on Demand Copy
Access Restricted
Common
Out of Copyright
Loan, Digitize, Lookup
Authenticate / Authorise, Purchase / Pay
15
International Discovery to Delivery
Digitisation on Demand
• Encourage use of eTen DoD
–
–
–
–
libraries only need a scanner
Software plug in on HTML page ++
“cover to cover” – books, pictures
Out of copyright; exceptions for blind,
country specific
– On demand / just in time
– Payment options – direct, credit card,
cash to library
– Evolving service
• ILS interfaces – loan, reservation,
recording URL
• Regional & international registration of
digital masters
16
Discovery Universe
Portals
OPACs
Union Catalogues
+++
Request
Transfer Msg
OpenURL
Union
Catalogue
RDS
RDS
RDS
RDS
RDS
Delivery Universe
WorldCat RS
nuc
national /
regional services nuc
nuc
………
Subito
BLDSC
CISTI
Doc Del
17
Data Management
• shared data maintenance
– copy cataloguing - efficiencies
– ability to improve the data quality – FRBR
grouping, authority control
– data mining and statistical possibilities
• cooperative collection management
– collection analysis
– cooperative digitisation
– cooperative storage, weeding, purchasing
18
Purposes of Union Catalogues
2/2
Union catalogue models
•
Virtual
•
•
Physical
•
•
•
Created and maintained by online shared cataloguing
Created and maintained via batch load from local systems
Combinations
•
•
•
Meta search engines
Partly virtual; partly physical
Online cataloguing and batch load
Subset of a larger catalogue
•
WorldCat Group Catalog / WorldCat Local
19
Virtual Union Catalogue
•
Advantages:
•
•
Light weight organization
Relatively easy to implement; easy to get agreements
•
•
•
Need metasearch engine and portal
Ad-hoc basis: easy to extend – easy to disconnect
Disadvantages:
•
•
•
•
•
•
•
•
Slow
Limited common searching accesses
Limited sorting / relevance
Duplicates
Searching inefficiencies
Varying quality
Availability of all systems
Configuration maintenance
Search / Retrieve Standards
Z39.50
SRU
OpenSearch
Maintenance Standards
none
20
Virtual Model
Z39.50 Searches - SUDOC
80 000
70 000
60 000
50 000
40 000
30 000
20 000
10 000
0
Janvier-Avril 2007
Helka
NLA
Libris
BNE
DBD
ISSN
connexions
recherches réussies
BN-Opale
L. C.
ratio connexions / recherches réussies
RLG
0%
20%
40%
60%
80%
100%
Helka
NLA
Libris
BNE
DBD
ISSN
BN-Opale
L. C.
RLG
21
Physical union catalogue
– system performance advantages of
single resource
• Up time & speed
– consistently indexed data offers better
precision and recall in retrieval
– easier and cheaper to create and
maintain data links centrally
• TOC, index, article level data
• thumbnails, reviews
– Ease load on local OPAC
22
Physical Model
Subset – WorldCat.org
•
Speed of searching a single system
•
•
Exposure
•
•
•
•
•
•
•
•
Audience level, holdings count, copyright
WorldCat Identities
Book covers, reviews, tables of contents
Modern, Library 2.0 interface
•
•
WorldCat.org partner network – Google, Yahoo, MSN, online bookshops,
antiquarian bookshops, Ask.com etc.
Enriched data
•
•
Searching at local / group / global levels
Perpetual beta, rapidly evolving
Improved displays, faceted searching
Social networking features
Registry of Digital Masters
Collection Analysis
Work Cluster Service
23
Models / Needs
Discovery
Exposure
Delivery
Cataloguing
Collection
Develop’t
Gestalt
Data
mining
Innovation
Virtual
Physical
Subset
standard Z39.50
SRU
RTM
NCIP
ISO ILL
• Possible combinations
– Physical union catalogue + global subset
– Global subset + additional virtual component
24
Infrastructure
ratio connexions / recherches réussies
0%
20%
40%
60%
80%
100%
Helka
• Maintain portal
– Load on Local OPACs
NLA
Libris
BNE
DBD
ISSN
BN-Opale
L. C.
RLG
• Maintain physical catalogue
– Local Extract
– Loading and Conversion
– Online maintenance
• Cataloguing Support
– Error corrections
– Quality police - consistency
• Global exposure
– Export to union catalogue
25
Infrastructure
Physical union catalogue
Batch versus Online
• batch data less consistent but converted & de-duplicated
• Not as up to date
• Most union catalogues have some batch contribution
• Optimal = SRU update
26
Infrastructure
Online Contribution to CBS
• Windows client WinIBW
• Web client
• Holdings
• Record capture
• Simple new record input (Australia)
• Z39.50 / UCP
• From Aleph systems in Germany
• SRU update
• Real time update from an alternative
interface
• Q1 2006
27
Infrastructure
Batch Contribution to CBS
• FTP
• Files sent to FTP server associated with
CBS (PUSH)
• OAI PMH
• CBS OAI server enacts transactions to
retrieve pre-defined sets of data from
another database (PULL)
• Email / SMTP
28
Infrastructure
External Output
Catalogue
Mirror
OPAC
Documents
Repository
Catalogue
Union
catalogue
Push
OAI
Pull
SRU
Update
Push
29
Infrastructure
NCC
PUSH
L
O
G
GGC
SRU UPDATE M21
WorldCat
WorldCat identifiers,
FRBR work id
30
Infrastructure
Update protocols compared
• OAI & FTP
• SRU Update
• Two databases are not identical – – Suited for batch loading
e.g. A union catalogue is not a slave – OAI is used where the
of the contributing catalogues
data source exposes the
• immediate availability of records
metadata to be
harvested for discovery
– Scalability, no scheduling & batch
maintenance
purposes
• interactive feedback – alignment,
– Where the data source
enrichment
does not expect the data
– Identifiers
to be changed by the
harvesting site or does
– Inter-record links – authorities,.
not care
• Diagnostics – diffused error
– Hence one way – no
resolution
mechanism for response
or future synchronisation
31
Infrastructure
Subset – WorldCat.org
•
Independent loading from institutions regardless of
•
•
•
No infrastructure
•
•
Location
Data formats and standards
Lowest setup and maintenance costs
WorldCat Local option
•
Ability to add virtual collections to search
32
Models / Infrastructure
Maintain
Portal
Load on local
OPAC
Cataloguing
Support
Local
Extract
Loading &
Conversion
Export to
global
Data
difference
tolerance
Virtual
Physical
Subset
33
Faceted Searching Sorting
34
Audience Level and Rareness
35
Data Mining
36
37
Collection Analysis: Gestalt effect
38
39
Innovation
40
We can do it !!
41