Making WorldCat Work Harder
Download
Report
Transcript Making WorldCat Work Harder
A Future for the Library Catalogue
T. Hickey
ACRL/DVC
Bryn Mawr
3 November 2006
OCLC Research
Research for both
• OCLC services
• Membership
Metadata management
Knowledge organization
Content management
Interoperability
Systems & interaction design
~30 employees
Reports
Melvyl Recommender Project (CDL)
• http://www.cdlib.org/inside/projects/melvyl_recommend
er/report_docs/Mellon_final.pdf
Changing Nature of the Catalog and its Integration…
• http://www.loc.gov/catdir/calhoun-report-final.pdf
Future of Cataloging at Indiana University
• http://www.iub.edu/~libtserv/pub/Future_of_Cataloging
_White_Paper.pdf
Martha Yee: Beyond the OPAC: future directions for webbased catalogues
• http://www.nla.gov.au/lis/stndrds/grps/acoc/papers200
6.html
Influences
FRBR
Faceting
Google, Yahoo, etc.
Digital content
Ranking
Consortia
Connectivity
Remote users
Basic Approach
Go to the users
• Bring data to the user
• Make it as inviting as possible
• Invite their participation
Use the data we have
• Classification
• Controlled vocabularies
• Controlled names
• FRBR
• Usage data
OCLC’s Role
Largest consortia, largest catalog
• 72 million records, growing at 12 million/year
• 1.1 billion holdings
• http://www.oclc.org/worldcat/grow.htm
Open WorldCat
Relationships with Web indexers
Authority control
Data mining
General Observations
Grouping and ranking are critical
Simpler is better
Faster is better
Faceting needs to be visible
Authority control is important
Local is not as important as it was
OCLC Research
Data mining
• WorldCat Identities
• Audience level
Authority control
• VIAF
• Heading control
FRBR
• Algorithm
• xISBN
Live search
FictionFinder
WorldCat Identities
Create a page for each person in WorldCat
• Name(s)
• Works by and about
• Subjects
• Dates
• Fiction/non-fiction
• Roles
• Co-authors
Add links
• Wikipedia
• Authority files
Approach
Borrow from
• FictionFinder
• RedLightGreen
• FRBR
• VIAF
• PeopleAustralia
• Wikipedia
Pages are ‘static’
• Easier to do complicated analysis
• Some parts may be editable
• Use cover art in lieu of photos
Statistics
80 million (nominally) controlled headings
18 million different identities in WorldCat
2 million with at least five citations
12 million with only one citation
400,000 identities with non-Latin script forms
Plans, etc.
Make WorldCat Identities public this year
Revised version of FictionFinder soon
Improve authority control
Extend authority control
Improve FRBR matching
Do We Need It?
Just have Google harvest everything
• Our experience with Google
• Fielded searching
• Reliable searching
Possibility of user-supplied metadata
Cost of good metadata
Cost of non-existent metadata
Conclusions
Shift to remote users forces new approaches
Online availability – trend towards centralization
More flexibility in implementations
Patrons are better served
Less emphasis on physical collections
Thom Hickey
[email protected]