Making WorldCat Work Harder

Download Report

Transcript Making WorldCat Work Harder

A Future for the Library Catalogue
T. Hickey
ACRL/DVC
Bryn Mawr
3 November 2006
OCLC Research







Research for both
• OCLC services
• Membership
Metadata management
Knowledge organization
Content management
Interoperability
Systems & interaction design
~30 employees
Reports
 Melvyl Recommender Project (CDL)
• http://www.cdlib.org/inside/projects/melvyl_recommend
er/report_docs/Mellon_final.pdf
 Changing Nature of the Catalog and its Integration…
• http://www.loc.gov/catdir/calhoun-report-final.pdf
 Future of Cataloging at Indiana University
• http://www.iub.edu/~libtserv/pub/Future_of_Cataloging
_White_Paper.pdf
 Martha Yee: Beyond the OPAC: future directions for webbased catalogues
• http://www.nla.gov.au/lis/stndrds/grps/acoc/papers200
6.html
Influences








FRBR
Faceting
Google, Yahoo, etc.
Digital content
Ranking
Consortia
Connectivity
Remote users
Basic Approach
 Go to the users
• Bring data to the user
• Make it as inviting as possible
• Invite their participation
 Use the data we have
• Classification
• Controlled vocabularies
• Controlled names
• FRBR
• Usage data
OCLC’s Role
 Largest consortia, largest catalog
• 72 million records, growing at 12 million/year
• 1.1 billion holdings
• http://www.oclc.org/worldcat/grow.htm
 Open WorldCat
 Relationships with Web indexers
 Authority control
 Data mining
General Observations






Grouping and ranking are critical
Simpler is better
Faster is better
Faceting needs to be visible
Authority control is important
Local is not as important as it was
OCLC Research
 Data mining
• WorldCat Identities
• Audience level
 Authority control
• VIAF
• Heading control
 FRBR
• Algorithm
• xISBN
 Live search
 FictionFinder
WorldCat Identities
 Create a page for each person in WorldCat
• Name(s)
• Works by and about
• Subjects
• Dates
• Fiction/non-fiction
• Roles
• Co-authors
 Add links
• Wikipedia
• Authority files
Approach
 Borrow from
• FictionFinder
• RedLightGreen
• FRBR
• VIAF
• PeopleAustralia
• Wikipedia
 Pages are ‘static’
• Easier to do complicated analysis
• Some parts may be editable
• Use cover art in lieu of photos
Statistics





80 million (nominally) controlled headings
18 million different identities in WorldCat
2 million with at least five citations
12 million with only one citation
400,000 identities with non-Latin script forms
Plans, etc.





Make WorldCat Identities public this year
Revised version of FictionFinder soon
Improve authority control
Extend authority control
Improve FRBR matching
Do We Need It?
 Just have Google harvest everything
• Our experience with Google
• Fielded searching
• Reliable searching
 Possibility of user-supplied metadata
 Cost of good metadata
 Cost of non-existent metadata
Conclusions
 Shift to remote users forces new approaches
 Online availability – trend towards centralization
 More flexibility in implementations
 Patrons are better served
 Less emphasis on physical collections
Thom Hickey
[email protected]