Collecting web resources : Selecting, harvesting, cataloging

Download Report

Transcript Collecting web resources : Selecting, harvesting, cataloging

When the Books Leave the Building
Metadata for a Digital Age
RLG Partnership Symposium 2010
Robert Wolven
 What we’re doing now doesn’t make sense
 We need to think differently before we can act
appropriately
 We’re all in this together and …
 We’re not alone.
When books were books …
 20th Century Research Process
 Library as metadata repository
 Library as content repository
RLG Partnership Symposium 2010
Research
Question
Metadata
Scholarly Research
Cycle
Analysis
Content
Metadata
Indexes
Bibliographies
Research
Content
Finding Aids
Archives
Question
Journals
Library Catalog
Books
Research
Question
Library Catalog
Books
Web
Search
Digital
Collections
Data
Research
News
Question
Books
Articles
Library Catalog
Books
Digital
Collections
Web
Search
Data
Research
News
Question
Books
Articles
News
Articles
Digital Collections
Library Catalog
Books
Library Super-Catalog:
Web-Scale Discovery
Articles,
News,
Images,
Data,
Chapters …
Name Authorities, Subject Headings …
Library focus on content:
From Analog to Digital
 From: units in which resources are managed
(published, purchased, stored …)
 To: units in which resources are accessed
(chapter-level DOIs, i-Tunes, article-linking …)
RLG Partnership Symposium 2010
Library focus on content (cont’d)
 From: published vs unique
(shared cataloging, standards vs local access, practice)
 To: limited access vs open access
(outsourced responsibility vs no responsibility?)
Library focus on content (cont’d)
 From: mediated access via metadata
(metadata as surrogate)
 To: searchable content vs viewable content
(metadata as supplement)
RLG Partnership Symposium 2010
Library focus on metadata
creation and management
 From: emphasis on discovery
 To: emphasis on access
 From: design for homogeneous, controlled
environment
 To: design for blended, web-scale environment
RLG Partnership Symposium 2010
Outside the library:
content providers
 From: domain-specific content
 To: cross-domain search
 From: limited cumulation
 To: indefinite cumulation
RLG Partnership Symposium 2010
From web search to web research
 “Conversation” as determinant of relevance
 Many participants provide effective filter
 Finding what matters most
 Finding what hasn’t been discussed
 Finding everything that matters
RLG Partnership Symposium 2010
Some challenges:
 Consistent discovery across heterogeneous objects
 Defining appropriate “targets” of discovery
 Enhancing retrospective metadata
 Parsing ambiguous data to improve retrieval
Some implications for metadata
practice
 Design metadata for primary audience
 Deprecate consistency as a value
 Use identifiers to compensate for lack of consistency
 Maximize use of linked data
 Apply expertise based on mission, not ownership
 Focus on metadata to bridge communities of practice
 Focus on improving ability to parse large results
RLG Partnership Symposium 2010