Transcript andy-powell

RDN, e-Prints UK and NOFDigitise: a (very) small
sample of UK OAI activity
Andy Powell
[email protected]
UKOLN, University of Bath
THE 4TH INTERNATIONAL JISC/CNI CONFERENCE
June 2002
a centre of expertise in digital information management
www.ukoln.ac.uk
About the RDN
• Resource Discovery Network (RDN)
• co-operative network of subject gateways
–
–
–
–
–
BIOME (health, medicine and life sciences)
EEVL (engineering, maths and computing)
HUMBUL (humanities)
PSIgate (physical sciences)
SOSIG (social science, business and law)
• databases of descriptions of high quality
Internet resources… and other services
• 50,000+ records in total
• funded by JISC for UK HE and FE
2
About the RDN
• subject-focused communities
– cataloguers (subject specialists)
– end-users (community-oriented collaborative
services)
• individual Web interfaces but shared
policy framework
–
–
–
–
collection development
cataloguing guidelines
technical standards and interoperability
IPR
• DC metadata (with some extensions)
3
Searching… using Z39.50
EEVL HUMBUL PSIgate
BIOME
SOSIG
• central cross-searching
service based on Z39.50
• but...
RDN
ResourceFinder
– performance issues
– difficult to build flexible browse
interface
• so... looking for a record
sharing solution
4
Sharing… using OAI
EEVL HUMBUL PSIgate
BIOME
SOSIG
RDN
ResourceFinder
5
• gather records using OAI
• central database of all RDN
records using Cheshire
software
• basis for central search,
browse interface and other
services (e.g. Web index)
Sharing… and searching
EEVL HUMBUL PSIgate
BIOME
SOSIG
OAI
RDN
ResourceFinder
Z39.50
Renardus
cross-search
6
• central database of all RDN
records available for
searching using Z39.50
• e.g. offered as part of the
Renardus European subject
gateway pilot service
Implementation
• implementation not tied to particular
subject gateway software
• repository built from two Perl scripts...
– first provides record conversion from gateway format
to OAI XML format
– second supports the OAI protocol
• very simple repository
– records stored in UNIX filestore as XML files
– sub-directories for each OAI ‘set’
• gatherer based on Southampton’s Perl
modules - Perl scripts freely available
7
Issues - record richness
• default unqualified DC record format in
OAI is broadly in line with RDN records
• but...
• does not support full richness of RDN
records, e.g.
– can not indicate the subject classification
scheme in use
• need to investigate whether specific
RDN record format is required
8
Issues - branding
• need to indicate ownership of records
• need to brand records appropriately in
central search/browse results
• can use the OAI <about> section as
follows:
– dc:creator - the name of the cataloguer (author of
the record)
– dc:publisher - the name of the gateway that
originally made the record available
– dc:rights - a simple rights statement about the
record
9
Issues - how open?
• RDN records not freely available to
anyone
• open protocol/standard but ‘closed’
service
• can use standard HTTP authentication
methods to restrict access
– IP address checking
– HTTP basic authentication (username and
password)
– SSL
10
Conclusions
• OAI protocol met a requirement already
identified by the RDN to move from
searching to sharing
• repository code deployed at all RDN
gateways
• gatherer running centrally – importing
records into Cheshire database
• now need to:
– investigate requirements for richer record
format
11
e-Prints UK
• 2 year project funded by JISC
• access to eprints via existing RDN services
• harvesting metadata from OAI-compliant
eprint archives in the UK (and elsewhere)
• automatically enhanced metadata records
based on Web services offered by
– OCLC (name authority and subject
classification)
– University of Southampton (citation analysis,
OpCit)
12
e-Prints UK architecture
e-print archives
Institutional
e-print
archives
Web services
offered
by OCLC
Web service
offered
by Southampton
Citation
analysis
service
Personal
e-print
archives
OAI-PMH
Subject
classification
service
Name
authority
service
Non-institutional
e-print
archives
SOAP
e-Prints UK
SOAP
Javascript/HTTP
Z39.50
RDN
RDN
gateway/portal
RDN
gateway/portal
gateway/portal
service
service
service
end-user services thru the RDN
13
NOF-Digitise Programme
• programme of 149 ‘digitisation’ projects
• funded by New Opportunities Fund,
£50M, 2001 - 2004
• creation of innovative on-line learning
resources
• three broad themes; cultural
enrichment, citizenship, and re-skilling
• can expect 1,000,000 items digitised
during life of programme
14
NOF-Digitise and OAI
• considering use of OAI-PMH to create
single point of search across all
projects...
• ...and to expose NOF-digitise outputs to
other OAI service providers
• not likely to be mandatory but some
interest by projects in use of OAI-PMH
already
15
Open Archives Forum
• European Union IST accompanying
measure
• partners:
– Humboldt University, Berlin
– IEI-CNR, Pisa
– UKOLN, University of Bath
• started autumn 2001
16
project objectives
• provide focus for dissemination
• encourage collaborative development of
software
• support European liaison with OAI
• exchange information
• evaluate ‘open archives’ approach
… build community of interest
17
further information
• OA-Forum Web site
<www.oaforum.org>
• UKOLN OA-Forum Web site
<www.ukoln.ac.uk/metadata/oa-forum/>
• for more info, contact Leona Carpenter
[email protected]
18
a centre of expertise in digital information management
www.ukoln.ac.uk