New Approaches To Resource Discovery In The UK HE COmmunity

Download Report

Transcript New Approaches To Resource Discovery In The UK HE COmmunity

New Approaches To Resource
Discovery In The UK HE Community
Aims of Talk:
Brian Kelly
• Review approaches taken
by UK HE community
UK Web Focus
• Overview of eLib phase 3
UKOLN
projects and development
University of Bath
of the DNER
Bath, BA2 7AY
• Discussion of architectural
Email: [email protected]
models, software
URL: http://www.ukoln.ac.uk/
development and funding
regimes
UKOLN is funded by Resource: The Council for Museums, Archives and Libraries,
the Joint Information Systems Committee (JISC) of the Higher and Further
Education Funding Councils, as well as by project funding from the JISC and the
European Union.
UKOLN also receives support from the University of Bath where it is based.
Contents
Issues
Issues
•Software
•Server or site? Web Manager’s
View
•File formats
•User interface
•Administrator’s
interface
Other
approaches
Librarian’s
View
eLib Phase 3
Hybrid Libraries
and Clumps
DNER
Distributed National
Electronic Resource
2
•Web-enabled
OPAC
•Integration with
other OPACs
•Cross-searching
or union
catalogue
•Z39.50
•Metadata
•Identifiers
Other Initiatives
EU & US projects
Which To Choose?
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
Alkaline (Vestris)
AltaVista - Search Intranet
ASTAWare SearchKey
atomz Search (remote)
BooleanSearch
BBDBot
BRS/Search (Dataware)
Compass Server (Netscape)
Cybotics
DataWare BRS/Search
DocFather (formerly
SiteSearch)
dtSearch Web
Excalibur RetrievalWare
EWS (Excite)
Excerpt (Obsolete)
Extense
FAST Search Server
• Findex (code library)
• Folio siteDirector
• FreeFind
(remote)
Software
from
•
3
• Glimpse
• Harvest
• ht://Dig
Can choose by
reading reviews, web
sites, etc. or by
• Magnifi Enterprise Server
looking at usage in
• Matt's SimpleSearch
community
• Microsoft Index Server
• ICE
• Microsoft Site Server
• iHound (ICATT)
• MiniSearch (remote)
• Index Search (Xavatoria)
• MondoSearch
• Index Server (Microsoft)
• Muscat
• IndexMySite (remote)
• NetResults (now SearchKey Plus)
• Infoseek - Ultraseek
• Netscape - Compass Server
• Intermediate Search
• OpenText - LiveLink
• intraSearch (remote)
• Perl Scripts
• I-Search
• Perlfect Search
• Isearch
• Phantom (Maxum)
• ITMS
• PicoSearch (remote)
• Isys:web
• Etc.
• Java Applets
• JHLSearch
<http://searchtools.com/tools/tools.html>
Fulcrum
Which to choose? What• JObjects
softwareQuestAgent
may be obsolete? What does remote mean?
• Lycos / InMagic
Findings: UK HE Web Sites
Main findings of 3 surveys:
Software
ht://Dig
eXcite
Microsoft
Harvest
Ultraseek
Other
None
Totals
50
45
40
Nos.
in
Nos. in
Nos. in
35
Jul 1999 Mar 2000 Aug 2000
30
32
42
25

17
9
19
 25
15
18
12

20
6
3
8

9
11
7
15

34
31
29
—10
50
44
60

5
160
163
163
0
ht://Dig
eXcite
Microsoft
Harvest
Other
None
• Article published in Ariadne issue 21 <http://www.ariadne.ac.uk/issue21/webwatch/>
• Results (including update on survey) available from:
ht://Dig
60
e Xc ite
50
40
<http://www.ukoln.ac.uk/web-focus/surveys/uk-he-search-engines/>
M ic ros oft
30
Ha rv e s t
20
10
Ultra s e e k
4
0
Othe r
Nos .
None
Popular Product: ht://Dig
ht://Dig
See <http://www.htdig.org/>
• Now used at 42 (up from
25 then 32) UK HEIs
• Freely available
• Own domain with welldesigned web site
• Robot to index multiple
servers
Oxford Case Study
5
131 servers
438,500 resources
Indexes MS Office, PDF,
etc. files (external parser)
Issue: Web community not
interested in non-Web resources?
National Search Engines
ACDC (Academic Directory)
http://acdc.hensa.ac.uk/
• (Unfunded) pilot of index of
ac.uk domain based on
distributed approach using
Harvest
• Set up in March 1996
• Lack of development effort
resulted in degraded service
(e.g. indexer not aware of
JavaScript code)
6
Issues: Problems with volunteer effort once enthusiasm wanes
Lack of user involvement can limit acceptance
Lack of funding body involvement can mean lessons learnt are lost
Institutional Developments
Maestro robot (Dundee):
•
•
•
•
Indexes Scottish resources
Individual or all sites
Volunteer effort
Interesting application for OS/2
North East Universities (UNIS4NE):
7
• Appearance of cross-searching
• Actually interface to HotBot / AltaVista
eLib Subject Gateways
SOSIG is an example
of subject gateway
initially funded by eLib
SOSIG provides
access to manually
catalogued resources
in Social Sciences
Involvement with Social
Science community
has helped acceptance
8
ROADS
ROADS software used to
support several gateways
Key features:
 Open source
 Support for whois++
 Momentum behind
software meant:
– Uptake in other
communities
– Additional developments
(e.g. ROADS/Z39.50
gateway)
But:
 Whois++ standard
failed to take off
9
Approaches Taken By
Hybrid Libraries Projects
Let’s look at some of the approaches taken by some of
the eLib Phase 3 Hybrid Libraries projects which help
users find electronic and "real world" resources:
Agora:
 Use of Z39.50 and Collection Level Descriptions
 Working with a commercial software vendor
Headline:
 Provision of a personalised interface
 An open source approach
BUILDER:
 Searching across Hybrid Library Web sites
 Authenticated access to exam papers
 Making use of locally available applications
10
Agora (1)
In the Agora Hybrid
Library the user can
choose a Landscape
11
Agora (2)
The landscape may
be a collection of
resources; individual
collections can be
12selected
Agora (3)
Collections are defined
using the Collection
Level Description
agreed by eLib
projects
13
Agora (4)
Results from local
collections are
usually returned
first
14
Agora (5)
The results can be
viewed directly or
requested using ILL
15
Agora (6)
The results are
retrieved
simultaneously
16
Agora (7)
Results from AltaVista
obtained using
“HTML-scraping”
technique
17
Headline (1)
Headline’s PIE
(Personal Information
Environment) provides
a personalised
interface to Hybrid
Libraries resources.
Here is Pete’s (an
Economics UG
student) default
information landscape
18
http://www.headline.ac.uk/
publications/pie/Pete'sPage1.html
Headline (2)
Pete selects the All
Resources link
This gives a list of
all the Library
resources and
services that Pete
is entitled to use
19
Headline (3)
Pete adds the
Economic Systems
Research journal to
his list of resources
20
Headline (4)
Pete now clicks on
the Customise
option near the top
of the window
He can now add the
journal to his
resources for This
Week’s Essay
21
Headline (5)
Pete now carries out
additional research
He selects
collections of interest
and then searches
for “Japan and
emerging markets”
22
Headline (6)
Pete expands the
results for Unicorn
…
23
Headline (7)
… and then views
a map showing the
physical location
This illustrates how
Headline supports
access to physical
objects as well as
digital resources.
24
Headline (8)
Finally Pete
expands the
results from
Decomate
These are PDF
documents which
can be viewed
directly
25
BUILDER (1)
BUILDER (Birmingham
University Integrated
Library Development and
Electronic Resource)
provides a number of
hybrid library
demonstrators
The Microsoft SiteServer
indexer is used to index
across other Hybrid Libraries
(and Clumps) projects
Notice branding of the results
Authentication is provided using the
Novell NDS which provides access
to the institutional network
26
Issues
The different approaches to software development:
• Make use of (and work with) commercial products:




Benefit from market-tested products
More realistic awareness of commercial acceptance
Relationships may be difficult
May be sucked into use of proprietary solutions
• Develop open source software and use complementing open
source products
 Flexibility in adopting emerging new standards
 Requires technical expertise to develop and maintain
 Management resistance, esp. if fails to gain momentum
• Pragmatic approach in using existing tools
 Makes use of existing tools and expertise
 Can quickly develop prototypes which can help gain support for
services
 May be architecturally flawed and make use of proprietary
solutions
27
Tools (1)
A variety of open
source tools are
being developed
within the
community.
DC-dot, developed
by UKOLN, can be
used to assist the
creation of Dublin
Core metadata.
The metadata can
be generated in
various formats
such as HTML and
RDF.
28
http://www.ukoln.ac.uk/metadata/dcdot/
Tools (2)
UKOLN has also
developed a tool for
creating collection
level descriptions to
support projects
funded by RSLP
(Research Support
Libraries Programme),
another HE funded
programme
29
http://www.ukoln.ac.uk/metadata/rslp/tool/
From Hybrid Libraries to the DNER
Hybrid Libraries projects are addressing:
• Needs for users to find variety of resources
• Need to gain experiences from projects
The DNER:
•
•
•
•
Distributed National Electronic Resource
Building on Hybrid Libraries project experiences
Focus on services rather than projects
Aims to provide seamless access to quality
resources
• Is developing a standards-based architectural
framework
30
DNER Architecture
Areas of interest include:
• Collection descriptions
• User profiles
• Identifiers
Emphasis on interoperability through use of standards
Work currently in progress
31
Currently...
Local content
Web
Web
National content
Web
Web
International content
Web
Web
End user
32
Currently...
Local content
Web
Web
National content
Web
Collection Description
(e.g. Agora)
International content
Web
Web
Web
User Profile
(e.g. Headline)
End user
Authentication
(Athens)
33
Future...
Content
Web
Web
Web
Web
Web
Collection
description
User profile
End user
Authentication
(Sparta)
34
Future...
Content
Collection
description
Portal
Subject portal
or institutional
portal or MLE
or ...
User profile
End user
Authentication
(Sparta)
35
Sharing content
How do ‘portals’ and content servers interact?
Technologies currently being investigated:
•
•
•
•
36
HTTP
Z39.50 - Bath Profile
OAI - Open Archive Initiative
RSS - Rich Site Summary / RDF Site Summary
Open Archives Initiative
OAI Metadata Harvesting Framework:
• Simple mechanism for sharing metadata records
• Records shared over HTTP...
• ... as XML
• Client can ask metadata server for
– all records
– all records modified in last ‘n’ days
– info about databases, formats, etc.
See <http://www.openarchives.org/>
37
RSS
RSS (Rich Site Summary):
• XML application for syndicated news feeds
• Pointers and simple descriptions of news items (not
the items themselves)
• Being transitioned to more generic RDF/XML
application (RSS 1.0)
• No querying - just regular ‘gathering’ of RSS file
• See <http://rssxpress.ukoln.ac.uk/>
38
Future... Z, OAI, RSS
Content
OAI
RSS
HTTP
Z39.50
Collection
description
Portal or
MLE or ...
User profiles
Authentication
(Sparta)
39
HTTP
End user
Content Identification
Need to persistently identify stuff to:
• Enable lecturers to embed it into learning
resources
• Enable students to embed it into multimedia
essays
• Enable people to cite it
... so let’s look at a current example (from VADS)
40
Content Example
41
Content example - the URL
http://vads.ahds.ac.uk/ixbin/hixclient?_IXDB_=vads&_I
XSPFX_=t&_MREF_=3392&_IXSR_=ea1&_IXSP_=0
&_IXSS_=%2524%2brec%2bvads%2band%2bseaside%
2band%2b%2528%2528Basic%2bDesign%2bCollection
%2bin%2btitle_vads_collection%2529%2bor%2b%2528
Be nicer if the content URL was something like:
Halliwell%2bCollection%2bin%2btitle_vads_collection
http://vads.ahds.ac.uk/id=137234-849783
%2529%2bor%2b%2528Imperial%2bWar%2bMuseum%
http://dx.doi.org/10.3456/1096493
2bConcise%2bArt%2bCollection%2bin%2btitle_vads_co
llection%2529%2bor%2b%2528London%2bCollege%2b
of%2bFashion%2bCollege%2bArchive%2bin%2btitle_va
ds_collection%2529%2529%2bsort%2btitle%2b%3d%25
2e%26_IXDB_%3dvads&_IXRECNUM=3392&_IXAS
EARCH=&SUBMIT-BUTTON=DISPLAY
42
Identifiers
Could use URLs, PURLs, DOIs, ... but...
• URLs are locators not identifiers
• DOIs and PURLs resolved centrally
• All resolve to same thing irrespective of
who/where the user is e.g.
– 10.1045/october2000-granger always resolves to
US version even though D-Lib mirrored in UK
– http://purl.org/dc always resolves to US version
even though DC pages mirrored in UK
DOI and PURL are resolved using a US resolver
43
Identifiers
Need some way to encode:
• identifier
• citation
in such a way that resolution happens in the
context of:
• The location of the end user
• The access rights of the end user
this can be achieved with OpenURL and SFX
See <http://www.sfxit.com/> for further information
44
Development of Standards
As well as designing an architecture to support
interoperability based on open standards there is a
need to be involved in standards development work:
Warwick Framework
• A framework for metadata applications, which
informed W3C’s RDF work
Dublin Core
• eLib community has been actively involved with
Dublin Core development
Bath Profile
• Bath Profile for Z39.50 defines core attributes for
library applications
45
What’s Happening Elsewhere?
A number of EU-funded projects and joint UK/US
projects are involved in related activities, including:
Renardus
• EU project to develop an academic subject gateway
service for Europe
SCHEMAS
• SCHEMAS provides a forum for metadata schema
designers involved in EU-funded projects and
national initiatives
IMESH
• Joint JISC/NSF funded project to develop a
configurable, reusable and extensible toolkit
for subject gateway providers
46
Renardus
http://www.renardus.org/
Renardus:
• Will build a pilot
European broker
service offering
subject-based access
to collections of
information to support
learning, teaching &
research using Z39.50
• An open source
approach – e.g.
making use of Zebra
(www.indexdata.dk)
47
SCHEMAS
To support EU projects
SCHEMAS will:
• Monitor metadata
developments
• Organise workshops
• Provide a registry of
schemas
The use of RDF to
store schemas in a
machine-readable way
is being investigated
Will make use of
commercial software
(EOR from OCLC)
48
http://www.schemas-forum.org/
IMESH
A joint
JISC/FSF
funded project
Will develop
open sources
tools for use by
developers of
subject
gateways
49
http://www.desire.org/html/
subjectgateways/community/imesh/
Conclusions
This talk has provided examples of new approaches
to resource discovery within the UK Higher Education
community
A number of case studies have been looked at and
the following issues addressed:
• Standards
• Approaches to software development
• The funding regime
50
Standards
There is:
• Awareness of the importance of standards
• Some involvement in development of standards (e.g. Dublin
Core) and community agreements (e.g. collection level
descriptions)
Key standards:
• XML
• Dublin Core
• Z39.50: political backing and by library community, but less
enthusiasm from software developers
• RDF: some enthusiasts, used in some projects, but also
sceptics (too complex, lack of widespread support)
• DOIs, OpenURLs, etc: Interest by early adopters
• Authentication (digital signatures, etc): difficult
• User profiles: early days
51
Software Development
There are a variety of approaches to software
development:
• Development of Open Source software
• Use of commercial software / joint projects with
commercial software vendors, etc.
The pros and cons of these approaches are well known
There is probably no best single approach applicable
for all
Interoperability through use of open standards is the
key – let’s be agnostic over this argument
52
Funding Regimes
• Volunteer effort by enthusiasts can be useful
(cf. the Web in 1993) but this approach has
limitations
• Large scale programmes, such as eLib, can
result in significant developments
• The transition from projects to services is
essential – and may be difficult
• Building on national initiatives through
international collaboration will provide fresh
insights and address unforeseen
interoperability issues
53
Question Time
Any questions?
Acknowledgements: Thanks to Andy Powell, Leona
Carpenter, Rachel Heery and my other colleagues in
UKOLN and members of eLib Hybrid Libraries projects
for their help with this presentation
54