Creating a National Electronic Thesis and Dissertation
Download
Report
Transcript Creating a National Electronic Thesis and Dissertation
Lawrence Webley, Hussein Suleman, Tatenda Chipeperekwa
{chippytdm,lwebley}@gmail.com, [email protected]
University of Cape Town
Department of Computer Science
Digital Libraries Laboratory
Present
History
Requirements
ETD Environment in SA
Design Principles
System Architecture
Screen Snaps
Future work
Mid-1990s
• Establishment of NDLTD
Late-1990s
• Early South African ETD sites at Rhodes/Wits/Pretoria
2001
• OAI-PMH developed to interconnect repositories
2007
• SA National working group formed
2009
• First hosted collections at NRF
2010
• First Version of Portal + Repository
2011
• ETD 2011 in Cape Town!
To link South Africa into international efforts
To gather data on university output
To deal with specific local issues
To showcase local accomplishments locally and
internationally
To promote local universities
To motivate institutions to have active ETD
projects
Metadata only
Metadata standards
What to expose – Masters/PhD only?
Who will provide support?
What about small institutions?
How to provide access? – OAI-PMH
Create reusable, customisable, open source ETD
portal management software
◦ Preferable not to reinvent the wheel!
◦ Composed entirely of open source components
◦ Can be customised to meet other use cases
Scalability
◦ National archives are constantly growing
Institution A
@ NRF
Institution B
@ NRF
TD Archive
...
Institution X
Institution Y
...
NRF ETD
Portal
NRF
Central
Archive
SA Universities and Technicons
SA NRF
NDLTD Union Archive
SCIRUS
...
International Partners
ETD collections at approximately
12 institutions
◦ Mostly larger, research intensive
institutions
Various software packages in
use
◦ Eprints, Dspace, ETD-db, other
OAI-PMH support in all systems
ETD Collections hosted remotely
at the NRF
For smaller institutions with few
resources and few ETDs
Multiple instances of Dspace
Temporary arrangement
Technical support from NRF –
collection management from
institutions
Our repository software fits in
here
Collection of metadata records
from all institutions
Any/all metadata formats
Harvested from institutions
using OAI-PMH
Provides OAI-PMH and RSS
interfaces
No digital objects
Web interface to collection
Search/Browse/View metadata
Statistics for collections
Latest entries
Administrative interface
◦ For managing source repositories
NDLTD Union Archive
◦ International Collection
SCIRUS
◦ Science specific search engine
All modern Linux-based software components
Multi-tiered, simple architecture of complex
components
Clean separation between components
◦ Scalability
◦ More easily customised (simply replace a component)
◦ Failure resistant
Any metadata
Simplicity (minimal dependencies)
Java/Tomcat/Lucene
Summary
Info
Harvester
Database
Institutions
Harvester
Web
Interface
portal
RSS Feed
OAI-PMH
data
provider
portal
portal
portal
Higher up
repositories
Retrieves metadata from a set
of ETD repositories
◦ Via OAI-PMH interfaces
Performs incremental harvests
Performs record validation
◦ Simple validation checks
Performs twice daily harvests
Configurable via web frontend.
Provides machine access
points to metadata harvested
◦ OAI-PMH interface
Can use any SQL-compliant DB
◦ Our implementation used MySQL
Additional services provided
◦ RSS feed of latest records
◦ Summary statistics for records from
each institution
Designed to fit into a hierarchy
of OAI-PMH compliant DLs
RSS
repository
Harvester
Portal
Database
Lucene
Portal
Web
Interface
Search,
browse
statistics
Harvester
Web Admin
Harvests from Repository into
portal DB
Lucene indexes records
Portal provides human interface
◦ Allows keyword searching,
browsing, category searches
◦ Also offers links to OAI-PMH and
RSS interfaces
Packaging into Ubuntu repository
Generic browsing categories
Content Management System
◦ Favourites, citation
Social media buttons
◦ Facebook like, google plus
Bug fixes
Questions?
Links
•Live portal @ www.netd.ac.za
•Source Code Available @
http://dl.cs.uct.ac.za/projects/etd_portal