Creating a National Electronic Thesis and Dissertation

Download Report

Transcript Creating a National Electronic Thesis and Dissertation

Lawrence Webley, Hussein Suleman, Tatenda Chipeperekwa
{chippytdm,lwebley}@gmail.com, [email protected]
University of Cape Town
Department of Computer Science
Digital Libraries Laboratory








Present
History
Requirements
ETD Environment in SA
Design Principles
System Architecture
Screen Snaps
Future work
Mid-1990s
• Establishment of NDLTD
Late-1990s
• Early South African ETD sites at Rhodes/Wits/Pretoria
2001
• OAI-PMH developed to interconnect repositories
2007
• SA National working group formed
2009
• First hosted collections at NRF
2010
• First Version of Portal + Repository
2011
• ETD 2011 in Cape Town!






To link South Africa into international efforts
To gather data on university output
To deal with specific local issues
To showcase local accomplishments locally and
internationally
To promote local universities
To motivate institutions to have active ETD
projects






Metadata only
Metadata standards
What to expose – Masters/PhD only?
Who will provide support?
What about small institutions?
How to provide access? – OAI-PMH

Create reusable, customisable, open source ETD
portal management software
◦ Preferable not to reinvent the wheel!
◦ Composed entirely of open source components
◦ Can be customised to meet other use cases

Scalability
◦ National archives are constantly growing
Institution A
@ NRF
Institution B
@ NRF
TD Archive
...
Institution X
Institution Y
...
NRF ETD
Portal
NRF
Central
Archive
SA Universities and Technicons
SA NRF
NDLTD Union Archive
SCIRUS
...
International Partners

ETD collections at approximately
12 institutions
◦ Mostly larger, research intensive
institutions

Various software packages in
use
◦ Eprints, Dspace, ETD-db, other

OAI-PMH support in all systems





ETD Collections hosted remotely
at the NRF
For smaller institutions with few
resources and few ETDs
Multiple instances of Dspace
Temporary arrangement
Technical support from NRF –
collection management from
institutions






Our repository software fits in
here
Collection of metadata records
from all institutions
Any/all metadata formats
Harvested from institutions
using OAI-PMH
Provides OAI-PMH and RSS
interfaces
No digital objects





Web interface to collection
Search/Browse/View metadata
Statistics for collections
Latest entries
Administrative interface
◦ For managing source repositories

NDLTD Union Archive
◦ International Collection

SCIRUS
◦ Science specific search engine



All modern Linux-based software components
Multi-tiered, simple architecture of complex
components
Clean separation between components
◦ Scalability
◦ More easily customised (simply replace a component)
◦ Failure resistant

Any metadata
Simplicity (minimal dependencies)

Java/Tomcat/Lucene

Summary
Info
Harvester
Database
Institutions
Harvester
Web
Interface
portal
RSS Feed
OAI-PMH
data
provider
portal
portal
portal
Higher up
repositories

Retrieves metadata from a set
of ETD repositories
◦ Via OAI-PMH interfaces


Performs incremental harvests
Performs record validation
◦ Simple validation checks


Performs twice daily harvests
Configurable via web frontend.

Provides machine access
points to metadata harvested
◦ OAI-PMH interface

Can use any SQL-compliant DB
◦ Our implementation used MySQL

Additional services provided
◦ RSS feed of latest records
◦ Summary statistics for records from
each institution

Designed to fit into a hierarchy
of OAI-PMH compliant DLs
RSS
repository
Harvester
Portal
Database
Lucene
Portal
Web
Interface
Search,
browse
statistics
Harvester
Web Admin



Harvests from Repository into
portal DB
Lucene indexes records
Portal provides human interface
◦ Allows keyword searching,
browsing, category searches
◦ Also offers links to OAI-PMH and
RSS interfaces



Packaging into Ubuntu repository
Generic browsing categories
Content Management System
◦ Favourites, citation

Social media buttons
◦ Facebook like, google plus

Bug fixes 
Questions?
Links
•Live portal @ www.netd.ac.za
•Source Code Available @
http://dl.cs.uct.ac.za/projects/etd_portal