presentation - digital libraries laboratory @ uct . cs
Download
Report
Transcript presentation - digital libraries laboratory @ uct . cs
ETD-db 2.0: Rewriting ETD-db
Sung Hee Park*†
Paul Mather †, Kimberli Weeks †, Collin Brittle†, Gail McMillan†,
Edward Fox *
†Digital
Library and Archives, University Libraries, Virginia Tech
*Digital Library Research Lab, Computer Science, Virginia Tech
14th International Symposium on Electronic Theses &
Dissertations
Sept. 15,2011, Cape Town, South Africa
Contents
Introduction
Requirements
Design
Implementation
Tests
Summary and Future Work
Introduction
ETD-db 1.0
◦ Workflow: submit, manage, search, brows
◦ 1995/97: James Powell, 1998/2001 Tony Atkins
Technical Directors,VT Digital Library and Archives
◦ Web applications, Perl, MySQL
◦ Improvements suggested by
Stakeholders (e.g., system admins, managers, authors)
R. Jones 2004 article compares DSpace & ETD-db
ETD-db 2.0
New version of Virginia Tech’s ETD system
Web application
Ruby on Rails
◦ Model-View-Controller-based Web app framework
◦ Any database
◦ Any server supported by Ruby on Rails
Major objectives of rewriting ETD-db
1. Improve the original, powerful functionalities
2. Handle ETD collections more reliably and securely
Use Cases
Blue: new use cases
Show users
Modify users
ReportETDStatistics
Report Usage Show Help PagesDisplay Confirmation
Yellow: existing use cases
Fill Title Page
Add users
Administrator
Upload Files
Delete users
Mail System
Manage system
Manager
Manage users
Send Email
Change ETD
Submit ETD
Addison
Cataloger
Catalog ETDs
Login
ETD-db system
Generate_title_pages
Host OS system
Generate_browse_pages
Generate html header
Author
Review ETD
Withhold ETD
Reviewer
Search ETD
Approve ETD
Change Avaiabilty
Fulltext Search
Move Files
Metadata Search
Browse ETD
Manage Available
Patron
Browser by Advisor
Manage Submitted
Browser by Department
Manage ETD
Browser by Year Browser by Author
Manage Withheld
Modity
Delete
View
Generate html footer
Major Requirements
Improving single password per role
◦ ETD-db
One password per role, multiple staff per role
◦ More reliable and safer system
Role management functionalities
Finer-grained permissions
Supporting fine-grained access control
◦ ETD-db
Database-level access permission control
For example, ETD-db 1.0 has submitted, available, and withheld ETD
databases.
◦ ETD-db 2.0
digital object-level and action-level access permission
Design
Keyword
Provenance
AvailabilityDescription
CopyrightStatement
DegreeDescription
DoctypeDescription
DepartmentList
UrnRegistry
Person
ETD
Person_Role
Action
Content
Administrator
(from Use Case View)
Document
Role
Permission
Manager
(from Use Case View)
DigitalObject
Committee
Author
Reviewer
(from Use Case View)
Chair
Cataloger
(from Use Case View) (from Use Case View)
Co-chair
Member
Audio
Video
Designed Objects
Person
◦ Class neutral to roles
◦ Information about users
Role
◦ Class neutral to person
◦ Information about the role itself
Permission
◦ Class defines actions
Action
◦ Class describes activity like CRUD (Create, Read, Update and
Delete)
Digital Object
◦ Class represents something such as metadata or content
Reference Metadata Schema
Authorization
Submission Process
Show ETDs by Author View
Implementation: Role Management
Administrative functionalities
1.
2.
3.
4.
Register Digital Objects
Register Actions
Register Roles
Assign Permissions to Roles
Authorization for Multiple Roles
1. Different users with the same role
2. Same user with multiple roles
Implementation: Submission Process
ETD metadata
◦ Gets information from ED-ID
◦ Gets information from Banner
◦ Gets information from VT-Specific Banner
ETD file(s)
◦ Multiple file uploads
◦ Various files types
◦ Designed as child classes
Committee
members
◦ Roles
◦ Association with the Person model and the Role model
Test Driven Development (TDD)
Characteristics of the Ruby on Rails
Quality First Model
Types of tests we are using
1. Unit tests
2. Functional tests
3. Integration tests
TDD in Ruby on Rails
Unit Tests
◦ Examine our models (objects)
◦ Models in the Model-View-Controller (MVC)
◦ Object oriented programming
Functional Tests
◦ Appropriate response to users’ requests?
Integration Tests
◦ Study users’ workflow/scenarios/usages
Authentication
Authorization
Register New Staff
Register New Role
Discussion
Stakeholder
System
Administrators
Managers
Cataloger
&Reviewer
ETD-db 1.0
Share a single password per each role (e.g., reviewer and
system administrator)
Does not support import & export function
Written in Perl scripts which provide software libraries
depending on back end database
Does not support transaction processing
Character set encoding is not strongly enforced, which
leads the inconsistent output
Does not support log and audit files
External PHP scripts support batch BTD loading into
ETD-db 1.0
Safer, more reliable ‘change_availability’ process
through concurrency control and transactional processing
support
Better approval and release notification
Does not support transaction processing and relevant
rollback.
Does not connect to Banner/HR system
Connection to Banner/HR system
Provides authors with better feedback about progress and
submission status
Supports explicit UTF-8 character set encoding
User role management and authentication (LDAP)
Better queue management and notification
Set date for automatic release notification
Turn off or extend date for automatic release
Authors
Managers
ETD-db 2.0
More maintainable administrator functions like user role
management
New import & export function for easier migration from
existing repositories
Exploiting the state of the art agile web development
paradigm – improved maintenance
Eliminates inconsistencies between the file structure and
database
Supports explicit UTF-8 character set encoding
Record log and audit files
Incorporated features for BTDs (scanned bound
theses/dissertations)
&
Share a single username and password per each role
Hardcoded release date and release by requests from staffs
or authors
Summary: System Admin View
More maintainable administrator functions
New import & export function
Exploiting agile web development
paradigm
Eliminates inconsistencies between file
structure and database
Supports explicit UTF-8 character set
Record log and audit files
Incorporated features for BTDs
Summary: Manager View
Safer, more reliable ‘change_availability’
◦ Concurrency control
◦ Transactional processing support
Better approval and release notification
Summary: Author View
Connection to Banner/HR system
Authors get better feedback
Supports explicit UTF-8 character set
encoding
Summary: Cataloger & Reviewer View
User role management and authentication
Better queue management and notification
Set date for automatic release notification
Turn off or extend date for automatic
release
Conclusion & Future Work
ETD-db 2.0
◦ Improved ETD-db reliability and security
◦ Benefits all stakeholders
◦ State of the art Web development framework
Security
◦ Fine-grained access control and increased audit logging
◦ Eliminate inconsistencies between file structure and database
◦ More reliable content management
Increase the consistency between contents and their metadata
Ensure content integrity
Version control
Plans and Future Work
◦ Access and integrate Banner system
◦ Interview more users
◦ Implement audit logging and provenance
◦ Design import and export functions
19,315 VT ETDs as of Sept. 7, 2011
e.g., ETD-db (1997-2011) works really, really well
102 mixed
access
1%
8,551 VT-only
access [7,089
BTDs]
44%
611 withheld
from access
3%
10,051
accessible
worldwide [240
BTDs]
52%
ETD-db 2.0
Comments? Questions?
Contact
Sung Hee Park
[email protected]