presentation - digital libraries laboratory @ uct . cs

Download Report

Transcript presentation - digital libraries laboratory @ uct . cs

ETD-db 2.0: Rewriting ETD-db
Sung Hee Park*†
Paul Mather †, Kimberli Weeks †, Collin Brittle†, Gail McMillan†,
Edward Fox *
†Digital
Library and Archives, University Libraries, Virginia Tech
*Digital Library Research Lab, Computer Science, Virginia Tech
14th International Symposium on Electronic Theses &
Dissertations
Sept. 15,2011, Cape Town, South Africa
Contents
Introduction
 Requirements
 Design
 Implementation
 Tests
 Summary and Future Work

Introduction

ETD-db 1.0
◦ Workflow: submit, manage, search, brows
◦ 1995/97: James Powell, 1998/2001 Tony Atkins
 Technical Directors,VT Digital Library and Archives
◦ Web applications, Perl, MySQL
◦ Improvements suggested by
 Stakeholders (e.g., system admins, managers, authors)
 R. Jones 2004 article compares DSpace & ETD-db
ETD-db 2.0



New version of Virginia Tech’s ETD system
Web application
Ruby on Rails
◦ Model-View-Controller-based Web app framework
◦ Any database
◦ Any server supported by Ruby on Rails

Major objectives of rewriting ETD-db
1. Improve the original, powerful functionalities
2. Handle ETD collections more reliably and securely
Use Cases
Blue: new use cases
Show users
Modify users
ReportETDStatistics
Report Usage Show Help PagesDisplay Confirmation
Yellow: existing use cases
Fill Title Page
Add users
Administrator
Upload Files
Delete users
Mail System
Manage system
Manager
Manage users
Send Email
Change ETD
Submit ETD
Addison
Cataloger
Catalog ETDs
Login
ETD-db system
Generate_title_pages
Host OS system
Generate_browse_pages
Generate html header
Author
Review ETD
Withhold ETD
Reviewer
Search ETD
Approve ETD
Change Avaiabilty
Fulltext Search
Move Files
Metadata Search
Browse ETD
Manage Available
Patron
Browser by Advisor
Manage Submitted
Browser by Department
Manage ETD
Browser by Year Browser by Author
Manage Withheld
Modity
Delete
View
Generate html footer
Major Requirements

Improving single password per role
◦ ETD-db
 One password per role, multiple staff per role
◦ More reliable and safer system
 Role management functionalities
 Finer-grained permissions

Supporting fine-grained access control
◦ ETD-db
 Database-level access permission control
 For example, ETD-db 1.0 has submitted, available, and withheld ETD
databases.
◦ ETD-db 2.0
 digital object-level and action-level access permission
Design
Keyword
Provenance
AvailabilityDescription
CopyrightStatement
DegreeDescription
DoctypeDescription
DepartmentList
UrnRegistry
Person
ETD
Person_Role
Action
Content
Administrator
(from Use Case View)
Document
Role
Permission
Manager
(from Use Case View)
DigitalObject
Committee
Author
Reviewer
(from Use Case View)
Chair
Cataloger
(from Use Case View) (from Use Case View)
Co-chair
Member
Audio
Video
Designed Objects

Person
◦ Class neutral to roles
◦ Information about users

Role
◦ Class neutral to person
◦ Information about the role itself

Permission
◦ Class defines actions

Action
◦ Class describes activity like CRUD (Create, Read, Update and
Delete)

Digital Object
◦ Class represents something such as metadata or content
Reference Metadata Schema
Authorization
Submission Process
Show ETDs by Author View
Implementation: Role Management

Administrative functionalities
1.
2.
3.
4.

Register Digital Objects
Register Actions
Register Roles
Assign Permissions to Roles
Authorization for Multiple Roles
1. Different users with the same role
2. Same user with multiple roles
Implementation: Submission Process
 ETD metadata
◦ Gets information from ED-ID
◦ Gets information from Banner
◦ Gets information from VT-Specific Banner
 ETD file(s)
◦ Multiple file uploads
◦ Various files types
◦ Designed as child classes
 Committee
members
◦ Roles
◦ Association with the Person model and the Role model
Test Driven Development (TDD)
Characteristics of the Ruby on Rails
 Quality First Model
 Types of tests we are using

1. Unit tests
2. Functional tests
3. Integration tests
TDD in Ruby on Rails

Unit Tests
◦ Examine our models (objects)
◦ Models in the Model-View-Controller (MVC)
◦ Object oriented programming

Functional Tests
◦ Appropriate response to users’ requests?

Integration Tests
◦ Study users’ workflow/scenarios/usages
Authentication
Authorization
Register New Staff
Register New Role
Discussion
Stakeholder
System
Administrators







Managers

Cataloger
&Reviewer







ETD-db 1.0
Share a single password per each role (e.g., reviewer and
system administrator)
Does not support import & export function
Written in Perl scripts which provide software libraries
depending on back end database
Does not support transaction processing
Character set encoding is not strongly enforced, which
leads the inconsistent output
Does not support log and audit files
External PHP scripts support batch BTD loading into
ETD-db 1.0
Safer, more reliable ‘change_availability’ process 
through concurrency control and transactional processing
support
Better approval and release notification
Does not support transaction processing and relevant
rollback.
Does not connect to Banner/HR system

Connection to Banner/HR system

Provides authors with better feedback about progress and
submission status
Supports explicit UTF-8 character set encoding




User role management and authentication (LDAP)
Better queue management and notification
Set date for automatic release notification
Turn off or extend date for automatic release

Authors
Managers
ETD-db 2.0
More maintainable administrator functions like user role
management
New import & export function for easier migration from
existing repositories
Exploiting the state of the art agile web development
paradigm – improved maintenance
Eliminates inconsistencies between the file structure and
database
Supports explicit UTF-8 character set encoding
Record log and audit files
Incorporated features for BTDs (scanned bound
theses/dissertations)
& 



Share a single username and password per each role
Hardcoded release date and release by requests from staffs
or authors
Summary: System Admin View
More maintainable administrator functions
 New import & export function
 Exploiting agile web development
paradigm
 Eliminates inconsistencies between file
structure and database
 Supports explicit UTF-8 character set
 Record log and audit files
 Incorporated features for BTDs

Summary: Manager View

Safer, more reliable ‘change_availability’
◦ Concurrency control
◦ Transactional processing support

Better approval and release notification
Summary: Author View
Connection to Banner/HR system
 Authors get better feedback
 Supports explicit UTF-8 character set
encoding

Summary: Cataloger & Reviewer View
User role management and authentication
 Better queue management and notification
 Set date for automatic release notification
 Turn off or extend date for automatic
release

Conclusion & Future Work

ETD-db 2.0
◦ Improved ETD-db reliability and security
◦ Benefits all stakeholders
◦ State of the art Web development framework

Security
◦ Fine-grained access control and increased audit logging
◦ Eliminate inconsistencies between file structure and database
◦ More reliable content management
 Increase the consistency between contents and their metadata
 Ensure content integrity
 Version control

Plans and Future Work
◦ Access and integrate Banner system
◦ Interview more users
◦ Implement audit logging and provenance
◦ Design import and export functions
19,315 VT ETDs as of Sept. 7, 2011
e.g., ETD-db (1997-2011) works really, really well
102 mixed
access
1%
8,551 VT-only
access [7,089
BTDs]
44%
611 withheld
from access
3%
10,051
accessible
worldwide [240
BTDs]
52%
ETD-db 2.0
Comments? Questions?
Contact
Sung Hee Park
[email protected]