ETD-db 2.0: Rewriting ETD-db
Sung Hee Park*†
Paul Mather †, Kimberli Weeks †, Collin Brittle†, Gail McMillan†, Edward Fox *
†Digital Library and Archives, University Libraries, Virginia Tech*Digital Library Research Lab, Computer Science, Virginia Tech
14th International Symposium on Electronic Theses & DissertationsSept. 15,2011, Cape Town, South Africa
ContentsIntroductionRequirementsDesignImplementationTestsSummary and Future Work
IntroductionETD-db 1.0
◦Workflow: submit, manage, search, brows
◦1995/97: James Powell, 1998/2001 Tony Atkins Technical Directors, VT Digital Library and
Archives◦Web applications, Perl, MySQL◦Improvements suggested by
Stakeholders (e.g., system admins, managers, authors)
R. Jones 2004 article compares DSpace & ETD-db
ETD-db 2.0New version of Virginia Tech’s ETD system Web application Ruby on Rails
◦ Model-View-Controller-based Web app framework ◦Any database◦Any server supported by Ruby on Rails
Major objectives of rewriting ETD-db1. Improve the original, powerful functionalities 2. Handle ETD collections more reliably and
securely
Use Cases
Blue: new use cases
Browser by Advisor
Browser by Author
Browser by Department
Browser by Year
Manage Available
Manage Submitted
Manage Withheld
Mail System
Generate html footer
Generate html header
Host OS system
Move Files
Generate_title_pages
Generate_browse_pages
Change Avaiabilty
Patron
Fulltext Search
Metadata Search
Withhold ETD
Approve ETD
Addison
ViewDeleteModity
Display Confirmation
Send Email
Upload Files
Fill Title PageShow Help Pages
ETD-db system
Report UsageReportETDStatisticsShow users
Modify users
Author
Reviewer
Administrator
Browse ETD
Search ETD
Review ETD
Catalog ETDs
Cataloger
Change ETD
Manage ETD
Submit ETD
Manage system
Login
Manager
Yellow: existing use cases
Add users
Manage users
Delete users
Major RequirementsImproving single password per role
◦ETD-db One password per role, multiple staff per role
◦More reliable and safer system Role management functionalities Finer-grained permissions
Supporting fine-grained access control ◦ETD-db
Database-level access permission control For example, ETD-db 1.0 has submitted, available, and withheld
ETD databases. ◦ETD-db 2.0
digital object-level and action-level access permission
Design
Author
(from Use Case View)
Reviewer
(from Use Case View)
Document Audio Video
Committee
Chair Co-chair Member
Cataloger
(from Use Case View)
Administrator
(from Use Case View)
Manager
(from Use Case View)
Content
Keyword Provenance
Person
Person_Role Action
DigitalObject
AvailabilityDescription CopyrightStatement DegreeDescription DoctypeDescription
DepartmentList
UrnRegistry
ETD
Role Permission
Designed ObjectsPerson
◦ Class neutral to roles ◦ Information about users
Role◦ Class neutral to person ◦ Information about the role itself
Permission◦ Class defines actions
Action ◦ Class describes activity like CRUD (Create, Read,
Update and Delete)Digital Object
◦ Class represents something such as metadata or content
Reference Metadata Schema
Authorization
Submission Process
Show ETDs by Author View
Implementation: Role Management
Administrative functionalities1. Register Digital Objects 2. Register Actions3. Register Roles4. Assign Permissions to Roles
Authorization for Multiple Roles1. Different users with the same role2. Same user with multiple roles
Implementation: Submission Process
ETD metadata◦Gets information from ED-ID◦Gets information from Banner ◦Gets information from VT-Specific Banner
ETD file(s)◦Multiple file uploads◦Various files types◦Designed as child classes
Committee members◦Roles◦Association with the Person model and the Role
model
Test Driven Development (TDD)Characteristics of the Ruby on RailsQuality First ModelTypes of tests we are using
1. Unit tests 2. Functional tests3. Integration tests
TDD in Ruby on RailsUnit Tests
◦Examine our models (objects)◦Models in the Model-View-Controller (MVC)◦Object oriented programming
Functional Tests◦Appropriate response to users’ requests?
Integration Tests ◦Study users’ workflow/scenarios/usages
Authentication
Authorization
Register New Staff
Register New Role
DiscussionStakeholder ETD-db 2.0 ETD-db 1.0System Administr-ators
More maintainable administrator functions like user role management
New import & export function for easier migration from existing repositories
Exploiting the state of the art agile web development paradigm – improved maintenance
Eliminates inconsistencies between the file structure and database
Supports explicit UTF-8 character set encoding Record log and audit files Incorporated features for BTDs (scanned bound
theses/dissertations)
Share a single password per each role (e.g., reviewer and system administrator)
Does not support import & export function Written in Perl scripts which provide software libraries
depending on back end database Does not support transaction processing Character set encoding is not strongly enforced, which
leads the inconsistent output Does not support log and audit files External PHP scripts support batch BTD loading into
ETD-db 1.0
Managers Safer, more reliable ‘change_availability’ process through concurrency control and transactional processing support
Better approval and release notification
Does not support transaction processing and relevant rollback.
Authors & Managers
Connection to Banner/HR system Provides authors with better feedback about progress and
submission status Supports explicit UTF-8 character set encoding
Does not connect to Banner/HR system
Cataloger &Reviewer
User role management and authentication (LDAP) Better queue management and notification Set date for automatic release notification Turn off or extend date for automatic release
Share a single username and password per each role Hardcoded release date and release by requests from staffs
or authors
Summary: System Admin View
More maintainable administrator functions New import & export functionExploiting agile web development
paradigm Eliminates inconsistencies between file
structure and databaseSupports explicit UTF-8 character setRecord log and audit filesIncorporated features for BTDs
Summary: Manager ViewSafer, more reliable ‘change_availability’
◦Concurrency control ◦Transactional processing support
Better approval and release notification
Summary: Author ViewConnection to Banner/HR systemAuthors get better feedbackSupports explicit UTF-8 character set
encoding
Summary: Cataloger & Reviewer ViewUser role management and authenticationBetter queue management and notificationSet date for automatic release notificationTurn off or extend date for automatic
release
Conclusion & Future WorkETD-db 2.0
◦ Improved ETD-db reliability and security◦ Benefits all stakeholders◦ State of the art Web development framework
Security◦ Fine-grained access control and increased audit logging◦ Eliminate inconsistencies between file structure and
database ◦ More reliable content management
Increase the consistency between contents and their metadata Ensure content integrity Version control
Plans and Future Work ◦ Access and integrate Banner system◦ Interview more users◦ Implement audit logging and provenance◦ Design import and export functions
19,315 VT ETDs as of Sept. 7, 2011 e.g., ETD-db (1997-2011) works really, really well
10,051 accessible worldwide [240
BTDs]52%
8,551 VT-only ac-cess [7,089 BTDs]
44%
102 mixed access1%
611 withheld from access
3%
ETD-db 2.0Comments? Questions?
Contact Sung Hee [email protected]