Click here to load reader
Click here to load reader
Feb 23, 2016
ETD-db 2.0: Rewriting ETD-dbSung Hee Park*Paul Mather , Kimberli Weeks , Collin Brittle, Gail McMillan, Edward Fox *
Digital Library and Archives, University Libraries, Virginia Tech*Digital Library Research Lab, Computer Science, Virginia Tech
14th International Symposium on Electronic Theses & DissertationsSept. 15,2011, Cape Town, South AfricaETD-db is the ETD database software developed at Virginia Tech, which we make freely available to members of the National Digital Library of Theses and Dissertations (NDLTD). ETD-db provides a workflow that enables ETD submission, approval, and access through a series of Web pages and perl scripts that interact with a MySQL database.
The ETD database is a series of web pages and perl scripts that interact with a MySQL database. These scripts provide a standard interface for web users and researchers, ETD authors, graduate school personnel, and library personnel to enter and manage the files and metadata related to a collection electronic theses and dissertations. 1ContentsIntroductionRequirementsDesignImplementationTestsSummary and Future WorkThis presentation is organized into 5 parts.describes the requirements of ETD-db 2.0 from stakeholders illustrates authentication and authorization workflows and design objects diagrams discusses evaluation of the functions and objects in the Test sectiondemonstrates usages with scenarios concludes with our contributions and future work plan
2IntroductionETD-db 1.0Workflow: submit, manage, search, brows1995/97: James Powell, 1998/2001 Tony AtkinsTechnical Directors, VT Digital Library and ArchivesWeb applications, Perl, MySQLImprovements suggested byStakeholders (e.g., system admins, managers, authors) R. Jones 2004 article compares DSpace & ETD-dbThe ETD-db system is a digital repository system for submitting, searching, browsing, and managing ETD collections. It was originally developed b yJames Powel, the first technical Director of Virginia Techs precursor to DLA, the Scholarly Communications Project. In 1998 the Digital Library and Archives 2nd Technical Director, Tony Atkins upgraded Powells work, which has been functioning beautifully for over a decade with patches and band aids by our subsequent Technical Directors and Systems Administrators, including, currently Paul Mather.
Virginia Tech has been using ETD-db system as its main repository since we required the first ETD in Jan. 1997. In addition to born digital ETDs, paper versions of theses and dissertation have been scanned and integrated into the ETD-db system. Currently, the VT-ETD db hosts over 19,000 ETDs.
In Oct. 2010 Virginia Tech was honored by OCLC when the one millionth ETD harvested into WorldCat via the Digital Collection Gateway was by Xingwei Wang, who completed her doctorate at Vriginia Tech in electrical and computer engineering dissertation. It was titled "Label-free DNA Sequence Detection and can be viewed at http://scholar.lib.vt.edu/theses/available/etd-08142006-211154/.
Over the years since ETD-db has been in production, stakeholders (e.g., system administrators, managers, reviewers, authors) have suggested various improvements. Moreover, a researcher compared DSpace ("DSpace home page,") to the ETD-db system when choosing software to manage ETDs and pointed out some advantages of ETD-db compared to DSpace. ETD db 1.7c and DSpace v1.1.1 have already been compared by Jones (Jones, 2004).
3ETD-db 2.0New version of Virginia Techs ETD system Web application Ruby on Rails Model-View-Controller-based Web app framework Any databaseAny server supported by Ruby on RailsMajor objectives of rewriting ETD-dbImprove the original, powerful functionalities Handle ETD collections more reliably and securely This paper describes a new version of Virginia Techs ETD digital library system, which is still based on free, open-source software. ETD-db 2.0 is a Web application using Ruby on Rails, a Model-View-Controller-based framework. It continues to work with any database (e.g., MySQL, PostgreSQL, Oracle, etc.) and is hosted by any Web server supported by Ruby on Rails (e.g., Apache, NGINX). The major objectives of rewriting ETD-db were to (1) to improve the original powerful functionalities of the current ETD-db, and, (2) to handle ETD collections more reliably and securely.
To provide new, reliable and secure features, we conducted interviews. We talked to current ETD-db users, including system administrators, managers, and authors, catalogers and graduate school reviewers. Sung Hee analyzed their requirements using the Use Case Based Requirement Engineering approach.
This diagram the use cases. Blue represents new features for ETD-db 2.0 and yellow indicates features of the existing ETD-db system.5Major RequirementsImproving single password per role ETD-db One password per role, multiple staff per roleMore reliable and safer system Role management functionalitiesFiner-grained permissions Supporting fine-grained access control ETD-db Database-level access permission control For example, ETD-db 1.0 has submitted, available, and withheld ETD databases. ETD-db 2.0 digital object-level and action-level access permission
Among the requirements, we focused on (1) improving the single password per role and (2) supporting fine-grained access control as new features for administration and security.
Improving one single password per role: In ETD-db, one password per role exists even though more than two users have the same role (e.g., there are multiple Graduate School reviewers). ETD-db 2,0 is a more reliable and safer system, role management functionalities with finer-grained permissions need to be extended to multiple admin users.
Supporting fine-grained access control: Currently, ETD-db provides different access permissions to different databases. For example, ETD-db 1.0 has submitted, available, and withheld ETD databases. System administrators, managers, or reviewers have access to all ETD databases while authors have access permission only to the submitted ETD database. Different roles should not have only database-level access permission, but also digital object-level and action-level access permission for more secure access control. 6Design
According to the functional/non-functional requirements drawn through use case analysis, models for ETD-db 2.0 were designed as you see in this illustration. Sung Hee identified seven controllers through requirements engineering. In this section, he focuses on the workflows of the authentication, authorization, and submission processes. Unlike ETD-db 1.0, ETD-db 2.0 introduces a Person class, Role, and Permission.
7Designed ObjectsPersonClass neutral to roles Information about users RoleClass neutral to person Information about the role itselfPermissionClass defines actionsAction Class describes activity like CRUD (Create, Read, Update and Delete)Digital Object Class represents something such as metadata or contentPerson: A class that is neutral to roles and contains information about users such as a first name, a last name, an email address, roles, etc.
Role: A class that is neutral to person and contains information about the role itself, like role names.
Permission: A class that defines which actions may be taken on which digital objects by a given role.
Action: A class that describes an action like CRUD (Create, Read, Update and Delete) functions including approve, catalog, import & export.
DigitalObject: A class that represents a digital object, for example, metadata, contents, and each role.
8Reference Metadata Schema
Authentication is a process to check whether a user is appropriately registered. This process will be verified by a centralized authentication Web services (e.g., Google ID, Open ID, Shiboleth, CAS or ED-Auth) or a user table in the database that ETD-db 2.0 hosts. [Although we designed centralized authentication by web services in the design phase, currently, we have only implemented authentication by Ed-Auth, an LDAP-based authentication method used at Virginia Tech.] If centralized authentication Web services are used for authentication, the corresponding authentication method will verify the users credentials. If the user is successfully authenticated, workflow will be passed to the authorization process. If not, the control will be redirected to the Forgot Password and Help page. 9Authorization
Authorization is the process to verify whether a user has an appropriate privilege or not. This slide shows an authorization flowchart when a user has multiple roles. In the ETD-db rewrite, we focused on this and on different users with the same role. ETD-db 2.0 is designed to allow different users to have the same role (e.g., multiple Graduate School staff are authorized to process a queue of ETDs awaiting approval) and authenticate with their own user name and password. In addition, ETD-db 2.0 aims to authorize a user with multiple roles (e.g., administrator or manager and reviewer). 10Submission Process
Submission Strictly speaking, a submission process is different from a login process as an author. Once authentication as a valid user and authorization as an author role has succeeded, the user can see a new ETD submission link or a list of incomplete (pending) ETD submission, if any.
To make it easier for an author to submit a new ETD and to improve metadata consistency, ETD-db 2.0 derives author information from the Banner administrative system. Basic Information about a user (e.g., department, degree, document type, etc.) is provided by the Banner system and injected into the submission process. The author also enters information through the Web interface. 11Show ETDs by Author View
We are implementing each view (e.g., show, new, edit, delete, etc.)