Top Banner
DSpace: Technical Basics Iryna Kuchma Open Access Programme Manager www.eifl.net Attribution 3.0 Unported
30

DSpace: Technical Basics Iryna Kuchma Open Access Programme Manager Attribution 3.0 Unported.

Dec 16, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: DSpace: Technical Basics Iryna Kuchma Open Access Programme Manager  Attribution 3.0 Unported.

DSpace: Technical Basics

Iryna KuchmaOpen Access Programme Manager

www.eifl.netAttribution 3.0 Unported

Page 2: DSpace: Technical Basics Iryna Kuchma Open Access Programme Manager  Attribution 3.0 Unported.
Page 3: DSpace: Technical Basics Iryna Kuchma Open Access Programme Manager  Attribution 3.0 Unported.
Page 4: DSpace: Technical Basics Iryna Kuchma Open Access Programme Manager  Attribution 3.0 Unported.

Application Architecture

The DSpace system is organised into three tiers which consist of a number of components

Each layer only invokes the layer below it i.e. the application layer may not used the storage layer directly

Page 5: DSpace: Technical Basics Iryna Kuchma Open Access Programme Manager  Attribution 3.0 Unported.

The Storage Layer

The storage layer is responsible for physical storage of metadata and content

DSpace uses a relational database to store all information about the organization of content, metadata about the content, information about e-people and authorization, and the state of currently-running workflows.

Page 6: DSpace: Technical Basics Iryna Kuchma Open Access Programme Manager  Attribution 3.0 Unported.

The Business Logic Layer

The business logic layer deals with managing the content of the archive, users of the archive (e-people), authorization, and workflow

Page 7: DSpace: Technical Basics Iryna Kuchma Open Access Programme Manager  Attribution 3.0 Unported.

The Application Layer

The application layer contains components that communicate with the world outside of the individual DSpace installation, for example the Web user interface and the Open Archives Initiative protocol for metadata harvesting service

The DSpace Web UI is the largest and most-used component in the application layer. Two versions:

1. JSPUI: Built on Java Servlet and JavaServer Page technology2. XMLUI (Manakin): Built on XML and Cocoon technology

Page 8: DSpace: Technical Basics Iryna Kuchma Open Access Programme Manager  Attribution 3.0 Unported.

Server Architecture

Web Application Server

User Interface

These systems may reside on a single server or be hosted separately on dedicated servers

Page 9: DSpace: Technical Basics Iryna Kuchma Open Access Programme Manager  Attribution 3.0 Unported.

Structural Overview

DSpace is split into three directory trees:Source Directory [dspace-src]

Surprisingly, this is where the source code resides

Install Directory [dspace] Populated during install & during normal operation Contains:

Configuration files Command line tools Libraries DSpace archive (depending on configuration)

Web Deployment Directory [tomcat]/webapps/dspace Contains the JSPs and Java classes and libraries necessary to run

DSpace

Page 10: DSpace: Technical Basics Iryna Kuchma Open Access Programme Manager  Attribution 3.0 Unported.

Persistent Identifiers

The use of location based identifiers such as the Uniform Resource Locator (URL) often leads to problems in accessibility to resources with time

Often when accessing a resource via a hyperlink users receive a “404 - page not found” error

Persistent identifiers are an attempt at solving the issues surrounding resource identification and long term preservation

A persistent identifier allows the resource to be uniquely identified in a way that will not change if the resource is renamed or relocated

Page 11: DSpace: Technical Basics Iryna Kuchma Open Access Programme Manager  Attribution 3.0 Unported.

Persistent Identifiers

This means that a resource can be reliably referenced for future access by humans and software

Caveat: Persistence is heavily dependant on organisation policy i.e. persistence of an object is only effective if an organisation maintains and manages this persistence

Different systems in use for persistent identifiers Persistent Uniform Resource Locators (PURLs) Digital Object Identifiers (DOI) Handle – Used by DSpace

Page 12: DSpace: Technical Basics Iryna Kuchma Open Access Programme Manager  Attribution 3.0 Unported.

The Handle

In a handle system, resource address is identified by a unique handle assigned by a common registration service

Registration Service Handle Prefix Local Identifier

http://hdl.handle.net 2160 568

http://hdl.handle.net/2160/568

Page 13: DSpace: Technical Basics Iryna Kuchma Open Access Programme Manager  Attribution 3.0 Unported.

Practical: Using a Handle

Navigate to Aberystwyth’s DSpace repository – Cadair Select an item from a collection and note the handle address

Open this address in a new browser window

The handle will resolve an redirect back to your original item

Page 14: DSpace: Technical Basics Iryna Kuchma Open Access Programme Manager  Attribution 3.0 Unported.

Configuring the Handles service

Out of the box, a DSpace installation will use the handle: hdl:123456789

These aren't really Handles, since the global Handle system doesn't actually know about them

3 Steps to handle configuration

Page 15: DSpace: Technical Basics Iryna Kuchma Open Access Programme Manager  Attribution 3.0 Unported.

Configuring the Handles service

In order to use handle in DSpace, registration for a prefix with the Corporation for National Research Initiatives (CNRI) is required

How to register with CNRI? Complete the registration form on the CNRI website Create & Upload the sitebndl.zip to CNRI Pay a small annual fee

http://www.handle.net/service_agreement.html

Page 16: DSpace: Technical Basics Iryna Kuchma Open Access Programme Manager  Attribution 3.0 Unported.

Generating the sitebndl.zip

The Site Bundle is an archive which contains information about your DSpace installation and is used to generate your handle

To generate the sitebndl.zip run the command:

[dspace]/bin/dsrun net.handle.server.SimpleSetup[dspace]/handle-server

You will be required to complete a series of questionsOnce completed the sitebndl.zip can be found:

[dspace]/handle-server/sitebndl.zip

Complete the registration and upload the sitebndl.zip

Page 17: DSpace: Technical Basics Iryna Kuchma Open Access Programme Manager  Attribution 3.0 Unported.

Configuring the Handle Server

Once registration is complete, a handle should be returned from CNRI

Edit the [dspace]/handle-server/config.dct to include the lines in the “server_config” clause:

"storage_type" = "CUSTOM""storage_class" = "org.dspace.handle.HandlePlugin”

Update all references to YOUR_NAMING_AUTHORITY to your assigned handle:

300:0.NA/YOUR_NAMING_AUTHORITY -> 300:0.NA/2097

Configuring the Handle Server

Page 18: DSpace: Technical Basics Iryna Kuchma Open Access Programme Manager  Attribution 3.0 Unported.

Updating the Handle Prefix

Edit [dspace]/config/dspace.cfg and update the handle prefix

A restart of Tomcat will be requiredIf items have already been deposited into DSpace their handle

will need updating[dspace]/bin/update-handle-prefix 123456789

YourHandle

Page 19: DSpace: Technical Basics Iryna Kuchma Open Access Programme Manager  Attribution 3.0 Unported.

Starting the Handle Server

Finally start the handle server

[dspace]/bin/start-handle-server

A script will be required to automate the starting of the handle server upon a server boot

Once configured the handles should resolve as the practical demonstrated earlier in this module

Page 20: DSpace: Technical Basics Iryna Kuchma Open Access Programme Manager  Attribution 3.0 Unported.

Workflow scenarios

Scenario 1: Head of research

I want to be able to see everything my researchers deposit for quality control

purposes

Page 21: DSpace: Technical Basics Iryna Kuchma Open Access Programme Manager  Attribution 3.0 Unported.

Workflow scenarios

Scenario 2: Repository manager

I want to approve everything that goes in to the repository to make sure there are no

copyright issues or bad metadata

Page 22: DSpace: Technical Basics Iryna Kuchma Open Access Programme Manager  Attribution 3.0 Unported.

Workflow scenarios

Scenario 3: Cataloguer

I want to be able to see everything my researchers deposit for quality control

purposes

Page 23: DSpace: Technical Basics Iryna Kuchma Open Access Programme Manager  Attribution 3.0 Unported.

The three workflows

DSpace has three workflow steps1. Accept/Reject Step

2. Accept/Reject/Edit Metadata Step

3. Edit Metadata Step

You can use any combination of the three Steps are worked through in order

Which might be used in each of the previous scenarios?

Page 24: DSpace: Technical Basics Iryna Kuchma Open Access Programme Manager  Attribution 3.0 Unported.

RSS feeds

RSS feeds– Site level (all new items)– Community level (new items in all contained

collections)– Collection level (new items in that collection)

Can be read in modern web browsers

Can be subscribed to in news reader software

Page 25: DSpace: Technical Basics Iryna Kuchma Open Access Programme Manager  Attribution 3.0 Unported.

Alerts

Alerts– Created by users– Created for a collection– Emails sent each day for new items– Script must run daily:

• [dspace]/bin/sub-daily

Page 26: DSpace: Technical Basics Iryna Kuchma Open Access Programme Manager  Attribution 3.0 Unported.

DSpace statistcis

DSpace statistics:– Collated from DSpace log files– Reports generated daily (daily and monthly

reports)– http://dspace.example.com/dspace/statistics

• Or via the Administer menu

– Can be private (must be logged in) or public• In dspace.cfg:

– report.public = [true|false]

Page 27: DSpace: Technical Basics Iryna Kuchma Open Access Programme Manager  Attribution 3.0 Unported.

Statistics collected

The following statistics are collected– General overview (e.g. number of items

archived / number of item views / user logins)– Archive Information (numbers of each type of

item)– Item view counts– Actions performed– Search terms used

Page 28: DSpace: Technical Basics Iryna Kuchma Open Access Programme Manager  Attribution 3.0 Unported.

Google Analytics

Google Analytics allow a richer and more detailed suite of statistics

• Time visitors spent on the site• Where they came from• Terms they used in search engines to find items• The geographic location of visitors• How many pages they looked at• Which pages they started and ended their visit on

– JSPUI requires a small code change, Manakin has a configurable option.

Page 29: DSpace: Technical Basics Iryna Kuchma Open Access Programme Manager  Attribution 3.0 Unported.

Credits

These slides have been produced re-using The DSpace Course by:– Stuart Lewis & Chris Yates

– Repository Support Project http://www.rsp.ac.uk/

– Part of the RepositoryNet

– Funded by JISChttp://www.jisc.ac.uk/

Page 30: DSpace: Technical Basics Iryna Kuchma Open Access Programme Manager  Attribution 3.0 Unported.

Thank you! Questions?