@ulcc www.ulcc.ac.uk EPrints User Group The meeting will begin at 10.30am on Tuesday 13 th January 2015 #EPrintsUG
Jul 14, 2015
@ulcc www.ulcc.ac.uk
EPrints User Group
The meeting will begin at 10.30am on Tuesday 13th January 2015
#EPrintsUG
@ulcc www.ulcc.ac.uk
Welcome & Matters from previous meeting
http://bit.ly/EPrintsUG130115
#EPrintsUG
Valerie McCutcheon, Research Information Manager
End-to-End
Open Access (OA) Process Review and Improvements
#e2eoa
http://e2eoa.org/[email protected]://openaccess.jiscinvolve.org/wp/pathfinder-projects/
Research Information Systems
Student
System
Research Mapping
System & Web Pages
Research
System
Enlighten -
Institutional
Repository
Finance
System
Human
Resources
System
PROJECT OUTPUTS
• Publications
• Datasets
• Artwork
• Exhibitions
• Impact (soon)
ALL REVIEWED BY
LIBRARY STAFF
Smug dissatisfaction?
Popular Process –Low Barrier for Authors
• Email details of every paper accepted for publication to:
and attach the accepted final version
• Library open access team checks funder details, what a publication allows, whether payment is required. We do the admin if we can – pay invoices, enter date into our repository.
REPOSITORY
SPREADSHEET
LIBRARY SERIALS
SYSTEM
FINANCE SYSTEM
PUBLISHERS
@Acceptance
date?
INTERMEDIARY
SERVICE
/
RCUK
REF
HORIZON 2020
LOCAL INFO
3500003285 618.35
3500003285 184.82
3500003285 251.69
3500003285 401.44
2520000360 1258.47
MDPI
Luminescent Measurement Systems for the Investigation of a Scramjet Inlet-Isolator Sensors
One Publication – Many Transactions
• OA Issues and Potential Solution
• Embedding future REF
requirement
• Advocacy
• Outcomes and Sustainability
E2E Project Planned Outputs –Workshops
Metadata specification
Improve OA reporting
Improve award linkage functionality
E2E Project Planned Outputs
RCUK
REF
HORIZON 2020
LOCAL INFO
Not all organisations
have Funder’s
Reference Linked to
Publications in their
Repository
– 'CASRAIfying’ – standard definitions
and generic technical specificationConsortia Advancing Standards in Research Administration Information UK
– Many inputsREF, RCUK/RIOXX, Horizon 2020, Charities Open Access Fund, Jisc (Meeting 16th Jan
2015), Institutional Requirements…..
– Community build - avoid replicationPURE, EPrints, Hydra, Symplectic, Converis, ARMA, SCONUL, SCURL, RLUK,
UKCoRR, COAR, CCC, Jisc initiatives….
MetadataCASRAI Standard
Use Case: REF Submission Prep
As an institution preparing for a REF
submission we need to ensure that those of our
publications which are in-scope for REF OA
policy meet the repository deposit deadline
requirements so that we can remove any
remaining compliance gaps before submitting.
16
Use Case: APC Intake
As a funder collecting reports from funded
institutions our evaluation unit needs
information on expenditure of funds related to
OA article-processing-charges so that we
can effectively monitor our OA policy.
17
Why CASRAI?
● Lack of interoperability a root cause of admin
burden and assessment blindness
● Need a sustainable agreement mechanism
● Avoid forcing agreement at the wrong layer
18
To be interoperable...
“...one should be actively engaged in the
ongoing process of ensuring that the systems,
procedures and culture of an organisation are
managed in such a way as to maximise
opportunities for exchange and reuse of
information, whether internally or externally.”
Paul Miller UKOLN/Jisc (2000)
19
What is CASRAI?
● nonprofit global network of organizations
● projects commissioned to develop common
reusable agreements for interoperability
● institutions, funders, vendors, libraries
20
Common Interop Patterns
● A - B Interopo information stored with one database needs to get
into another database (so a local step can occur)
● A + B Interopo information in multiple databases needs to be
aggregated (so analysis can occur)
● Coarse-grained or Fine-grained Interopo bulk reports or detailed queries
21
Applied to Open Access Interop
● Article submission (A - B)
● CRIS/Repository (A - B)
● APC reporting (A - B)
● OA Policy Monitoring (A + B)
● Others?
22
CASRAI-UK Open Access Group
● Glasgow (Chair)
● ESPRC (Chair)
● Jisc (Sponsor)
● MRC
● NERC
● Wellcome Trust
● HEFCE
23
● St. Andrews
● Kings
● Edinburgh
● Elsevier
● Eprints (new)
CASRAI-UK Open Access Profile
● Output Metadatao (ID; title; authors; version; etc)
● Access Metadatao (license type; freedoms; embargos)
● Compliance Metadato (first/subsequent deposits; checks)
● Financing Metadato (funder; APC; EU component; etc)
24
CASRAI-UK OA Interop Use Cases
● REF Post-2014 (A - B)
● APC Intake (A - B)
● Horizon 2020 (A - B)
● Internal Institutional Workflows
● Others??
25
Next Steps
● Finalize the policy view of the OA Data
Profile
● Create the technology view
● Public review/comment
● Implementation pilots / phase 2 work
26
Who is involved?
• Developed for:
• Funded by:
• Developed by: Paul Walk, Sheridan Brown
• Public consultation: V4OA project; rioxxwebsite
What is rioxx?
An application profile
… a set of metadata elements and
guidelines
Note: applies solely to publications
What is rioxx for?
the specific purpose of enabling funders to monitor compliance with their Open Access policies
Mandatory elements
1. dc:identifiera persistent identifier for the resource (http uri)
2. dc:language
3. dc:source
e.g. ISSN or ISBN for the electronic resource
4. dc:title
5. dcterms:dateAcceptedthe date on which the resource was accepted
for publication
6. rioxxterms:author
the use of an ORCID is recommended
7. rioxxterms:project
funder_name
recorded as text, and/or
funder_id
recommend using FundRef DOI or ISNI
an alphanumeric identifier for the project
must be recorded
8. rioxxterms:type
a controlled list of resource types is provided
9. rioxxterms:version
uses recommendations from NISO/ALPSP
Journal Article Versions Technical Working
Group, e.g. Accepted Manuscript (AM), Version
of Record (VoR)
10. ali:license_ref
• Specified by National Information Standards Organisation (NISO)
• http URI points to a license
• License has a start_date
• Multiple licenses may apply, allowing for expression of embargos
• If license is unknown, default position is “all rights reserved”
<ali:license_ref start_date=“2015-02-17”>http://
creativecommons.org/licenses/by/4.0</ali.license_ref>
Help!Jisc has funded an
EPrints plugin for rioxx
Jisc is funding technical support for the plugin:EPrints Services (hosted repositories)
ULCC (hosted repositories)
Peter West (non-hosted repositories)
@ulcc www.ulcc.ac.uk
Developing a Data Repository
Stephen Grace, David McElroy, Rory McNicholl and Timothy Miles-Board
#EPrintsUG
Developing a Data Repository
Stephen Grace and David McElroy, UEL
Rory McNicholl and Timothy Miles-Board, ULCC
EPrints User Group, Senate House London, 13 January 2015
ULCC and UEL
Developing a Data Repository
1. Developing data.uel
2. Demos
– CoinDOI
– Repository Links
3. Look and Feel
“Research organisations will ensure that EPSRC-funded research data is securely
preserved for a minimum of 10 years…”(EPSRC, Expectation VII)
Developing data.uel
• Rationale
• Functional Specifications
• Mock-ups
• Metadata Schemas
• Relational Diagrams
Rationale
• UEL adopted RDM policy March 2012
– Library & Learning Services (LLS) will create
a register of datasets…
– [and] a portal for datasets which are suitable
for sharing
• Separate to ROAR
– Different workflows
– Not everything will be Open Access
Functional Specifications
• Excel spreadsheets describing:
– What we wanted
– Why we wanted it
– Who was responsible for doing it
Functional Specifications
Description of what we think we
need
Our reasoning for this. By including this
we were able to take advantage of our
developers knowledge. If there is a
better way of doing something they let
us know.
Some aspects of
development were
shared. Above we were
to provide the metadata
profile
Metadata Schemas
• Based on ReCollect and Datacite
• Only mandatory Datacite fields are
mandatory in data.uel
“Research organisations will ensure that appropriately structured metadata describing the research data they hold is published and
made freely accessible on the internet…” (EPSRC, Expectation V)
Relational Diagrams• How projects
can be linked
to data
collections
• Potentially to
each other
over time
Look & Feel
• Developed in-house at UEL
– (with help from ULCC of course)
• Important to make the repositories feel like
part of UEL
– Branding (More later)
– Single Sign On!
Thank you
UEL:Stephen Gracehttp://orcid.org/0000-0001-8874-2671
@StephenGraceful
David McElroyhttp://orcid.org/0000-0002-0966-8862
@davidlmcelroy
Research Data Services at UEL
Repo data.uel.ac.uk
Web www.uel.ac.uk/researchdata/
Blog datamanagementuel.wordpress.com
ULCC:Rory McNicholl
@RoryMcN
Timothy Miles-Board
@drtjmb
Arkivum in 60 seconds
SLA with 100% data
integrity guaranteed
World-wide professional
indemnity insurance
Long term contracts for
enterprise data archiving
Fully automated and
managed solution
Audited and certified
to ISO27001Data escrow, exit
plan, no lock-in
Eprints: data deposit
EPrintsArkivum
Appliance
Review Approve
Appliance Cache
Ark
ivu
mS
erv
ice
Researc
her Editor
FilesFiles
Metadata
Files
Files
safe
Files
safe
Clear
cache
Clear
cache
EPrints StorageResearcher files
Files
safe
Delete
originals
Chain of custody
• Checksums to confirm correct transfers
• Lock-down the content
• Confirmation of replication
• Update metadata and audit trail
• Remove local copy (if desired)
Eprints: data access
EPrintsArkivum
Appliance
Review Approve
Appliance Cache
Ark
ivu
mS
erv
ice
Researc
her Editor
Files
ready
Request
data
Files
Files
Researcher files
Wait
Retrieve
files
Files ready
Files
Access workflow
• Immediate access to small files
• Review/approve/notify for large files
• Simple expectation of retrieval time
• Caching of files for repeated access
Access control done by EPrints
• Open access: no license, no barriers
• Unrestricted use, but request for access
• Unrestricted use, but charge for access
• Restricted use: request/approve access
• Embargoed: no access for set period
• Locked down: no access
Limitations of current approach
• All data goes through EPrints
• Can’t link to data already in the archive
• Can’t push data to publication platforms
• Can’t grant direct access to the archive
Direct data deposit into Arkivum
EPrintsArkivum
Appliance
Review
Approve
Deposit file
Appliance Cache
Ark
ivu
mS
erv
ice
Researc
her
Editor
Add URL
Get URL
Retrieve File
Check Data
Automated uploader
EPrintsArkivum
Appliance
Review Approve
Files
Appliance Cache
Ark
ivu
mS
erv
ice
Researc
her
EditorCheck Data
Uplo
ader
Metadata
URL
Prepare
dataset
Local storage
Work to be done
• Seamless experience for the user
• Security model
• Permissions: who can deposit, link, approve
• Efficient and reliable transfer of big files
• Relationship to SWORD
• Timestamps, UUIDs, thumbnails, indexing
• Quality control, sign-off
Questions?
Pluginhttp://bazaar.eprints.org/378/
Documentationhttp://files.eprints.org/978/1/ArkivumEPrintsPlugin21V11A.pdf
Wiki Pagehttp://wiki.eprints.org/w/Files/Configuration_and_User_Guide_for_version_2.1_of_the_EPrints/Arkivum_storage_plugin
@ulcc www.ulcc.ac.uk
It’s EPrints, but not as you know it…
Timothy Miles-Board & David McElroy
#EPrintsUG
<rioxxterms:project funder_name=“EPSRC”>JA/139K/02
</rioxxterms:project>
<rioxxterms:project funder_name=“NERC”>GLX_456_P
</rioxxterms:project>
<rioxxterms:project funder_name=“NERC”>JP/894Y/77
</rioxxterms:project>
Community Engagement Lead
• Understand and Promote Community Priorities
• Support Community Interactivity, Productivity and Outputs
• Manage EPrints Services’ interactions and engagement with the community
How can I help?
Response Number of Responders
Documentation IIIIIIII
Better Communication III
Roadmap III
New Features / More Development II
Training Materials / Webinars I
Developer Meets I
Capturing Community Needs I
Define and Strengthen Community I
Better Web Site I
Documentation
• Documentation means many things to many people
• Community based activity – Everyone can contribute
• Lots of ideas how to move forwards on this
• Lets have a discussion session this afternoon
Better Communication
• New role (Community Engagement Lead)– Email: [email protected]– Telephone: 023 8059 8814– Twitter: @gobfrey
• Nascent Comms Strategy– Website Relaunch (early 2015)– UK User Group Google Group– Social Media and Mailing List– Webinars
Roadmap
Milestone Date
EPrints 3.3.13 January 2015
EPrints 3.3.14 Early 2015
EPrints 3.4 Easter 2015
EPrints 4 End of 2015
EPrints 3.3.13
• Released January 8th, 2015• Minor release
– 150 Commits– 88 files changed
• General Stability and Security Improvements• Improvements to
– DOI handling, Multilingual support, Search Indexing, Embargo Handling, Filetypes, Metatdata Normalisation, Abstract Page Rendering, CC Licenses, Subject Trees, Database Layer, Unicode Ordering, reCAPCHA, Audio Handling, OAI Exports, RSS, EndNote Imports, ISI Imports, List Rendering, EPScript, LaTeX handling, Automated Tests, Apache 2.4 Support
EPrints 3.3.14
• To be released early 2015
• Minor Release
– Stability Improvements
• New Bazaar Features
– Channels and Badges
– Assigned Wiki Pages for Documentation
• Xapian Faceted Search by default
EPrints 3.4.0
• Coming around Easter 2015
• Response to a more diverse repository landscape
• Stepping Stone towards EPrints 4.0
EPrints 4 / 3.4.0 Key Philosophy
• “Base” EPrints storing and handling of generic data and objects
• “Layers” to handle specific metadata schema, import/export, rendering, search, etc. for specific domains
3.4 Releases
• Collections of metadata schemas, renderers, plugins and packages tested together for a specific purpose (the pizza model)
• Initial Releases for:
– Open Access Publications
– Open Education
– Open Data
Base
Pu
blic
atio
ns
Met
adat
a Sc
hem
a
IRU
S Tr
acke
r
Pu
blic
atio
ns
Ro
ute
r
OR
CID
Su
pp
ort
WO
K /
Sco
pu
s
Pro
ject
s /
Fun
de
r &
Rio
xx2
IRSt
ats
2
EPrints 3.4 for Publications
Base
Edu
cati
on
Met
adat
a Sc
hem
a
MeP
rin
ts
Bo
okm
arks
Co
llect
ion
s
EdSh
are
Styl
e U
ser
Inte
rfac
e
EPrints 3.4 for Open Education
Base
Res
earc
h D
ata
Met
adat
a Sc
hem
a
ReC
olle
ct
Dat
aCit
e
Ark
ivu
m
Exem
pla
r Si
mp
leSt
ora
ge P
lugi
n
EPrints 3.4 for Open Research Data
Larg
e Fi
le U
plo
ad
Mec
han
ism
s
EPrints User Access Control
Also know as: -
EPrints Access Control Layer
(ACL)
John Salter & John Beaman
Introduction
• We have spent some time trying to write an Access
Control system for EPrints
• It’s been a horror!
• One of our use-cases is for Research Data, but it could be
used on other repository types
Out of the box User Access Control
• EPrints (you all know what this is, right..?) has basic
control at the document level - the 'security' field:
– ‘public’ (Open Access)
– ‘validuser’ (anyone who's got an account on that EPrints instance)
– ‘staffonly’ (Repository editors/admins)
• This doesn't cover the requirements for some
repositories...
Requirements
• Control access to Eprints, Documents
• Control access based on:
– User attributes (e.g. signed-in via Shibboleth)
– Location / IP address (e.g. on-campus)
• Simple interface to assign restrictions
EPACL: EPrints Access Control Layer
• Modular Design
–Doesn't overwrite any existing 'security' specified on documents.
–Will be available as a standard EPrints Plugin (EPM package) via the
EPrints Bazaar
–ACL Authority modules governing different methods of Authentication
(see later) can be developed separately
– These modules can simply be 'dropped in' to the existing framework as
required
ACL Objects
• ACL Authority
– Corresponds to a method of authentication (e.g. Shibboleth, LDAP etc.)
• ACL Role
– Authorised by an ACL Authority with zero or more 'filters' applied
– Filters act as additional requirements (e.g. ACL_Authority='LDAP',
Filter='dc=leeds.ac.uk')
• ACL Group
– Each ACL Group consists of one or more ACL Roles
– The ACL Roles are combined within an ACL Group by being 'OR'ed or 'AND'ed
together
– One or more ACL Groups are applied to EPrints data objects (e.g. eprints,
documents)
– The ACL Groups applied to EPrints data objects are 'OR'ed or 'AND'ed together
Request vs User
• Two basic steps to authorise access - request and user
• First EPrints checks if the request has appropriate access
rights
• If so, any additional user requirements are also checked
• If not, the request is denied (since the user's credentials
are irrelevant at that point)
Dealing with rejection
• What happens when someone is denied access?
–Document landing pages
–Restricted summary pages
–Contact details to request access?
Summary Page citation style
• How do we deal with rejected requests (i.e. what do we
show)?
• We can define different citation styles depending on
whether the request is allowed or denied
• 'Restricted' citations may be required (e.g. to satisfy the
minimum metadata requirements of a DOI)
Document-level landing pages
• Partly born out of DOI requirements
• Individual documents may have their own DOI, so ideally
need their own ‘landing page’
• Document landing pages should contain at least the
mandatory metadata fields if linked from a DOI
The horrors (and the solutions)
• The out-of-the-box document security is deep within
EPrints (we learnt about doing some really cranky Perl-fu)
• Documents don't use the SummaryPage - again this is
deep in the code (we learnt lots about Eprints Triggers)
• Apache 2.2 and 2.4 differ in the way they handle IP
addresses (Seb did some patching of EPrints code - but
this hasn't been released - may be in 3.3.13?)
The horrors (and the solutions) [continued…]
• All sorts of crazy inheritance complexities
• EPrints doesn't use 'sessions' in the same way as most
other web software. Sessions are only created when a user
logs in (we think user accounts need to be automatically
created - is this right? It would be useful for tracking re-use)
• EPrints documentation is *ahem* offering room for
improvement */ahem*...
Thank you for listening!
• These presentation notes and other ACL documentation
can be found on the EPrints wiki at
http://wiki.eprints.org/w/EPrints_User_Group_2015-01-13
• Our contact details:
• Any questions?