1 Interoperability : architectures and connections John Gilby, M25 Systems Team, LSE Ashley Sanders, Copac Team, MIMAS "Hyper Clumps, Mini Clumps and National Catalogues: resource discovery for the 21st century“ 11th November 2004, British Library, London
22
Embed
1 Interoperability: architectures and connections John Gilby, M25 Systems Team, LSE Ashley Sanders, Copac Team, MIMAS "Hyper Clumps, Mini Clumps and National.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
1
Interoperability:architectures and connections
John Gilby, M25 Systems Team, LSE
Ashley Sanders, Copac Team, MIMAS
"Hyper Clumps, Mini Clumps and National Catalogues: resource discovery for the 21st century“
11th November 2004, British Library, London
2
Contents
• Overview of technical architecture of union catalogues (Copac & InforM25)
• Introduce Z39.50 to Z39.50 middleware & issues to consider
• CC-interop and JAFER– Installation, configuration & testing
• Results set issues and searching times
3
A reminder, Z39.50 is:
• a standard for information retrieval
• a client/server relationship– Z-client – stand-alone in PC or associated
with web server/user interface– Z-server - generally a module in library
systems
• a method for communication between disparate computer systems (such as a library catalogue and a user’s PC)
4
Copac• has 26 libraries (including large research, academic and BL, NLS)
• geographically covers whole of UK• JISC funded, administered by MIMAS• has “control” over indexes and searching process• can be searched via Z39.50• periodic data loads• live circulation data via Z39.50 = very successful
and popular with users• Copac V3 – experimental Z39.50 searching of
Copac and National Library of Wales
5
Incoming MARC records from contributing institutions
Record pre-processing:
standardisation & problem
identification
Copac database
Z-server, OpenURL& web interface
Formation of consolidated and
individual records & indexes
CURL database creation
MARC21 & UKMARC
Duplicate checks
pass/fail
web Z39.50
CURL/Copac database creation
6
Distributed catalogue• typically has up to 40 library catalogues (academic – CAIRNS,
InforM25, RIDING; Public - WiLL)
• regionally based
• funded by regional organisation
• rely on institutional catalogues for record standards, indexing and Z-server configurations
• some control over Z39.50 searching process
• data is as up to date as library OPAC
• ‘clump’ software combines result sets and presents them to user
• generally cannot accept queries outside of user interface
7
User
Copacsingle, large
database
Distributed catalogueZ-client software and
user interface
Z-server/institutional library systems
network
network
Union catalogues
8
Z to Z Middleware
Remote user
Z-client
Z39.50to
Z39.50Middleware
Institution Z-serverA
Institution Z-serverB
Z39.50
Z39.50
‘Local’ user web interface
e.g. M25 libraries
e.g. Copac V3
9
Connection Issues
• When to make connections ?• Which Z-servers ?
– selecting some/all, landscaping
• Access & Authentication– handled by middleware
• Timing of middleware response– user’s client is expecting single response– middleware has to wait for Z-servers to respond
before it responds to client– automatic time-out advisable
10
Search & Result Set Issues
• Query transformation– multiple Z-servers behave differently to an incoming
query– user sends query in their own ‘format’ (attribute set)– need to avoid failed searches– middleware transforms query to form suitable for
individual Z-servers
• Response aggregation– user’s client cannot know hits/Z-server– client must display origin of record– various options
11
and so to JAFER
• Middleware options for CC-interop:– graft Z39.50 server onto existing InforM25
software– develop completely new software– use existing available software
• JAFER Toolkit Project (JISC 5/99 Programme)– readily available & supported– could do most of what was required
12
Working with JAFER
• JAFER: http://www.jafer.org/– increased the JAFER logging facilities– established subsets of libraries for
searching– produced XSLT stylesheets
• Created new Copac Interface– copy of standard Copac web interface
tailored for testing JAFER
17
Search tests
• Search set 1 - Copac Z39.50 criteria– no query transformations
• Access succeeded– some searches received no response
19
Search test results — 2
• Response with Copac search settings– 203 searches carried out– 95 failed to return a result (0 or more
records)
• Response with InforM25 settings– 199 searches carried out– 3 failed to return a result (0 or more
records)
20
Middleware benefits
• Simplifies access to range of catalogues• Query transformation improves search
success rate• Virtual catalogue staff can:
– provide centralised development and maintenance– identify and investigate problems– act as a central contact point
• Can interconnect the (JISC) Information Environment
• Potentially useful for a National Catalogue
21
Search problems/solutions• Users lose control of query• Search consistency
– failure of catalogues to respond– lowest common denominator or all options?– catalogues searching different fields– catalogues searching fields in different ways
• Standardisation– profiles eg. Bath Profile – work on index standardisation
22
Response times
• Improved access to resources– benefits end-user and library staffBUT– impacts on local catalogue– over-large result sets– duplication of material
• Response times– impact on local catalogue searcher– impact on virtual catalogue searcher
23
Response time test
• Hourly search for ‘Austen’ – record time taken to obtain search result– does not include record collection or result processing
• Number of searches responding– c.90% within 2 seconds– c.4% within 4-27seconds
• Overall response time governed by slowest catalogue– Timeouts for slow to- or non-responding catalogues
24
Restricted searches
• Should all searches be sent to all catalogues?– control where searches are sent initially
– pre-defined search groups - by location/subject?
• Better to deal with large result sets through ranking and/or sorting?– which brings us back to response times…
25
Summary & what next ?
• JAFER tests - middleware works• Enables distributed catalogues to be
‘plugged into’ the IE• Dynamic resource selection is technically
feasible• Clump services interested• Further investigations:
– Response-time tests– Results processing
26
Further details
• Reports on the project website:http://ccinterop.cdlr.strath.ac.uk/documents.htm