Top Banner
Union Catalog Architecture Tsach Moshkovits, Development Team Leader Olybris, Ex Libris Seminar 2005 Kos, April 2005
33

Union Catalog Architecture Tsach Moshkovits, Development Team Leader Olybris, Ex Libris Seminar 2005 Kos, April 2005.

Dec 13, 2015

Download

Documents

Linette Martin
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Union Catalog Architecture Tsach Moshkovits, Development Team Leader Olybris, Ex Libris Seminar 2005 Kos, April 2005.

Union Catalog Architecture

Tsach Moshkovits, Development Team Leader

Olybris, Ex Libris Seminar 2005

Kos, April 2005

Page 2: Union Catalog Architecture Tsach Moshkovits, Development Team Leader Olybris, Ex Libris Seminar 2005 Kos, April 2005.

Union Catalog Architecture2

Overview

The Union Catalog is a sophisticated mechanism that supports the integration of disparate libraries into a single environment.By environment, we mean a unified User view, rather than a single database or a merged index.

Page 3: Union Catalog Architecture Tsach Moshkovits, Development Team Leader Olybris, Ex Libris Seminar 2005 Kos, April 2005.

Union Catalog Architecture3

Overview

The following will be discussed in this session:

Union catalog structureUnion catalog vs. Unified catalogEquivalencyMerge

Page 4: Union Catalog Architecture Tsach Moshkovits, Development Team Leader Olybris, Ex Libris Seminar 2005 Kos, April 2005.

Union Catalog Architecture4

A Unified Catalog

Usually, a Union catalog involves a catalog where all Equivalent records are merged into one new record. In this scenario, the original records are not saved, and the index is built on the merged version of the records.Obviously, the merged record must include information about its different parts to allow navigation from the record to remote resources.

Page 5: Union Catalog Architecture Tsach Moshkovits, Development Team Leader Olybris, Ex Libris Seminar 2005 Kos, April 2005.

Union Catalog Architecture5

Unified Catalog Drawbacks

Match and Merge is preformed on load time, record by record. This is a slow process when additional resources are added.A new resource may not be available until the slow load process is completely finished.Updating a record is complex, since it may require more than just updating its merged record. This is true because the equivalence relation is not necessarily transitive.

Page 6: Union Catalog Architecture Tsach Moshkovits, Development Team Leader Olybris, Ex Libris Seminar 2005 Kos, April 2005.

Union Catalog Architecture6

Unified Catalog Drawbacks

Merging becomes even more problematic if the merge algorithm suggests that not all data is preserved for every source record. In such a case, any match and merge process must re-access all remote resources to retrieve all original records.It is also impossible to update the unified catalog with a standard Cataloging GUI.

Page 7: Union Catalog Architecture Tsach Moshkovits, Development Team Leader Olybris, Ex Libris Seminar 2005 Kos, April 2005.

Union Catalog Architecture7

ALEPHUnion Catalog

Contributors

A Import Load /

CatalogNew/Update/Delete

B Create Equivalence

C Merge “Just in Time”

Equivalence Table (Z120)

IndicesOriginal Records

Unified Catalog Structure – Virtual Approach

Page 8: Union Catalog Architecture Tsach Moshkovits, Development Team Leader Olybris, Ex Libris Seminar 2005 Kos, April 2005.

Union Catalog Architecture8

Union Structure – Level A

Records are stored as distinct entities in the database.Records can be loaded from an external resource or cataloged with the ALEPH Cataloging module.Records from an external resource can hold an identifier to the external resource to allow simple updating or navigation to an external resource.Indices are created using the standard ALEPH indexing scheme.

Page 9: Union Catalog Architecture Tsach Moshkovits, Development Team Leader Olybris, Ex Libris Seminar 2005 Kos, April 2005.

Union Catalog Architecture9

Union Structure – Level B

An Equivalence table is created by mapping each record to its equivalent records.The equivalence relation is not necessarily transitive.This table can be recreated any time, leaving the records intact.

Page 10: Union Catalog Architecture Tsach Moshkovits, Development Team Leader Olybris, Ex Libris Seminar 2005 Kos, April 2005.

Union Catalog Architecture10

Union Structure – Level C

Result sets will be de-duplicated to contain only one record per group of equivalents.Browse lists will de-duplicate their counters to count only one record per group of equivalents.User View uses on-the-fly Merge to present a single record that is built from a group of equivalents.The Merge algorithm can vary from user to user.

Page 11: Union Catalog Architecture Tsach Moshkovits, Development Team Leader Olybris, Ex Libris Seminar 2005 Kos, April 2005.

Union Catalog Architecture11

It is simple to update a record by unlinking it from the Equivalence table and marking it as “New.” This action breaks all existing connections in the group.A new record is simply inserted as equivalent only to itself.In all cases, the data of each record stays intact in the database.

Virtual Approach Advantages

Page 12: Union Catalog Architecture Tsach Moshkovits, Development Team Leader Olybris, Ex Libris Seminar 2005 Kos, April 2005.

Union Catalog Architecture12

A separate job runs on all equivalency tables marked as “New.” The job assures that records in a group are evaluated for their real equivalency.It takes no longer to load external resources here than it does to load and index in ALEPH.

Virtual Approach Advantages

Page 13: Union Catalog Architecture Tsach Moshkovits, Development Team Leader Olybris, Ex Libris Seminar 2005 Kos, April 2005.

Union Catalog Architecture13

The worst-case effect of update, insert, or delete is that between the time a record is updated, until the time that equivalency entries are (re)created, the group of equivalent records appears as non-equivalent.There is 100% uptime.

Virtual Approach Advantages

Page 14: Union Catalog Architecture Tsach Moshkovits, Development Team Leader Olybris, Ex Libris Seminar 2005 Kos, April 2005.

Union Catalog Architecture14

The same uptime considerations apply if the match algorithm is to be changed.Changing the merge algorithm has absolutely no effect, since it is executed “just in time.”

Virtual Approach Advantages

Page 15: Union Catalog Architecture Tsach Moshkovits, Development Team Leader Olybris, Ex Libris Seminar 2005 Kos, April 2005.

Union Catalog Architecture15

An equivalency table is created for each record in the database, and points to itself.Pool selection:

The equivalency search is minimized to a certain number of candidates. This is usually done on a direct index, such as ISBN, ISSN, or LCCN, and is therefore relatively fast.If the number of candidates exceeds a certain limit, the record itself will be considered as the only candidate.

Equivalency Table Creation

Page 16: Union Catalog Architecture Tsach Moshkovits, Development Team Leader Olybris, Ex Libris Seminar 2005 Kos, April 2005.

Union Catalog Architecture16

Final match:The equivalent records from the pool are found.Matching and conflicting fields are searched.Matching adds a positive weight, while conflicts add a negative weight.The total weight is checked against a threshold.

Equivalency Table Creation

Page 17: Union Catalog Architecture Tsach Moshkovits, Development Team Leader Olybris, Ex Libris Seminar 2005 Kos, April 2005.

Union Catalog Architecture17

When both stages are complete, each record has a Z120 record, holding the numbers of all equivalent records.Z120 is never empty. It holds the record’s own number if no equivalencies are found.Both the pool selection program and the match program are table-defined, not hard-coded

Equivalency Table Creation

Page 18: Union Catalog Architecture Tsach Moshkovits, Development Team Leader Olybris, Ex Libris Seminar 2005 Kos, April 2005.

Union Catalog Architecture18

Merge

When a user wants to view a record, a merge is done on all the records in its equivalency table, combining them into a single display.No merged record actually exists in the database. This is a virtual display created on request.

Page 19: Union Catalog Architecture Tsach Moshkovits, Development Team Leader Olybris, Ex Libris Seminar 2005 Kos, April 2005.

Union Catalog Architecture19

Merge

A merged record display is built by taking the “basic” fields from the preferred record and adding other fields from each of its equivalent records.The preferred record is selected by assigning weights to all the equivalent records based on table-defined criteria, and the top weight wins.The merge program is also table-defined.

Page 20: Union Catalog Architecture Tsach Moshkovits, Development Team Leader Olybris, Ex Libris Seminar 2005 Kos, April 2005.

Union Catalog Architecture20

Implementation

The union_global_param tables defines the programs (algorithms) used for different Union Catalog tasks.

! 1 2 3 4

!!!!!-!-!!!!!!!!!!!!!!!!!!!!-!!!!!!!!!!!!!!!!!!!!

USM90 B candidate_prog union_candidate_cdl

USM90 B match_prog union_match_cdl

USM90 B preferred_prog union_preferred_cdl

USM90 B merge_prog union_merge_aleph

USM90 B normalize_prog union_normalize_cdl

Page 21: Union Catalog Architecture Tsach Moshkovits, Development Team Leader Olybris, Ex Libris Seminar 2005 Kos, April 2005.

Union Catalog Architecture21

!!!!!-!!!!!!-!!!!!!!!!!-!!!!!!!!!!!!!!!!!!!!!!!!!-!!!

LDR F05-01 EQUAL d -10

LDR F17-01 NOT-EQUAL 1,2,3,4,5,7,8,u,z 001

100## PRESENT 001

110## PRESENT 001

111## PRESENT 001

130## PRESENT 001

The table defines a value for each field. All values are added according to the specifications in the middle columns.

The record with the highest value is selected as the preferred record.

Preferred Table – An Example

Page 22: Union Catalog Architecture Tsach Moshkovits, Development Team Leader Olybris, Ex Libris Seminar 2005 Kos, April 2005.

Union Catalog Architecture22

Match Table – An Example

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!-!-!!!>

date exact match + 200

date within 2 - 025

date mismatch - 250

short title match + 450

full title match + 600

full title occur within + 350

full title mismatch - 600

full title keywords + 450

full title keywords order + 050

260b exact match + 100

260b occur within + 100

260b mismatch - 025

The accumulative sum will be compare against a defined threshold

Page 23: Union Catalog Architecture Tsach Moshkovits, Development Team Leader Olybris, Ex Libris Seminar 2005 Kos, April 2005.

Union Catalog Architecture23

Match Table – An Example

Different fields are compared to determine whether two records match.For each field, if a match is found, the plus value is added to the total match weight. Otherwise, the minus value is subtracted from the total matched weight.The threshold in the first line defines the weight above which two records are considered a match.

Page 24: Union Catalog Architecture Tsach Moshkovits, Development Team Leader Olybris, Ex Libris Seminar 2005 Kos, April 2005.

Union Catalog Architecture24

Workflow Illustration

Single BIB

record

BIB’s pool of

candidates

BIB’s pool of matched

records

(= equiv table)

queue of new/updated records

Resources Contributors

Page 25: Union Catalog Architecture Tsach Moshkovits, Development Team Leader Olybris, Ex Libris Seminar 2005 Kos, April 2005.

Union Catalog Architecture25

“Union Catalog” - On top of Bibliographic + Holdings database

“Union View” - On top of ALEPH 500 administrative database

Two Types of Union Catalogs

Page 26: Union Catalog Architecture Tsach Moshkovits, Development Team Leader Olybris, Ex Libris Seminar 2005 Kos, April 2005.

Union Catalog Architecture26

SOURCE 1 SOURCE 2 SOURCE 3

UNION CATALOG

JUMPNormalizerecords

Bibliographic and Holdings Database

Page 27: Union Catalog Architecture Tsach Moshkovits, Development Team Leader Olybris, Ex Libris Seminar 2005 Kos, April 2005.

Union Catalog Architecture27

When records are loaded from various resources, fixes are done to normalize their structure and data.Checks could be performed prior to the load so that incompatible records are rejected.

Bibliographic and Holdings Database

Page 28: Union Catalog Architecture Tsach Moshkovits, Development Team Leader Olybris, Ex Libris Seminar 2005 Kos, April 2005.

Union Catalog Architecture28

Jump to original

View in union

holdings

Bibliographic and Holdings Database

Page 29: Union Catalog Architecture Tsach Moshkovits, Development Team Leader Olybris, Ex Libris Seminar 2005 Kos, April 2005.

Union Catalog Architecture29

ADM 1 ADM 2

ADM 3

BIB 2 BIB 3

Librarian View

Union Catalog - User View

BIB 1

ALEPH 500 Database

Page 30: Union Catalog Architecture Tsach Moshkovits, Development Team Leader Olybris, Ex Libris Seminar 2005 Kos, April 2005.

Union Catalog Architecture30

Records are managed in standard ALEPH 500 in a single BIB and ADM library, but separately per sub-library or administrative unit.The Staff User view does not change from an administrative GUI prospective.A user (patron) has a unified view on the PAC.

ALEPH 500 Database

Page 31: Union Catalog Architecture Tsach Moshkovits, Development Team Leader Olybris, Ex Libris Seminar 2005 Kos, April 2005.

Union Catalog Architecture31

ALEPH 500 Database

Page 32: Union Catalog Architecture Tsach Moshkovits, Development Team Leader Olybris, Ex Libris Seminar 2005 Kos, April 2005.

Union Catalog Architecture32

ALEPH 500 Database

Page 33: Union Catalog Architecture Tsach Moshkovits, Development Team Leader Olybris, Ex Libris Seminar 2005 Kos, April 2005.

Union Catalog Architecture33

ALEPH 500 Database