Top Banner
LexEVS 101 Craig Stancl Rick Kiefer February, 2010
31
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: LexEVS 101 Craig Stancl Rick Kiefer February, 2010.

LexEVS101

Craig Stancl

Rick Kiefer

February, 2010

Page 2: LexEVS 101 Craig Stancl Rick Kiefer February, 2010.

LexGrid Model Overview

• The LexGrid Model is Mayo’s proposed mechanism for standard storage of controlled vocabularies and ontologies:

• Defines HOW vocabularies should be formatted and represented

• Flexible enough to accurately represent a wide variety of vocabularies and other lexically-based resources

• Defines several different server storage mechanisms and a XML format

• Provides the core representation for all data managed and retrieved through the LexEVS system

• Once the vocabulary information is represented in a standardized format, it becomes possible to build common repositories to store vocabulary content and common programming interfaces and tools to access and manipulate that content.

• Terminologies from widely varying resources such as RRF, OWL, and OBO can be loaded to a single data base management system and accessed with an single API.

• The LexGrid model stands alone as a complete terminology model

Page 3: LexEVS 101 Craig Stancl Rick Kiefer February, 2010.

LexBIG Model Overview

• LexBIG provides references into the data model without requiring resolution of the a complete terminology node set or graph.

• As such it functions as a kind of lazy loading mechanism, similar to what can be found in Hibernate.

• Elements of LexBIG that are resolved in a minimal manner can often avoid database calls by referring to a Lucene index, saving response time.

Page 4: LexEVS 101 Craig Stancl Rick Kiefer February, 2010.

LexGrid, LexBIG and LexEVS

• LexEVS: Optimizing query code that retrieves LexBIG objects.

• LexBIG: How the terminology service looks as objects returned to the user.

• LexGrid: How the terminology service looks in a data base.

LexGridData base

LexBIGObjects

LexEVSAPI

•LexEVS uses the LexBIG model in conjunction with the LexGrid model

Page 5: LexEVS 101 Craig Stancl Rick Kiefer February, 2010.

LexEVS Environment Architecture

• LexEVS consists of LexGrid Model & Storage, LexEVS Java API, LexEVS Distributed Service and LexEVS caGrid Service.

LexEVS caGrid Service

LexEVS Distributed / SDK Service

LexEVS Java API

LexGridModel & Storage

Page 6: LexEVS 101 Craig Stancl Rick Kiefer February, 2010.

LexEVS API - Local/Direct

• Now we will discuss LexEVS API.

LexEVS Java API

LexGridModel & Storage

Page 7: LexEVS 101 Craig Stancl Rick Kiefer February, 2010.

LexEVS API - Local/Direct

• The direct/local API consists of LexEVS on a local system (LexEVS installed). The API uses JDBC query the LexEVS database.

Database ServerLexEVS on Local System

LexEVS Install

JDBC

Direct

Page 8: LexEVS 101 Craig Stancl Rick Kiefer February, 2010.

8

LexEVS API - Local/Direct

• The LexEVS Local Installation is the foundation of the LexEVS System as a whole. All of the other Environments rely on this being available and configured properly. Some characteristics of the Local installation are:

• Java based

• Installed via GUI install program, or command line

• Some indexes (Lucene-based) are held on the local file system

• Includes the LexEVS GUI

• Includes a full set of Administration scripts to maintain the server

• Optionally includes Testing resources, Source Code, JavaDocs, and more…

Page 9: LexEVS 101 Craig Stancl Rick Kiefer February, 2010.

Model Objects

LexEVScaCORESDK APIs

Java API

LexGridMySQL DB

LuceneIndex Files

Distributed Java

LexEVScaGrid

API

Java(QBE)

ApplicationService

ClientWeb/Grid Service(Soap/HTTP/Rest)

Java(RMI)

( Distributed)

Client Application

Core API

Data Source

RMI

LexEVS API - Local/Direct

• In a local environment, an application uses the Java API to access content in LexEVS.

Page 10: LexEVS 101 Craig Stancl Rick Kiefer February, 2010.

LexEVS API – Distributed / SDK

• Now we will discuss LexEVS Distributed API.

LexEVS Distributed / SDK Service

LexEVS Java API

LexGridModel & Storage

Page 11: LexEVS 101 Craig Stancl Rick Kiefer February, 2010.

LexEVS API – Distributed / SDK

• The distributed Java/SDK consists of a client system which uses RMI to communicate with a distributed LexEVS server where LexEVS API is installed. The API uses JDBC query the LexEVS database.

Database Server

Distributed LexEVS Server

RMI

LexEVS on Local System

LexEVS Install

Database Server

LexEVS Install

JDBC

JDBC

Direct

Distributed / SDK

LexBIG API Proxy

Client System

LexEVS Client Proxy

Page 12: LexEVS 101 Craig Stancl Rick Kiefer February, 2010.

12

LexEVS API – Distributed / SDK

• The LexEVS Distributed Installation is actually two services combined as one application.

• Remote Access (via Remote Method Invocation) to the local LexEVS API

• A caCORE SDK Service conforming to all of the caCORE Service Interfaces.

• The key feature of the Distributed environment is that it exposes the fully LexEVS API to clients, while centralizing the actual vocabulary content in one place. This lets users have a single set of loaded ontologies – instead of multiple sets for multiple users – which reduces maintenance and increases usability.

• The Distributed Layer is also the first LexEVS environment to employ any type of Ontology-based security. It uses Security Tokens to restrict licensed ontologies (i.e. MedDRA).

Page 13: LexEVS 101 Craig Stancl Rick Kiefer February, 2010.

LexEVS API – Distributed / SDK

• Remote Access (via Remote Method Invocation) to the local LexEVS API• Any method that can be called locally is also available remotely

(with the exception of certain administration functionality, which is disabled for security purposes).

Page 14: LexEVS 101 Craig Stancl Rick Kiefer February, 2010.

Model Objects

LexEVScaCORESDK APIs

Java API

LexGridMySQL DB

LuceneIndex Files

Distributed Java

LexEVScaGrid

API

Java(QBE)

ApplicationService

ClientWeb/Grid Service(Soap/HTTP/Rest)

Java(RMI)

( Distributed)

Client Application

Core API

Data Source

RMI

LexEVS API – Distributed / SDK

• In a distributed environment, the client application uses the Distributed Java API (RMI) to access content in LexEVS or caCORE SDK Services which include REST, SOAP, RMI Interfaces for QBE, HQL, Hibernate Detached Criteria, SDK CQL, caGrid CQL.

Page 15: LexEVS 101 Craig Stancl Rick Kiefer February, 2010.

LexEVS API – Distributed / SDK

• A caCORE SDK Service conforming to all of the caCORE Service Interfaces. This includes:

• REST-ful service

• caCORE-SDK SOAP Web Service

• Query By Example (QBE) Java RMI Interfaces

• A Web-based interface to the REST-ful service

Page 16: LexEVS 101 Craig Stancl Rick Kiefer February, 2010.

LexEVS API – Distributed / SDK

• In a distributed environment, the client application uses the Distributed Java API (RMI) to access content in LexEVS or caCORE SDK Services which include REST, SOAP, RMI Interfaces for QBE, HQL, Hibernate Detached Criteria, SDK CQL, caGrid CQL.

Model Objects

LexEVScaCORESDK APIs

Java API

LexGridMySQL DB

LuceneIndex Files

Distributed Java

LexEVScaGrid

API

Java(QBE)

ApplicationService

ClientWeb/Grid Service(Soap/HTTP/Rest)

Java(RMI)

( Distributed)

Client Application

Core API

Data Source

RMI

Page 17: LexEVS 101 Craig Stancl Rick Kiefer February, 2010.

LexEVS API – caGrid Service

• Now we will discuss LexEVS caGrid Service

LexEVS caGrid Service

LexEVS Distributed / SDK Service

LexEVS Java API

LexGridModel & Storage

Page 18: LexEVS 101 Craig Stancl Rick Kiefer February, 2010.

LexEVS on Local System

LexEVS Install

Database Server

Distributed LexEVS Server

RMI

Database Server

LexEVS Install

JDBC

JDBC

Direct

Distributed / SDK

Database Server

LexBIG API Proxy

Client System

caGrid Host ServerClient System Distributed LexEVS Server

RMI

LexEVS Install

Grid

JDBCTCP

LexEVS Proxy

LexEVS Client Proxy

LexEVS Client Proxy

LexEVS API – caGrid Service

The caGrid Service consists of client system, caGrid Host Server, Distributed LexEVS Server and Database Server.

Page 19: LexEVS 101 Craig Stancl Rick Kiefer February, 2010.

19

LexEVS API – caGrid Service

• LexEVS has two deployed caGrid Services, one Analytical Service and one Data Service. They are both available and discoverable through the caGrid Portal/Index Service

• Analytical Service• Exposes the LexEVS API in much the same way as the Local and Distributed

Environments do – except as a Grid Service. A user may again use familiar Interfaces (LexBIGService, CodedNodeSet, CodedNodeGraph, etc.) to interact with the Analytical Grid Service

• Data Service• The Data Service simply exposes the LexGrid model as a caGrid Data Service.

Like any standard caGrid Data Service, CQL queries are used to query the data source.

Page 20: LexEVS 101 Craig Stancl Rick Kiefer February, 2010.

Model Objects

LexEVScaCORESDK APIs

Java API

LexGridMySQL DB

LuceneIndex Files

Distributed Java

LexEVScaGrid

API

Java(QBE)

ApplicationService

ClientWeb/Grid Service(Soap/HTTP/Rest)

Java(RMI)

( Distributed)

Client Application

Core API

Data Source

RMI

LexEVS API – caGrid Service

• In grid services environment, the client application calls the grid services interfaces which in turn call the distributed Java API to access content in LexEVS.

Page 21: LexEVS 101 Craig Stancl Rick Kiefer February, 2010.

21

Choosing an Environment

• LexEVS Environments – Which one to use?

Choosing the right Environment for your needs is important. Each of the Environments adds complexity and maintenance to the system. Also, performance plays a factor as each added Environment adds overhead.

• Local• Best Performance, easiest installation. Use this when Performance is critical

and there isn’t a need to directly expose the LexEVS API to other users.

• Distributed• Use this to directly expose the LexEVS API to multiple users – while sharing

only one set of loaded ontologies. The RMI overhead decreases performance slightly from the Local Environment. Also, if caCORE SDK-like functionality is need, this Environment is required.

• Grid• The most complex to set up – use this if users need a functioning caGRID

Node. This adds another layer of overhead, so performance will be impacted the most in this Environment.

Page 22: LexEVS 101 Craig Stancl Rick Kiefer February, 2010.

22

Services Overview

The LexEVS Service is designed to run standalone or as part of a larger network of services. It is comprised of four primary subsystems:

• Service Manager

• Service Metadata

• Query Service

• Extensions

Page 23: LexEVS 101 Craig Stancl Rick Kiefer February, 2010.

23

Services: Service Manager

• LexEVS Service - Service Manager

The service manager provides a centralized access point for administrative functions, including write and update access for a service's content. For example, the service manager allows new coding schemes to be validated and loaded, existing coding schemes to be retired and removed, and the status of various coding schemes to be updated and changed.

Page 24: LexEVS 101 Craig Stancl Rick Kiefer February, 2010.

24

Services: Metadata Service

• LexEVS Service – Metadata Service

The Service Metadata provides external clients with information about the vocabulary content (e.g. NCI Thesaurus) and appropriate licensing information.

Page 25: LexEVS 101 Craig Stancl Rick Kiefer February, 2010.

25

Services: Query Operations

• LexEVS Service - Query Operations

The Query Operations provide numerous functions for querying and traversing vocabulary content.

The Query Service is comprised of:

• Lexical Operations

• Graph Operations

• Metadata Operations

• History Operations

Page 26: LexEVS 101 Craig Stancl Rick Kiefer February, 2010.

26

Query Service: Lexical Set Operations

• Query Service - Lexical Set Operations

• Lexical Set Operations provides methods to return a lists or iterators of coded entries. Supported query criteria include the application of match/filter algorithms, sorting algorithms, and property restrictions. Support is also provided to resolve the union, intersection or difference of two node sets.

Page 27: LexEVS 101 Craig Stancl Rick Kiefer February, 2010.

27

Query Service: Graph Operations

• Query Services - Graph Set Operations • Graph Operations support the subsetting of concepts according to

relationship and distance, identification of relation source and target concepts, and graph traversal. Additional operations include enumeration and traversal of concepts by relation, walking of directed acyclic graphs (DAGs), enumeration of source and target concepts for a relation, and enumeration of relations for a concept.

Page 28: LexEVS 101 Craig Stancl Rick Kiefer February, 2010.

28

Query Service: Metadata Operations

• Query Services - Metadata Operations

• Metadata Operations allows for the query and resolution of registered code system metadata according to specified coding scheme references, property names, or values.

Page 29: LexEVS 101 Craig Stancl Rick Kiefer February, 2010.

29

Query Service: History

• Query Services - History

• History provides vocabulary-specific information about concept insertions, modifications, splits, merges, and retirements when supplied by the content provider.

Page 30: LexEVS 101 Craig Stancl Rick Kiefer February, 2010.

30

Services: Extensions

• Extensions

The Extensions component provides a mechanism to extend the specific service functions, such as Loaders, or re-wrap specific query operations into convenience methods.

Page 31: LexEVS 101 Craig Stancl Rick Kiefer February, 2010.

For More Information…

• Vocabulary Knowledge Center Wiki

• https://cabig-kc.nci.nih.gov/Vocab/KC/index.php/Main_Page

• Vocabulary Knowledge Center Forums

• https://cabig-kc.nci.nih.gov/Vocab/forums/

• Vocabulary Knowledge Center eMail

[email protected]