LexEVS 101 Craig Stancl Rick Kiefer February, 2010
Dec 28, 2015
LexEVS101
Craig Stancl
Rick Kiefer
February, 2010
LexGrid Model Overview
• The LexGrid Model is Mayo’s proposed mechanism for standard storage of controlled vocabularies and ontologies:
• Defines HOW vocabularies should be formatted and represented
• Flexible enough to accurately represent a wide variety of vocabularies and other lexically-based resources
• Defines several different server storage mechanisms and a XML format
• Provides the core representation for all data managed and retrieved through the LexEVS system
• Once the vocabulary information is represented in a standardized format, it becomes possible to build common repositories to store vocabulary content and common programming interfaces and tools to access and manipulate that content.
• Terminologies from widely varying resources such as RRF, OWL, and OBO can be loaded to a single data base management system and accessed with an single API.
• The LexGrid model stands alone as a complete terminology model
LexBIG Model Overview
• LexBIG provides references into the data model without requiring resolution of the a complete terminology node set or graph.
• As such it functions as a kind of lazy loading mechanism, similar to what can be found in Hibernate.
• Elements of LexBIG that are resolved in a minimal manner can often avoid database calls by referring to a Lucene index, saving response time.
LexGrid, LexBIG and LexEVS
• LexEVS: Optimizing query code that retrieves LexBIG objects.
• LexBIG: How the terminology service looks as objects returned to the user.
• LexGrid: How the terminology service looks in a data base.
LexGridData base
LexBIGObjects
LexEVSAPI
•LexEVS uses the LexBIG model in conjunction with the LexGrid model
LexEVS Environment Architecture
• LexEVS consists of LexGrid Model & Storage, LexEVS Java API, LexEVS Distributed Service and LexEVS caGrid Service.
LexEVS caGrid Service
LexEVS Distributed / SDK Service
LexEVS Java API
LexGridModel & Storage
LexEVS API - Local/Direct
• Now we will discuss LexEVS API.
LexEVS Java API
LexGridModel & Storage
LexEVS API - Local/Direct
• The direct/local API consists of LexEVS on a local system (LexEVS installed). The API uses JDBC query the LexEVS database.
Database ServerLexEVS on Local System
LexEVS Install
JDBC
Direct
8
LexEVS API - Local/Direct
• The LexEVS Local Installation is the foundation of the LexEVS System as a whole. All of the other Environments rely on this being available and configured properly. Some characteristics of the Local installation are:
• Java based
• Installed via GUI install program, or command line
• Some indexes (Lucene-based) are held on the local file system
• Includes the LexEVS GUI
• Includes a full set of Administration scripts to maintain the server
• Optionally includes Testing resources, Source Code, JavaDocs, and more…
Model Objects
LexEVScaCORESDK APIs
Java API
LexGridMySQL DB
LuceneIndex Files
Distributed Java
LexEVScaGrid
API
Java(QBE)
ApplicationService
ClientWeb/Grid Service(Soap/HTTP/Rest)
Java(RMI)
( Distributed)
Client Application
Core API
Data Source
RMI
LexEVS API - Local/Direct
• In a local environment, an application uses the Java API to access content in LexEVS.
LexEVS API – Distributed / SDK
• Now we will discuss LexEVS Distributed API.
LexEVS Distributed / SDK Service
LexEVS Java API
LexGridModel & Storage
LexEVS API – Distributed / SDK
• The distributed Java/SDK consists of a client system which uses RMI to communicate with a distributed LexEVS server where LexEVS API is installed. The API uses JDBC query the LexEVS database.
Database Server
Distributed LexEVS Server
RMI
LexEVS on Local System
LexEVS Install
Database Server
LexEVS Install
JDBC
JDBC
Direct
Distributed / SDK
LexBIG API Proxy
Client System
LexEVS Client Proxy
12
LexEVS API – Distributed / SDK
• The LexEVS Distributed Installation is actually two services combined as one application.
• Remote Access (via Remote Method Invocation) to the local LexEVS API
• A caCORE SDK Service conforming to all of the caCORE Service Interfaces.
• The key feature of the Distributed environment is that it exposes the fully LexEVS API to clients, while centralizing the actual vocabulary content in one place. This lets users have a single set of loaded ontologies – instead of multiple sets for multiple users – which reduces maintenance and increases usability.
• The Distributed Layer is also the first LexEVS environment to employ any type of Ontology-based security. It uses Security Tokens to restrict licensed ontologies (i.e. MedDRA).
LexEVS API – Distributed / SDK
• Remote Access (via Remote Method Invocation) to the local LexEVS API• Any method that can be called locally is also available remotely
(with the exception of certain administration functionality, which is disabled for security purposes).
Model Objects
LexEVScaCORESDK APIs
Java API
LexGridMySQL DB
LuceneIndex Files
Distributed Java
LexEVScaGrid
API
Java(QBE)
ApplicationService
ClientWeb/Grid Service(Soap/HTTP/Rest)
Java(RMI)
( Distributed)
Client Application
Core API
Data Source
RMI
LexEVS API – Distributed / SDK
• In a distributed environment, the client application uses the Distributed Java API (RMI) to access content in LexEVS or caCORE SDK Services which include REST, SOAP, RMI Interfaces for QBE, HQL, Hibernate Detached Criteria, SDK CQL, caGrid CQL.
LexEVS API – Distributed / SDK
• A caCORE SDK Service conforming to all of the caCORE Service Interfaces. This includes:
• REST-ful service
• caCORE-SDK SOAP Web Service
• Query By Example (QBE) Java RMI Interfaces
• A Web-based interface to the REST-ful service
LexEVS API – Distributed / SDK
• In a distributed environment, the client application uses the Distributed Java API (RMI) to access content in LexEVS or caCORE SDK Services which include REST, SOAP, RMI Interfaces for QBE, HQL, Hibernate Detached Criteria, SDK CQL, caGrid CQL.
Model Objects
LexEVScaCORESDK APIs
Java API
LexGridMySQL DB
LuceneIndex Files
Distributed Java
LexEVScaGrid
API
Java(QBE)
ApplicationService
ClientWeb/Grid Service(Soap/HTTP/Rest)
Java(RMI)
( Distributed)
Client Application
Core API
Data Source
RMI
LexEVS API – caGrid Service
• Now we will discuss LexEVS caGrid Service
LexEVS caGrid Service
LexEVS Distributed / SDK Service
LexEVS Java API
LexGridModel & Storage
LexEVS on Local System
LexEVS Install
Database Server
Distributed LexEVS Server
RMI
Database Server
LexEVS Install
JDBC
JDBC
Direct
Distributed / SDK
Database Server
LexBIG API Proxy
Client System
caGrid Host ServerClient System Distributed LexEVS Server
RMI
LexEVS Install
Grid
JDBCTCP
LexEVS Proxy
LexEVS Client Proxy
LexEVS Client Proxy
LexEVS API – caGrid Service
The caGrid Service consists of client system, caGrid Host Server, Distributed LexEVS Server and Database Server.
19
LexEVS API – caGrid Service
• LexEVS has two deployed caGrid Services, one Analytical Service and one Data Service. They are both available and discoverable through the caGrid Portal/Index Service
• Analytical Service• Exposes the LexEVS API in much the same way as the Local and Distributed
Environments do – except as a Grid Service. A user may again use familiar Interfaces (LexBIGService, CodedNodeSet, CodedNodeGraph, etc.) to interact with the Analytical Grid Service
• Data Service• The Data Service simply exposes the LexGrid model as a caGrid Data Service.
Like any standard caGrid Data Service, CQL queries are used to query the data source.
Model Objects
LexEVScaCORESDK APIs
Java API
LexGridMySQL DB
LuceneIndex Files
Distributed Java
LexEVScaGrid
API
Java(QBE)
ApplicationService
ClientWeb/Grid Service(Soap/HTTP/Rest)
Java(RMI)
( Distributed)
Client Application
Core API
Data Source
RMI
LexEVS API – caGrid Service
• In grid services environment, the client application calls the grid services interfaces which in turn call the distributed Java API to access content in LexEVS.
21
Choosing an Environment
• LexEVS Environments – Which one to use?
Choosing the right Environment for your needs is important. Each of the Environments adds complexity and maintenance to the system. Also, performance plays a factor as each added Environment adds overhead.
• Local• Best Performance, easiest installation. Use this when Performance is critical
and there isn’t a need to directly expose the LexEVS API to other users.
• Distributed• Use this to directly expose the LexEVS API to multiple users – while sharing
only one set of loaded ontologies. The RMI overhead decreases performance slightly from the Local Environment. Also, if caCORE SDK-like functionality is need, this Environment is required.
• Grid• The most complex to set up – use this if users need a functioning caGRID
Node. This adds another layer of overhead, so performance will be impacted the most in this Environment.
22
Services Overview
The LexEVS Service is designed to run standalone or as part of a larger network of services. It is comprised of four primary subsystems:
• Service Manager
• Service Metadata
• Query Service
• Extensions
23
Services: Service Manager
• LexEVS Service - Service Manager
The service manager provides a centralized access point for administrative functions, including write and update access for a service's content. For example, the service manager allows new coding schemes to be validated and loaded, existing coding schemes to be retired and removed, and the status of various coding schemes to be updated and changed.
24
Services: Metadata Service
• LexEVS Service – Metadata Service
The Service Metadata provides external clients with information about the vocabulary content (e.g. NCI Thesaurus) and appropriate licensing information.
25
Services: Query Operations
• LexEVS Service - Query Operations
The Query Operations provide numerous functions for querying and traversing vocabulary content.
The Query Service is comprised of:
• Lexical Operations
• Graph Operations
• Metadata Operations
• History Operations
26
Query Service: Lexical Set Operations
• Query Service - Lexical Set Operations
• Lexical Set Operations provides methods to return a lists or iterators of coded entries. Supported query criteria include the application of match/filter algorithms, sorting algorithms, and property restrictions. Support is also provided to resolve the union, intersection or difference of two node sets.
27
Query Service: Graph Operations
• Query Services - Graph Set Operations • Graph Operations support the subsetting of concepts according to
relationship and distance, identification of relation source and target concepts, and graph traversal. Additional operations include enumeration and traversal of concepts by relation, walking of directed acyclic graphs (DAGs), enumeration of source and target concepts for a relation, and enumeration of relations for a concept.
28
Query Service: Metadata Operations
• Query Services - Metadata Operations
• Metadata Operations allows for the query and resolution of registered code system metadata according to specified coding scheme references, property names, or values.
29
Query Service: History
• Query Services - History
• History provides vocabulary-specific information about concept insertions, modifications, splits, merges, and retirements when supplied by the content provider.
30
Services: Extensions
• Extensions
The Extensions component provides a mechanism to extend the specific service functions, such as Loaders, or re-wrap specific query operations into convenience methods.
For More Information…
• Vocabulary Knowledge Center Wiki
• https://cabig-kc.nci.nih.gov/Vocab/KC/index.php/Main_Page
• Vocabulary Knowledge Center Forums
• https://cabig-kc.nci.nih.gov/Vocab/forums/
• Vocabulary Knowledge Center eMail