Top Banner
Semantic Web use cases in outcomes research Experiences from building a patient repository and developing standards Chimezie Ogbuji Metacognition Inc. (Owner) //metacognition.info/presentations/SW-usecases-outcomes-research.ppt
27

Semantic Web use cases in outcomes research Experiences from building a patient repository and developing standards Chimezie Ogbuji Metacognition Inc.

Apr 01, 2015

Download

Documents

Carlo Cheever
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Semantic Web use cases in outcomes research Experiences from building a patient repository and developing standards Chimezie Ogbuji Metacognition Inc.

Semantic Web use cases in outcomes researchExperiences from building a patient repository and developing standards

Chimezie OgbujiMetacognition Inc. (Owner)

http://metacognition.info/presentations/SW-usecases-outcomes-research.ppt

Page 2: Semantic Web use cases in outcomes research Experiences from building a patient repository and developing standards Chimezie Ogbuji Metacognition Inc.

Outline• Me• Semantic Web and Semantic Web technologies• RDF, GRDDL, OWL, RIF, and SPARQL

• Cleveland Clinic Semantic DB project• Content repository• Data collection workflow• Quality and outcomes reporting• Cohort identification

• Use of the system

Page 3: Semantic Web use cases in outcomes research Experiences from building a patient repository and developing standards Chimezie Ogbuji Metacognition Inc.

Me and Semantic Web • I’ve been developing software using standards of the Semantic

Web since 2001• Began working on Cleveland Clinic SemanticDB project in 2003• Began working in the World-Wide Consortium (W3C),

developing the SPARQL and GRDDL standards in 2007 and 2006, respectively

• I contribute to and maintain several open source software projects related to Semantic Web technologies:• RDFLib (https://code.google.com/p/rdflib/)• FuXi (https://code.google.com/p/fuxi/)• Akamu (https://code.google.com/p/akamu/)

Page 4: Semantic Web use cases in outcomes research Experiences from building a patient repository and developing standards Chimezie Ogbuji Metacognition Inc.

The Semantic Web• The Semantic Web• A vision of how the existing WWW can be extended such that

machines can interpret the meaning of data involved in protocol interactions

• A vision of the founder of the World-wide Web Consortium (W3C) and inventor of the internet (Tim Berners-Lee)

• Semantic Web technologies / standards• A technological roadmap that attempts to realize this • Layers of W3C standards (“Layer cake”)

Page 5: Semantic Web use cases in outcomes research Experiences from building a patient repository and developing standards Chimezie Ogbuji Metacognition Inc.

http://www.w3.org/2007/Talks/0130-sb-W3CTechSemWeb/

Page 6: Semantic Web use cases in outcomes research Experiences from building a patient repository and developing standards Chimezie Ogbuji Metacognition Inc.

http://www.bnode.org/blog/2009/07/08/the-semantic-web-not-a-piece-of-cake

Page 7: Semantic Web use cases in outcomes research Experiences from building a patient repository and developing standards Chimezie Ogbuji Metacognition Inc.

“Focus” standards • Resource Description Framework• Gleaning Resource Descriptions from Dialects of Language• SPARQL Protocol And RDF Query Language• Ontology Web Language

Page 8: Semantic Web use cases in outcomes research Experiences from building a patient repository and developing standards Chimezie Ogbuji Metacognition Inc.

RDF• A framework for representing information in the Web.• Motivation• machine interpretable metadata about web resources• mashup of application data• automated processing of web information by software agents

• Graph data model (directed, labeled graph)

• Nodes and links are labeled with URIs• Some nodes are not labeled (Blank nodes)• Links are called RDF sentences or triples

http://www.w3.org/TR/rdf-concepts/

Page 9: Semantic Web use cases in outcomes research Experiences from building a patient repository and developing standards Chimezie Ogbuji Metacognition Inc.

GRDDL• A protocol for sowing semantics in structured (XML) web

content for harvest• Vast amount of latent semantics in web documents• Web content today is primarily built for human consumption

http://www.w3.org/TR/grddl/

Page 10: Semantic Web use cases in outcomes research Experiences from building a patient repository and developing standards Chimezie Ogbuji Metacognition Inc.

Faithful Rendition“By specifying a GRDDL transformation, the author of a document states that the transformation will provide a faithful rendition in RDF of information (or some portion of the information) expressed through the XML dialect used in the source document.”

•Licenses an interpretation of an XML document that is certified by the author

Page 11: Semantic Web use cases in outcomes research Experiences from building a patient repository and developing standards Chimezie Ogbuji Metacognition Inc.

Architectural value• XML is well suited for messaging, data collection, and

structural validation• RDF is well suited for expressive logical assertions, querying,

and inference.• RDF graphs can be created, update, deleted, etc. (managed)

using a particular XML vocabulary • vocabulary can be specific to a particular purpose rather

• GRDDL facilitates mutually beneficial use of XML and RDF processing and representation

Page 12: Semantic Web use cases in outcomes research Experiences from building a patient repository and developing standards Chimezie Ogbuji Metacognition Inc.

SPARQL• The query language for RDF content• It operates over an RDF dataset • Comprised of named RDF graphs and a single RDF graph without

a name• Operationally and structurally similar to SQL• Many implementations (including the one we used) build on

existing relational database management systems• Translate SPARQL queries into SQL queries

Elliott et al. A complete translation from SPARQL into efficient SQL. 2009

http://www.w3.org/TR/sparql11-query/

Page 13: Semantic Web use cases in outcomes research Experiences from building a patient repository and developing standards Chimezie Ogbuji Metacognition Inc.

OWL• Language for describing and constraining the semantics of an

RDF vocabulary• Such constraints (often hierarchical) are called ontologies• An ontology specifies a conceptualization of a particular

domain as categories, relationships between them, and constraints on both.

• By defining an OWL document for the terms in an RDF graph, additional RDF sentences can be inferred

• Additionally, an RDF graph can be determined to be consistent or inconsistent with respect to the ontology

• Both tasks can be done by a logical reasoning engine

Page 14: Semantic Web use cases in outcomes research Experiences from building a patient repository and developing standards Chimezie Ogbuji Metacognition Inc.

Semantic Database (SDB)• Cleveland Clinic’s Heart and Vascular Institute (HVI)• Challenges:• fragmented gathering and storing of clinical research data• compartmentalization of medical science and practice• clinical knowledge is typically expressed in ambiguous,

idiosyncratic terminology• problematic for longitudinal patient data that can feasibly span

multiple, geographically separated sources and disciplines• Longitudinal patient record: • patient records from different times, providers, and sites of care

that are linked to form a lifelong view of a patient’s health care experience

http://www.w3.org/2001/sw/sweo/public/UseCases/ClevelandClinic/

Page 15: Semantic Web use cases in outcomes research Experiences from building a patient repository and developing standards Chimezie Ogbuji Metacognition Inc.

Project goals• Create a framework for context-free data management• Usable for any domain with nothing (or little) assumed about

the domain• Expert-provided, domain-specific knowledge is used to control

most aspects of• Data entry• Storage• Display• Retrieval• Formatting for external systems

Page 16: Semantic Web use cases in outcomes research Experiences from building a patient repository and developing standards Chimezie Ogbuji Metacognition Inc.

Components• Content repository• supports data collection, document management, and

knowledge representation for use in managing longitudinal clinical data

• manages patient record documents as XML and converts them to RDF graphs for downstream semantic processing

• Data collection workflow• process of transcribing details of a heart procedure from the EHR

into a registry• RDF used as the state machine of a workflow engine

Pierce et al. SemanticDB: A Semantic Web Infrastructure for Clinical Research and Quality Reporting. 2012Ogbuji. A Role for Semantic Web Technologies in Patient Record Data Collection. 2009

Page 17: Semantic Web use cases in outcomes research Experiences from building a patient repository and developing standards Chimezie Ogbuji Metacognition Inc.

Workflow State as RDF Dataset• Each task is an XML document in a content repository• Mirrored into a named RDF graph that shares a web location

(the name) with the document• (SPARQL) query is dispatched against a workflow dataset to

find tasks in particular states or assigned to particular people• Applications interact with task information and fetch: • JSON and XML representations (for client-side web applications)• XHTML documents that render as faceted views of a collection of

tasks• faceted view includes links to subsequent stages in workflow and

into other web applications on server

Page 18: Semantic Web use cases in outcomes research Experiences from building a patient repository and developing standards Chimezie Ogbuji Metacognition Inc.
Page 19: Semantic Web use cases in outcomes research Experiences from building a patient repository and developing standards Chimezie Ogbuji Metacognition Inc.

Reporting challenges• Reporting places a heavy burden on institutions to produce

data in specific formats with precise definitions• Definitions vary across reports• makes it difficult to use the same source data for all reports

• Institutions are typically forced to manually abstract the data for each report

• This is done separately to conform to the requirements for each report

Pierce et al. SemanticDB: A Semantic Web Infrastructure for Clinical Research and Quality Reporting. 2012

Page 20: Semantic Web use cases in outcomes research Experiences from building a patient repository and developing standards Chimezie Ogbuji Metacognition Inc.

Components: reporting• Quality and outcomes reporting• generate outcomes reports both for internal and external

consumption• internal reports were generated monthly and external reports

are generated quarterly• quarterly reports submitted to Society of Thoracic Surgeons (STS)

Adult Cardiac Surgery National Database and American College of Cardiology (ACC) CathPCI Database

• submissions are required for certification

Pierce et al. SemanticDB: A Semantic Web Infrastructure for Clinical Research and Quality Reporting. 2012

Page 21: Semantic Web use cases in outcomes research Experiences from building a patient repository and developing standards Chimezie Ogbuji Metacognition Inc.
Page 22: Semantic Web use cases in outcomes research Experiences from building a patient repository and developing standards Chimezie Ogbuji Metacognition Inc.

Cohort identification• SPARQL and RDF datasets are well-suited as infrastructure for

a longitudinal patient record data warehouse• HVI software development team partnered with Cycorp to

build a cohort identification interface called the Semantic Research Assistant (SRA)

• Based on the Cyc inference engine• a powerful reasoning system and knowledge base with built-in

capability for natural language (NL)processing, forward-chaining inference and backward-chaining inference.

• incorporates Cyc's NL processing to permit a user to compose a cohort selection query by typing an English sentence or sentence fragment

Lenat et al. Harnessing Cyc to Answer Clinical Researchers' Ad Hoc Queries. 2010.

Page 23: Semantic Web use cases in outcomes research Experiences from building a patient repository and developing standards Chimezie Ogbuji Metacognition Inc.
Page 24: Semantic Web use cases in outcomes research Experiences from building a patient repository and developing standards Chimezie Ogbuji Metacognition Inc.

RDF dataset warehouse• CycL to SPARQL• domain-specific medical ontologies in conjunction with the Cyc

general ontology are used to convert the NL query into a formal representation and then into SPARQL queries.

• SPARQL queries are submitted to the SemanticDB RDF store for execution

• Cleveland Clinic’s registry of 200,000 patient records comprises an RDF graph of roughly 80 million RDF assertion

Page 25: Semantic Web use cases in outcomes research Experiences from building a patient repository and developing standards Chimezie Ogbuji Metacognition Inc.

Dataset topology• An RDF dataset with no default graph and one named graph

per patient record (a patient record graph)• Beyond identifying the cohort, most subsequent query

processing happens within a single patient record graph• In our vocabulary, there are instances of PatientRecord,

Operation, Patient, MedicalEvent, HospitalEpisode, etc.• PatientRecord resources share a URI with their containing

graph

Page 26: Semantic Web use cases in outcomes research Experiences from building a patient repository and developing standards Chimezie Ogbuji Metacognition Inc.

• GRAPH operator can be used to optimize the search space• Optimal for the following cohort querying paradigm

• Constraints in the first part of query are cross-graph and the second part are intra-graph

Page 27: Semantic Web use cases in outcomes research Experiences from building a patient repository and developing standards Chimezie Ogbuji Metacognition Inc.

Use of system• From 2009 through June of 2011• over 200 clinical investigations utilized SemanticDB to identify

study cohorts and retrieve appropriate data for analysis• studies ranged from relatively simple feasibility assessments to

extremely complex investigations of time-related events and competing risks of the patient experiencing a certain outcome after treatment

• prior cohort identification and data export queries for studies would have been performed by a skilled database administrator (DBA) interpreting instructions from domain experts

• Using SemanticDB and the SRA, a non-technical domain expert performed most of the queries