Formulate User Instructions, adapted from caBench-to-Bedside (caB2B) Web Application v3.2 An easy to use tool for searching across caGrid Mukesh Sharma Washington University School of Medicine WITH FUNDING SUPPORT PROVIDED BY NATIONAL INSTITUTE OF STANDARDS AND TECHNOLOGY
33
Embed
Formulate User Instructions, adapted from caBench-to-Bedside (caB2B) Web Application v3.2 An easy to use tool for searching across caGrid Mukesh Sharma.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Formulate User Instructions, adapted from caBench-to-Bedside (caB2B)
Web Application v3.2An easy to use tool for searching across caGrid
Mukesh SharmaWashington University School of Medicine
Granular, role-based, security with single sign-on to both private and public data resources
InformaticistForm and expose queries to find data
Desktop (thick) Use UML models
Define queries in terms of RDF triples driven by ontologies (not UML)
IT systems expert
Configure knowledge resources and data services
Server-sideConfigure resources that use caGrid
Prepare and support more flexible method (e.g., SPARQL) so as not to be limited by caGrid
caB2B overview
– caB2B is a tool designed to integrate and analyze diverse biomedical datasets seamlessly. It has been developed to facilitate individual steps of cancer research analyses and reduce the bench-to-bedside barrier.
– caB2B is a caGrid client that permits bench scientists, translational researchers, and clinicians to leverage data services developed under caBIG® through a graphical user interface. Its metadata-based query interface enables end users to search virtually any caGrid data service.
44
Example Use Cases• User can query for all pre-cancerous
biospecimens from various caTissue instances like those at Washington University, Thomas Jefferson University, Holden Comprehensive Cancer Center etc.
• User can identify the sample obtained for Glioblastoma multiforme (GBM) and the corresponding CT image information. This query can be performed by querying across caTissue and NBIA using caB2B.
• User can find out if a sample used in an expression profiling experiment is available for a SNP analysis experiment. This query can be performed by querying across caTissue and caArray using caB2B.
• User can search for a particular gene based on the EntrezGeneID and its related information e.g. messenger RNA and protein information from GeneConnect.
55
caB2B Dependencies
• Availability of data on the caGrid
• Metadata registered in caDSR
• caGrid core services that support security, query federation and metadata
• Performance of the caGrid and data services
66
caB2B v3.2 Components• caB2B Server
– Caches metadata (concept codes, class and attribute descriptions, and permissible values) from caDSR and service instances to query
– Permits caB2B server customization by the Administrator– Allows for model metadata caching and service instance selection– Permits Administrator to curate models (frequently used paths, creating
categories, defining intermodel joins) in order to facilitate end user queries
• caB2B Client Application– Allows end users to query virtually any caGrid data service, persist salient
results, and examine this information using visualization windows• caB2B Web Application
– Allows users to query microarray data, imaging data, and biospecimen data available on the caGrid.
– Permits keyword searches or use highly relevant parameterized queries (saved searches).
77
Target Audience • caB2B Administrative Module
Bioinformaticist -The caB2B administrator. Knowledge of UML models/domain models of caBIG tools is required; For activities like creating multi-model category, knowledge of Extensible Markup Language (XML) and basic knowledge of executing commands is desired.
• caB2B Client Application
Clinical and Translational Research Scientist. Knowledge of UML models/domain models of caBIG tools is required to create and execute the queries using caB2B.
• caB2B Web ApplicationClinical and translational research scientist. No special knowledge or skill is required to use the caB2B web application.
88
caB2B Web Application Capabilities
• caB2B Web Application allows users to
• Sign in (optional)• Select the type of data to search• Select the services (databases) from which data could be retrieved• Perform a keyword or a parameterized query• Execute queries offline• Export data into CSV file
99
caB2B Administrative Module
1010
Administrative module features
• Web based administration.• UI to search caDSR, retrieve models and load into MDR.• Discover services dynamically.• Curate frequently used paths to speedup query building.• Create categories to bridge gap between end user’s view of data and real
object oriented representation.• Define intermodel joins based on CDE, DEC match and manual override to
connect underspecified models.• Automatic cache update between administrative module and caB2B server.• Ability to reconfigure the previously configured service instances.
1111
Load Models from caDSR
Administrative interface
Select models to load
caB2B MDR
caB2B MDR
Get all Model names
Fetch selected model
1212
Discover Services Dynamically
Select models to discover services
caB2B MDR
caB2B MDR
Discover data services by domain model
Get loaded Models
Select service instances
1313
Curating frequently used paths for connecting classes
Identifying the most relevant paths between a classes and storing them.
1414
caB2B CategoryA UML Class is a collection of attributes that makes sense
technically to developers and bioinformaticians, but may not be intuitive to researchers and clinicians.
Defining intermodel joins using semantic metadata and manual override to consider underspecified models
Connecting two models using the common bridging attributes between them.
1818
caB2B End User Client
1919
The end user client is a Java application that enables end users to query for and persist data available on the caGrid. The end user client offers the following features:
• caGrid based authentication of users. Anonymous login for users without grid account.
• The query component consists of a diagrammatic view.
• The diagrammatic viewer allows the user to create a directed acyclic graph of the query that is to be executed and also helps the user to connect two or more classes to be searched.
• User based access control for experiments and saved queries. The experiments and queries saved by user will only be visible to the user and not to anyone else. "My Experiments" and "My Search Queries" menus on home page dashboard are available for easy access to user's experiments and queries.
caB2B Client Features
2020
caB2B Client Features• Category popularity to display most used categories. "Popular Categories"
menu on the home page dashboard now displays categories searched by all caB2B users in descending order of popularity.
• User override of administrator defined services instances. The user can change the service instances configured by the administrator without using the administrative module. The user can achieve this through "MySetting" link at the Home page dashboard or from the third step of search data wizard.
• User to view DCQLs in read only way. The DCQL that will be executed for a particular query is available for review from the third step of search data wizard.
• Grouping of query results by service instances. The results obtained for a query can be narrowed down to view results obtained from a particular service instance.
2121
caB2B Client Features• Queries generated/executed can be saved.
• The data obtained from the query may be saved as a ‘virtual experiment’ and analyzed further.
• Saved data may be filtered to generate a custom data view.
• The end user may also visualize data in the experiment by using various graphical components.
2222
Diagrammatic viewer
2323
My Search Queries, Popular categories and My Experiments menus on home page for easy access
2424
User can override administrator defined services instances from “My Settings” or from the third step of query
2525
The DCQL that will be executed for a particular query can be viewed
2626
Grouping of query results by service instances.
Available results can be grouped based to instances
to view results from the service of interest.
348 Samples filtered to view only the results from
Scenarios for caB2B Use• Use GU/WashU instances to access public data available via production
caGrid
• Use your institutional instance to access– Public data available via production caGrid– Private data available via institutional caGrid (local services)
• Use public/private instance to access – Private data which only your (or your collaborator’s) caGrid credentials
allow access to
3030
caB2B Development and Support
• caB2B Knowledge Center at Washington University Medical School