Page 1
>>> Korean BioInformation Center>>> Korean BioInformation Center
>>> >>> KRIBBKRIBB Korea Research institute of Bioscience and BiotechnologyKorea Research institute of Bioscience and Biotechnology
GS2PATH:
Linking Gene Ontology
and Pathways
Jin Ok YangKorean BioInformation Center
6th InCoB 2007
Page 2
>>> Korean BioInformation Center>>> Korean BioInformation Center
>>> >>> KRIBBKRIBB Korea Research institute of Bioscience and BiotechnologyKorea Research institute of Bioscience and Biotechnology
KOBIC (Korean BioInformation Center) The national bioinformatics center of Korea Integration of diverse biological information
Genome information Biodiversity information Bioresource information
Bioinformatics training
International exchange program Collaborative Development of bioinformatic tools
Bioportal (Biowiki) Biopipeline (Bioworkflow engine)
Page 3
>>> Korean BioInformation Center>>> Korean BioInformation Center
>>> >>> KRIBBKRIBB Korea Research institute of Bioscience and BiotechnologyKorea Research institute of Bioscience and Biotechnology
BioWiki Wiki
• a web technology that enables anyone to create and update website contents
• suited for developing online knowledge bases (e.g., Wikipedia )
BioWiki• To adopt the wiki paradigm in biology• Collaborative development of biological knowledge
bases• BioWiki Contest ( http://biowiki.net )
Page 4
>>> Korean BioInformation Center>>> Korean BioInformation Center
>>> >>> KRIBBKRIBB Korea Research institute of Bioscience and BiotechnologyKorea Research institute of Bioscience and Biotechnology
BioPipe (http://www.biopipe.net)
Design View
Ontology View
Monitoring View
Toolbar
Drag the module from the list and drop it into the design view. Drag the module from the list and drop it into the design view.
• BioWorkFlow Engine
• No installation required
• Drag & Drop, and then Connect
• BioPipe Contest !!– Aug 15th ~ Sep 20th
– Open free Web 2.0
• BioWorkFlow Engine
• No installation required
• Drag & Drop, and then Connect
• BioPipe Contest !!– Aug 15th ~ Sep 20th
– Open free Web 2.0
Page 5
>>> Korean BioInformation Center>>> Korean BioInformation Center
>>> >>> KRIBBKRIBB Korea Research institute of Bioscience and BiotechnologyKorea Research institute of Bioscience and Biotechnology
GS2PATH:
Linking Gene Ontology
and Pathways
Jin Ok YangKorean BioInformation Center
6th InCoB 2007
Page 6
>>> Korean BioInformation Center>>> Korean BioInformation Center
>>> >>> KRIBBKRIBB Korea Research institute of Bioscience and BiotechnologyKorea Research institute of Bioscience and Biotechnology
Background
Efforts on analyzing functional relationships among gene
sets with GO term and pathways
Gene Ontology (GO) Term based analysis Analysis
focused on function
GO term related pathways More useful information
How do you
interpret the gene
set ?
GO & Pathways
GO & Pathways
Page 7
>>> Korean BioInformation Center>>> Korean BioInformation Center
>>> >>> KRIBBKRIBB Korea Research institute of Bioscience and BiotechnologyKorea Research institute of Bioscience and Biotechnology
Gene set enrichmentEnrichment Test
Means test to investigate which specific GO term the given gene
set has
P-value for GO term was calculated by using hyper-geometric
probability
Gene set enrichment
Derives its power by focusing on gene sets, that is,
groups of genes that share common biological function,
chromosomal location, or regulation
Evaluates microarray data at the level of gene sets
which are defined based on prior biological knowledge
Page 8
>>> Korean BioInformation Center>>> Korean BioInformation Center
>>> >>> KRIBBKRIBB Korea Research institute of Bioscience and BiotechnologyKorea Research institute of Bioscience and Biotechnology
Introduction: GO
GO databases and tools
GO term was used mostly to
analyze data sets to identify
significant biological changes
Pathways also can be exploited to
find functional relationships in genes
Page 9
>>> Korean BioInformation Center>>> Korean BioInformation Center
>>> >>> KRIBBKRIBB Korea Research institute of Bioscience and BiotechnologyKorea Research institute of Bioscience and Biotechnology
Introduction: Pathways
Page 10
>>> Korean BioInformation Center>>> Korean BioInformation Center
>>> >>> KRIBBKRIBB Korea Research institute of Bioscience and BiotechnologyKorea Research institute of Bioscience and Biotechnology
GS2PATHA system to find gene set enrichment in
each Gene Ontology (GO) terms and map
the part of gene set on GO term into
biological pathways (KEGG and BioCarta)
An integrated search tool for analyzing
the functional relationships in gene sets and
for providing comprehensive results
Page 11
>>> Korean BioInformation Center>>> Korean BioInformation Center
>>> >>> KRIBBKRIBB Korea Research institute of Bioscience and BiotechnologyKorea Research institute of Bioscience and Biotechnology
FeaturesFunctional relationships between GO term and
pathways
Hyper-geometric test for gene set enrichment
Dual search for up- and down- regulation gene set
Various filtering options for GO termsthe number of descendant node, evidence of GO terms
and statistical values mapping gene set in each GO term
User-specified coloring for genes onto pathways
Page 12
>>> Korean BioInformation Center>>> Korean BioInformation Center
>>> >>> KRIBBKRIBB Korea Research institute of Bioscience and BiotechnologyKorea Research institute of Bioscience and Biotechnology
Implementation (1/3)
GS2Path consists of
one internal database (mapping
database)
four components
Query Processor, GO Accessor, KEGG
Accessor, and BioCarta Accessor
Page 13
>>> Korean BioInformation Center>>> Korean BioInformation Center
>>> >>> KRIBBKRIBB Korea Research institute of Bioscience and BiotechnologyKorea Research institute of Bioscience and Biotechnology
Schema of internal mapping DB
Page 14
>>> Korean BioInformation Center>>> Korean BioInformation Center
>>> >>> KRIBBKRIBB Korea Research institute of Bioscience and BiotechnologyKorea Research institute of Bioscience and Biotechnology
Architecture
Page 15
>>> Korean BioInformation Center>>> Korean BioInformation Center
>>> >>> KRIBBKRIBB Korea Research institute of Bioscience and BiotechnologyKorea Research institute of Bioscience and Biotechnology
Implementation (2/3)Query Processor
receives a user query
Converts query into gene related information
distributes it to the other components, waiting for receiving
results
from them
GO Accessor
retrieves statistical values mapping gene set in each GO terms to
KEGG and BioCarta Pathways
Calculates P-value using cumulative hyper-geometric distribution
Page 16
>>> Korean BioInformation Center>>> Korean BioInformation Center
>>> >>> KRIBBKRIBB Korea Research institute of Bioscience and BiotechnologyKorea Research institute of Bioscience and Biotechnology
Implementation (3/3)BioCarta and KEGG Accessor
retrieve results from BioCarta and KEGG databases,
respectively
To support user-specified coloring,
For KEGG, exploiting the web service API (SOAP/WSDL) of
KEGG
For BioCarta, no supporting user-defined coloring API. Thus,
after retrieving the image of a pathway from BioCarta database,
we color genes in the image on-the-fly.
Page 17
>>> Korean BioInformation Center>>> Korean BioInformation Center
>>> >>> KRIBBKRIBB Korea Research institute of Bioscience and BiotechnologyKorea Research institute of Bioscience and Biotechnology
GO Term based Pathways Analysis
Page 18
>>> Korean BioInformation Center>>> Korean BioInformation Center
>>> >>> KRIBBKRIBB Korea Research institute of Bioscience and BiotechnologyKorea Research institute of Bioscience and Biotechnology
SearchGene set enrichment test in organism
total profile: GO, KEGG and BioCarta
Single or two parts analysis (up and
down regulation)
Pathway viewer for KEGG and
BioCarta
Page 19
>>> Korean BioInformation Center>>> Korean BioInformation Center
>>> >>> KRIBBKRIBB Korea Research institute of Bioscience and BiotechnologyKorea Research institute of Bioscience and Biotechnology
InputDatabase
GO category Biological Process Molecular Function Cellular Component
Pathways: KEGG and BioCarta
OrganismHuman, Mouse, Rat, and Yeast
Gene ID list
Page 20
>>> Korean BioInformation Center>>> Korean BioInformation Center
>>> >>> KRIBBKRIBB Korea Research institute of Bioscience and BiotechnologyKorea Research institute of Bioscience and Biotechnology
Test
Enrichment testP-value: Hyper-geometric probability
FDR (False Discovery Rate)Adjustment of p-value
Page 21
>>> Korean BioInformation Center>>> Korean BioInformation Center
>>> >>> KRIBBKRIBB Korea Research institute of Bioscience and BiotechnologyKorea Research institute of Bioscience and Biotechnology
FilteringGO Term
EvidenceSlimNumber of genes in termP-value
Pathways: KEGG and BiocartaNumber of genes in termP-value
Page 22
>>> Korean BioInformation Center>>> Korean BioInformation Center
>>> >>> KRIBBKRIBB Korea Research institute of Bioscience and BiotechnologyKorea Research institute of Bioscience and Biotechnology
Example: microarray clustering data
Part A
Part B
Page 23
>>> Korean BioInformation Center>>> Korean BioInformation Center
>>> >>> KRIBBKRIBB Korea Research institute of Bioscience and BiotechnologyKorea Research institute of Bioscience and Biotechnology
Interface
Select Organism
Put the gene set
Select GO category or Pathways
Page 24
>>> Korean BioInformation Center>>> Korean BioInformation Center
>>> >>> KRIBBKRIBB Korea Research institute of Bioscience and BiotechnologyKorea Research institute of Bioscience and Biotechnology
Click
Page 25
>>> Korean BioInformation Center>>> Korean BioInformation Center
>>> >>> KRIBBKRIBB Korea Research institute of Bioscience and BiotechnologyKorea Research institute of Bioscience and Biotechnology
Retaining only GO
terms having at least 5
genes
Page 26
>>> Korean BioInformation Center>>> Korean BioInformation Center
>>> >>> KRIBBKRIBB Korea Research institute of Bioscience and BiotechnologyKorea Research institute of Bioscience and Biotechnology
Page 27
>>> Korean BioInformation Center>>> Korean BioInformation Center
>>> >>> KRIBBKRIBB Korea Research institute of Bioscience and BiotechnologyKorea Research institute of Bioscience and Biotechnology
Select customized colors
Page 28
>>> Korean BioInformation Center>>> Korean BioInformation Center
>>> >>> KRIBBKRIBB Korea Research institute of Bioscience and BiotechnologyKorea Research institute of Bioscience and Biotechnology
Page 29
>>> Korean BioInformation Center>>> Korean BioInformation Center
>>> >>> KRIBBKRIBB Korea Research institute of Bioscience and BiotechnologyKorea Research institute of Bioscience and Biotechnology
Page 30
>>> Korean BioInformation Center>>> Korean BioInformation Center
>>> >>> KRIBBKRIBB Korea Research institute of Bioscience and BiotechnologyKorea Research institute of Bioscience and Biotechnology
Genes colored in KEGG and BioCarta
Page 31
>>> Korean BioInformation Center>>> Korean BioInformation Center
>>> >>> KRIBBKRIBB Korea Research institute of Bioscience and BiotechnologyKorea Research institute of Bioscience and Biotechnology
ConclusionUsing Gs2path, users
Get the integrated Gene Ontology terms and
pathways information together
Filter the results with various conditions
Capture relationships between Gene Ontology
terms and Pathways
Available at
http://array.kobic.re.kr:8080/arrayport/gs2path/