Association of variations in I kappa B-epsilon with Graves' disease using classical and my Grid methodologies Peter Li School of Computing Science University of Newcastle upon Tyne
Jan 17, 2016
Association of variations in I kappa B-epsilon with Graves' disease using
classical and myGrid methodologies
Peter Li
School of Computing ScienceUniversity of Newcastle upon Tyne
In silico experiments in bioinformatics
Bioinformatics analyses - in silico experiments - workflows
EMBL
BLAST
Clustal-W
Genscan
Resources/Services
Example workflow: Investigate the evolutionary relationships between proteins
Clustal-W
Proteinsequences
Multiplesequencealignment
Query
Issues in bioinformatics
Large amounts of distributed data in different databases
Highly heterogeneous
Lack of standards
Many applications
Different algorithms, implementations
Lack of programmatic access
Difficult to perform bioinformatics analyses
e-Science and myGrid
• e-Science for in silico experiments– Potential solution to problems in bioinformatics– Using the Grid as a framework– Bioinformatics resources deployed as Web Services
• myGrid– Development of middleware to support the
performance of in silico experiments in biology– Investigated the use of myGrid workflow technology in
the genetic analysis of Graves’ disease
Graves’ disease
• Autoimmune thyroid disease
• Lymphocytes attack thyroid gland cells causing hyperthyroidism
• An inherited disorder• Complex genetic basis• Symptoms:
– Increased pulse rate, sweating, heat intolerance
– goitre, exophthalmos
In silico experiments in Graves’ disease
• Microarray data analysis
• Gene annotation pipeline
• Design of genotype assays for SNP variations
Classical approach to the bioinformatics of Graves’ diseaseData Analysis - Microarray
Import microarray data to Affymetrix data Mining Tool, Run Analyses and select
Experiment Design to test Hypotheses Find restriction sites and design primers by eye for genotyping experiments
Study Annotations for many different Genes
Select Gene and Visually examine SNPS lying within
Taverna workflow system
• Used to compose and enact in silico experiments in myGrid
• Freefluo enactor• Scufl language• Workbench GUI
– Service browser– Model explorer for workflow
composition– Graphical view of workflow
• Free and open sourcehttp://taverna.sf.net
Modelling in silico experiments as workflows
• Semantic, syntactic and format typing of data in workflow
• Data has to be filtered, transformed, parsed for consumption by services
Annotation Pipeline
GO MEDLINE KEGG SwissProt InterPro PDB Blast HGBASE
Query
Results: Differential expression and variations of the I kappa B-epsilon gene
Mean NFKBIE expression levels -
• Controls: 1.60 +/- 0.11 (SEM)
• GD: 2.22 +/- 0.20 (SEM)
• P=0.0047 (T-test)
n=303’ UTR SNP – 3948 C/A
- Mnl restriction site- χ2 = 9.1, p = 0.0025, Odds Ratio = 1.4
Comparison between conventional bioinformatics and Taverna workflow approaches
• Advantages– Graphical composition of experiments– Automation and speed– Management of workflow information– Share and reuse workflows for other diseases
• Issues– Initial cost of learning/activation energy– Lack of Web Service interface to required
resources
Acknowledgements
• Institute of Human Genetics– Simon Pearce and Claire Jennings
• School of Computing Science– Anil Wipat, Matthew Pocock and Keith
Hayward
• European Bioinformatics Institute– Tom Oinn