Page 1
Issues for metabolomics and systems biology
Douglas KellDouglas KellSchool of Chemistry, University of Manchester, School of Chemistry, University of Manchester,
MANCHESTER M60 1QD, U.K.MANCHESTER M60 1QD, U.K.
[email protected] @manchester.ac.ukhttp://http://dbkgroup.orgdbkgroup.org//
http://www.mib.ac.uk www.mcisb.org
Page 2
What I can’t do now and would like to
Page 3
Some facts I ‘know’ (i.e. think I can remember…)
• Epidemiologically, statins enhance longevity
• Cholesterol is barely a risk factor when within the normal range of 120-240 mg%
• Statins supposedly act (only) via HMG-CoA reductase to lower cholesterol
• Actually many have (and from the above logically must have) off-target effects
Page 4
More ‘facts’• Although originating as natural products,
many/most statins can bear comparatively little structural relationships to them or to each other
• Are there QSAR-type relations between the various off-target effects and the drugs that cause them?
atorvastatinlovastatin
Page 5
The software tool I want would integrate all of those questions by:
• Finding the facts from the literature (and the Web) by reading the articles ‘intelligently’
• Displaying and setting out the facts sensibly• Allowing the QSARs directly from the papers
as the structures and substructures would be ‘known’ (or knowlable via PubChem, DrugBank etc)
• Classify/cluster the off-target effects and the papers that described them (via TM and ML)
• Without me having to write any actual code
Page 6
Westerhoff & Palsson NBT 22, 1249-52 (2004)
But despite everything science is in some ways becoming LESS effective in an applied context
Page 7
Declining numbers of drug launches
Leeson & Springthorpe, NRDD 6, 881-890 (2007)
Page 8
Drug Discovery/Development Pipeline
• Multifaceted, complicated, lengthy process
O
O H
OO
HO
O
O
O
N
N
OO H
HOO H
NHCH 3
C lC l
IdeaIdea DrugDrug12 -15 Years12 -15 Years
DiscoveryDiscovery Exploratory DevelopmentExploratory Development Full DevelopmentFull DevelopmentPhase IPhase I Phase IIPhase II Phase IIIPhase III
0 155 10
Pre-cli
nical
Pharmac
ology
Pre-cli
nical
Pharmac
ology
Pre-cli
nical S
afety
Pre-cli
nical S
afety
Clinica
l Pharm
acology
& Safety
Clinica
l Pharm
acology
& Safety
ProductsProductsNH 2
CO 2HNNO H
N
N N
N
F
F
NH
OCH 3O
O
O
C lO
N H 2
NNH
OO -
O H O H O
F
O 2SN
N
HNO
NN
N
O
N NCF 3
SO
OH 2N
Peter S. Dragovich, Pfizer
Page 9
Attrition
Kola & Landis, NRDD 3, 711-5 (2004)
Page 10
Issues of attrition
• PK/PD less of an issue in last decade • Now mostly due to (i) lack of efficacy, (ii) toxicity• Both problems are underpinned by the fact that
drugs are typically first developed on the basis of molecular assays before being tested in the intact system
• These failures turn drug discovery – if it was not already – into a problem of systems biology
Page 11
Nature Rev Drug Disc 7, 205-220 (March 2008)
Page 12
Poor correlation between different artificial membrane (Corti & PAMPA) assays
Corti et al EJ Pharm Sci 7, 354-362 (2006)
Page 13
Poor correlation between Caco-2 cells and artificial membrane (PAMPA) assays
Note axis scales
Balimane et al., AAPS J 8, e1-e13 (2006)
Page 14
Poor relationship between PAMPA permeability and log Ko/w
Corti et al. EJ Pharm Sci 7, 354-362 (2006)
Page 15
Poor relationship between Caco-2 permeability and log Ko/w
Corti et al. EJ Pharm Sci 7, 354-362 (2006)
r2 = 0.097
THESE THEORIES OF DRUG UPTAKEWERE BIOPHYSICAL, ‘LIPID-ONLY’
THEORIES
Page 16
Narcotics (‘general anaesthetics’)
• Potency also correlates with log P (up to a cut-off) (Meyer & Overton)
• Negligible structure-activity relationships• Was assumed that they also act by a
‘biophysical’ mechanism by partitioning ‘nonspecifically’ into membrane and e.g. ‘squeezing’ nerve channels
• This too was a ‘lipid-only’ theory• None of this now stands up
Page 17
Anaesthetic potency does largely correlate with partitioning into membrane, suggesting (to
many) a ‘lipid-only’ mechanism
P. Seeman, Pharmacol Rev 24, 583-655 (1972)
Page 18
But…narcotics inhibit luciferase, a soluble protein, with the same potency with which
they anaesthetise animals, over 5 logs!
Franks & Lieb, Nature 310, 599-601 (1984)
No lipid involved!
Page 19
The structural basis is known
Franks et al, Biophys J 75, 2205-11 (1998)
Binding of bromoform to luciferase
Page 20
Halothane affects narcosis in part via a TREK-1 K+ channel
Heurteaux et al. EMBO J 23, 2684-95 (2004)
Page 21
How to integrate all this information with biological and
physiological networks?
• One strategy is Integrative Systems Biology
Page 22
One view of systems biology
Computation/Modelling
Experiment
TechnologyTheory
Page 23
Bringing together metabolomics and systems biology models
Drug Discovery Today 11, 1085-1092 (2006)
Page 24
There is a convergence between systems biology models from whole-genome reconstruction and the number of
experimental metabolome peaks (ca 3000 for human serum)
Page 25
The human metabolic network (1)
• 8 cellular compartments• 2,712 compartment-specific metabolites• ~ 1,500 different chemical entities• 1,496 genes• 2,233 metabolic reactions (1,795 unique)• 1,078 transport reactions (32.6%)
PNAS 104, 1777-1782 (2007)
Page 26
The human metabolic network (2)
• Not yet compartmentalised• 2,823 reactions (incl 300 ‘orphans’), of which 2,215
have disease associations, plus 1189 transport reactions and 457 exchange reactions
• 2,322 genes (1069 common with Palsson model)
Molecular Systems Biology 3, 135 (2007)
Page 27
Systems biology and modelling are all about representation
Page 28
The main representation for systems biology models is SBML
www.sbml.org
Page 29
BIOCHEMICAL MODEL (assumed to
be in SBML)
Store in dB
create
Compare with other models
VISUALISE
Layouts and views
SBGN
Overlays, dynamics
LINK WORKFLOWS
Soaplab, Taverna,
Web services, etc.
Store results of manipulations
Compare with and fit to real data (parameters and
variables) with constraints
How to deal with fitting, including as f(globalparameters like pH)
Integrate various levels
Run, analyse (sensitivities, etc)
Automatic characterisation of parameter space and
constraint checking
Model merging: (not) LEGO blocks
Optimal DoE for Sys Identification, incl identifiability
edit
Network Motif discovery
Literature mining
Cheminformatic analysesTHERE ARE MANY POSSIBLE THINGS THAT ONE
MIGHT DO WITH THIS REPRESENTATION, AND THESE ACTIONS CAN BE SEEN AS MODULES
Page 30
BIOCHEMICAL MODEL (assumed to
be in SBML)Compare with and fit to real
data (parameters and variables) with constraints
Compare with and fit to real data (parameters and
variables) with constraints
FEBS J 274, 5576-5585
4, 74-97
Page 31
The Data Management Infrastructure of the Manchester Centre for Integrated
Systems Biology
Norman PatonUniversity of Manchester
Page 32
Capabilities
• We require software to support:– Data capture: Pedro.– Data access: Pierre.– Integration of data and analyses: Taverna.
Page 33
Pipeline Pilot workflow
etc…
Page 34
METABOLIC MODEL IN SBML
CREATE MODEL
VISUALISE
STORE MODEL IN
DB
RUN BASE MODEL
SENSITIVITY ANALYSES
SCAN PARAMETER
SPACE
COMPARE WITH ‘REAL’ DATA
DIFFERENT METABOLIC
MODEL IN SBML
LITERATURE MINING ANNOTATE
STORE NEW MODEL IN DB
COMPARE MODELS
STORE DIFFERENCES AS
NEW MODEL IN DBSYSTEMS BIOLOGY WORKFLOWS
Page 35
Scientists Decoupled suppliers & consumers
Collaboration
Knowledge
Management
Science
Page 36
‘Warehouse’ vs distributed workflows
• Different ‘modules’ developed in different labs can reside on different computers anywhere, and expose themselves as Web Services
• Labs can then specialise in what they are best at• All that is then needed is an environment for enacting
bioinformatic workflows by coupling together these service-oriented architectures
• One such is Taverna• This is arguably the best way to combine metabolomic
SBML models with metabolomic data, and is what are using at MCISB
Page 37
Overall Architecture
Experiment1 Experimentn…
Repository1 Repositoryn…
Model
Repository
Analysis1
Analysisn
Consistent Web Interfaces
Consistent Web Service Interfaces
Data Integration
Using Workflows
Workflow
Repository
Page 38
The Taverna API consumer along with libSBML allows many of these
transformations to be performed
Details: http://www.mcisb.org/software/taverna/libsbml/index.html
Page 39
Relating Models to Expression Read gene names of enzymes from SBML model
Query maxd transcriptome database using gene names
Compute colour for expression readings
Create new SBMLmodel
Page 40
Visualise Models Using Cell Designer
JC_C-0.07-1_Measurement JC_N-0.07-1_Measurement
Page 41
Potential Solutions• Semantic annotation• Chemical and bio-text mining• RDF annotations – that can also be included
within the SBML• Integrated reasoning engine• Allowing literature-based discovery• But we still lack a proper and useful
(bio)chemical ontology integrating roles, pathways, diseases, chemical (sub)structures, targets, etc.
• This last is probably the most damaging lack and thus most important need
Page 42
Issues for metabolomics and systems biology
Douglas KellDouglas KellSchool of Chemistry, University of Manchester, School of Chemistry, University of Manchester,
MANCHESTER M60 1QD, U.K.MANCHESTER M60 1QD, U.K.
[email protected] @manchester.ac.ukhttp://http://dbkgroup.orgdbkgroup.org//
http://www.mib.ac.uk www.mcisb.org