Investigations of helminth kinomes - from understanding parasite biology to drug discovery Andreas J. Stroehlein
Investigations of helminth kinomes -
from understanding parasite biology
to drug discovery
Andreas J. Stroehlein
Paul M. Selzer
BSc Bioinformatics
“The Pan-Proteome of the Bacterial Pathogens Causing Bovine Respiratory Disease”
Molecular Discovery Sciences at
MSD Animal Health Innovation GmbH
Bingen
MSc Bioinformatics
“The Metabolic Potential of Balamuthia mandrillaris – A Computational Approach for Effective Pathway
Identification”
Robert Koch Institute Mycotic and Parasitic Agents and Mycobacteria
Berlin
B. mandrillaris draft genome/transcriptome
The Gasser laboratory
Aaron Jex
Abdul Jabbar
Sarah Preston Clare Anstead
Namitha Mohandas
Anson Koehler
Pasi Korhonen
Neil Young
Robin Gasser
Human parasites Schistosoma, Trichuris,
Ascaris…
Mitochondrial Genomes
Food-borne parasites
Opisthorchis…
Veterinary parasites
Haemonchus…
Water-borne parasites Cryptosporidium, Giardia
Parasites of wildlife
Parasite genomics
Bioinformatics Drug discovery &
drug screening
Fields of research
Zoonoses
Worms (helminths) cause human diseases worldwide
7
Veterinary importance
Fascioliasis (F. hepatica) Ascariasis (A. suum) Haemonchosis (H. contortus)
8
Mass treatment leads to development of drug resistance
New, effective drugs needed
9
Using parasite genomics to define new drug targets
Ascaris Next-generation sequencing
technologies
10
Unlock parasite biology
Define essential pathways/proteins
Computational Drug Discovery
Drug screening & hit-to-lead
optimisation
Parasite genomics and transcriptomics
Ascaris suum (Nature 2011) Haemonchus contortus
(Genome Biology 2013)
Trichuris suis (Nat. Gen. 2014)
Lucilia cuprina (Nat. Comm. 2015)
Toxocara canis (Nat. Comm. 2015)
Schistosoma haematobium (Nat. Gen. 2012)
Opisthorchis viverrini (Nat. Comm. 2014)
Trichinella spp., Fasciola spp., Oesophagostomum dentatum…
Praziquantel (PZQ)
60% urogenital schistosomiasis
>200 million cases of schistosomiasis (S. mansoni, S. haematobium)
S. haematobium and schistosomiasis
- Chronic inflammation - Granulomas and fibrosis - Bladder cancer - HIV predisposition
Risk of resistance development against PZQ Need for new drugs that kill all schistosomes Lack of studies in S. haematobium
Miracidia
Cercariae
Adults ♂♀
Eggs
Enter via skin
Develop in snail
Penetrate snail
Exit through bladder wall
via urine
Liver via lungs
Urogenital vasculature
Drug discovery guided by genomics and transcriptomics
Nature Genetics 44, 221-225 (2012)
Protein kinases as drug targets
• Regulate essential signalling processes
• Relatively conserved catalytic domain and structure
• Mode of action well-understood
14
Protein kinases
Protein kinases as drug targets
Drug development pipeline
Genes Drug
Targets
Small Molecules Cell Assays
Drug Tests
Drug Leads
Clin. Trials
Drug FDA
Sales
Takes 10-15 years
Costs billions of $
Defining the S. haematobium kinome
Kinase identification and classification Kinannote (Goldberg et al. 2013)
Pairwise orthologs OrthoMCL (Li et al. 2003)
S. haematobium genome and transcriptome
S. mansoni genome and kinome (Protasio et al. 2012 & Andrade et al. 2011)
Iterative curation of gene models BLAT (Kent 2002)
Exonerate (Slater et al. 2005)
Phylogenetic analysis Mr. Bayes (Ronquist et al. 2012)
Group- and trematode-specific probability models (HMMs)
Multi-stage approach tailored to schistosome kinase identification and classification
S. mansoni as reference
Improved identification, classification and gene models in both species
Consistent orthology and classification using independent methods
Cu
rati
on
C
on
sen
sus
The challenge of functional annotation of trematode proteins
C. elegans
D. melanogaster
M. musculus
Parasitic flatworms Up to 50% of proteins
uncharacterised
3,525 genomes 9,455,299 proteins 46,169 bioprojects
1,312 genomes 3,809,380 proteins 6,355 bioprojects
304 genomes 408,904 proteins 533 bioprojects
NCBI (July 2015)
Refining the alignment of catalytic kinase domains
Pkinase_Tyr (PF07714) 145 sequences
Pkinase (PF00069) 54 sequences
Kinase identification (Kinannote)
9 group- and trematode-specific hidden Markov models (HMMs) Trematode_AGC Trematode_Other Trematode_CAMK Trematode_RGC Trematode_CMGC Trematode_STE Trematode_CK1 Trematode_TK Trematode_TKL
Using functional domain patterns for kinase identification
Good patterns representing
protein kinases
Bad patterns representing NO protein
kinases
ipro_output.csv
Extract domain patterns
Pattern known?
…
Prot_X ‘ABC’
Prot_Y ‘ABC’
Prot_Z ‘ABC’
Prot_I ‘DEF’
Prot_J ‘AEF’
Prot_K ‘BDF’
…
Domain info - Pfam
- PANTHER - SUPERFAMILY
Knowledgebase Library of known patterns
yes automatically decide if kinase or not based on
knowledgebase
no prompt user providing
information for each domain
Expert knowledge & literature
Store user decision in knowledgebase
CDK
CDKLMAK
SRPK
GSK CK2CLK
CAMK2
DAPK
MLC
K
MARK
NIM1
QIK
CAM
K1
PHK
CASK
CA
MK
2M
K2
DCAM
KL
PKD
CH
K1
CAM
K1
RAD53
LKB PIM
ME
LK
TTBK
VRKCK1-ACK1-GCK1-D
STKR2
STKR1
STKR
MLK
RAF
LRRKIL
K
LZK
MLK
LIMK
LISK
KHS
MSN
MST
ME
K4
ME
K7
ME
K3
STE7
STE11
ASK
FRAY
STE
PA
KA
SLK
TAO
YSK
AM
PK
BR
SK
SNRK
PAKB
MEK1
INSR
SE
V
VKR
TRK
MU
SK
CCK4ROR
EG
FR
AC
KS
YK
FAK
FGFR
RYK
SR
C
ABL
CSK
TECEPH
FER
GCN2
PEK
WEE
ShSAK
AURPLK1SmSAK
FUSED
NEK6
WNKTTK
NAKBIKE
MPSKNKF2
IRE
SCY1
NRBP
NKF5
HASPIN
VPS15
TBCK
BUD32
NEK10
CDC7ULKTLK ULK
NEK
ULK
PDK1
MAST
NDR LATS
ROCK
DMPKGEK
RSKP90
BARK
GRKRSKP70
SGK AKT
PKN
PKCI
PKC
H
PKCA
PRP4
HIPK
DY
RK
1D
YR
K2
ERK7
NMO P38
JNK
ERK1CDK8
CDK7
CDK CDK10CRK7
CDK9CDK4
CDC2 CDK5
PFTAIRE
CDKPCTAIRE
0.5
CMGC
51/49CAMK
41/41
AGC
39/39
Other
40/40
TK
31/31STE
27/27
CK1 9/9
RGC 3/3
20/20TKL
261 (Sh)
259 (Sm)
ePKs
PKA
PKA
PKG
The S. haematobium kinome and interspecific comparisons
Total: 269/267 ePKs: 261/259 PKLs RIO: 2/2 ABC: 2/2 Unclassified: 4/4 Closest human homologs: - CDC5 - KSR - WNK1 - RSK
Conserved
groups, families, subfamilies
Conserved pairs of orthologs
Two kinases unique in S. haematobium
CDK
CDKLMAK
SRPK
GSK CK2CLK
CAMK2
DAPK
MLC
K
MARK
NIM1
QIK
CAM
K1
PHK
CASK
CA
MK
2M
K2
DCAM
KL
PKD
CH
K1
CAM
K1
RAD53
LKB PIM
ME
LK
TTBK
VRKCK1-ACK1-GCK1-D
STKR2
STKR1
STKR
MLK
RAF
LRRKIL
K
LZK
MLK
LIMK
LISK
KHS
MSN
MST
ME
K4
ME
K7
ME
K3
STE7
STE11
ASK
FRAY
STE
PA
KA
SLK
TAO
YSK
AM
PK
BR
SK
SNRK
PAKB
MEK1
INSR
SE
V
VKR
TRK
MU
SK
CCK4ROR
EG
FR
AC
KS
YK
FAK
FGFR
RYK
SR
C
ABL
CSK
TECEPH
FER
GCN2
PEK
WEE
ShSAK
AURPLK1SmSAK
FUSED
NEK6
WNKTTK
NAKBIKE
MPSKNKF2
IRE
SCY1
NRBP
NKF5
HASPIN
VPS15
TBCK
BUD32
NEK10
CDC7ULKTLK ULK
NEK
ULK
PDK1
MAST
NDR LATS
ROCK
DMPKGEK
RSKP90
BARK
GRKRSKP70
SGK AKT
PKN
PKCI
PKC
H
PKCA
PRP4
HIPK
DY
RK
1D
YR
K2
ERK7
NMO P38
JNK
ERK1CDK8
CDK7
CDK CDK10CRK7
CDK9CDK4
CDC2 CDK5
PFTAIRE
CDKPCTAIRE
0.5
CMGC
51/49CAMK
41/41
AGC
39/39
Other
40/40
TK
31/31STE
27/27
CK1 9/9
RGC 3/3
20/20TKL
261 (Sh)
259 (Sm)
ePKs
PKA
PKA
PKG
CDK
CDKLMAK
SRPK
GSK CK2CLK
CAMK2
DAPK
MLC
K
MARK
NIM1
QIK
CAM
K1
PHK
CASK
CA
MK
2M
K2
DCAM
KL
PKD
CH
K1
CAM
K1
RAD53
LKB PIM
ME
LK
TTBK
VRKCK1-ACK1-GCK1-D
STKR2
STKR1
STKR
MLK
RAF
LRRKIL
K
LZK
MLK
LIMK
LISK
KHS
MSN
MST
ME
K4
ME
K7
ME
K3
STE7
STE11
ASK
FRAY
STE
PA
KA
SLK
TAO
YSK
AM
PK
BR
SK
SNRK
PAKB
MEK1
INSR
SE
V
VKR
TRK
MU
SK
CCK4ROR
EG
FR
AC
KS
YK
FAK
FGFR
RYK
SR
C
ABL
CSK
TECEPH
FER
GCN2
PEK
WEE
ShSAK
AURPLK1SmSAK
FUSED
NEK6
WNKTTK
NAKBIKE
MPSKNKF2
IRE
SCY1
NRBP
NKF5
HASPIN
VPS15
TBCK
BUD32
NEK10
CDC7ULKTLK ULK
NEK
ULK
PDK1
MAST
NDR LATS
ROCK
DMPKGEK
RSKP90
BARK
GRKRSKP70
SGK AKT
PKN
PKCI
PKC
H
PKCA
PRP4
HIPK
DY
RK
1D
YR
K2
ERK7
NMO P38
JNK
ERK1CDK8
CDK7
CDK CDK10CRK7
CDK9CDK4
CDC2 CDK5
PFTAIRE
CDKPCTAIRE
0.5
CMGC
51/49CAMK
41/41
AGC
39/39
Other
40/40
TK
31/31STE
27/27
CK1 9/9
RGC 3/3
20/20TKL
261 (Sh)
259 (Sm)
ePKs
PKA
PKA
PKG
Sequence conservation of S. haematobium kinases
0 20 40 60 80 100
02
040
60
80
10
012
0
Nu
mb
er o
f ki
nas
es
Pairwise sequence similarity (%)
S. mansoni
H. sapiens
Lowest similarity:
H. sapiens: Unclassified (58%) S. mansoni: RGC group (87%)
Highest similarity:
H. sapiens: CK1 group (76%) S. mansoni: CMGC group (94%)
Functional annotation and variable transcription profiles
All transcribed
kinase
sequences
EggMale Female
AHighly transcribed
kinase sequences
(top 10%)
EggMale Female
B
Per
cen
tag
e o
f tr
ansc
rip
ts
Per
cen
tag
e o
f tr
ansc
rip
ts
(2; 2; 2) Transcription (0; 1; 0)
(2; 2; 2) Replication and repair (0; 1; 0)
(3; 3; 3) Translation (1; 3; 3)
(3; 1; 2) Nucleotide metabolism (0; 0; 0)
(8; 8; 5) Signalling molecules (0; 1; 0)
and interaction
(8; 9; 9) Folding, sorting (1; 1; 0)
and degradation
(12; 10; 11) Excretory system (1; 1; 1)
(15; 12; 15) Sensory system (3; 0; 2)
(16; 13; 16) Digestive system (5; 1; 3)
(18; 18; 16) Transport and catabolism (1; 2; 2)
(20; 16; 18) Circulatory system (2; 0; 1)
(23; 21; 21) Environmental adaptation (5; 2; 6)
(25; 24; 21) Cell motility (3; 1; 0)
(31; 30; 29) Cell growth and death (4; 5; 9)
(35; 35; 29) Development (6; 4; 3)
(41; 39; 39) Nervous system (9; 5; 7)
(45; 43; 42) Endocrine system (10; 12; 10)
(53; 50; 48) Cell communication (8; 7; 8)
(91; 88; 84) Signal transduction (22; 18; 18)
(46; 44; 42) Immune system (8; 6; 4)
Kinase transcription is conserved throughout the S. haematobium life cycle
Prediction of druggable targets and associated compounds
57 kinases
219 kinases
Impaired/lethal phenotype?
257 S. haematobium kinases (male + female)
Pathway chokepoint?
Associated small molecule?
Clinical candidate or approved drug?
Drug-like biochemical properties?
40 druggable kinases
43 prioritised compounds 33 cancer drugs
17 FDA-approved
The perfect drug – first steps and future avenues
Not protected by patent
Safe for host
Easy to synthesise
Medicinal chemistry
Discontinued & non-active analogs of approved drugs
Natural products
Incentive for big pharma to pursue a drug candidate
Fasciola spp. veterinary importance
Unprotected analogs to bypass patents
Treatment period ADMET
Selective efficacy against (a group of)
parasites
Drug screening
Repurpose approved drugs
Comparison of kinomes (broad-spectrum targets)
Structural comparisons
Virtual screening
Excellent patient
compliance Safe for
environment
Free or cheap
Dosage & Administration
Conclusions and future directions
CDK
CDKLMAK
SRPK
GSK CK2CLK
CAMK2
DAPK
MLC
K
MARK
NIM1
QIK
CAM
K1
PHK
CASK
CA
MK
2M
K2
DCAM
KL
PKD
CH
K1
CAM
K1
RAD53
LKB PIM
ME
LK
TTBK
VRKCK1-ACK1-GCK1-D
STKR2
STKR1
STKR
MLK
RAF
LRRKIL
K
LZK
MLK
LIMK
LISK
KHS
MSN
MST
ME
K4
ME
K7
ME
K3
STE7
STE11
ASK
FRAY
STE
PA
KA
SLK
TAO
YSK
AM
PK
BR
SK
SNRK
PAKB
MEK1
INSR
SE
VVKR
TRK
MU
SK
CCK4ROR
EG
FR
AC
KS
YK
FAK
FGFR
RYK
SR
C
ABL
CSK
TECEPH
FER
GCN2
PEK
WEE
ShSAK
AURPLK1SmSAK
FUSED
NEK6
WNKTTK
NAKBIKE
MPSKNKF2
IRE
SCY1
NRBP
NKF5
HASPIN
VPS15
TBCK
BUD32
NEK10
CDC7ULKTLK ULK
NEK
ULK
PDK1
MAST
NDR LATS
ROCK
DMPKGEK
RSKP90
BARK
GRKRSKP70
SGK AKT
PKN
PKCI
PKC
H
PKCA
PRP4
HIPK
DY
RK
1D
YR
K2
ERK7
NMO P38
JNK
ERK1CDK8
CDK7
CDK CDK10CRK7
CDK9CDK4
CDC2 CDK5
PFTAIRE
CDKPCTAIRE
0.5
CMGC
51/49CAMK
41/41
AGC
39/39
Other
40/40
TK
31/31STE
27/27
CK1 9/9
RGC 3/3
20/20TKL
261 (Sh)
259 (Sm)
ePKs
PKA
PKA
PKG