NP - Positive set Negativ e Set Full length ORFs Genome Annotat ed Candidat e NPs Top ranked NPs Input Training NP catalogue Negativ e Set Negativ e Set Negativ e set NP processin g tools Transla ted proteom e ML quality: Cross validation NeuroPID prediction
42
Embed
NP - Positive set Negative Set Full length ORFs Genome Annotated Candidate NPs Top ranked NPs Input Training NP catalogue Negative Set Negative set NP.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
NP - Positiveset
Negative Set
Full lengthORFs
Genome
Annotated
Candidate NPs
Top ranked NPs
Input Training NP catalogue
Negative Set
Negative Set
Negative set
NP processing tools
Translated proteome ML quality:
Cross validation
NeuroPID
prediction
Q
Y C
N
H
L D
R
W
M
T
S
G
A
V
P
F
I E
K
0
20
40
60
80
100
1-lo
g(p-
valu
e, t-
test
)
1-lo
g(p-
valu
e, t-
test
)
A B
GRAVY
Instabilit
y
Molecular
Weig
ht PI
Aromati
city
0
10
20
30
MRSRTSVLTSSLAFLYFFGIVGRSALAMEETPASSMNLQHYNN
MLNPMVFDDTMPEKRAYTYVSEYKRLPVYNFGIGKRWIDTNDN
KRGRDYSFGLGKRRQYSFGLGKRNDNADYPLRLNLDYLPVDNP
AFHSQENTDDFLEEKRGRQPYSFGLGKRAVHYSGGQPLGSKRP
NDMLSQRYHFGLGKRMSEDEEESSQR
MRSRTSVLTSSLAFLYFFGIVGRSALAMEETPASSMNLQHYNN
MLNPMVFDDTMPEKRAYTYVSEYKRLPVYNFGIGKRWIDTNDN
KRGRDYSFGLGKRRQYSFGLGKRNDNADYPLRLNLDYLPVDNP
AFHSQENTDDFLEEKRGRQPYSFGLGKRAVHYSGGQPLGSKRP
NDMLSQRYHFGLGKRMSEDEEESSQR
Random Forest Classifier RBF Linear SVC Gradient Boosting SVC Sigmoid Polynomal SVM0
0.2
0.4
0.6
0.8
1
‘accuracy’ ‘precision’ ‘recall’
Area under ROC curve
Cros
s va
lidati
on p
erfo
rman
ce
Cros
s va
lidati
on p
erfo
rman
ce
S. frugiperda (Fall armyworm) 5
H. armigera (Cotton bollworm) 6
S. gregorian (Desert locust ) 4
A. florea (Little honeybee) 0
M. rotundata (Alfalfa leafcutter bee)1
C. floridanus (Florida carpenter ant) 2
A. echinatior (Leafcutter ant) 3
A
C
B
D
D
SW Arthropods
UniProt Arthropods
Random Forest
Gradient Boosting
Linear SVC
Random Forest
Gradient Boosting
Linear SVC
Mean Accuracy 0.94 0.95 0.94 0.92 0.92 0.86
Mean Precision 0.94 0.95 0.93 0.93 0.94 0.95
Mean Recall 0.92 0.92 0.92 0.95 0.95 0.85
Mean AUC 0.94 0.95 0.94 0.89 0.90 0.87SW
Chordata UniProt
Chrodata
Random Forest
Gradient Boosting
Linear SVC
Random Forest
Gradient Boosting
Linear SVC
Mean Accuracy 0.96 0.97 0.95 0.90 0.91 0.85
Mean Precision 0.94 0.94 0.88 0.91 0.92 0.89
Mean Recall 0.91 0.92 0.93 0.91 0.91 0.83
Mean AUC 0.95 0.95 0.94 0.90 0.91 0.85
Organism# sequences UniProtKB
# of full length
UniProtKB# of SP
# of NP & SP
# NeuroPID All methods
Functional annotation enrichment
B. mori 17908 17069 138 6 69Innate immunity;Insulin-like; Chorion, Hormne (NP)