Top Banner
ProSanos Corporation Confidential and Proprietary ProSanos Corporation Confidential and Proprietary Modeling and clustering Modeling and clustering disease progression for disease progression for correlation with correlation with genetic and demographic genetic and demographic factors factors Robert Kingan Robert Kingan
24

ProSanos Corporation Confidential and Proprietary Modeling and clustering disease progression for correlation with genetic and demographic factors Robert.

Jan 01, 2016

Download

Documents

Beryl Bailey
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: ProSanos Corporation Confidential and Proprietary Modeling and clustering disease progression for correlation with genetic and demographic factors Robert.

ProSanos Corporation Confidential and ProprietaryProSanos Corporation Confidential and Proprietary

Modeling and clustering Modeling and clustering disease progression for disease progression for

correlation with genetic and correlation with genetic and demographic factorsdemographic factors

Robert KinganRobert Kingan

Page 2: ProSanos Corporation Confidential and Proprietary Modeling and clustering disease progression for correlation with genetic and demographic factors Robert.

ProSanos Corporation Confidential and ProprietaryProSanos Corporation Confidential and Proprietary

What is SSIFT?

“To address […] common diseases, which include schizophrenia, depression, and breast cancer, it is essential to incorporate observations of the clinical progression of the disease to refine the definition of phenotype.” – Michael N. Liebman, U. Penn.

Yes, but what is SSIFT?– SSIFT = Stratification and Synchronization Inference

Technology

Page 3: ProSanos Corporation Confidential and Proprietary Modeling and clustering disease progression for correlation with genetic and demographic factors Robert.

ProSanos Corporation Confidential and ProprietaryProSanos Corporation Confidential and Proprietary

What is SSIFT?

Stratification: Dividing a patient population into groups which are meaningful for diagnosis, prognosis, treatment selection, or genotype-phenotype correlation.

Synchronization: Recognizing a pattern of disease progression, regardless of disease stage for a particular patient.

Page 4: ProSanos Corporation Confidential and Proprietary Modeling and clustering disease progression for correlation with genetic and demographic factors Robert.

ProSanos Corporation Confidential and ProprietaryProSanos Corporation Confidential and Proprietary

SSIFT overview

Assumptions—what is SSIFT-ableOther constraints on data selectionOutline of technique

– Identifying variables– Modeling disease progression– Parameterizing different models– Clustering patients by progression patterns– Interpreting the results

Page 5: ProSanos Corporation Confidential and Proprietary Modeling and clustering disease progression for correlation with genetic and demographic factors Robert.

ProSanos Corporation Confidential and ProprietaryProSanos Corporation Confidential and Proprietary

Pattern of disease progression

Time

Dis

ease

mar

ker

initial value

final value

period of change

Page 6: ProSanos Corporation Confidential and Proprietary Modeling and clustering disease progression for correlation with genetic and demographic factors Robert.

ProSanos Corporation Confidential and ProprietaryProSanos Corporation Confidential and Proprietary

SSIFT workflow

Survey the data

Select useful variables

Fit disease progression models

Construct feature vectors

Assign feature weights

Cluster weighted feature vectors

Evaluate the clustering results

Complete?No

Yes

Page 7: ProSanos Corporation Confidential and Proprietary Modeling and clustering disease progression for correlation with genetic and demographic factors Robert.

ProSanos Corporation Confidential and ProprietaryProSanos Corporation Confidential and Proprietary

SSIFT workflow

Patient 1

0

0.5

1

1.5

2

2.5

3

3.5

4

1 2 3 4 5 6 7 8 9 10

Patient 2

0

1

2

3

4

5

6

1 2 3 4 5 6 7 8 9 10

Patient 3

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

1 2 3 4 5 6 7 8 9 10

Patient 4

0

1

2

3

4

5

6

1 2 3 4 5 6 7 8 9 10

Patient 6

0

0.5

1

1.5

2

2.5

3

3.5

1 2 3 4 5 6 7 8 9 10

Patient 7

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

1 2 3 4 5 6 7 8 9 10

Patient 8

0

0.5

1

1.5

2

2.5

3

1 2 3 4 5 6 7 8 9 10

Patient 9

0

1

2

3

4

5

6

1 2 3 4 5 6 7 8 9 10

SSIFTPatient 5

0

0.5

1

1.5

2

2.5

3

3.5

1 2 3 4 5 6 7 8 9 10

Patient 10

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

1 2 3 4 5 6 7 8 9 10

Group A: Patients 1,3,7,10

0

1

2

3

4

5

1 4 7

10 13 16 19 22 25 28 31 34 37 40 43 46 49

Time (years)

Mar

ker L

evel

Group B: Patients 2,4,5,6,8,9

0

1

2

3

4

5

6

1 4 7

10 13 16 19 22 25 28 31 34 37 40 43 46 49

Time (years)

Mar

ker L

evel

Page 8: ProSanos Corporation Confidential and Proprietary Modeling and clustering disease progression for correlation with genetic and demographic factors Robert.

ProSanos Corporation Confidential and ProprietaryProSanos Corporation Confidential and Proprietary

SSIFT curve types

)(

)(

1)()(ˆ

t

t

e

eabaty

cty )(ˆ

mtyty 0)(ˆ

)1ln()(ˆ )(22

00

0

ttety

)1ln()(ˆ )(22

00

0

ttety

Page 9: ProSanos Corporation Confidential and Proprietary Modeling and clustering disease progression for correlation with genetic and demographic factors Robert.

ProSanos Corporation Confidential and ProprietaryProSanos Corporation Confidential and Proprietary

Converting parameters

Logistic

Constant

Linear

Early stable

Late stable

)*)2/)(

,,,(),,,( 4)(

m

ybabamba ab

),,,(),,,( NULLNULLccmba

)*

,),(ˆ),(ˆ(),,,( 01 m

yymtytymba n

)*

,),(ˆ,(),,,( 0

y

ttymba n

)*

,,),(ˆ(),,,( 01

y

ttymba

y* = population mean, t1=first time point, tn=last time point

Page 10: ProSanos Corporation Confidential and Proprietary Modeling and clustering disease progression for correlation with genetic and demographic factors Robert.

ProSanos Corporation Confidential and ProprietaryProSanos Corporation Confidential and Proprietary

Modified Mahalanobis distance

Tqpqp vvvvqpd )()(),( 1

Tqpqpqp vvvvqpd ))()((),( 11

21

Tqp

Tqpqp vvQggQvvqpd )())()(()(),( 2

1

Page 11: ProSanos Corporation Confidential and Proprietary Modeling and clustering disease progression for correlation with genetic and demographic factors Robert.

ProSanos Corporation Confidential and ProprietaryProSanos Corporation Confidential and Proprietary

SSIFT workflow

Survey the data

Select useful variables

Fit disease progression models

Construct feature vectors

Assign feature weights

Cluster weighted feature vectors

Evaluate the clustering results

Complete?No

Yes

Page 12: ProSanos Corporation Confidential and Proprietary Modeling and clustering disease progression for correlation with genetic and demographic factors Robert.

ProSanos Corporation Confidential and ProprietaryProSanos Corporation Confidential and Proprietary

SSIFT workflow

Survey the data

Select useful variables

Fit disease progression models

Construct feature vectors

Assign feature weights

Cluster weighted feature vectors

Evaluate the clustering results

Complete?No

Yes

Correlate results with:•demographic data•genetic data

Page 13: ProSanos Corporation Confidential and Proprietary Modeling and clustering disease progression for correlation with genetic and demographic factors Robert.

ProSanos Corporation Confidential and ProprietaryProSanos Corporation Confidential and Proprietary

Application of SSIFT to NIDDK

About NIDDKSSIFT and transplant dataVariable selectionModelingResults

Page 14: ProSanos Corporation Confidential and Proprietary Modeling and clustering disease progression for correlation with genetic and demographic factors Robert.

ProSanos Corporation Confidential and ProprietaryProSanos Corporation Confidential and Proprietary

Candidate variables

-Fetoprotein Albumin Alkaline phosphatase (AP) Bicarbonate Blood urea nitrogen (BUN) Calcium Creatinine clearance Cholesterol Chlorine Corrected PT control Creatinine Direct bilirubin FK506 level Glomerular filtration rate Gamma GTP Glucose Hematocrit (HCT)

Hemoglobin CSA HPLC level Potassium CSA monoclonal level Sodium Platelet count Prothrombin time Part. thromboplastin CT Part. thromboplastin PT CSA RIA level SGOT (AST) SGPT (ALT) Total bilirubin CSA TDX level Total protein White blood cells (WBC) Weight in KG

Page 15: ProSanos Corporation Confidential and Proprietary Modeling and clustering disease progression for correlation with genetic and demographic factors Robert.

ProSanos Corporation Confidential and ProprietaryProSanos Corporation Confidential and Proprietary

Selected variables

Variable Log? Ŝu Weights Ŝw

a b m

AST Yes 0.19 0 1 1 0 0.32

AP Yes 0.18 0 1 1 0 0.28

Hemoglobin No 0.13 0 1 0 1 0.24

Total bilirubin Yes 0.15 0 1 0 1 0.21

Potassium No 0.20 1 1 1 1 0.20

Hematocrit No 0.19 1 1 1 1 0.19

WBC Yes 0.17 1 1 1 1 0.17

BUN Yes 0.14 1 1 1 1 0.14

Creatinine Yes 0.12 1 1 1 1 0.12

Sodium No 0.11 1 1 1 1 0.11

Page 16: ProSanos Corporation Confidential and Proprietary Modeling and clustering disease progression for correlation with genetic and demographic factors Robert.

ProSanos Corporation Confidential and ProprietaryProSanos Corporation Confidential and Proprietary

Evaluating Kaplan-Meier curves

Ŝ

Page 17: ProSanos Corporation Confidential and Proprietary Modeling and clustering disease progression for correlation with genetic and demographic factors Robert.

ProSanos Corporation Confidential and ProprietaryProSanos Corporation Confidential and Proprietary

Final selected variables

Best pair: AST + AP, Ŝ=0.34

Best triple: AST + AP + hematocrit, Ŝ=0.42

No set of four variables exceeded Ŝ=0.42

Page 18: ProSanos Corporation Confidential and Proprietary Modeling and clustering disease progression for correlation with genetic and demographic factors Robert.

ProSanos Corporation Confidential and ProprietaryProSanos Corporation Confidential and Proprietary

Survival by clustered SSIFT AST, AP and HCT parameters

Ŝ = 0.42

Page 19: ProSanos Corporation Confidential and Proprietary Modeling and clustering disease progression for correlation with genetic and demographic factors Robert.

ProSanos Corporation Confidential and ProprietaryProSanos Corporation Confidential and Proprietary

Cluster mean curves

Best clusterWorst cluster

Page 20: ProSanos Corporation Confidential and Proprietary Modeling and clustering disease progression for correlation with genetic and demographic factors Robert.

ProSanos Corporation Confidential and ProprietaryProSanos Corporation Confidential and Proprietary

SSIFT in Gene Discovery: Simulation

TimeM

arke

rs

SSIFT™

Disease Genes Disease Progression Pattern

Determine

AnalyzeDiscover

Page 21: ProSanos Corporation Confidential and Proprietary Modeling and clustering disease progression for correlation with genetic and demographic factors Robert.

ProSanos Corporation Confidential and ProprietaryProSanos Corporation Confidential and Proprietary

Simulated data

Mar

ker

Val

ue

(rel

ativ

e sc

ale)

Time (years)

Page 22: ProSanos Corporation Confidential and Proprietary Modeling and clustering disease progression for correlation with genetic and demographic factors Robert.

ProSanos Corporation Confidential and ProprietaryProSanos Corporation Confidential and Proprietary

14

77

5 49 4

26

87 8

53

76

91 74 9

64 51

18

31

38 48

23

89 73

70

80

39

61 40 2

97 3 29

12

32

56

83 77

10

01

46

06

67

1 30

33

58 21

96

15

57

99 6 79 25

86

94 27

81

37

65

72 46

69

88

98 95 13

52

93 19 35

34

43 5

01

74

4 42 41 8

52

43

6 28 78

55

92

10

11 16

59

90 84

45

57

68

20

67 22

54

82 63 6

2

01

23

45

Dendrogram of agnes(x = distance.matrix, diss = TRUE, method = "average")

Agglomerative Coefficient = 0.99

He

igh

t

Clustering Results

Page 23: ProSanos Corporation Confidential and Proprietary Modeling and clustering disease progression for correlation with genetic and demographic factors Robert.

ProSanos Corporation Confidential and ProprietaryProSanos Corporation Confidential and Proprietary

Nearest-Neighbor AnalysisGene Genotype for Nearest Neighbors, based on SSIFT Pattern

123456789012345678901234567890123456789012345678901234567890123456789C9254 352131423364133346331543231331365461513311265564413451642314514456463

D7562 323136554461643162336542432246213615526446315262641541251463645444165

J1789 122552224555422461565335314552346323516442534343355552444332245543456

A2109 451422556633426215561542335632532551544111436664632366416662551652621

J2602 261323412652652223665466252216452111435542263542444161536324633341322

W4147 321143532244464333634436443621115464641422635644662235654536525633252

C2353

L9800 336242634156534231126616432343525335453144614443526334516552645522411

P3336 134335463316364241553225312351666146252445354642364643361143152565441

R2489 333645353163166613452462363523625226142335415145144513456124654144534

K805 125622234521326136152541524635445125324132524235224536261424613321411

K8420 121246521534522166135433555544321611651615366165466361564321116551155

D7336 656426432634424564226153331645113553652653122613653166613412212454536

S4207 336631132522113543146663466524336526152322153355236454211112421554435

S9560 612133524211556321334441342343121422445113565223663146264256263231422

B3833 111451165325313114512436245536622531545565455124463645111525655366466

B1192 641336314611531121361246426232236435651132226313322353452353441515446

S939 235312511132633313662343516122256413432554515433213166261326465216115

T3285 441652143434363443164114445154263532135434413455513346363354553424142

555555566666666666645526566351142222222222224412114442333333333344344

C2353 is related to SSIFT pattern of disease progression (p<10-41 ).

Page 24: ProSanos Corporation Confidential and Proprietary Modeling and clustering disease progression for correlation with genetic and demographic factors Robert.

ProSanos Corporation Confidential and ProprietaryProSanos Corporation Confidential and Proprietary

SSIFT: Stratification and SSIFT: Stratification and Synchronization Inference Synchronization Inference

TechnologyTechnology

DiscussionDiscussion