Top Banner
Ameliorating Statistical Methodologies as Genomic Data Burgeon: Refined Proportional Odds Model with Application to New Dravet Dataset Ivan Rodriguez *,, Joseph C. Watkins, Ph.D. *,,§ * The University of Arizona Department of Mathematics UROC-PREP/STAR Program § Graduate Interdisciplinary Program in Statistics, Chair October 1, 2016
92

Rodriguez_NRMC_Presentation

Jan 17, 2017

Download

Documents

Ivan Rodriguez
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Rodriguez_NRMC_Presentation

Ameliorating Statistical Methodologies as Genomic Data Burgeon:Refined Proportional Odds Model with Application to New Dravet Dataset

Ivan Rodriguez∗,⊥, †Joseph C. Watkins, Ph.D.∗,⊥,§

∗The University of Arizona⊥Department of Mathematics†UROC-PREP/STAR Program

§Graduate Interdisciplinary Program in Statistics, Chair

October 1, 2016

Page 2: Rodriguez_NRMC_Presentation

Focus

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 3: Rodriguez_NRMC_Presentation

Focus

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 4: Rodriguez_NRMC_Presentation

Research Overview

Challenge: making sense of this abundant data.Motivation: ≈150,000 newborns diagnosed with genetic diseaseannually (Nussbaum, McInnes, & Willard, 2007).Objectives:

Match data and diagnosis by improving existing technique.Apply model to new and exclusive dataset.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 5: Rodriguez_NRMC_Presentation

Research Overview

Challenge: making sense of this abundant data.Motivation: ≈150,000 newborns diagnosed with genetic diseaseannually (Nussbaum, McInnes, & Willard, 2007).Objectives:

Match data and diagnosis by improving existing technique.Apply model to new and exclusive dataset.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 6: Rodriguez_NRMC_Presentation

Research Overview

Challenge: making sense of this abundant data.Motivation: ≈150,000 newborns diagnosed with genetic diseaseannually (Nussbaum, McInnes, & Willard, 2007).Objectives:

Match data and diagnosis by improving existing technique.Apply model to new and exclusive dataset.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 7: Rodriguez_NRMC_Presentation

Research Overview

Challenge: making sense of this abundant data.Motivation: ≈150,000 newborns diagnosed with genetic diseaseannually (Nussbaum, McInnes, & Willard, 2007).Objectives:

Match data and diagnosis by improving existing technique.Apply model to new and exclusive dataset.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 8: Rodriguez_NRMC_Presentation

Research Overview

Challenge: making sense of this abundant data.Motivation: ≈150,000 newborns diagnosed with genetic diseaseannually (Nussbaum, McInnes, & Willard, 2007).Objectives:

Match data and diagnosis by improving existing technique.Apply model to new and exclusive dataset.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 9: Rodriguez_NRMC_Presentation

Research Overview

Challenge: making sense of this abundant data.Motivation: ≈150,000 newborns diagnosed with genetic diseaseannually (Nussbaum, McInnes, & Willard, 2007).Objectives:

Match data and diagnosis by improving existing technique.Apply model to new and exclusive dataset.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 10: Rodriguez_NRMC_Presentation

Methods: Ordinal Categorical Data Analysis

Analysis of data with non-arbitrary categorical ordering.Relevant example: disease severity scale.Complications:

Assigning numeric values to categories.Nonequidistance between categories.

Naïve solution: dichotomize ordinal outcome.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 11: Rodriguez_NRMC_Presentation

Methods: Ordinal Categorical Data Analysis

Analysis of data with non-arbitrary categorical ordering.Relevant example: disease severity scale.Complications:

Assigning numeric values to categories.Nonequidistance between categories.

Naïve solution: dichotomize ordinal outcome.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 12: Rodriguez_NRMC_Presentation

Methods: Ordinal Categorical Data Analysis

Analysis of data with non-arbitrary categorical ordering.Relevant example: disease severity scale.Complications:

Assigning numeric values to categories.Nonequidistance between categories.

Naïve solution: dichotomize ordinal outcome.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 13: Rodriguez_NRMC_Presentation

Methods: Ordinal Categorical Data Analysis

Analysis of data with non-arbitrary categorical ordering.Relevant example: disease severity scale.Complications:

Assigning numeric values to categories.Nonequidistance between categories.

Naïve solution: dichotomize ordinal outcome.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 14: Rodriguez_NRMC_Presentation

Methods: Ordinal Categorical Data Analysis

Analysis of data with non-arbitrary categorical ordering.Relevant example: disease severity scale.Complications:

Assigning numeric values to categories.Nonequidistance between categories.

Naïve solution: dichotomize ordinal outcome.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 15: Rodriguez_NRMC_Presentation

Methods: Ordinal Categorical Data Analysis

Analysis of data with non-arbitrary categorical ordering.Relevant example: disease severity scale.Complications:

Assigning numeric values to categories.Nonequidistance between categories.

Naïve solution: dichotomize ordinal outcome.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 16: Rodriguez_NRMC_Presentation

Methods: Ordinal Categorical Data Analysis

Analysis of data with non-arbitrary categorical ordering.Relevant example: disease severity scale.Complications:

Assigning numeric values to categories.Nonequidistance between categories.

Naïve solution: dichotomize ordinal outcome.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 17: Rodriguez_NRMC_Presentation

Methods: Proportional Odds Model, General

Better method: the proportional odds model (McCullagh, 1980).Extends binary logistic regression (Cox, 1958).Celebrated method for ordinal data analysis (Bender & Grouven,1998).Applications: surveys, quality assurance, radiology, clinical research(McCullagh, 1999).

logit[P(Yi ≤ j | Xi)

]= θj − βTXi , j ∈ (1, . . . , J − 1),

logit(π) = log(

π

1− π

).

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 18: Rodriguez_NRMC_Presentation

Methods: Proportional Odds Model, General

Better method: the proportional odds model (McCullagh, 1980).Extends binary logistic regression (Cox, 1958).Celebrated method for ordinal data analysis (Bender & Grouven,1998).Applications: surveys, quality assurance, radiology, clinical research(McCullagh, 1999).

logit[P(Yi ≤ j | Xi)

]= θj − βTXi , j ∈ (1, . . . , J − 1),

logit(π) = log(

π

1− π

).

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 19: Rodriguez_NRMC_Presentation

Methods: Proportional Odds Model, General

Better method: the proportional odds model (McCullagh, 1980).Extends binary logistic regression (Cox, 1958).Celebrated method for ordinal data analysis (Bender & Grouven,1998).Applications: surveys, quality assurance, radiology, clinical research(McCullagh, 1999).

logit[P(Yi ≤ j | Xi)

]= θj − βTXi , j ∈ (1, . . . , J − 1),

logit(π) = log(

π

1− π

).

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 20: Rodriguez_NRMC_Presentation

Methods: Proportional Odds Model, General

Better method: the proportional odds model (McCullagh, 1980).Extends binary logistic regression (Cox, 1958).Celebrated method for ordinal data analysis (Bender & Grouven,1998).Applications: surveys, quality assurance, radiology, clinical research(McCullagh, 1999).

logit[P(Yi ≤ j | Xi)

]= θj − βTXi , j ∈ (1, . . . , J − 1),

logit(π) = log(

π

1− π

).

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 21: Rodriguez_NRMC_Presentation

Methods: Proportional Odds Model, General

Better method: the proportional odds model (McCullagh, 1980).Extends binary logistic regression (Cox, 1958).Celebrated method for ordinal data analysis (Bender & Grouven,1998).Applications: surveys, quality assurance, radiology, clinical research(McCullagh, 1999).

logit[P(Yi ≤ j | Xi)

]= θj − βTXi , j ∈ (1, . . . , J − 1),

logit(π) = log(

π

1− π

).

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 22: Rodriguez_NRMC_Presentation

Methods: Proportional Odds Model, General

Better method: the proportional odds model (McCullagh, 1980).Extends binary logistic regression (Cox, 1958).Celebrated method for ordinal data analysis (Bender & Grouven,1998).Applications: surveys, quality assurance, radiology, clinical research(McCullagh, 1999).

logit[P(Yi ≤ j | Xi)

]= θj − βTXi , j ∈ (1, . . . , J − 1),

logit(π) = log(

π

1− π

).

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 23: Rodriguez_NRMC_Presentation

Methods: Proportional Odds Model, Limitations

Great on paper, but not in practice.Proportional odds assumption often violated (Long & Freese, 2006).

A standard workaround: modify the model.Refine the latent variable.Fine-tune the null hypothesis.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 24: Rodriguez_NRMC_Presentation

Methods: Proportional Odds Model, Limitations

Great on paper, but not in practice.Proportional odds assumption often violated (Long & Freese, 2006).

A standard workaround: modify the model.Refine the latent variable.Fine-tune the null hypothesis.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 25: Rodriguez_NRMC_Presentation

Methods: Proportional Odds Model, Limitations

Great on paper, but not in practice.Proportional odds assumption often violated (Long & Freese, 2006).

A standard workaround: modify the model.Refine the latent variable.Fine-tune the null hypothesis.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 26: Rodriguez_NRMC_Presentation

Methods: Proportional Odds Model, Limitations

Great on paper, but not in practice.Proportional odds assumption often violated (Long & Freese, 2006).

A standard workaround: modify the model.Refine the latent variable.Fine-tune the null hypothesis.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 27: Rodriguez_NRMC_Presentation

Methods: Proportional Odds Model, Limitations

Great on paper, but not in practice.Proportional odds assumption often violated (Long & Freese, 2006).

A standard workaround: modify the model.Refine the latent variable.Fine-tune the null hypothesis.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 28: Rodriguez_NRMC_Presentation

Methods: Proportional Odds Model, Limitations

Great on paper, but not in practice.Proportional odds assumption often violated (Long & Freese, 2006).

A standard workaround: modify the model.Refine the latent variable.Fine-tune the null hypothesis.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 29: Rodriguez_NRMC_Presentation

Methods: Latent Variable

Variables that are inferred, not directly observed.The focus is to make better inferences.

Y ∗ = βT + ε,

P(Y ≤ j | X ) = 1exp(βTX − θj) + 1.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 30: Rodriguez_NRMC_Presentation

Methods: Latent Variable

Variables that are inferred, not directly observed.The focus is to make better inferences.

Y ∗ = βT + ε,

P(Y ≤ j | X ) = 1exp(βTX − θj) + 1.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 31: Rodriguez_NRMC_Presentation

Methods: Latent Variable

Variables that are inferred, not directly observed.The focus is to make better inferences.

Y ∗ = βT + ε,

P(Y ≤ j | X ) = 1exp(βTX − θj) + 1.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 32: Rodriguez_NRMC_Presentation

Methods: Latent Variable

Variables that are inferred, not directly observed.The focus is to make better inferences.

Y ∗ = βT + ε,

P(Y ≤ j | X ) = 1exp(βTX − θj) + 1.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 33: Rodriguez_NRMC_Presentation

Methods: Hypothesis Testing

Null versus alternative hypotheses: H0 against HA.Traditionally, H0 is the status quo.

H0 : β1 = · · · = βq = 0.

β = τξ, τ ∈ F,Si = ξTxi .

H0 : τ = 0,

P(Y ≤ j | X

)= 1

exp(Sτ − θj) + 1.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 34: Rodriguez_NRMC_Presentation

Methods: Hypothesis Testing

Null versus alternative hypotheses: H0 against HA.Traditionally, H0 is the status quo.

H0 : β1 = · · · = βq = 0.

β = τξ, τ ∈ F,Si = ξTxi .

H0 : τ = 0,

P(Y ≤ j | X

)= 1

exp(Sτ − θj) + 1.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 35: Rodriguez_NRMC_Presentation

Methods: Hypothesis Testing

Null versus alternative hypotheses: H0 against HA.Traditionally, H0 is the status quo.

H0 : β1 = · · · = βq = 0.

β = τξ, τ ∈ F,Si = ξTxi .

H0 : τ = 0,

P(Y ≤ j | X

)= 1

exp(Sτ − θj) + 1.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 36: Rodriguez_NRMC_Presentation

Methods: Hypothesis Testing

Null versus alternative hypotheses: H0 against HA.Traditionally, H0 is the status quo.

H0 : β1 = · · · = βq = 0.

β = τξ, τ ∈ F,Si = ξTxi .

H0 : τ = 0,

P(Y ≤ j | X

)= 1

exp(Sτ − θj) + 1.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 37: Rodriguez_NRMC_Presentation

Methods: Hypothesis Testing

Null versus alternative hypotheses: H0 against HA.Traditionally, H0 is the status quo.

H0 : β1 = · · · = βq = 0.

β = τξ, τ ∈ F,Si = ξTxi .

H0 : τ = 0,

P(Y ≤ j | X

)= 1

exp(Sτ − θj) + 1.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 38: Rodriguez_NRMC_Presentation

Methods: Hypothesis Testing

Null versus alternative hypotheses: H0 against HA.Traditionally, H0 is the status quo.

H0 : β1 = · · · = βq = 0.

β = τξ, τ ∈ F,Si = ξTxi .

H0 : τ = 0,

P(Y ≤ j | X

)= 1

exp(Sτ − θj) + 1.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 39: Rodriguez_NRMC_Presentation

Methods: Score Function

Allows for quantification of performance of model.

u(θ1, . . . , θJ−1, τ) = −J∑

j=1

nj∑i=1

Sij[1− ψ

(θj − τSij

)− ψ

(θj−1 − τSij

)],

ψ(t) = 11 + exp(−t) .

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 40: Rodriguez_NRMC_Presentation

Methods: Score Function

Allows for quantification of performance of model.

u(θ1, . . . , θJ−1, τ) = −J∑

j=1

nj∑i=1

Sij[1− ψ

(θj − τSij

)− ψ

(θj−1 − τSij

)],

ψ(t) = 11 + exp(−t) .

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 41: Rodriguez_NRMC_Presentation

Methods: Score Function

Allows for quantification of performance of model.

u(θ1, . . . , θJ−1, τ) = −J∑

j=1

nj∑i=1

Sij[1− ψ

(θj − τSij

)− ψ

(θj−1 − τSij

)],

ψ(t) = 11 + exp(−t) .

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 42: Rodriguez_NRMC_Presentation

Methods: Simulations

Criteria: type I error frequency and power.Algorithm:1. Generate genotype data.2. Obtain error terms.3. Fix latent variables.4. Produce ordinal categorical responses.5. Estimate θj under modified H0.6. Plug θ̂j into score function.7. Receive p-values.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 43: Rodriguez_NRMC_Presentation

Methods: Simulations

Criteria: type I error frequency and power.Algorithm:1. Generate genotype data.2. Obtain error terms.3. Fix latent variables.4. Produce ordinal categorical responses.5. Estimate θj under modified H0.6. Plug θ̂j into score function.7. Receive p-values.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 44: Rodriguez_NRMC_Presentation

Methods: Simulations

Criteria: type I error frequency and power.Algorithm:1. Generate genotype data.2. Obtain error terms.3. Fix latent variables.4. Produce ordinal categorical responses.5. Estimate θj under modified H0.6. Plug θ̂j into score function.7. Receive p-values.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 45: Rodriguez_NRMC_Presentation

Methods: Simulations

Criteria: type I error frequency and power.Algorithm:1. Generate genotype data.2. Obtain error terms.3. Fix latent variables.4. Produce ordinal categorical responses.5. Estimate θj under modified H0.6. Plug θ̂j into score function.7. Receive p-values.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 46: Rodriguez_NRMC_Presentation

Methods: Simulations

Criteria: type I error frequency and power.Algorithm:1. Generate genotype data.2. Obtain error terms.3. Fix latent variables.4. Produce ordinal categorical responses.5. Estimate θj under modified H0.6. Plug θ̂j into score function.7. Receive p-values.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 47: Rodriguez_NRMC_Presentation

Methods: Simulations

Criteria: type I error frequency and power.Algorithm:1. Generate genotype data.2. Obtain error terms.3. Fix latent variables.4. Produce ordinal categorical responses.5. Estimate θj under modified H0.6. Plug θ̂j into score function.7. Receive p-values.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 48: Rodriguez_NRMC_Presentation

Methods: Simulations

Criteria: type I error frequency and power.Algorithm:1. Generate genotype data.2. Obtain error terms.3. Fix latent variables.4. Produce ordinal categorical responses.5. Estimate θj under modified H0.6. Plug θ̂j into score function.7. Receive p-values.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 49: Rodriguez_NRMC_Presentation

Methods: Simulations

Criteria: type I error frequency and power.Algorithm:1. Generate genotype data.2. Obtain error terms.3. Fix latent variables.4. Produce ordinal categorical responses.5. Estimate θj under modified H0.6. Plug θ̂j into score function.7. Receive p-values.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 50: Rodriguez_NRMC_Presentation

Methods: Simulations

Criteria: type I error frequency and power.Algorithm:1. Generate genotype data.2. Obtain error terms.3. Fix latent variables.4. Produce ordinal categorical responses.5. Estimate θj under modified H0.6. Plug θ̂j into score function.7. Receive p-values.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 51: Rodriguez_NRMC_Presentation

Methods: Simulations

Criteria: type I error frequency and power.Algorithm:1. Generate genotype data.2. Obtain error terms.3. Fix latent variables.4. Produce ordinal categorical responses.5. Estimate θj under modified H0.6. Plug θ̂j into score function.7. Receive p-values.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 52: Rodriguez_NRMC_Presentation

Methods: Application

Response: Dravet syndrome patient severity.Predictor: 12 stress-related single nucleotide polymorphisms.Sample size: 22 relatively isolated individuals.Categories: 2, mild and severe.Other data: sex, status, IQ, allele count.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 53: Rodriguez_NRMC_Presentation

Methods: Application

Response: Dravet syndrome patient severity.Predictor: 12 stress-related single nucleotide polymorphisms.Sample size: 22 relatively isolated individuals.Categories: 2, mild and severe.Other data: sex, status, IQ, allele count.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 54: Rodriguez_NRMC_Presentation

Methods: Application

Response: Dravet syndrome patient severity.Predictor: 12 stress-related single nucleotide polymorphisms.Sample size: 22 relatively isolated individuals.Categories: 2, mild and severe.Other data: sex, status, IQ, allele count.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 55: Rodriguez_NRMC_Presentation

Methods: Application

Response: Dravet syndrome patient severity.Predictor: 12 stress-related single nucleotide polymorphisms.Sample size: 22 relatively isolated individuals.Categories: 2, mild and severe.Other data: sex, status, IQ, allele count.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 56: Rodriguez_NRMC_Presentation

Methods: Application

Response: Dravet syndrome patient severity.Predictor: 12 stress-related single nucleotide polymorphisms.Sample size: 22 relatively isolated individuals.Categories: 2, mild and severe.Other data: sex, status, IQ, allele count.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 57: Rodriguez_NRMC_Presentation

Methods: Application

Response: Dravet syndrome patient severity.Predictor: 12 stress-related single nucleotide polymorphisms.Sample size: 22 relatively isolated individuals.Categories: 2, mild and severe.Other data: sex, status, IQ, allele count.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 58: Rodriguez_NRMC_Presentation

Results: The Proposed Model Is Successful

Type I error and power comparable to:Sequence kernel association test (Wu et al., 2011).Optimized sequence kernel association test (Lee et al., 2012).

In terms of power, outperforms:Variable threshold test (Price et al., 2010).Cohort allelic sums test (Morgenthaler & Thilly, 2007).Cumulative minor-allele test (Zawistowski et al., 2010).

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 59: Rodriguez_NRMC_Presentation

Results: The Proposed Model Is Successful

Type I error and power comparable to:Sequence kernel association test (Wu et al., 2011).Optimized sequence kernel association test (Lee et al., 2012).

In terms of power, outperforms:Variable threshold test (Price et al., 2010).Cohort allelic sums test (Morgenthaler & Thilly, 2007).Cumulative minor-allele test (Zawistowski et al., 2010).

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 60: Rodriguez_NRMC_Presentation

Results: The Proposed Model Is Successful

Type I error and power comparable to:Sequence kernel association test (Wu et al., 2011).Optimized sequence kernel association test (Lee et al., 2012).

In terms of power, outperforms:Variable threshold test (Price et al., 2010).Cohort allelic sums test (Morgenthaler & Thilly, 2007).Cumulative minor-allele test (Zawistowski et al., 2010).

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 61: Rodriguez_NRMC_Presentation

Results: The Proposed Model Is Successful

Type I error and power comparable to:Sequence kernel association test (Wu et al., 2011).Optimized sequence kernel association test (Lee et al., 2012).

In terms of power, outperforms:Variable threshold test (Price et al., 2010).Cohort allelic sums test (Morgenthaler & Thilly, 2007).Cumulative minor-allele test (Zawistowski et al., 2010).

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 62: Rodriguez_NRMC_Presentation

Results: The Proposed Model Is Successful

Type I error and power comparable to:Sequence kernel association test (Wu et al., 2011).Optimized sequence kernel association test (Lee et al., 2012).

In terms of power, outperforms:Variable threshold test (Price et al., 2010).Cohort allelic sums test (Morgenthaler & Thilly, 2007).Cumulative minor-allele test (Zawistowski et al., 2010).

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 63: Rodriguez_NRMC_Presentation

Results: The Proposed Model Is Successful

Type I error and power comparable to:Sequence kernel association test (Wu et al., 2011).Optimized sequence kernel association test (Lee et al., 2012).

In terms of power, outperforms:Variable threshold test (Price et al., 2010).Cohort allelic sums test (Morgenthaler & Thilly, 2007).Cumulative minor-allele test (Zawistowski et al., 2010).

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 64: Rodriguez_NRMC_Presentation

Results: The Proposed Model Is Successful

Type I error and power comparable to:Sequence kernel association test (Wu et al., 2011).Optimized sequence kernel association test (Lee et al., 2012).

In terms of power, outperforms:Variable threshold test (Price et al., 2010).Cohort allelic sums test (Morgenthaler & Thilly, 2007).Cumulative minor-allele test (Zawistowski et al., 2010).

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 65: Rodriguez_NRMC_Presentation

Results: The Proposed Model Is Successful

Type I error and power comparable to:Sequence kernel association test (Wu et al., 2011).Optimized sequence kernel association test (Lee et al., 2012).

In terms of power, outperforms:Variable threshold test (Price et al., 2010).Cohort allelic sums test (Morgenthaler & Thilly, 2007).Cumulative minor-allele test (Zawistowski et al., 2010).

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 66: Rodriguez_NRMC_Presentation

Results: Stress and Dravet Are Intricately Correlated

Rare phenotypes prevalent for young severe patients.Several genes protect or exacerbate Dravet.

Likely varies on case-by-case basis.

Stress-Dravet link contingent on sample heterogeneity.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 67: Rodriguez_NRMC_Presentation

Results: Stress and Dravet Are Intricately Correlated

Rare phenotypes prevalent for young severe patients.Several genes protect or exacerbate Dravet.

Likely varies on case-by-case basis.

Stress-Dravet link contingent on sample heterogeneity.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 68: Rodriguez_NRMC_Presentation

Results: Stress and Dravet Are Intricately Correlated

Rare phenotypes prevalent for young severe patients.Several genes protect or exacerbate Dravet.

Likely varies on case-by-case basis.

Stress-Dravet link contingent on sample heterogeneity.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 69: Rodriguez_NRMC_Presentation

Results: Stress and Dravet Are Intricately Correlated

Rare phenotypes prevalent for young severe patients.Several genes protect or exacerbate Dravet.

Likely varies on case-by-case basis.

Stress-Dravet link contingent on sample heterogeneity.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 70: Rodriguez_NRMC_Presentation

Results: Stress and Dravet Are Intricately Correlated

Rare phenotypes prevalent for young severe patients.Several genes protect or exacerbate Dravet.

Likely varies on case-by-case basis.

Stress-Dravet link contingent on sample heterogeneity.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 71: Rodriguez_NRMC_Presentation

Discussion

Preliminary evaluation of model and dataset analysis.Severe modifying genes significantly determine quality-of-life.Identification of modifying genes is paramount.

Provides impetus for new medication and treatment.

Personalized care will rise with genomic information.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 72: Rodriguez_NRMC_Presentation

Discussion

Preliminary evaluation of model and dataset analysis.Severe modifying genes significantly determine quality-of-life.Identification of modifying genes is paramount.

Provides impetus for new medication and treatment.

Personalized care will rise with genomic information.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 73: Rodriguez_NRMC_Presentation

Discussion

Preliminary evaluation of model and dataset analysis.Severe modifying genes significantly determine quality-of-life.Identification of modifying genes is paramount.

Provides impetus for new medication and treatment.

Personalized care will rise with genomic information.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 74: Rodriguez_NRMC_Presentation

Discussion

Preliminary evaluation of model and dataset analysis.Severe modifying genes significantly determine quality-of-life.Identification of modifying genes is paramount.

Provides impetus for new medication and treatment.

Personalized care will rise with genomic information.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 75: Rodriguez_NRMC_Presentation

Discussion

Preliminary evaluation of model and dataset analysis.Severe modifying genes significantly determine quality-of-life.Identification of modifying genes is paramount.

Provides impetus for new medication and treatment.

Personalized care will rise with genomic information.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 76: Rodriguez_NRMC_Presentation

Discussion

Preliminary evaluation of model and dataset analysis.Severe modifying genes significantly determine quality-of-life.Identification of modifying genes is paramount.

Provides impetus for new medication and treatment.

Personalized care will rise with genomic information.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 77: Rodriguez_NRMC_Presentation

In Conclusion

Focus:Improving the proportional odds model.Unknown link between stress and Dravet.

Takeaways:The proposed model is formidable.A new stress-Dravet link has been established.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 78: Rodriguez_NRMC_Presentation

In Conclusion

Focus:Improving the proportional odds model.Unknown link between stress and Dravet.

Takeaways:The proposed model is formidable.A new stress-Dravet link has been established.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 79: Rodriguez_NRMC_Presentation

In Conclusion

Focus:Improving the proportional odds model.Unknown link between stress and Dravet.

Takeaways:The proposed model is formidable.A new stress-Dravet link has been established.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 80: Rodriguez_NRMC_Presentation

In Conclusion

Focus:Improving the proportional odds model.Unknown link between stress and Dravet.

Takeaways:The proposed model is formidable.A new stress-Dravet link has been established.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 81: Rodriguez_NRMC_Presentation

In Conclusion

Focus:Improving the proportional odds model.Unknown link between stress and Dravet.

Takeaways:The proposed model is formidable.A new stress-Dravet link has been established.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 82: Rodriguez_NRMC_Presentation

In Conclusion

Focus:Improving the proportional odds model.Unknown link between stress and Dravet.

Takeaways:The proposed model is formidable.A new stress-Dravet link has been established.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 83: Rodriguez_NRMC_Presentation

In Conclusion

Focus:Improving the proportional odds model.Unknown link between stress and Dravet.

Takeaways:The proposed model is formidable.A new stress-Dravet link has been established.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 84: Rodriguez_NRMC_Presentation

Acknowledgments

Joseph C. Watkins, Ph.D.Miao Zhang, M.S.Michael Hammer, Ph.D., and the Hammer Lab.Andrew Huerta, Ph.D. and Reneé Reynolds, M.A.Andrew Carnie, Ph.D.

This research was supported in part by the Western Alliance to ExpandStudent Opportunities (WAESO) Louis Stokes Alliance for MinorityParticipation (LSAMP) National Science Foundation (NSF) CooperativeAgreement No. HRD-1101728.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 85: Rodriguez_NRMC_Presentation

Acknowledgments

Joseph C. Watkins, Ph.D.

Miao Zhang, M.S.Michael Hammer, Ph.D., and the Hammer Lab.Andrew Huerta, Ph.D. and Reneé Reynolds, M.A.Andrew Carnie, Ph.D.

This research was supported in part by the Western Alliance to ExpandStudent Opportunities (WAESO) Louis Stokes Alliance for MinorityParticipation (LSAMP) National Science Foundation (NSF) CooperativeAgreement No. HRD-1101728.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 86: Rodriguez_NRMC_Presentation

Acknowledgments

Joseph C. Watkins, Ph.D.Miao Zhang, M.S.

Michael Hammer, Ph.D., and the Hammer Lab.Andrew Huerta, Ph.D. and Reneé Reynolds, M.A.Andrew Carnie, Ph.D.

This research was supported in part by the Western Alliance to ExpandStudent Opportunities (WAESO) Louis Stokes Alliance for MinorityParticipation (LSAMP) National Science Foundation (NSF) CooperativeAgreement No. HRD-1101728.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 87: Rodriguez_NRMC_Presentation

Acknowledgments

Joseph C. Watkins, Ph.D.Miao Zhang, M.S.Michael Hammer, Ph.D., and the Hammer Lab.

Andrew Huerta, Ph.D. and Reneé Reynolds, M.A.Andrew Carnie, Ph.D.

This research was supported in part by the Western Alliance to ExpandStudent Opportunities (WAESO) Louis Stokes Alliance for MinorityParticipation (LSAMP) National Science Foundation (NSF) CooperativeAgreement No. HRD-1101728.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 88: Rodriguez_NRMC_Presentation

Acknowledgments

Joseph C. Watkins, Ph.D.Miao Zhang, M.S.Michael Hammer, Ph.D., and the Hammer Lab.Andrew Huerta, Ph.D. and Reneé Reynolds, M.A.

Andrew Carnie, Ph.D.

This research was supported in part by the Western Alliance to ExpandStudent Opportunities (WAESO) Louis Stokes Alliance for MinorityParticipation (LSAMP) National Science Foundation (NSF) CooperativeAgreement No. HRD-1101728.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 89: Rodriguez_NRMC_Presentation

Acknowledgments

Joseph C. Watkins, Ph.D.Miao Zhang, M.S.Michael Hammer, Ph.D., and the Hammer Lab.Andrew Huerta, Ph.D. and Reneé Reynolds, M.A.Andrew Carnie, Ph.D.

This research was supported in part by the Western Alliance to ExpandStudent Opportunities (WAESO) Louis Stokes Alliance for MinorityParticipation (LSAMP) National Science Foundation (NSF) CooperativeAgreement No. HRD-1101728.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 90: Rodriguez_NRMC_Presentation

Acknowledgments

Joseph C. Watkins, Ph.D.Miao Zhang, M.S.Michael Hammer, Ph.D., and the Hammer Lab.Andrew Huerta, Ph.D. and Reneé Reynolds, M.A.Andrew Carnie, Ph.D.

This research was supported in part by the Western Alliance to ExpandStudent Opportunities (WAESO) Louis Stokes Alliance for MinorityParticipation (LSAMP) National Science Foundation (NSF) CooperativeAgreement No. HRD-1101728.

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 91: Rodriguez_NRMC_Presentation

References

Bender, R., & Grouven, U. (1998). Using binary logistic regression models for ordinal data with non-proportionalodds. Journal of Clinical Epidemiology, 51(10), 809–816. doi:10.1016/S0895-4356(98)00066-3

Cox, D. R. (1958). The regression analysis of binary sequences (with discussion). Journal of the Royal StatisticalSociety, Series B, 20, 215–242.

Lee, S., Emond, M. J., Bamshad, M. J., Barnes, K. C., Rieder, M. J., Nickerson, D. A., NHLBI GO Exome SequencingProject, . . . , Lin, X. (2012). Optimal unified approach for rare-variant association testing with application tosmall-sample case-control whole-exome sequencing studies. The American Journal of Human Genetics, 91(2),224–237. doi:10.1016/j.ajhg.2012.06.007

Long, J. S., & Freese, J. (2006). Regression models for categorical dependent variables using Stata. College Station,TX: Stata Press.

McCullagh, P. (1980). Regression models for ordinal data. Journal of the Royal Statistical Society, 42(2), 109–142.McCullagh, P. (1999). The proportional odds model. In P. Armitage, Encyclopedia of Biostatistics Vol. 5 (3560–

3563). Hoboken, NJ: John Wiley & Sons.Morgenthaler, S., & Thilly, W. G. (2007). A strategy to discover genes that carry multi-allelic or mono-allelic risk

for common diseases: A cohort allelic sums test (CAST). Mutation Research, 615(1–2), 28–56. doi:10.1016/j.mrfmmm.2006.09.003

Nussbaum, R. L., McInnes, R. R., & Willard, H. F. (2007). Thompson & Thompson genetics in medicine (6th ed.).Philadelphia, PA: W. B. Saunders. doi:10.1016/S0015-0282(02)03084-4

Price, A. L., Kryukov, G. V., de Bakker, P. I., Purcell, S. M., Staples, J., Wei, L. J., & Sunyaev, S. R. (2010).Pooled association tests for rare variants in exon-resequencing studies. The American Journal of Human Genetics,86(6), 832–838. doi:10.1016/j.ajhg.2010.04.005

Wu, M. C., Lee, S., Cai, T., Li, Y., Boehnke, M., & Lin, X. (2011). Rare-variant association testing for sequencingdata with the sequence kernel association test. The American Journal of Human Genetics, 89(1), 82–93. doi:10.1016/j.ajhg.2011.05.029

Zawistowski, M., Gopalakrishnan, S., Ding, J., Li, Y., Grimm, S., & Zöllner, S. (2010). Extending rare-varianttesting strategies: Analysis of noncoding sequence and imputed genotypes. The American Journal of HumanGenetics, 87(5), 604–617. doi:10.1016/j.ajhg.2010.10.012

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016

Page 92: Rodriguez_NRMC_Presentation

Questions?

Ivan Rodriguez: 〈[email protected]〉 .

Ivan Rodriguez Ameliorating Statistical Methodologies October 1, 2016