Top Banner
Association Modeling With iPlant
32

Association Modeling With iPlant. Goals of this Section Familiarize with the basic concepts of quantitative genetics: – Traits, phenotypes, genotypes.

Dec 18, 2015

Download

Documents

Loren Logan
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Association Modeling With iPlant. Goals of this Section Familiarize with the basic concepts of quantitative genetics: – Traits, phenotypes, genotypes.

Association Modeling With iPlant

Page 2: Association Modeling With iPlant. Goals of this Section Familiarize with the basic concepts of quantitative genetics: – Traits, phenotypes, genotypes.

Goals of this Section

• Familiarize with the basic concepts of quantitative genetics:– Traits, phenotypes, genotypes

• Understand the basics of trait mapping• Understand the conceptual foundations of

association studies• Lear how to perform a genome wide association

study in the iPlant Discovery Environment– Obtain genotypes– Run a Mixed Linear Model

Page 3: Association Modeling With iPlant. Goals of this Section Familiarize with the basic concepts of quantitative genetics: – Traits, phenotypes, genotypes.

PhenotypeObservable (measurable) trait (character) of an organism

Trait: eye color

Phenotype: wild type (red), white eyed, orange eyed

http://www.unc.edu/depts/our/hhmi/hhmi-ft_learning_modules/fruitflymodule/phenotypes.html

Page 4: Association Modeling With iPlant. Goals of this Section Familiarize with the basic concepts of quantitative genetics: – Traits, phenotypes, genotypes.

Qualitative Traits

Campbell, 8e

Page 5: Association Modeling With iPlant. Goals of this Section Familiarize with the basic concepts of quantitative genetics: – Traits, phenotypes, genotypes.

Controlled by One Locus

Page 6: Association Modeling With iPlant. Goals of this Section Familiarize with the basic concepts of quantitative genetics: – Traits, phenotypes, genotypes.

Donahue, R. P., et al., Probable assignment of the Duffy blood group locus to chromosome 1 in man, Proceedings of the National Academy of Sciences 61, 949-955 (1968).

Co-segregation in Pedigree

Page 7: Association Modeling With iPlant. Goals of this Section Familiarize with the basic concepts of quantitative genetics: – Traits, phenotypes, genotypes.

Quantitative Trait

Carlos Harjes

Page 8: Association Modeling With iPlant. Goals of this Section Familiarize with the basic concepts of quantitative genetics: – Traits, phenotypes, genotypes.

Trait Varies on a Continuous ScaleFr

eque

ncy

Trait Value

Page 9: Association Modeling With iPlant. Goals of this Section Familiarize with the basic concepts of quantitative genetics: – Traits, phenotypes, genotypes.

Quantitative Traits

• Probably caused by multiple loci– Interaction effects– Environment

If the mean trait value for individuals with marker state MM is different from the mean

trait value of individuals with marker state mm (i.e. the marker is associated with the

phenotype), then the marker is linked to a quantitative trait locus.

Page 10: Association Modeling With iPlant. Goals of this Section Familiarize with the basic concepts of quantitative genetics: – Traits, phenotypes, genotypes.

Mar

kers

Individuals

Trait value

Marker #3 Mean Trait Value

Present 99 ± 5

Absent 118 ± 8

Marker #6 Mean Trait Value

Present 110 ± 10

Absent 115 ± 13

Page 11: Association Modeling With iPlant. Goals of this Section Familiarize with the basic concepts of quantitative genetics: – Traits, phenotypes, genotypes.

Quantitative Genetics

Exploring the Genetic Architecture* Underlying Quantitative Traits

*Genetic Architecture• How many loci?• Which location?• How strong?

Page 12: Association Modeling With iPlant. Goals of this Section Familiarize with the basic concepts of quantitative genetics: – Traits, phenotypes, genotypes.

Tools for Statistical Genetics in the DETool Purpose

Genotype by Sequencing Workflow Automatic pipeline for extracting SNPs from GBS data (with genome from user or from iPlant database)

UNEAK pipeline Automatic pipeline for extracting SNPs from GBS data without reference genomes

MLM workflow Automatic workflow for fitting Mixed Linear Model

GLM workflow Automatic workflow for fitting General Linear Model

QTLC workflow Automatic workflow for composite interval mapping

QTL simulation workflow Automatic workflow for simulating trait data with given linkage map

PLINK PLINK implementation of various association models

Zmapqtl Interval mapping and composite interval mapping with the options to perform a permutation test

LRmapqtl Linear regression modeling

SRmapqtl Stepwise regression modeling

AntEpiSeeker Epistatic interaction modeling

Random Jungle Random Forest implementation for GWAS

FaST-LMM Factored Spectrally Transformed Linear Mixed Modeling

Qxpak Versatile mixed modeling

gluH2P Convert Hapmap format to Ped format

LD Linkage Disequilibrium plot

Structure Estimation of population structure

PGDSpider Data conversion tool

GLMstrucutre GLM with population structure as fixed effect

Page 13: Association Modeling With iPlant. Goals of this Section Familiarize with the basic concepts of quantitative genetics: – Traits, phenotypes, genotypes.

A Model for Quantitative Traits

P = G + E + GG + GEP=PhenotypeG=GenotypeE=EnvironmentGG=Interaction between genotypesGE=Interaction between genotype and environment

P = G + e

Phenotype

Genotype Environment

Page 14: Association Modeling With iPlant. Goals of this Section Familiarize with the basic concepts of quantitative genetics: – Traits, phenotypes, genotypes.

A Statistical Model for QTLs

P=G + e

yij trait value in individual j with genotype iβ0 population average of trait valueβ1 effect of marker i on trait valuexi marker genotype iεij error term

General Linear Model (in matrix notation): Y=Xb + e

Note: If errors are not normally distributed, use generalized linear models

Page 15: Association Modeling With iPlant. Goals of this Section Familiarize with the basic concepts of quantitative genetics: – Traits, phenotypes, genotypes.

http://concord.org/publications/newsletter/2009-spring/genetics

Page 16: Association Modeling With iPlant. Goals of this Section Familiarize with the basic concepts of quantitative genetics: – Traits, phenotypes, genotypes.

Linkage Mapping (QTL Mapping)

• Designed population– F2– Recombinant inbred (RIL)– Double-Haploid (DH)– Back-cross (B2)

Page 17: Association Modeling With iPlant. Goals of this Section Familiarize with the basic concepts of quantitative genetics: – Traits, phenotypes, genotypes.
Page 18: Association Modeling With iPlant. Goals of this Section Familiarize with the basic concepts of quantitative genetics: – Traits, phenotypes, genotypes.

Limitation of Linkage Mapping

• Needs large number of related individuals• Resolution limited (interval contains 100s of

genes)• QTL position and effect are confounded

Page 19: Association Modeling With iPlant. Goals of this Section Familiarize with the basic concepts of quantitative genetics: – Traits, phenotypes, genotypes.

Association Mapping

• Use random collection of individuals from natural population

• Very dense marker map = very high resolution

Page 20: Association Modeling With iPlant. Goals of this Section Familiarize with the basic concepts of quantitative genetics: – Traits, phenotypes, genotypes.

Linkage & RecombinationRecombination causes linkage decay

Other factors affecting LD:• Selection (artificial or natural)• Drift• Mutations• Population structure• Demography

Page 21: Association Modeling With iPlant. Goals of this Section Familiarize with the basic concepts of quantitative genetics: – Traits, phenotypes, genotypes.

Linkage Disequilibrium

Page 22: Association Modeling With iPlant. Goals of this Section Familiarize with the basic concepts of quantitative genetics: – Traits, phenotypes, genotypes.

Pitfalls: Population Structure

• Difference in allele frequencies between subpopulations

• Due to neutral or adaptive processes

• Can create spurious association

Page 23: Association Modeling With iPlant. Goals of this Section Familiarize with the basic concepts of quantitative genetics: – Traits, phenotypes, genotypes.

T G T G

No association within groups

Page 24: Association Modeling With iPlant. Goals of this Section Familiarize with the basic concepts of quantitative genetics: – Traits, phenotypes, genotypes.

• Similar effect due to presence of related individuals (esp. in plants)

• Can be accounted for using the data:– Estimate number of subpopulations– Assign individuals to subpopulation– Estimate kinship

Page 25: Association Modeling With iPlant. Goals of this Section Familiarize with the basic concepts of quantitative genetics: – Traits, phenotypes, genotypes.

Accounting for Random Effects: Mixed Linear Models

• "Cost" associated with estimating a parameter• We are not interested in the value of the parameter, only the variance• Q-K method (structured association)

y=Xβ+Sα+Qv+Zu+e

Fixed effects:β Vector of fixed effectsα Vector of SNPs effectsv Vector of subpopulation effects

Random effects:u Vector of kinship effectse Residuals

Q Matrix of population association (STRUCTURE)X, S, Z Incidence Matrices

Page 26: Association Modeling With iPlant. Goals of this Section Familiarize with the basic concepts of quantitative genetics: – Traits, phenotypes, genotypes.

Traits

Markers

Population Structure

Kinship

STRUCTURE

TASSEL

MLM

Page 27: Association Modeling With iPlant. Goals of this Section Familiarize with the basic concepts of quantitative genetics: – Traits, phenotypes, genotypes.

Obtain Markers

Genome Resequencing Workflow

Genotyping By Sequencing

Page 28: Association Modeling With iPlant. Goals of this Section Familiarize with the basic concepts of quantitative genetics: – Traits, phenotypes, genotypes.

MLM Pipeline for GWAS

marker

trait

filter

convert

impute

impute

K

GLM

MLM

http://www.maizegenetics.net/statistical-geneticsZhang et al. Nature Genetics. 2010; doi:10.1038/ng.546

Ed Buckler (Cornell University)TASSEL

http://www.maizegenetics.net/tassel/docs/Tassel_User_Guide_3.0.pdf

Page 29: Association Modeling With iPlant. Goals of this Section Familiarize with the basic concepts of quantitative genetics: – Traits, phenotypes, genotypes.

MLM Input Files

• Hapmap file• Phenotype data• Kinship matrix*• Population structure*

straintraits

Phenotype data

strain3 populations sum to 1

* Kinship matrix & population structure data can be generated using TASSEL or with “MLM Workflow” App in DE

Population structure

Page 30: Association Modeling With iPlant. Goals of this Section Familiarize with the basic concepts of quantitative genetics: – Traits, phenotypes, genotypes.

Origin

• Hapmap file: – Download (e.g. http://triticeaetoolbox.org/)– Convert from PLINK (.map/.ped) using Tassel 3 Conversion– Impute with NPUTE– Transform to numerical format with NumericalTransform

• Phenotype data• Kinship matrix

– Generate from hapmap marker data with Kinship• Population structure

– Generate using ParallelStructure– Convert to matrix with Structure2Tassel

Page 31: Association Modeling With iPlant. Goals of this Section Familiarize with the basic concepts of quantitative genetics: – Traits, phenotypes, genotypes.

MLM Output• MLM1.txt

– Marker– “df” degrees of freedom– “F” F distribution for test of marker– “p” p-value– “errordf” df used for denominator of F-test– etc.

• MLM2.txt– Estimated effect for each allele for each marker

• MLM3.txt– The compression results shows the likelihood, genetic variance, and error variance for

each compression level tested during the optimization process.

See TASSEL manual for details:http://www.maizegenetics.net/tassel/docs/Tassel_User_Guide_3.0.pdf

Page 32: Association Modeling With iPlant. Goals of this Section Familiarize with the basic concepts of quantitative genetics: – Traits, phenotypes, genotypes.

THANKS!