Top Banner
ASHG Interactive Workshop: Overview and Interpretation of GTEx Resources: eQTLs and Gene Expression No Relevant Conflicts to Disclose: Kristin G. Ardlie François Aguet Ayellet V. Segrè Jared L. Nedzel Stephen Montgomery Disclosure for:
55

ASHG Interactive Workshop: Overview and Interpretation of … · 2017-10-18 · COMMENTARY eGTEx: the Enhancing GTExproject 2 ADVANCE ONLINE PUBLICATION NATURE GENETICS vary. The

Mar 13, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: ASHG Interactive Workshop: Overview and Interpretation of … · 2017-10-18 · COMMENTARY eGTEx: the Enhancing GTExproject 2 ADVANCE ONLINE PUBLICATION NATURE GENETICS vary. The

ASHGInteractiveWorkshop:OverviewandInterpretationofGTEx Resources:eQTLs andGeneExpression

NoRelevantConflictstoDisclose:KristinG.ArdlieFrançoisAguetAyelletV.SegrèJaredL.NedzelStephenMontgomery

Disclosurefor:

Page 2: ASHG Interactive Workshop: Overview and Interpretation of … · 2017-10-18 · COMMENTARY eGTEx: the Enhancing GTExproject 2 ADVANCE ONLINE PUBLICATION NATURE GENETICS vary. The

Overview and Interpretation of GTEx Resources: eQTLs and Gene Expression

ASHG 2017 Annual Meeting10/18/2017

Page 3: ASHG Interactive Workshop: Overview and Interpretation of … · 2017-10-18 · COMMENTARY eGTEx: the Enhancing GTExproject 2 ADVANCE ONLINE PUBLICATION NATURE GENETICS vary. The

• Overview of study and data• Portal demonstration• Jupyter notebook• GWAS-eQTL challenges

GTEx WorkshopAgenda

Page 4: ASHG Interactive Workshop: Overview and Interpretation of … · 2017-10-18 · COMMENTARY eGTEx: the Enhancing GTExproject 2 ADVANCE ONLINE PUBLICATION NATURE GENETICS vary. The

Association of common DNA variants with diseases and traits

ACGGGCAATCACGTACGGGCAAACACGTACGGGCAATCACGTACGGGCAAACACGTACGGACAATCAAGTACGGACAAACAAGT

ACGGGCAATCACGTACGGACAAACAAGTACGGACAAACAAGTACGGACAATCAAGTACGGACAATCAAGTACGGACAAACAAGT

https://www.ebi.ac.uk/gwas

Controls Cases

Genome-wide association studies (GWAS) led to discovery of >10,000 common DNA variants associated with >600 diseases/traits.

~95% GWAS SNPs locatedinnon-coding regions

Page 5: ASHG Interactive Workshop: Overview and Interpretation of … · 2017-10-18 · COMMENTARY eGTEx: the Enhancing GTExproject 2 ADVANCE ONLINE PUBLICATION NATURE GENETICS vary. The

eQTLs: expression quantitative trait loci

T

A

T

A

AA AT TT

Expression

GenotypeAA AT TT

Expression

Genotype

Hypothesis: the functional effect of most (non-coding) GWAS variants is modification of gene expression

Regulatory variation is measured as expression quantitative trait loci (eQTLs)

Measured in a population:

Page 6: ASHG Interactive Workshop: Overview and Interpretation of … · 2017-10-18 · COMMENTARY eGTEx: the Enhancing GTExproject 2 ADVANCE ONLINE PUBLICATION NATURE GENETICS vary. The

Regulation of gene expression: multi-tissue and multi-individual

Across a population(e.g., eQTL studies in blood)

Across tissues or cell typesFunctional genomic maps(e.g., ENCODE, Roadmap Epigenomics)

Assessing role of genetic variation on gene function requires both dimensions

Page 7: ASHG Interactive Workshop: Overview and Interpretation of … · 2017-10-18 · COMMENTARY eGTEx: the Enhancing GTExproject 2 ADVANCE ONLINE PUBLICATION NATURE GENETICS vary. The

The Genotype Tissue-Expression project

Breast - Mammary Tissue

Artery - Coronary

Heart - Left Ventricle

Esophagus - Muscularis

Esophagus - Gastroesophageal Junction

Thyroid Esophagus - Mucosa

Heart - Atrial Appendage

Artery - Aorta

LungSpleen

Colon - Sigmoid

TestisSkin - Not Sun Exposed (Suprapubic)

OvaryColon - TransversePancreas

Adipose - Subcutaneous

Liver

Stomach

Pituitary

BrainAnterior cingulate cortex (BA24)

Caudate (basal ganglia)Cerebellar Hemisphere

CerebellumCortex

Frontal Cortex (BA9)HippocampusHypothalamus

Nucleus accumbens (basal ganglia)Putamen (basal ganglia)

Artery - Tibial

Nerve - Tibial

Adrenal Gland

Adipose - Visceral (Omentum)Small Intestine - Terminal Ileum

Prostate

Vagina

Whole Blood

Uterus

Muscle - Skeletal

Skin - Sun Exposed (Lower leg)

Cells - Transformed fibroblastsCells - EBV-transformed lymphocytes

Atlas of gene expression and eQTLs in non-diseased human tissues from up to 960 recently deceased donors

• 53 tissue sites• 11 distinct brain regions• 2 cell lines

• Core molecular assays:• WGS/WES (primarily whole blood)• RNA-seq• Small RNA-seq (future)

This workshop

Page 8: ASHG Interactive Workshop: Overview and Interpretation of … · 2017-10-18 · COMMENTARY eGTEx: the Enhancing GTExproject 2 ADVANCE ONLINE PUBLICATION NATURE GENETICS vary. The

eGTEx: the Enhancing GTEx projectCOMMENTARY

2 ADVANCE ONLINE PUBLICATION NATURE GENETICS

vary. The molecular phenotypes being studied are shown in Table 1 and described in the fol-lowing sections.

DNA accessibilitySystematic understanding of the impact of genetic variation on gene expression requires both comprehensive delineation of regulatory DNA and an understanding of the degree to which individual regulatory regions vary at the population level. DNA is tightly packaged into chromatin inside our cells, with 147-nucleo-tide segments of DNA wrapped around each histone octomer (themselves separated by ~50-nucleotide linkers). Displacement of nucleosomes through the binding of transcrip-tional regulators results in accessible regions of ‘open chromatin’, which can be mapped using endonucleases such as DNase I (refs. 27,28). Past work has shown that disease and trait associations are highly concentrated in acces-sible elements29 and that allelic variation in DNA accessibility can precisely map the effects of sequence variation on transcription factor activity3,30,31. In eGTEx, we will examine DNA accessibility using both the DNase I hypersen-sitivity assay and the higher-resolution DNase I footprinting assay to map transcription factor occupancy within regulatory DNA at nucleo-tide-level resolution. The footprints revealed by DNA accessibility are highly unbiased and cap-ture variation in diverse regulatory elements, including promoters, enhancers, silencers, insulators, and locus-control regions.

Histone modificationsEach histone protein in the chromatin fiber has a long amino acid tail that can be

tissue-specific mechanisms for disease-asso-ciated variants, there remains a need to obtain multi-omics reference data to study the effects of genetic variation across multiple tissues and multiple layers of molecular complexity. In addition to complementing studies of com-plex genetic diseases, expanding multitissue molecular data from ‘normal’ individuals can enhance cancer studies20,21 (which currently comprise 28% of all requests for GTEx data use), by distinguishing cancer-specific altera-tions and elucidating the tissue specificity of certain cancers and their mutations22,23.

Here we introduce the US National Institutes of Health (NIH) Common Fund’s Enhancing GTEx (eGTEx) project, which seeks to complement the gene expression phenotypes determined in the GTEx proj-ect with intermediate phenotypes across the same tissues and individuals (Fig. 1). These additional data types will provide a more complete reference of how genetic differences cascade through molecular and cellular phe-notypes to impact organismal phenotypes. To achieve this goal, eGTEx is applying diverse molecular assays to the GTEx sample col-lection, including DNase I hypersensitivity, ChIP–seq, DNA and RNA methylation, ASE, protein expression, somatic mutation, and telomere length assays. Together, the eGTEx reference aims to enable high-resolution identification of the mechanistic impacts of genetic variants and their role in human dis-eases, and it will serve as an enabling resource that will facilitate novel integrative and holis-tic computational methods development and biological insights.

The eGTEx project: study design and assaysThe goal of the GTEx project is to establish a national multitissue cohort for molecular phe-notypes. The current release of GTEx (data-base of Genotypes and Phenotypes (dbGaP) accession phs000424.v7.p2) provides 11,688 transcriptomes from 714 individuals and 53 tissues (median of 17 tissues per individual, 173 samples per tissue). The next release, v8, is expected to include 17,500 transcrip-tomes from ~850 individuals, and final data production for the project is targeted for late 2017. In addition to molecular data, GTEx includes pathology reports, histology images and reports, and donor characteris-tics, including ethnicity, age, and sex. Within GTEx, tissues are obtained from deceased donors with next-of-kin consent to the collec-tion and banking of anonymized samples for scientific research24. Two existing strengths of the GTEx project are the large number of tissues collected from each donor, facilitating characterization of gene expression across a

wide variety of tissues, and the relatively large size of the donor population, allowing one to evaluate the contribution of individual genetic variation. The first steps of assaying genetic variation and its impact on gene expression are the focus of two accompanying consor-tium papers25,26. However, fully understand-ing how a genetic variant regulates gene expression, such as through changes in DNA methylation or the binding affinity of a tran-scription factor, and subsequently connecting the downstream effects of differential gene expression through to protein abundance require additional molecular assays.

The goal of the eGTEx initiative is to enhance understanding of gene regulation by performing additional molecular analyses on the same tissues that underwent gene expres-sion analysis. Because of the large size of the GTEx tissue collection (over 25,000 samples), the variable quality across the samples col-lected, and the relatively small aliquot remain-ing for each sample, the eGTEx initiative will analyze a subset of the entire collection. The study design for eGTEx activities was allocated across two ‘dimensions’ of analysis: phase I, involving a relatively small number of donors (~15) analyzed for a large number of different tissues (>20); and phase II, involving a rela-tively small number of tissues (4–6) analyzed in a larger number of donors (150–200). eGTEx has planned to use the same tissues from the same individuals for as many assays as possible. However, because available aliquots are limited, some assays require frozen tissue as input, and the throughput differs by assay, the extent of overlap and the number of phenotypes that will be generated from each individual sample will

Telomere length

Histone modificationsDNA accessibility

Protein quantification

Translation

Transcription

Allele-specific expression

Somatic mutation

DNA methylation

RNA methylation

Figure 1 Quantifying layers of molecular and cellular phenotypes. The eGTEx project plans to study telomere length, DNA accessibility, histone modifications, DNA and RNA methylation, somatic mutation, allele-specific expression, and protein quantification across individuals and tissues.

eGTEx Project, Nat. Genet., 2017

eGTEx data types• Protein quantifications (x2)• Methylation (WGBS)• Histone modifications

(ChIP-seq)• Dnase-seq• mmPCR-seq (deep ASE)• Somatic DNA-seq

(deep exome seq)• Analysis of telomere

structure

Page 9: ASHG Interactive Workshop: Overview and Interpretation of … · 2017-10-18 · COMMENTARY eGTEx: the Enhancing GTExproject 2 ADVANCE ONLINE PUBLICATION NATURE GENETICS vary. The

Sample and data processing overview

DonorPathology

reviewTissues

Blood& skin

BrainU Miami

Brain Bank

9-11 sub-regions

(Liquid N2)

(PAXgene)

LCLsFibroblasts

RNADNA

RNA(DNA)

RNA(DNA)

RNA sequencing• QC: RIN ≥ 5.5• polyA+ (Illumina TruSeq)• 2x76bp, ≥ 50M reads

DNA Analysis• OMNI 2.5M/5M: 450 donors• WES (100x)• WGS (30x): HiSeq 2000, HiSeq X

Quality control

Page 10: ASHG Interactive Workshop: Overview and Interpretation of … · 2017-10-18 · COMMENTARY eGTEx: the Enhancing GTExproject 2 ADVANCE ONLINE PUBLICATION NATURE GENETICS vary. The

Data processing and quality control pipelinesGenotype QC: samples & variants

VCF

RNA-seq alignment, quantification & QC

Expression tables,Covariates

eQTL mapping

Page 11: ASHG Interactive Workshop: Overview and Interpretation of … · 2017-10-18 · COMMENTARY eGTEx: the Enhancing GTExproject 2 ADVANCE ONLINE PUBLICATION NATURE GENETICS vary. The

Genotype QC pipeline

Page 12: ASHG Interactive Workshop: Overview and Interpretation of … · 2017-10-18 · COMMENTARY eGTEx: the Enhancing GTExproject 2 ADVANCE ONLINE PUBLICATION NATURE GENETICS vary. The

RNA-seq pipeline: alignment, quantification, QC

https://github.com/broadinstitute/gtex-pipeline/tree/master/rnaseq

Page 13: ASHG Interactive Workshop: Overview and Interpretation of … · 2017-10-18 · COMMENTARY eGTEx: the Enhancing GTExproject 2 ADVANCE ONLINE PUBLICATION NATURE GENETICS vary. The

eQTL mapping pipeline

https://github.com/broadinstitute/gtex-pipeline/tree/master/qtl

Page 14: ASHG Interactive Workshop: Overview and Interpretation of … · 2017-10-18 · COMMENTARY eGTEx: the Enhancing GTExproject 2 ADVANCE ONLINE PUBLICATION NATURE GENETICS vary. The

RNA-seq and eQTL pipeline details

• Pipeline components selected and updated based on internal and published benchmarks (e.g., Teng et al., Genome Biology, 2016).

Release V6p V7 V8 V9

Genome build GRCh37 GRCh37 GRCh38 GRCh38GENCODE annotation v19 v19 v26 v26Aligner TopHat 1.4.1 STAR 2.4.2a STAR 2.5.3a STAR 2.5.3aGene expression RNA-SeQC 1.1.8 RNA-SeQC 1.1.9 RNA-SeQC 1.1.9 RNA-SeQC 1.1.9Transcript expression FluxCapacitor 1.6 RSEM 1.2.22 RSEM 1.3.0 RSEM 1.3.0Quality control metrics RNA-SeQC 1.1.8 RNA-SeQC 1.1.9 RNA-SeQC 1.1.9 RNA-SeQC 1.1.9QTL mapper FastQTL

Current public release

Page 15: ASHG Interactive Workshop: Overview and Interpretation of … · 2017-10-18 · COMMENTARY eGTEx: the Enhancing GTExproject 2 ADVANCE ONLINE PUBLICATION NATURE GENETICS vary. The

Overview of GTEx resources: open-access data

• Expression• Gene-level expression (TPM, counts)• Transcript-level expression (TPM, counts, isoform proportions)• Exon read counts

• QTLs• Single-tissue eQTLs (cis- and trans-)• Multi-tissue eQTLs• Future: splicing QTLs

• Histology images• De-identified public access sample and subject metadata

All open-access data is available at gtexportal.org

Page 16: ASHG Interactive Workshop: Overview and Interpretation of … · 2017-10-18 · COMMENTARY eGTEx: the Enhancing GTExproject 2 ADVANCE ONLINE PUBLICATION NATURE GENETICS vary. The

Overview of GTEx resources: protected data

• Sequence data:• RNA-seq (2x76 bp, unstranded, >50M reads/sample)• WGS (30x coverage) and WES (100x coverage)• Illumina Omni2.5/5 microarray genotypes (subset of 450 donors)

• Allele-specific expression (ASE)• Full sample and subject metadata• Future: eGTEx sequence data

• ChIP-seq• WGBS-seq

All protected-access data is available at dbGaP, under accession phs000424

Page 17: ASHG Interactive Workshop: Overview and Interpretation of … · 2017-10-18 · COMMENTARY eGTEx: the Enhancing GTExproject 2 ADVANCE ONLINE PUBLICATION NATURE GENETICS vary. The

GTEx data releases Release V6/V6p V7 V8 V9

RNA-seq 8,555 11,688 17,382 ~20,000WGS 148 635 838 ~960WES 520 603 ~960OMNI 450 450 450 450RNA-seq w/ GT 7333 10361 15253 ~20,000eQTL tissues 44 48 49 49

Analysis freezes

Midpoint publications: V6p• Full list available at https://gtexportal.org/home/publicationPage• Data remains available on GTEx PortalNo publication embargo on V7

Current public release

Page 18: ASHG Interactive Workshop: Overview and Interpretation of … · 2017-10-18 · COMMENTARY eGTEx: the Enhancing GTExproject 2 ADVANCE ONLINE PUBLICATION NATURE GENETICS vary. The

GTEx data production: samples per donor

Adipose - Subcutaneous

Adipose - Visceral (Omentum)

Adrenal GlandArtery - AortaArtery - Coronary

Artery - TibialBladderBrain - Amygdala

Brain - Anterior cingulate cortex (BA24)

Brain - Caudate (basal ganglia)

Brain - Cerebellar Hemisphere

Brain - Cerebellum

Brain - CortexBrain - Frontal Cortex (BA9)

Brain - Hippocampus

Brain - Hypothalamus

Brain - Nucleus accumbens (basal ganglia)

Brain - Putamen (basal ganglia)

Brain - Spinal cord (cervical c-1)

Brain - Substantia nigra

Breast - Mammary Tissue

Cells - Cultured fibroblasts

Cells - EBV-transformed lymphocytes

Cervix - Ectocervix

Cervix - Endocervix

Colon - Sigmoid

Colon - Transverse

Esophagus - Gastroesophageal Junction

Esophagus - Mucosa

Esophagus - Muscularis

Fallopian Tube

Heart - Atrial Appendage

Heart - Left Ventricle

Kidney - Cortex

Kidney - Medulla

LiverLungMinor Salivary Gland

Muscle - Skeletal

Nerve - TibialOvaryPancreasPituitaryProstateSkin - Not Sun Exposed (Suprapubic)

Skin - Sun Exposed (Lower leg)

Small Intestine - Terminal Ileum

SpleenStomachTestisThyroidUterusVaginaW

hole Blood1

200

400

600

800

948

Dono

rs

Page 19: ASHG Interactive Workshop: Overview and Interpretation of … · 2017-10-18 · COMMENTARY eGTEx: the Enhancing GTExproject 2 ADVANCE ONLINE PUBLICATION NATURE GENETICS vary. The

Expression data on GTEx PortalTranscript-level expression

• Based on full GENCODE annotation

• Quantified with RSEM• TPM• Expected read counts• No covariate correction

Gene-level expression• Based on collapsed GENCODE

annotation• Quantified with RNA-SeQC• TPM• Read counts• No covariate correction

eQTL inputs• Based on gene-level

quantifications• Additional normalization:

TMM of read counts; inverse normal transform

• Covariates (hidden + known) in separate file

Page 20: ASHG Interactive Workshop: Overview and Interpretation of … · 2017-10-18 · COMMENTARY eGTEx: the Enhancing GTExproject 2 ADVANCE ONLINE PUBLICATION NATURE GENETICS vary. The

Annotation used for gene-level expression quantification

• RNA-seq protocol:• polyA+• Unstranded

• Ambiguity in quantifying exondomains shared between sense andanti-sense transcripts

• Collapsing procedure:• Masks overlapping intervals• Mask ‘readthrough’ and ‘retained intron’ transcripts

Page 21: ASHG Interactive Workshop: Overview and Interpretation of … · 2017-10-18 · COMMENTARY eGTEx: the Enhancing GTExproject 2 ADVANCE ONLINE PUBLICATION NATURE GENETICS vary. The

Definition of cis-eQTLs in GTEx

• cis-eQTL: genome-wide significant association between ≥ 1 eVariant and eGene, with associations tested within ±1Mb cis-window around TSS. Does not imply evidence of allelic effects at each locus.

• eGene: gene with at least one significant eQTL (at 5% FDR).• eVariant: variant with a significant association to ≥1 eGene.• Effect allele: ALT allele (not necessarily the minor allele).

AA AT TT

Expression

Genotype

Page 22: ASHG Interactive Workshop: Overview and Interpretation of … · 2017-10-18 · COMMENTARY eGTEx: the Enhancing GTExproject 2 ADVANCE ONLINE PUBLICATION NATURE GENETICS vary. The

Data normalization for eQTL analyses

• Expression thresholds:• ≥6 counts in ≥ 20% of samples AND• ≥0.1 TPM in ≥ 20% of samples

• Normalization:• Between sample normalization: TMM (from edgeR)

• Corrects for library size differences and expression outlier effects• Within-gene normalization: inverse normal transform

• Attenuates outliers

Page 23: ASHG Interactive Workshop: Overview and Interpretation of … · 2017-10-18 · COMMENTARY eGTEx: the Enhancing GTExproject 2 ADVANCE ONLINE PUBLICATION NATURE GENETICS vary. The

Covariate correction in eQTL analyses

• Genotype: top 3 PCs, sex, sequencing platform (HiSeq 2000, HiSeq X)• Expression: significant technical confounders may be unknown; estimation of

hidden confounders is key (e.g., through PEER factors)

●●● ●●● ● ● ●●●●● ●● ● ●●●●●● ● ●●●● ●●● ●●●● ●●●● ●●●●● ●

End 1 Mismatch Rate

Cumulative Gap Length

Expression Profiling EfficiencyExonic Rate

Intronic RateRIN

Total Ischemic timemean coefficient of variation

5' 50−based normalization3' 50−base normalization

rRNA RateGenes Detected

Transcripts DetectedEstimated library size

Mapped Unique Rate of TotalUnique Rate of Mapped

Duplication Rate of MappedMapped Unique

Mean Coverage Per Base

End 1 Sense

End 1 AntisenseEnd 2 Antisense

End 2 SenseMapped Pairs

Mapped ReadsTotal reads

Time PAXgene fixativeBase Mismatch Rate

End 2 Mapping RateMapping Rate

End 1 Mapping RateEnd 2 Mismatch Rate

Failed Vendor QC CheckFragment Length StdDev

Fragment Length MeanEnd 1 % SenseEnd 2 % SenseAutolysis Score

Number of Gaps

Gap Percentagenucleic acid isolation batch

Intergenic RateIntragenic RateChimeric Pairs

Alternative AligmentsBSS collection site

Number Covered 5'rRNA

Split Reads

0 0.05 0.1 0.15 0.2 0.25

Nucleic acid isolationSample collectionSequencing Metrics

Supplementary Figure 8. Sample covariates associated with PEER factors in each tissue. For eachtissue, adjusted (R

2) reflecting the proportion of variance explained by each sample-specific covariate, for

the entire PEER component removed from the expression data. Each cell reflects variance explained for atissue/covariate pair, color scale at bottom. Grey cells represent pairs with insufficient data for estimation.

WWW.NATURE.COM/ NATURE | 31

Interval Of Onset To Death For First Underlying CauseCore Body Temperature

AgeGender

Manner Of DeathPneumonia_affectlung

Heart attack_etcRenal FailureLiver Disease

AscitesPneumonia

Tissue Recovery Time PointDeath Time Point Reference

Witnessed DeathHeight

BMIWeight

HypertensionChronic Respiratory Disease

HCV AbRace

Infected LinesHeroin Use

Bacterial InfectionsDocumented Sepsis

Long Term Steroid UseFungal Infections

Diabetes mellitus T2Cerebrovascular Disease

Prescription Pill AbuseResident Of State Run Group Home

CMV Total AbMen Sex With Men

Diabetes mellitus T1Heart Disease

Dialysis TreatmentMajor depression

Cancer Diagnosis 5yIschemic Heart Disease

Unexplained SeizuresAutopsy Performed By Coroner Or ME

Rheumatoid ArthritisBlood Donations Denied

Chronic Lower Respiratory DiseaseArthritis

Resided On Northern European Military BasePositive Blood CulturesUnexplained Weakness

Unexplained Weight LossCocaine Use In 5y

History Of Non Metastatic CancerReceived Tissue Organ Transplant

SchizophreniaAsthma

HCV 1 NATOpen Wounds

Exposure To ToxicsHIV 1 NAT

BodyTemperature − Units of measurementPrimary History Source

Abnormal WbcDeath Certificate Available

CohortIschemic Time

Donor On A Ventilator Immediately Prior To DeathPlace Of Death

Hardy ScaleICD−10 Code for cause of death

Classification of deathCategory of death

Death parameters

Demography

Medical history

Blood parameters

Collection

Tissue recovery

● ● ●● ●●● ●● ● ●●● ● ● ●●● ●● ●● ●● ● ●● ●●● ●● ●● ●●● ●● ●●● ●●

0 0.05 0.1 0.15 0.2 0.25

Supplementary Figure 9. Donor covariates associated with PEER factors in each tissue. For eachtissue, adjusted (R

2) reflecting the proportion of variance explained by each donor-specific covariate, for

the entire PEER component removed from the expression data. Each cell reflects variance explained for atissue/covariate pair, color scale at bottom. Grey cells represent pairs with insufficient data for estimation.

WWW.NATURE.COM/ NATURE | 32

Page 24: ASHG Interactive Workshop: Overview and Interpretation of … · 2017-10-18 · COMMENTARY eGTEx: the Enhancing GTExproject 2 ADVANCE ONLINE PUBLICATION NATURE GENETICS vary. The

eQTL mapping and eGene discovery

• Variants in cis-window (±1Mb from TSS) may be correlated due to linkage disequilibrium (LD)

• LD must be incorporated in multiple hypothesis testing correction when establishing genome-wide significance

• Empirical p-values from permutation of genotypes

Page 25: ASHG Interactive Workshop: Overview and Interpretation of … · 2017-10-18 · COMMENTARY eGTEx: the Enhancing GTExproject 2 ADVANCE ONLINE PUBLICATION NATURE GENETICS vary. The

Multiple hypothesis correction for eGene detection

0.0 0.2 0.4 0.6 0.8 1.0p-value

0

1000

2000

3000

4000

5000

6000

7000Genes

1 2 3 4 5 6 7-log10(min. p-value)

0.0

0.2

0.4

0.6

0.8

1.0

Freq

uenc

y

Beta distr.Nominal p-valuePermutation p-valuesEmpirical p-value

1 2 3 4 5 6-log10(min. p-value)

0.0

0.2

0.4

0.6

0.8

1.0

Freq

uenc

y

Beta distr.Nominal p-valuePermutation p-valuesEmpirical p-value

1 2 3 4 5 6 7 8-log10(min. p-value)

0.0

0.2

0.4

0.6

0.8

1.0

Freq

uenc

y

Beta distr.Nominal p-valuePermutation p-valuesEmpirical p-value

Gene A Gene B Gene C

Empirical p-valuedistribution

q-values (Storey)

eGenes at ≤ 0.05 FDR

Delaneau et al., Bioinformatics, 2016Storey & Tibshirani, PNAS, 2003

Page 26: ASHG Interactive Workshop: Overview and Interpretation of … · 2017-10-18 · COMMENTARY eGTEx: the Enhancing GTExproject 2 ADVANCE ONLINE PUBLICATION NATURE GENETICS vary. The

Threshold for significant variant-gene pairs

Nominal p-value threshold for each gene 𝑔:𝐹#$%(𝑝() where 𝐹#$% is the inverse cumulative Beta distribution of the gene.

1 2 3 4 5 6-log10(min. p-value)

0.0

0.2

0.4

0.6

0.8

1.0Fr

eque

ncyBeta distr.Nominal p-valuePermutation p-valuesEmpirical p-value

𝑝(: empirical p-value of gene closest to 0.05 FDR threshold

Page 27: ASHG Interactive Workshop: Overview and Interpretation of … · 2017-10-18 · COMMENTARY eGTEx: the Enhancing GTExproject 2 ADVANCE ONLINE PUBLICATION NATURE GENETICS vary. The

Example for portal demonstration

NDRG4,SETD6,CNOT1nearQTinterval-associatedvariant,rs37062GTEx Consortium, Science, 2015

Page 28: ASHG Interactive Workshop: Overview and Interpretation of … · 2017-10-18 · COMMENTARY eGTEx: the Enhancing GTExproject 2 ADVANCE ONLINE PUBLICATION NATURE GENETICS vary. The

jupyter notebook: overview of expression and eQTL data

• The interactive parts of the workshop will be conducted using a jupyter notebook, GTEx_ASHG17_workshop.ipynb

• On the GTEx Portal, go to https://gtexportal.org/workshop.html• Click on “Start the notebook” to begin. This will launch a cloud-

based instance of the notebook, with access to all data examples. Please note that the notebook is read-only.

• The notebook is also available for download at https://github.com/broadinstitute/gtex-ashg2017-workshop

Page 29: ASHG Interactive Workshop: Overview and Interpretation of … · 2017-10-18 · COMMENTARY eGTEx: the Enhancing GTExproject 2 ADVANCE ONLINE PUBLICATION NATURE GENETICS vary. The

Organization of GTEx data: common identifiers

• All sample attributes are indexed by Sample ID• All donor attributes are indexed by Donor ID• The donor-specific tissue collection ID is not a proxy for tissue type

GTEX-1117F-0226-SM-5GZZ7Sample ID:Donor ID Aliquot ID

Donor-specifictissue collection ID

Page 30: ASHG Interactive Workshop: Overview and Interpretation of … · 2017-10-18 · COMMENTARY eGTEx: the Enhancing GTExproject 2 ADVANCE ONLINE PUBLICATION NATURE GENETICS vary. The

ImplicationsofGTExforinterpretingGWASsignals

Page 31: ASHG Interactive Workshop: Overview and Interpretation of … · 2017-10-18 · COMMENTARY eGTEx: the Enhancing GTExproject 2 ADVANCE ONLINE PUBLICATION NATURE GENETICS vary. The

ManygenesinthesameregionhaveeQTLs

31

significance

geneA

geneAeQTLs

significance

geneB

geneBeQTLs

significance

geneC

geneCeQTLs

position

Page 32: ASHG Interactive Workshop: Overview and Interpretation of … · 2017-10-18 · COMMENTARY eGTEx: the Enhancing GTExproject 2 ADVANCE ONLINE PUBLICATION NATURE GENETICS vary. The

…withdifferenteffectsacrosstissues

32

significance

geneAeQTLs

geneBeQTLs

geneCeQTLs

position

Page 33: ASHG Interactive Workshop: Overview and Interpretation of … · 2017-10-18 · COMMENTARY eGTEx: the Enhancing GTExproject 2 ADVANCE ONLINE PUBLICATION NATURE GENETICS vary. The

…withdifferenteffectsacrosstissues

33

significance

geneAeQTLs

geneBeQTLs

geneCeQTLs

position

Whichone(s)explainthediseaserisk?

Page 34: ASHG Interactive Workshop: Overview and Interpretation of … · 2017-10-18 · COMMENTARY eGTEx: the Enhancing GTExproject 2 ADVANCE ONLINE PUBLICATION NATURE GENETICS vary. The

LotsofeQTLdatameansthatseeminglysignificantassociationsarethenorm

Page 35: ASHG Interactive Workshop: Overview and Interpretation of … · 2017-10-18 · COMMENTARY eGTEx: the Enhancing GTExproject 2 ADVANCE ONLINE PUBLICATION NATURE GENETICS vary. The

eQTL/GWASinterpretationneedstobeexaminedmorecautiously

~1/3ofallvariantscouldmeetthiscriterion

Page 36: ASHG Interactive Workshop: Overview and Interpretation of … · 2017-10-18 · COMMENTARY eGTEx: the Enhancing GTExproject 2 ADVANCE ONLINE PUBLICATION NATURE GENETICS vary. The

Co-localizationapproachescombineeQTLandGWASsignals

GWASsignaleQTLsignal

Giambartolomei etal,PLoS Genet,2014

Page 37: ASHG Interactive Workshop: Overview and Interpretation of … · 2017-10-18 · COMMENTARY eGTEx: the Enhancing GTExproject 2 ADVANCE ONLINE PUBLICATION NATURE GENETICS vary. The

iPython notebooktask• CorrelationofGWASandeQTLsummarystatisticsoveranassociatedhitforBMI.

Page 38: ASHG Interactive Workshop: Overview and Interpretation of … · 2017-10-18 · COMMENTARY eGTEx: the Enhancing GTExproject 2 ADVANCE ONLINE PUBLICATION NATURE GENETICS vary. The

Co-localizationofeQTLsandGWASinGTEx

Page 39: ASHG Interactive Workshop: Overview and Interpretation of … · 2017-10-18 · COMMENTARY eGTEx: the Enhancing GTExproject 2 ADVANCE ONLINE PUBLICATION NATURE GENETICS vary. The

• CorrelationofGWASandeQTLsummarystatisticsfortwoseparategenesoveranassociatedhitforBMI

BMIassociationrs2008514

-log10(eQ

TLP-value)

ToyexampleNotnecessarilythecausalgeneortissue

Page 40: ASHG Interactive Workshop: Overview and Interpretation of … · 2017-10-18 · COMMENTARY eGTEx: the Enhancing GTExproject 2 ADVANCE ONLINE PUBLICATION NATURE GENETICS vary. The

#tissues

#eG

enes

V6peQTLs,44tissues

Bimodaldistributionoftissue-specificityofcis-eQTLs

Multi-tissueeQTL meta-analysis:Metasoft (Han,BandEskin,E,AJHG2011)

Page 41: ASHG Interactive Workshop: Overview and Interpretation of … · 2017-10-18 · COMMENTARY eGTEx: the Enhancing GTExproject 2 ADVANCE ONLINE PUBLICATION NATURE GENETICS vary. The

NumberoftissuespereQTL inLDwithGWASvariantsincreaseswithincreasedpower(multi-tissueanalysis)

#tissuesperGWASvariant

#GW

ASvariants

Multi-tissueanalysisMedian=31tissues

Single-tissueanalysisMedian=5tissues

eQTL detectedonlywithmulti-tissueanalysis

Multi-tissueeQTL posteriorprobability

Single-tissue

eQTL

p-value

Page 42: ASHG Interactive Workshop: Overview and Interpretation of … · 2017-10-18 · COMMENTARY eGTEx: the Enhancing GTExproject 2 ADVANCE ONLINE PUBLICATION NATURE GENETICS vary. The

Co-localizationofeQTLsandGWASinGTEx

Page 43: ASHG Interactive Workshop: Overview and Interpretation of … · 2017-10-18 · COMMENTARY eGTEx: the Enhancing GTExproject 2 ADVANCE ONLINE PUBLICATION NATURE GENETICS vary. The

• CorrelationofGWASandeQTLsummarystatisticsfortwoseparatetissuesoveranassociatedhitforBMI

ToyexampleNotnecessarilythecausalgeneortissue

Page 44: ASHG Interactive Workshop: Overview and Interpretation of … · 2017-10-18 · COMMENTARY eGTEx: the Enhancing GTExproject 2 ADVANCE ONLINE PUBLICATION NATURE GENETICS vary. The

DetectingGWAS/eQTL overlapiseasyinprinciple

44

gene

eQTL

position

GWAS

GWASsignificance

significance

eQTL

significance

SamesignalinGWASandeQTL:colocalization!

Page 45: ASHG Interactive Workshop: Overview and Interpretation of … · 2017-10-18 · COMMENTARY eGTEx: the Enhancing GTExproject 2 ADVANCE ONLINE PUBLICATION NATURE GENETICS vary. The

Difficultinpractice

45

position

significance

GWASsignificance

eQTL

significance

Unclearifsamesignalinboth

eQTL

GWAS

Page 46: ASHG Interactive Workshop: Overview and Interpretation of … · 2017-10-18 · COMMENTARY eGTEx: the Enhancing GTExproject 2 ADVANCE ONLINE PUBLICATION NATURE GENETICS vary. The

Methodstodetectcolocalization

46

Method methodarchetype identifiescausalvariants?

multiplecausalvariants?

COLOC Bayesian No No

Sherlock Bayesian No Yes

eCAVIAR completelikelihood(exhaustivesearch) Yes Yes,but

intractable

FINEMAP completelikelihood(stochasticsearch) Yes Yes

SMR Mendelianrandomization No No

TWAS TWAS No Yes

MetaXcan TWAS No Yes

Page 47: ASHG Interactive Workshop: Overview and Interpretation of … · 2017-10-18 · COMMENTARY eGTEx: the Enhancing GTExproject 2 ADVANCE ONLINE PUBLICATION NATURE GENETICS vary. The

Furthercaveats:SomegeneshavemultipleindependenteQTLs

• CouldexplaincomplexityofGWAS/eQTLsignals

• ConditionaleQTLs notyettestedforcolocalizationwithGWAS

47

Zengetal.(2016)BioRXiv.

gene

primaryeQTL

significance

positionsig

nificance secondaryeQTL

regressoutprimaryeQTL signal

Page 48: ASHG Interactive Workshop: Overview and Interpretation of … · 2017-10-18 · COMMENTARY eGTEx: the Enhancing GTExproject 2 ADVANCE ONLINE PUBLICATION NATURE GENETICS vary. The

Anotherchallenge:IdentifyingcausalvariantsineQTLregions

• CAVIAR(Hormozdiari etal.Genetics2014)resultswillbeonGTEx Portalsoon!

Fine-mappingmethodsproposecrediblesetsofcausalvariantsforaneQTL

Page 49: ASHG Interactive Workshop: Overview and Interpretation of … · 2017-10-18 · COMMENTARY eGTEx: the Enhancing GTExproject 2 ADVANCE ONLINE PUBLICATION NATURE GENETICS vary. The

eQTLlimitedincapturingrarevarianteffectsGeneexpressionoutlierscanpointtorarevariantswithlargeeffects

Overexpressionoutlier

Underexpressionoutlier

Page 50: ASHG Interactive Workshop: Overview and Interpretation of … · 2017-10-18 · COMMENTARY eGTEx: the Enhancing GTExproject 2 ADVANCE ONLINE PUBLICATION NATURE GENETICS vary. The

Interpretingpersonalvariantsusinggeneticandfunctional

genomicsdata

Li,Kim,Tsang,Davis,Nature,2017

Page 51: ASHG Interactive Workshop: Overview and Interpretation of … · 2017-10-18 · COMMENTARY eGTEx: the Enhancing GTExproject 2 ADVANCE ONLINE PUBLICATION NATURE GENETICS vary. The

Using GTEx to help solve rare disease cases.

Patient muscle(n=63)

GTEx control muscle(n=184)

>""">""""" """" """">"">

Aberrant splicing Allele imbalance

AAAAT

AVariant Calling

RNA

WES

TTAA

Variant Calling

RN

AW

ES

TT

AA>""">""""" """" """">"">

Aberrant splicing Allele imbalance

AAAAT

AVariant Calling

RNA

WES

TTAA

Cummingsetal,ScienceTransMed,2017

SeeTalkbyBerylCummingsFriday10:45AM

Page 52: ASHG Interactive Workshop: Overview and Interpretation of … · 2017-10-18 · COMMENTARY eGTEx: the Enhancing GTExproject 2 ADVANCE ONLINE PUBLICATION NATURE GENETICS vary. The

Interpretinggeneticvariantsindisease

Geneticvariationinfluencegeneexpressionof~90%ofallknownprotein-codinggenes

AbundanceofeQTLdatarequirescarewhenconductingGWASfollow-up- Multipletestingcanleadtofalsediscoveries

- Co-localizationmethodsrequired- 40%ofallvariantsdonotco-localizewiththeirnearestgene

Geneexpressionoutlierscanidentifylarge-effectrarevariants- Canbeusedtointerpretindividualriskfactorsandidentifyrare

diseasegenesandvariants

Page 53: ASHG Interactive Workshop: Overview and Interpretation of … · 2017-10-18 · COMMENTARY eGTEx: the Enhancing GTExproject 2 ADVANCE ONLINE PUBLICATION NATURE GENETICS vary. The

GTEx pipelines

• Source code is available athttps://github.com/broadinstitute/gtex-pipeline

• Includes wrapper scripts, Dockerfiles

• Pipelines are available on FireCloud(http://firecloud.org)

• Namespace: broadinstitute_gtex

PipelinemodulesPipelinemodulesPipeline modulesPipeline

modulesPipelinemodulesPipeline modules

Docker image

WDLscript

WDLscript

Page 54: ASHG Interactive Workshop: Overview and Interpretation of … · 2017-10-18 · COMMENTARY eGTEx: the Enhancing GTExproject 2 ADVANCE ONLINE PUBLICATION NATURE GENETICS vary. The

Biobank

• The biobank from the GTEx project is hosted at the Broad Institute.• Samples can be searched and requested at

https://gtexportal.org/home/samplesPage.• Sample requests for research complementing the primary project

are welcome.

Page 55: ASHG Interactive Workshop: Overview and Interpretation of … · 2017-10-18 · COMMENTARY eGTEx: the Enhancing GTExproject 2 ADVANCE ONLINE PUBLICATION NATURE GENETICS vary. The

AcknowledgementsARTICLERESEARCH

GTEx ConsortiumLaboratory, Data Analysis & Coordinating Center (LDACC)—Analysis Working Group François Aguet1, Kristin G. Ardlie1, Beryl B. Cummings1,2, Ellen T. Gelfand1, Gad Getz1,3, Kane Hadley1, Robert E. Handsaker1,4, Katherine H. Huang1, Seva Kashin1,4, Konrad J. Karczewski1,2, Monkol Lek1,2, Xiao Li1, Daniel G. MacArthur1,2, Jared L. Nedzel1, Duyen T. Nguyen1, Michael S. Noble1, Ayellet V. Segrè1, Casandra A. Trowbridge1, Taru Tukiainen1,2

Statistical Methods groups—Analysis Working Group Nathan S. Abell5,6, Brunilda Balliu6, Ruth Barshir7, Omer Basha7, Alexis Battle8, Gireesh K. Bogu9,10, Andrew Brown11,12,13, Christopher D. Brown14, Stephane E. Castel15,16, Lin S. Chen17, Colby Chiang18, Donald F. Conrad19,20, Nancy J. Cox21, Farhan N. Damani8, Joe R. Davis5,6, Olivier Delaneau11,12,13, Emmanouil T. Dermitzakis11,12,13, Barbara E. Engelhardt22, Eleazar Eskin23,24, Pedro G. Ferreira25,26, Laure Frésard5,6, Eric R. Gamazon21,27,28, Diego Garrido-Martín9,10, Ariel D.H. Gewirtz29, Genna Gliner30, Michael J. Gloudemans5,6,31, Roderic Guigo9,10,32, Ira M. Hall18,19,33, Buhm Han34, Yuan He35, Farhad Hormozdiari23, Cedric Howald11,12,13, Hae Kyung Im36, Brian Jo29, Eun Yong Kang23, Yungil Kim8, Sarah Kim-Hellmuth15,16, Tuuli Lappalainen15,16, Gen Li37, Xin Li6, Boxiang Liu5,6,38, Serghei Mangul23, Mark I. McCarthy39,40,41, Ian C. McDowell42, Pejman Mohammadi15,16, Jean Monlong9,10,43, Stephen B. Montgomery5,6, Manuel Muñoz-Aguirre9,10,44, Anne W. Ndungu39, Dan L. Nicolae36,45,46, Andrew B. Nobel47,48, Meritxell Oliva36,49, Halit Ongen11,12,13, John J. Palowitch47, Nikolaos Panousis11,12,13, Panagiotis Papasaikas9,10, YoSon Park14, Princy Parsana8, Anthony J. Payne39, Christine B. Peterson50, Jie Quan51, Ferran Reverter9,10,52, Chiara Sabatti53,54, Ashis Saha8, Michael Sammeth55, Alexandra J. Scott18, Andrey A. Shabalin56, Reza Sodaei9,10, Matthew Stephens45,46, Barbara E. Stranger36,49,57, Benjamin J. Strober35, Jae Hoon Sul58, Emily K. Tsang6,31, Sarah Urbut46, Martijn van de Bunt39,40, Gao Wang46, Xiaoquan Wen59, Fred A. Wright60, Hualin S. Xi51, Esti Yeger-Lotem7,61, Zachary Zappala5,6, Judith B. Zaugg62, Yi-Hui Zhou60

Enhancing GTEx (eGTEx) groups Joshua M. Akey29,63, Daniel Bates64, Joanne Chan5, Lin S. Chen17, Melina Claussnitzer1,65,66, Kathryn Demanelis17, Morgan Diegel64, Jennifer A. Doherty67, Andrew P. Feinberg35,68,69,70, Marian S. Fernando36,49, Jessica Halow64, Kasper D. Hansen68,71,72, Eric Haugen64, Peter F. Hickey72, Lei Hou1,73, Farzana Jasmine17, Ruiqi Jian5, Lihua Jiang5, Audra Johnson64, Rajinder Kaul64, Manolis Kellis1,73, Muhammad G. Kibriya17, Kristen Lee64, Jin Billy Li5, Qin Li5, Xiao Li5, Jessica Lin5,74, Shin Lin5,75, Sandra Linder5,6, Caroline Linke36,49, Yaping Liu1,73, Matthew T. Maurano76, Benoit Molinie1, Stephen B. Montgomery5,6, Jemma Nelson64, Fidencio J. Neri64, Meritxell Oliva36,49, Yongjin Park1,73, Brandon L. Pierce17, Nicola J. Rinaldi1,73, Lindsay F. Rizzardi68, Richard Sandstrom64, Andrew Skol36,49,57, Kevin S. Smith5,6, Michael P. Snyder5, John Stamatoyannopoulos64,74,77, Barbara E. Stranger36,49,57, Hua Tang5, Emily K. Tsang6,31, Li Wang1, Meng Wang5, Nicholas Van Wittenberghe1, Fan Wu36,49, Rui Zhang5

NIH Common Fund Concepcion R. Nierras78

NIH/NCI Philip A. Branton79, Latarsha J. Carithers79,80, Ping Guan79, Helen M. Moore79, Abhi Rao79, Jimmie B. Vaught79

NIH/NHGRI Sarah E. Gould81, Nicole C. Lockart81, Casey Martin81, Jeffery P. Struewing81, Simona Volpi81

NIH/NIMH Anjene M. Addington82, Susan E. Koester82

NIH/NIDA A. Roger Little83

Biospecimen Collection Source Site—NDRI Lori E. Brigham84, Richard Hasz85, Marcus Hunter86, Christopher Johns87, Mark Johnson88, Gene Kopen89, William F. Leinweber89, John T. Lonsdale89, Alisa McDonald89, Bernadette Mestichelli89, Kevin Myer86, Brian Roe86, Michael Salvatore89, Saboor Shad89, Jeffrey A. Thomas89, Gary Walters88, Michael Washington88, Joseph Wheeler87

Biospecimen Collection Source Site—RPCI Jason Bridge90, Barbara A. Foster91, Bryan M. Gillard91, Ellen Karasik91, Rachna Kumar91, Mark Miklos90, Michael T. Moser91

Biospecimen Core Resource—VARI Scott D. Jewell92, Robert G. Montroy92, Daniel C. Rohrer92, Dana R. Valley92

Brain Bank Repository—University of Miami Brain Endowment Bank David A. Davis93, Deborah C. Mash93

Leidos Biomedical—Project Management Anita H. Undale94, Anna M. Smith95, David E. Tabor95, Nancy V. Roche95, Jeffrey A. McLean95, Negin Vatanian95, Karna L. Robinson95, Leslie Sobin95, Mary E. Barcus96, Kimberly M. Valentino95, Liqun Qi95, Steven Hunter95, Pushpa Hariharan95, Shilpi Singh95, Ki Sung Um95, Takunda Matose95, Maria M. Tomaszewski95

ELSI Study Laura K. Barker97, Maghboeba Mosavel98, Laura A. Siminoff97, Heather M. Traino97

Genome Browser Data Integration & Visualization—EBI Paul Flicek99, Thomas Juettemann99, Magali Ruffier99, Dan Sheppard99, Kieron Taylor99, Stephen J. Trevanion99, Daniel R. Zerbino99

Genome Browser Data Integration & Visualization—UCSC Genomics Institute, University of California Santa Cruz Brian Craft100, Mary Goldman100, Maximilian Haeussler100, W. James Kent100, Christopher M. Lee100, Benedict Paten100, Kate R. Rosenbloom100, John Vivian100, Jingchun Zhu100

1The Broad Institute of Massachusetts Institute of Technology and Harvard University, Cambridge, Massachusetts 02142, USA. 2Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, Massachusetts 02114, USA. 3Massachusetts General Hospital Cancer Center and Department of Pathology, Massachusetts General Hospital, Boston, Massachusetts 02114, USA. 4Department of Genetics, Harvard Medical School, Boston, Massachusetts 02114, USA. 5Department of Genetics, Stanford University, Stanford, California 94305, USA. 6Department of Pathology, Stanford University, Stanford, California 94305, USA. 7Department of Clinical Biochemistry and Pharmacology, Faculty of Health Sciences, Ben-Gurion University of the Negev, Beer-Sheva 84105, Israel. 8Department of Computer Science, Johns Hopkins University, Baltimore, Maryland 21218, USA. 9Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, 08003 Barcelona, Spain. 10Universitat Pompeu Fabra (UPF), 08002 Barcelona, Spain. 11Department of Genetic Medicine and Development, University of Geneva Medical School, 1211 Geneva, Switzerland. 12Institute for Genetics and Genomics in Geneva (iG3), University of Geneva, 1211 Geneva, Switzerland. 13Swiss Institute of Bioinformatics, 1211 Geneva, Switzerland. 14Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA. 15New York Genome Center, New York, New York 10013, USA. 16Department of Systems Biology, Columbia University Medical Center, New York, New York 10032, USA. 17Department of Public Health Sciences, The University of Chicago, Chicago, Illinois 60637, USA. 18McDonnell Genome Institute, Washington University School of Medicine, St. Louis, Missouri 63108, USA. 19Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63108, USA. 20Department of Pathology & Immunology, Washington University School of Medicine, St. Louis, Missouri 63108, USA. 21Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee 37232, USA. 22Department of Computer Science, Center for Statistics and Machine Learning, Princeton University, Princeton, New Jersey 08540, USA. 23Department of Computer Science, University of California, Los Angeles, California 90095, USA. 24Department of Human Genetics, University of California, Los Angeles, California 90095, USA. 25Instituto de Investigação e Inovação em Saúde (i3S), Universidade do Porto, 4200-135 Porto, Portugal. 26Institute of Molecular Pathology and Immunology (IPATIMUP), University of Porto, 4200-625 Porto, Portugal. 27Department of Clinical Epidemiology, Biostatistics and Bioinformatics, Academic Medical Center, University of Amsterdam, 1105 AZ Amsterdam, The Netherlands. 28Department of Psychiatry, Academic Medical Center, University of Amsterdam, 1105 AZ Amsterdam, The Netherlands. 29Lewis Sigler Institute, Princeton University, Princeton, New Jersey 08540, USA. 30Department of Operations Research and Financial Engineering, Princeton University, Princeton, New Jersey 08540, USA. 31Biomedical Informatics Program, Stanford University, Stanford, California 94305, USA. 32Institut Hospital del Mar d’Investigacions Mèdiques (IMIM), 08003 Barcelona, Spain. 33Department of Medicine, Washington University School of Medicine, St. Louis, Missouri 63108, USA. 34Department of Convergence Medicine, University of Ulsan College of Medicine, Asan Medical Center, Seoul 138-736, South Korea. 35Department of Biomedical Engineering, Johns Hopkins University, Baltimore, Maryland 21218, USA. 36Section of Genetic Medicine, Department of Medicine, The University of Chicago, Chicago, Illinois 60637, USA. 37Department of Biostatistics, Mailman School of Public Health, Columbia University, New York, New York 10032, USA. 38Department of Biology, Stanford University, Stanford, California 94305, USA. 39Wellcome Trust Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford OX3 7BN, UK. 40Oxford Centre for Diabetes, Endocrinology and Metabolism, University of Oxford, Churchill Hospital, Oxford OX3 7LE, UK. 41Oxford NIHR Biomedical Research Centre, Churchill Hospital, Oxford OX3 7LJ, UK. 42Computational Biology & Bioinformatics Graduate Program, Duke University, Durham, North Carolina 27708, USA. 43Human Genetics Department, McGill University, Montreal, Quebec H3A 0G1, Canada. 44Departament d’Estadística i Investigació Operativa, Universitat Politècnica de Catalunya, 08034 Barcelona, Spain. 45Department of Statistics, The University of Chicago, Chicago, Illinois 60637, USA. 46Department of Human Genetics, The University of Chicago, Chicago, Illinois 60637, USA. 47Department of Statistics and Operations Research, University of North Carolina, Chapel Hill, North Carolina 27599, USA. 48Department of Biostatistics, University of North Carolina, Chapel Hill, North Carolina 27599, USA. 49Institute for Genomics and Systems Biology, The University of Chicago, Chicago, Illinois 60637, USA. 50Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, Texas 77030, USA. 51Computational Sciences, Pfizer Inc, Cambridge, Massachusetts 02139, USA. 52Universitat de Barcelona, 08028 Barcelona, Spain. 53Department of Biomedical Data Science, Stanford University, Stanford, California 94305, USA. 54Department of Statistics, Stanford University, Stanford, California 94305, USA. 55Institute of Biophysics Carlos Chagas Filho (IBCCF), Federal University of Rio de Janeiro (UFRJ), 21941902 Rio de Janeiro, Brazil. 56Department of Psychiatry, University of Utah, Salt Lake City, Utah 84108, USA. 57Center for Data Intensive Science, The University of Chicago, Chicago, Illinois 60637, USA. 58Department of Psychiatry and Biobehavioral Sciences, University of California, Los Angeles, California 90095, USA. 59Department of Biostatistics, University of Michigan, Ann Arbor, Michigan 48109, USA. 60Bioinformatics Research Center and Departments of Statistics and Biological Sciences, North Carolina State University, Raleigh, North Carolina 27695, USA. 61National Institute for Biotechnology in the Negev, Beer-Sheva 84105, Israel. 62European Molecular Biology Laboratory, 69117 Heidelberg, Germany. 63Department of Ecology and Evolutionary Biology, Princeton University, Princeton, New Jersey 08540, USA. 64Altius Institute for Biomedical Sciences, Seattle, Washington 98121, USA. 65Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, Massachusetts 02215, USA. 66University of Hohenheim, 70599 Stuttgart, Germany. 67Huntsman Cancer Institute, Department of Population Health Sciences, University of Utah, Salt Lake City, Utah 84112, USA. 68Center for Epigenetics, Johns Hopkins University School of Medicine, Baltimore, Maryland 21205, USA. 69Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, Maryland 21205, USA. 70Department of Mental Health, Johns Hopkins University School of Public Health, Baltimore, Maryland 21205, USA. 71McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, Maryland 21205, USA. 72Department of Biostatistics, Johns Hopkins

© 2017 Macmillan Publishers Limited, part of Springer Nature. All rights reserved.

ARTICLERESEARCH

GTEx ConsortiumLaboratory, Data Analysis & Coordinating Center (LDACC)—Analysis Working Group François Aguet1, Kristin G. Ardlie1, Beryl B. Cummings1,2, Ellen T. Gelfand1, Gad Getz1,3, Kane Hadley1, Robert E. Handsaker1,4, Katherine H. Huang1, Seva Kashin1,4, Konrad J. Karczewski1,2, Monkol Lek1,2, Xiao Li1, Daniel G. MacArthur1,2, Jared L. Nedzel1, Duyen T. Nguyen1, Michael S. Noble1, Ayellet V. Segrè1, Casandra A. Trowbridge1, Taru Tukiainen1,2

Statistical Methods groups—Analysis Working Group Nathan S. Abell5,6, Brunilda Balliu6, Ruth Barshir7, Omer Basha7, Alexis Battle8, Gireesh K. Bogu9,10, Andrew Brown11,12,13, Christopher D. Brown14, Stephane E. Castel15,16, Lin S. Chen17, Colby Chiang18, Donald F. Conrad19,20, Nancy J. Cox21, Farhan N. Damani8, Joe R. Davis5,6, Olivier Delaneau11,12,13, Emmanouil T. Dermitzakis11,12,13, Barbara E. Engelhardt22, Eleazar Eskin23,24, Pedro G. Ferreira25,26, Laure Frésard5,6, Eric R. Gamazon21,27,28, Diego Garrido-Martín9,10, Ariel D.H. Gewirtz29, Genna Gliner30, Michael J. Gloudemans5,6,31, Roderic Guigo9,10,32, Ira M. Hall18,19,33, Buhm Han34, Yuan He35, Farhad Hormozdiari23, Cedric Howald11,12,13, Hae Kyung Im36, Brian Jo29, Eun Yong Kang23, Yungil Kim8, Sarah Kim-Hellmuth15,16, Tuuli Lappalainen15,16, Gen Li37, Xin Li6, Boxiang Liu5,6,38, Serghei Mangul23, Mark I. McCarthy39,40,41, Ian C. McDowell42, Pejman Mohammadi15,16, Jean Monlong9,10,43, Stephen B. Montgomery5,6, Manuel Muñoz-Aguirre9,10,44, Anne W. Ndungu39, Dan L. Nicolae36,45,46, Andrew B. Nobel47,48, Meritxell Oliva36,49, Halit Ongen11,12,13, John J. Palowitch47, Nikolaos Panousis11,12,13, Panagiotis Papasaikas9,10, YoSon Park14, Princy Parsana8, Anthony J. Payne39, Christine B. Peterson50, Jie Quan51, Ferran Reverter9,10,52, Chiara Sabatti53,54, Ashis Saha8, Michael Sammeth55, Alexandra J. Scott18, Andrey A. Shabalin56, Reza Sodaei9,10, Matthew Stephens45,46, Barbara E. Stranger36,49,57, Benjamin J. Strober35, Jae Hoon Sul58, Emily K. Tsang6,31, Sarah Urbut46, Martijn van de Bunt39,40, Gao Wang46, Xiaoquan Wen59, Fred A. Wright60, Hualin S. Xi51, Esti Yeger-Lotem7,61, Zachary Zappala5,6, Judith B. Zaugg62, Yi-Hui Zhou60

Enhancing GTEx (eGTEx) groups Joshua M. Akey29,63, Daniel Bates64, Joanne Chan5, Lin S. Chen17, Melina Claussnitzer1,65,66, Kathryn Demanelis17, Morgan Diegel64, Jennifer A. Doherty67, Andrew P. Feinberg35,68,69,70, Marian S. Fernando36,49, Jessica Halow64, Kasper D. Hansen68,71,72, Eric Haugen64, Peter F. Hickey72, Lei Hou1,73, Farzana Jasmine17, Ruiqi Jian5, Lihua Jiang5, Audra Johnson64, Rajinder Kaul64, Manolis Kellis1,73, Muhammad G. Kibriya17, Kristen Lee64, Jin Billy Li5, Qin Li5, Xiao Li5, Jessica Lin5,74, Shin Lin5,75, Sandra Linder5,6, Caroline Linke36,49, Yaping Liu1,73, Matthew T. Maurano76, Benoit Molinie1, Stephen B. Montgomery5,6, Jemma Nelson64, Fidencio J. Neri64, Meritxell Oliva36,49, Yongjin Park1,73, Brandon L. Pierce17, Nicola J. Rinaldi1,73, Lindsay F. Rizzardi68, Richard Sandstrom64, Andrew Skol36,49,57, Kevin S. Smith5,6, Michael P. Snyder5, John Stamatoyannopoulos64,74,77, Barbara E. Stranger36,49,57, Hua Tang5, Emily K. Tsang6,31, Li Wang1, Meng Wang5, Nicholas Van Wittenberghe1, Fan Wu36,49, Rui Zhang5

NIH Common Fund Concepcion R. Nierras78

NIH/NCI Philip A. Branton79, Latarsha J. Carithers79,80, Ping Guan79, Helen M. Moore79, Abhi Rao79, Jimmie B. Vaught79

NIH/NHGRI Sarah E. Gould81, Nicole C. Lockart81, Casey Martin81, Jeffery P. Struewing81, Simona Volpi81

NIH/NIMH Anjene M. Addington82, Susan E. Koester82

NIH/NIDA A. Roger Little83

Biospecimen Collection Source Site—NDRI Lori E. Brigham84, Richard Hasz85, Marcus Hunter86, Christopher Johns87, Mark Johnson88, Gene Kopen89, William F. Leinweber89, John T. Lonsdale89, Alisa McDonald89, Bernadette Mestichelli89, Kevin Myer86, Brian Roe86, Michael Salvatore89, Saboor Shad89, Jeffrey A. Thomas89, Gary Walters88, Michael Washington88, Joseph Wheeler87

Biospecimen Collection Source Site—RPCI Jason Bridge90, Barbara A. Foster91, Bryan M. Gillard91, Ellen Karasik91, Rachna Kumar91, Mark Miklos90, Michael T. Moser91

Biospecimen Core Resource—VARI Scott D. Jewell92, Robert G. Montroy92, Daniel C. Rohrer92, Dana R. Valley92

Brain Bank Repository—University of Miami Brain Endowment Bank David A. Davis93, Deborah C. Mash93

Leidos Biomedical—Project Management Anita H. Undale94, Anna M. Smith95, David E. Tabor95, Nancy V. Roche95, Jeffrey A. McLean95, Negin Vatanian95, Karna L. Robinson95, Leslie Sobin95, Mary E. Barcus96, Kimberly M. Valentino95, Liqun Qi95, Steven Hunter95, Pushpa Hariharan95, Shilpi Singh95, Ki Sung Um95, Takunda Matose95, Maria M. Tomaszewski95

ELSI Study Laura K. Barker97, Maghboeba Mosavel98, Laura A. Siminoff97, Heather M. Traino97

Genome Browser Data Integration & Visualization—EBI Paul Flicek99, Thomas Juettemann99, Magali Ruffier99, Dan Sheppard99, Kieron Taylor99, Stephen J. Trevanion99, Daniel R. Zerbino99

Genome Browser Data Integration & Visualization—UCSC Genomics Institute, University of California Santa Cruz Brian Craft100, Mary Goldman100, Maximilian Haeussler100, W. James Kent100, Christopher M. Lee100, Benedict Paten100, Kate R. Rosenbloom100, John Vivian100, Jingchun Zhu100

1The Broad Institute of Massachusetts Institute of Technology and Harvard University, Cambridge, Massachusetts 02142, USA. 2Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, Massachusetts 02114, USA. 3Massachusetts General Hospital Cancer Center and Department of Pathology, Massachusetts General Hospital, Boston, Massachusetts 02114, USA. 4Department of Genetics, Harvard Medical School, Boston, Massachusetts 02114, USA. 5Department of Genetics, Stanford University, Stanford, California 94305, USA. 6Department of Pathology, Stanford University, Stanford, California 94305, USA. 7Department of Clinical Biochemistry and Pharmacology, Faculty of Health Sciences, Ben-Gurion University of the Negev, Beer-Sheva 84105, Israel. 8Department of Computer Science, Johns Hopkins University, Baltimore, Maryland 21218, USA. 9Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, 08003 Barcelona, Spain. 10Universitat Pompeu Fabra (UPF), 08002 Barcelona, Spain. 11Department of Genetic Medicine and Development, University of Geneva Medical School, 1211 Geneva, Switzerland. 12Institute for Genetics and Genomics in Geneva (iG3), University of Geneva, 1211 Geneva, Switzerland. 13Swiss Institute of Bioinformatics, 1211 Geneva, Switzerland. 14Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA. 15New York Genome Center, New York, New York 10013, USA. 16Department of Systems Biology, Columbia University Medical Center, New York, New York 10032, USA. 17Department of Public Health Sciences, The University of Chicago, Chicago, Illinois 60637, USA. 18McDonnell Genome Institute, Washington University School of Medicine, St. Louis, Missouri 63108, USA. 19Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63108, USA. 20Department of Pathology & Immunology, Washington University School of Medicine, St. Louis, Missouri 63108, USA. 21Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee 37232, USA. 22Department of Computer Science, Center for Statistics and Machine Learning, Princeton University, Princeton, New Jersey 08540, USA. 23Department of Computer Science, University of California, Los Angeles, California 90095, USA. 24Department of Human Genetics, University of California, Los Angeles, California 90095, USA. 25Instituto de Investigação e Inovação em Saúde (i3S), Universidade do Porto, 4200-135 Porto, Portugal. 26Institute of Molecular Pathology and Immunology (IPATIMUP), University of Porto, 4200-625 Porto, Portugal. 27Department of Clinical Epidemiology, Biostatistics and Bioinformatics, Academic Medical Center, University of Amsterdam, 1105 AZ Amsterdam, The Netherlands. 28Department of Psychiatry, Academic Medical Center, University of Amsterdam, 1105 AZ Amsterdam, The Netherlands. 29Lewis Sigler Institute, Princeton University, Princeton, New Jersey 08540, USA. 30Department of Operations Research and Financial Engineering, Princeton University, Princeton, New Jersey 08540, USA. 31Biomedical Informatics Program, Stanford University, Stanford, California 94305, USA. 32Institut Hospital del Mar d’Investigacions Mèdiques (IMIM), 08003 Barcelona, Spain. 33Department of Medicine, Washington University School of Medicine, St. Louis, Missouri 63108, USA. 34Department of Convergence Medicine, University of Ulsan College of Medicine, Asan Medical Center, Seoul 138-736, South Korea. 35Department of Biomedical Engineering, Johns Hopkins University, Baltimore, Maryland 21218, USA. 36Section of Genetic Medicine, Department of Medicine, The University of Chicago, Chicago, Illinois 60637, USA. 37Department of Biostatistics, Mailman School of Public Health, Columbia University, New York, New York 10032, USA. 38Department of Biology, Stanford University, Stanford, California 94305, USA. 39Wellcome Trust Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford OX3 7BN, UK. 40Oxford Centre for Diabetes, Endocrinology and Metabolism, University of Oxford, Churchill Hospital, Oxford OX3 7LE, UK. 41Oxford NIHR Biomedical Research Centre, Churchill Hospital, Oxford OX3 7LJ, UK. 42Computational Biology & Bioinformatics Graduate Program, Duke University, Durham, North Carolina 27708, USA. 43Human Genetics Department, McGill University, Montreal, Quebec H3A 0G1, Canada. 44Departament d’Estadística i Investigació Operativa, Universitat Politècnica de Catalunya, 08034 Barcelona, Spain. 45Department of Statistics, The University of Chicago, Chicago, Illinois 60637, USA. 46Department of Human Genetics, The University of Chicago, Chicago, Illinois 60637, USA. 47Department of Statistics and Operations Research, University of North Carolina, Chapel Hill, North Carolina 27599, USA. 48Department of Biostatistics, University of North Carolina, Chapel Hill, North Carolina 27599, USA. 49Institute for Genomics and Systems Biology, The University of Chicago, Chicago, Illinois 60637, USA. 50Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, Texas 77030, USA. 51Computational Sciences, Pfizer Inc, Cambridge, Massachusetts 02139, USA. 52Universitat de Barcelona, 08028 Barcelona, Spain. 53Department of Biomedical Data Science, Stanford University, Stanford, California 94305, USA. 54Department of Statistics, Stanford University, Stanford, California 94305, USA. 55Institute of Biophysics Carlos Chagas Filho (IBCCF), Federal University of Rio de Janeiro (UFRJ), 21941902 Rio de Janeiro, Brazil. 56Department of Psychiatry, University of Utah, Salt Lake City, Utah 84108, USA. 57Center for Data Intensive Science, The University of Chicago, Chicago, Illinois 60637, USA. 58Department of Psychiatry and Biobehavioral Sciences, University of California, Los Angeles, California 90095, USA. 59Department of Biostatistics, University of Michigan, Ann Arbor, Michigan 48109, USA. 60Bioinformatics Research Center and Departments of Statistics and Biological Sciences, North Carolina State University, Raleigh, North Carolina 27695, USA. 61National Institute for Biotechnology in the Negev, Beer-Sheva 84105, Israel. 62European Molecular Biology Laboratory, 69117 Heidelberg, Germany. 63Department of Ecology and Evolutionary Biology, Princeton University, Princeton, New Jersey 08540, USA. 64Altius Institute for Biomedical Sciences, Seattle, Washington 98121, USA. 65Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, Massachusetts 02215, USA. 66University of Hohenheim, 70599 Stuttgart, Germany. 67Huntsman Cancer Institute, Department of Population Health Sciences, University of Utah, Salt Lake City, Utah 84112, USA. 68Center for Epigenetics, Johns Hopkins University School of Medicine, Baltimore, Maryland 21205, USA. 69Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, Maryland 21205, USA. 70Department of Mental Health, Johns Hopkins University School of Public Health, Baltimore, Maryland 21205, USA. 71McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, Maryland 21205, USA. 72Department of Biostatistics, Johns Hopkins

© 2017 Macmillan Publishers Limited, part of Springer Nature. All rights reserved.

ARTICLERESEARCH

GTEx ConsortiumLaboratory, Data Analysis & Coordinating Center (LDACC)—Analysis Working Group François Aguet1, Kristin G. Ardlie1, Beryl B. Cummings1,2, Ellen T. Gelfand1, Gad Getz1,3, Kane Hadley1, Robert E. Handsaker1,4, Katherine H. Huang1, Seva Kashin1,4, Konrad J. Karczewski1,2, Monkol Lek1,2, Xiao Li1, Daniel G. MacArthur1,2, Jared L. Nedzel1, Duyen T. Nguyen1, Michael S. Noble1, Ayellet V. Segrè1, Casandra A. Trowbridge1, Taru Tukiainen1,2

Statistical Methods groups—Analysis Working Group Nathan S. Abell5,6, Brunilda Balliu6, Ruth Barshir7, Omer Basha7, Alexis Battle8, Gireesh K. Bogu9,10, Andrew Brown11,12,13, Christopher D. Brown14, Stephane E. Castel15,16, Lin S. Chen17, Colby Chiang18, Donald F. Conrad19,20, Nancy J. Cox21, Farhan N. Damani8, Joe R. Davis5,6, Olivier Delaneau11,12,13, Emmanouil T. Dermitzakis11,12,13, Barbara E. Engelhardt22, Eleazar Eskin23,24, Pedro G. Ferreira25,26, Laure Frésard5,6, Eric R. Gamazon21,27,28, Diego Garrido-Martín9,10, Ariel D.H. Gewirtz29, Genna Gliner30, Michael J. Gloudemans5,6,31, Roderic Guigo9,10,32, Ira M. Hall18,19,33, Buhm Han34, Yuan He35, Farhad Hormozdiari23, Cedric Howald11,12,13, Hae Kyung Im36, Brian Jo29, Eun Yong Kang23, Yungil Kim8, Sarah Kim-Hellmuth15,16, Tuuli Lappalainen15,16, Gen Li37, Xin Li6, Boxiang Liu5,6,38, Serghei Mangul23, Mark I. McCarthy39,40,41, Ian C. McDowell42, Pejman Mohammadi15,16, Jean Monlong9,10,43, Stephen B. Montgomery5,6, Manuel Muñoz-Aguirre9,10,44, Anne W. Ndungu39, Dan L. Nicolae36,45,46, Andrew B. Nobel47,48, Meritxell Oliva36,49, Halit Ongen11,12,13, John J. Palowitch47, Nikolaos Panousis11,12,13, Panagiotis Papasaikas9,10, YoSon Park14, Princy Parsana8, Anthony J. Payne39, Christine B. Peterson50, Jie Quan51, Ferran Reverter9,10,52, Chiara Sabatti53,54, Ashis Saha8, Michael Sammeth55, Alexandra J. Scott18, Andrey A. Shabalin56, Reza Sodaei9,10, Matthew Stephens45,46, Barbara E. Stranger36,49,57, Benjamin J. Strober35, Jae Hoon Sul58, Emily K. Tsang6,31, Sarah Urbut46, Martijn van de Bunt39,40, Gao Wang46, Xiaoquan Wen59, Fred A. Wright60, Hualin S. Xi51, Esti Yeger-Lotem7,61, Zachary Zappala5,6, Judith B. Zaugg62, Yi-Hui Zhou60

Enhancing GTEx (eGTEx) groups Joshua M. Akey29,63, Daniel Bates64, Joanne Chan5, Lin S. Chen17, Melina Claussnitzer1,65,66, Kathryn Demanelis17, Morgan Diegel64, Jennifer A. Doherty67, Andrew P. Feinberg35,68,69,70, Marian S. Fernando36,49, Jessica Halow64, Kasper D. Hansen68,71,72, Eric Haugen64, Peter F. Hickey72, Lei Hou1,73, Farzana Jasmine17, Ruiqi Jian5, Lihua Jiang5, Audra Johnson64, Rajinder Kaul64, Manolis Kellis1,73, Muhammad G. Kibriya17, Kristen Lee64, Jin Billy Li5, Qin Li5, Xiao Li5, Jessica Lin5,74, Shin Lin5,75, Sandra Linder5,6, Caroline Linke36,49, Yaping Liu1,73, Matthew T. Maurano76, Benoit Molinie1, Stephen B. Montgomery5,6, Jemma Nelson64, Fidencio J. Neri64, Meritxell Oliva36,49, Yongjin Park1,73, Brandon L. Pierce17, Nicola J. Rinaldi1,73, Lindsay F. Rizzardi68, Richard Sandstrom64, Andrew Skol36,49,57, Kevin S. Smith5,6, Michael P. Snyder5, John Stamatoyannopoulos64,74,77, Barbara E. Stranger36,49,57, Hua Tang5, Emily K. Tsang6,31, Li Wang1, Meng Wang5, Nicholas Van Wittenberghe1, Fan Wu36,49, Rui Zhang5

NIH Common Fund Concepcion R. Nierras78

NIH/NCI Philip A. Branton79, Latarsha J. Carithers79,80, Ping Guan79, Helen M. Moore79, Abhi Rao79, Jimmie B. Vaught79

NIH/NHGRI Sarah E. Gould81, Nicole C. Lockart81, Casey Martin81, Jeffery P. Struewing81, Simona Volpi81

NIH/NIMH Anjene M. Addington82, Susan E. Koester82

NIH/NIDA A. Roger Little83

Biospecimen Collection Source Site—NDRI Lori E. Brigham84, Richard Hasz85, Marcus Hunter86, Christopher Johns87, Mark Johnson88, Gene Kopen89, William F. Leinweber89, John T. Lonsdale89, Alisa McDonald89, Bernadette Mestichelli89, Kevin Myer86, Brian Roe86, Michael Salvatore89, Saboor Shad89, Jeffrey A. Thomas89, Gary Walters88, Michael Washington88, Joseph Wheeler87

Biospecimen Collection Source Site—RPCI Jason Bridge90, Barbara A. Foster91, Bryan M. Gillard91, Ellen Karasik91, Rachna Kumar91, Mark Miklos90, Michael T. Moser91

Biospecimen Core Resource—VARI Scott D. Jewell92, Robert G. Montroy92, Daniel C. Rohrer92, Dana R. Valley92

Brain Bank Repository—University of Miami Brain Endowment Bank David A. Davis93, Deborah C. Mash93

Leidos Biomedical—Project Management Anita H. Undale94, Anna M. Smith95, David E. Tabor95, Nancy V. Roche95, Jeffrey A. McLean95, Negin Vatanian95, Karna L. Robinson95, Leslie Sobin95, Mary E. Barcus96, Kimberly M. Valentino95, Liqun Qi95, Steven Hunter95, Pushpa Hariharan95, Shilpi Singh95, Ki Sung Um95, Takunda Matose95, Maria M. Tomaszewski95

ELSI Study Laura K. Barker97, Maghboeba Mosavel98, Laura A. Siminoff97, Heather M. Traino97

Genome Browser Data Integration & Visualization—EBI Paul Flicek99, Thomas Juettemann99, Magali Ruffier99, Dan Sheppard99, Kieron Taylor99, Stephen J. Trevanion99, Daniel R. Zerbino99

Genome Browser Data Integration & Visualization—UCSC Genomics Institute, University of California Santa Cruz Brian Craft100, Mary Goldman100, Maximilian Haeussler100, W. James Kent100, Christopher M. Lee100, Benedict Paten100, Kate R. Rosenbloom100, John Vivian100, Jingchun Zhu100

1The Broad Institute of Massachusetts Institute of Technology and Harvard University, Cambridge, Massachusetts 02142, USA. 2Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, Massachusetts 02114, USA. 3Massachusetts General Hospital Cancer Center and Department of Pathology, Massachusetts General Hospital, Boston, Massachusetts 02114, USA. 4Department of Genetics, Harvard Medical School, Boston, Massachusetts 02114, USA. 5Department of Genetics, Stanford University, Stanford, California 94305, USA. 6Department of Pathology, Stanford University, Stanford, California 94305, USA. 7Department of Clinical Biochemistry and Pharmacology, Faculty of Health Sciences, Ben-Gurion University of the Negev, Beer-Sheva 84105, Israel. 8Department of Computer Science, Johns Hopkins University, Baltimore, Maryland 21218, USA. 9Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, 08003 Barcelona, Spain. 10Universitat Pompeu Fabra (UPF), 08002 Barcelona, Spain. 11Department of Genetic Medicine and Development, University of Geneva Medical School, 1211 Geneva, Switzerland. 12Institute for Genetics and Genomics in Geneva (iG3), University of Geneva, 1211 Geneva, Switzerland. 13Swiss Institute of Bioinformatics, 1211 Geneva, Switzerland. 14Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA. 15New York Genome Center, New York, New York 10013, USA. 16Department of Systems Biology, Columbia University Medical Center, New York, New York 10032, USA. 17Department of Public Health Sciences, The University of Chicago, Chicago, Illinois 60637, USA. 18McDonnell Genome Institute, Washington University School of Medicine, St. Louis, Missouri 63108, USA. 19Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63108, USA. 20Department of Pathology & Immunology, Washington University School of Medicine, St. Louis, Missouri 63108, USA. 21Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee 37232, USA. 22Department of Computer Science, Center for Statistics and Machine Learning, Princeton University, Princeton, New Jersey 08540, USA. 23Department of Computer Science, University of California, Los Angeles, California 90095, USA. 24Department of Human Genetics, University of California, Los Angeles, California 90095, USA. 25Instituto de Investigação e Inovação em Saúde (i3S), Universidade do Porto, 4200-135 Porto, Portugal. 26Institute of Molecular Pathology and Immunology (IPATIMUP), University of Porto, 4200-625 Porto, Portugal. 27Department of Clinical Epidemiology, Biostatistics and Bioinformatics, Academic Medical Center, University of Amsterdam, 1105 AZ Amsterdam, The Netherlands. 28Department of Psychiatry, Academic Medical Center, University of Amsterdam, 1105 AZ Amsterdam, The Netherlands. 29Lewis Sigler Institute, Princeton University, Princeton, New Jersey 08540, USA. 30Department of Operations Research and Financial Engineering, Princeton University, Princeton, New Jersey 08540, USA. 31Biomedical Informatics Program, Stanford University, Stanford, California 94305, USA. 32Institut Hospital del Mar d’Investigacions Mèdiques (IMIM), 08003 Barcelona, Spain. 33Department of Medicine, Washington University School of Medicine, St. Louis, Missouri 63108, USA. 34Department of Convergence Medicine, University of Ulsan College of Medicine, Asan Medical Center, Seoul 138-736, South Korea. 35Department of Biomedical Engineering, Johns Hopkins University, Baltimore, Maryland 21218, USA. 36Section of Genetic Medicine, Department of Medicine, The University of Chicago, Chicago, Illinois 60637, USA. 37Department of Biostatistics, Mailman School of Public Health, Columbia University, New York, New York 10032, USA. 38Department of Biology, Stanford University, Stanford, California 94305, USA. 39Wellcome Trust Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford OX3 7BN, UK. 40Oxford Centre for Diabetes, Endocrinology and Metabolism, University of Oxford, Churchill Hospital, Oxford OX3 7LE, UK. 41Oxford NIHR Biomedical Research Centre, Churchill Hospital, Oxford OX3 7LJ, UK. 42Computational Biology & Bioinformatics Graduate Program, Duke University, Durham, North Carolina 27708, USA. 43Human Genetics Department, McGill University, Montreal, Quebec H3A 0G1, Canada. 44Departament d’Estadística i Investigació Operativa, Universitat Politècnica de Catalunya, 08034 Barcelona, Spain. 45Department of Statistics, The University of Chicago, Chicago, Illinois 60637, USA. 46Department of Human Genetics, The University of Chicago, Chicago, Illinois 60637, USA. 47Department of Statistics and Operations Research, University of North Carolina, Chapel Hill, North Carolina 27599, USA. 48Department of Biostatistics, University of North Carolina, Chapel Hill, North Carolina 27599, USA. 49Institute for Genomics and Systems Biology, The University of Chicago, Chicago, Illinois 60637, USA. 50Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, Texas 77030, USA. 51Computational Sciences, Pfizer Inc, Cambridge, Massachusetts 02139, USA. 52Universitat de Barcelona, 08028 Barcelona, Spain. 53Department of Biomedical Data Science, Stanford University, Stanford, California 94305, USA. 54Department of Statistics, Stanford University, Stanford, California 94305, USA. 55Institute of Biophysics Carlos Chagas Filho (IBCCF), Federal University of Rio de Janeiro (UFRJ), 21941902 Rio de Janeiro, Brazil. 56Department of Psychiatry, University of Utah, Salt Lake City, Utah 84108, USA. 57Center for Data Intensive Science, The University of Chicago, Chicago, Illinois 60637, USA. 58Department of Psychiatry and Biobehavioral Sciences, University of California, Los Angeles, California 90095, USA. 59Department of Biostatistics, University of Michigan, Ann Arbor, Michigan 48109, USA. 60Bioinformatics Research Center and Departments of Statistics and Biological Sciences, North Carolina State University, Raleigh, North Carolina 27695, USA. 61National Institute for Biotechnology in the Negev, Beer-Sheva 84105, Israel. 62European Molecular Biology Laboratory, 69117 Heidelberg, Germany. 63Department of Ecology and Evolutionary Biology, Princeton University, Princeton, New Jersey 08540, USA. 64Altius Institute for Biomedical Sciences, Seattle, Washington 98121, USA. 65Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, Massachusetts 02215, USA. 66University of Hohenheim, 70599 Stuttgart, Germany. 67Huntsman Cancer Institute, Department of Population Health Sciences, University of Utah, Salt Lake City, Utah 84112, USA. 68Center for Epigenetics, Johns Hopkins University School of Medicine, Baltimore, Maryland 21205, USA. 69Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, Maryland 21205, USA. 70Department of Mental Health, Johns Hopkins University School of Public Health, Baltimore, Maryland 21205, USA. 71McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, Maryland 21205, USA. 72Department of Biostatistics, Johns Hopkins

© 2017 Macmillan Publishers Limited, part of Springer Nature. All rights reserved.Donors andtheir families