Assessment of genomewide association studies Tuan V. Nguyen Garvan Institute of Medical Research Sydney, Australia
Assessment of genomewide association studies
Tuan V. NguyenGarvan Institute of Medical Research
Sydney, Australia
WHICH GENES ?
Gene variants ?
False positive problem
Candidate gene studies: reproducibility problem
600 positive associations between common gene variants and disease reported 1986-2000
J N Hirschhorn et al. Genetics in Medicine 2002
166 were studied 3+ times
6 have been consistently replicated
Introduction to genomewide association studies
Genomewide association studies (GWA)
• Revolution in gene search• Hypothesis-free driven approach• Scan 100,000-500,000 gene variants (SNPs)• Case – control design (>1000 individuals)
Massive number of tests of hypothesis
Recent GWA studies in osteoporosis
• Styrkarsdottir U, et al (2008) Multiple genetic loci for bone mineral density and fractures. N Engl J Med 358:2355-2365.
• van Meurs JB, et al (2008) Large-scale analysis of association between LRP5 and LRP6 variants and osteoporosis. JAMA 299:1277-1290.
• Richards JB, et al (2008) Bone mineral density, osteoporosis, and osteoporotic fractures: a genome-wide association study. Lancet 371:1505-1512.
Some gene variants from GWAGene variant (SNP) Gene or location Trait and P-valuers3736228 11q13 (LRP5) BMD (p = 2.6 × 10-9)
Fracture (p = 0.02)rs3736228 11q13 (LRP5) BMD (p = 6.3 × 10-12)
Fracture (p = 0.002)rs4355801 LRP5rs4988321 11q13 (LRP5) BMD (p = 3.3 × 10-8)
Fracture (p = 0.002)rs2302685 12p12 (LRP6) BMD (p = 0.97)
Fracture (p = 0.95)rs4355801 8q24 (TNFRSF11B) BMD (p = 7.6 × 10-10)rs7524102 1p36 (ZBTB40) BMD (p = 9.2 × 10-19)
Fracture (p = 8.4 × 10-4)rs6696981 1p36 (close to ZBTB40) BMD (p = 1.7 × 10-7)
Fracture (p = 2.4 × 10-4)rs3130340 6p21 () BMD (p = 1.2 × 10-7)
Fracture (p = 0.008)rs9479055 6q25 (1) BMD (p = 6.2 × 10-7)rs4870044 6q25 (1) BMD (p = 1.6 × 10-11)rs1038304 6q25 (1) BMD (p = 4.0 × 10-11)rs6929137 6q25 (1) BMD (p = 2.5 × 10-10)rs1999805 6q25 (1) BMD (p = 2.2 × 10-8)rs6993813 8q24 (OPG) BMD (p = 1.8 × 10-14)
Fracture (p = 0.04)rs6469804 8q24 (OPG) BMD (p = 7.4 × 10-15)
Fracture (p = 0.052)rs9594738 13q14 (RANKL) BMD (p = 2.0 × 10-21)rs9594759 13q14 (RANKL) BMD (p = 1.1 × 10-16)rs11898505 2p16 (SPTBN1) Fracture (p = 1.8 × 10-4)rs3018362 18q21 (RANK) Fracture (p = 0.005)rs2306033 11p11 (LRP4) Fracture (p = 0.007)rs7935346 11p11 (LRP4) Fracture (p = 0.02)
What is the credibility of a GWA finding ?
An observed association with p<0.05 does not necessarily mean
that the association exists.
In 100,000 tests, 5000 positive findings could be false positive
Diagnostic test and association test
Diseased
YES
+ve -ve
NO
+ve -ve
Association
True
+ve -ve
False
Sensitivity
P(+ve | D)
False +ve Power P-value
P(+ve | False)
+ve -ve
What do want we to know?
• Probability of association given observed data (eg posterior probability of association)
or
• Probability of observing data if there is no association (P-value)
Posterior probability of association
• Prior probability of association ()• Power = Pr(significance | association)
Sample size• P-value = Pr(significance | no association)
Effect size
is a function of
What is the prior probability of association for a gene variant ?
Gene search = finding small needles in a VERY large haystack
• Human genome ~3 billion base pairs longBUT: Most are vanishingly rare
• 99.9% identical between any two individuals
• ~90% differences between any two individuals is due to common variants
Hypotheses• Common disease / common variants (CD/CV)
(Reich & Lander 2001, Pritchard et al 2005)
• ~90% differences between any two individuals is due to common variants
Prior probability of association ()
• Common variants in the human population: 10 million (Kruglyak and Nickerson Net Gent 2001)
• No. of genetic variants associated with a common disease ~100 or less (Yang et al, Int J Epidemiol 2005)
Prior probability of association
= 0.000001
A Bayesian interpretation of association
10,000,000 common variants
True association (100) No association (9,999,900)
Significant (95)
Non-significant (5)
Significant (100)
Non-significant (9,999,800)
P(True association given a significant result) = 95 / (95+195) = 48%
Power = 95%; P-value=0.00001
A Bayesian interpretation of association
10,000,000 common variants
True association (100) No association (9,999,900)
Significant (95)
Non-significant (5)
Significant (1)
Non-significant (9,999,800)
P(True association given a significant result) = 95 / (95+1) = 99%
Power = 95%; P-value=0.00000001
P-value and “true” association
P-value in the range of 5% - 0.1% will virtually be false positives even in large scale studies
P-value for a reliable association
P < 5 x 10-5
or P < 5 x 10-8
For 1000 cases and 1000 controls,
p< 10-8 are more likely to be true than false
Some gene variants from GWAGene variant (SNP) Gene or location Trait and P-valuers3736228 11q13 (LRP5) BMD (p = 2.6 × 10-9)
Fracture (p = 0.02)rs3736228 11q13 (LRP5) BMD (p = 6.3 × 10-12)
Fracture (p = 0.002)rs4355801 LRP5rs4988321 11q13 (LRP5) BMD (p = 3.3 × 10-8)
Fracture (p = 0.002)rs4355801 8q24 (TNFRSF11B) BMD (p = 7.6 × 10-10)rs7524102 1p36 (ZBTB40) BMD (p = 9.2 × 10-19)
Fracture (p = 8.4 × 10-4)rs9479055 6q25 (1) BMD (p = 6.2 × 10-7)rs4870044 6q25 (1) BMD (p = 1.6 × 10-11)rs1038304 6q25 (1) BMD (p = 4.0 × 10-11)rs6929137 6q25 (1) BMD (p = 2.5 × 10-10)rs1999805 6q25 (1) BMD (p = 2.2 × 10-8)rs6993813 8q24 (OPG) BMD (p = 1.8 × 10-14)
Fracture (p = 0.04)rs6469804 8q24 (OPG) BMD (p = 7.4 × 10-15)
Fracture (p = 0.052)rs9594738 13q14 (RANKL) BMD (p = 2.0 × 10-21)rs9594759 13q14 (RANKL) BMD (p = 1.1 × 10-16)rs11898505 2p16 (SPTBN1) Fracture (p = 1.8 × 10-4)rs3018362 18q21 (RANK) Fracture (p = 0.005)rs2306033 11p11 (LRP4) Fracture (p = 0.007)rs7935346 11p11 (LRP4) Fracture (p = 0.02)
Number of individuals needed to screen in population and family
Hypothetical gene Fracture risk in
Population Family
Relative risk 5 10
Cumulative risk 40% 80%
Cumulative risk after Rx 20% 40%
Number needed to treat 5 2.5
Frequency of risk “genotype”
0.2% 50%
Number needed to screen 2500 5
How many genes are required for a “good” fracture prognosis ?
Odds ratio
Genotype frequency
Number of genes needed for AUC of
0.70 0.80 0.90 0.95
1.1 5% >400 >400 >400 >400
10% 330 >400 >400 >400
30% 150 >400 >400 >400
1.5 5% 33 100 280 >400
10% 19 50 150 330
30% 9 23 70 160
Assessment of GWA finding
• Genetic components of BMD and fracture
• Finding genes of osteoporosis: a challenge
• Genes can help improve the prognosis of fracture