BSHI 2002 Glasgow, Scotland STATISTICAL ANALYSIS OF HLA STATISTICAL ANALYSIS OF HLA AND DISEASE ASSOCIATIONS AND DISEASE ASSOCIATIONS M. Tevfik DORAK M. Tevfik DORAK Department of Epidemiology Department of Epidemiology University of Alabama at Birmingham University of Alabama at Birmingham U.S.A. U.S.A. (2002) (2002) http://www.dorak.info
23
Embed
STATISTICAL ANALYSIS OF HLA AND DISEASE ASSOCIATIONS
STATISTICAL ANALYSIS OF HLA AND DISEASE ASSOCIATIONS. M. Tevfik DORAK Department of Epidemiology University of Alabama at Birmingham U.S.A. (2002). http://www.dorak.info. This workshop will cover categorical data analysis for case-control design and some concepts in population genetics. - PowerPoint PPT Presentation
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
BSHI 2002Glasgow, Scotland
STATISTICAL ANALYSIS OF STATISTICAL ANALYSIS OF HLA AND DISEASE HLA AND DISEASE
ASSOCIATIONSASSOCIATIONS
M. Tevfik DORAKM. Tevfik DORAKDepartment of EpidemiologyDepartment of Epidemiology
University of Alabama at BirminghamUniversity of Alabama at BirminghamU.S.A.U.S.A.(2002)(2002)
Cell (1,1) Frequency (F) 45Left-sided Pr <= F 0.0033Right-sided Pr >= F 0.9983Table Probability (P) 0.0016Two-sided Pr <= P 0.0066
BSHI 2002Glasgow, Scotland
The SAS System The SAS System FREQ Procedure Output – II FREQ Procedure Output – II
Estimates of the Common Relative Risk (Row1/Row2)
Type of Study Method Value 95% Confidence Limits Case-Control Mantel-Haenszel 0.5359 0.3461 0.8299(Odds Ratio) Logit 0.5359 0.3461 0.8299Cohort Mantel-Haenszel 0.6595 0.4892 0.8891(Col1 Risk) Logit 0.6595 0.4892 0.8891Cohort Mantel-Haenszel 1.2306 1.0666 1.4198(Col2 Risk) Logit 1.2306 1.0666 1.4198
BSHI 2002Glasgow, Scotland
□○■
BC
AC BB
□ ○
●BB
BC AB
□ ○
●
□ ○
■BC AB
AB CD AC BD
“ transmitted allele“ “case”
“ Non-transmitted allele” “control”
Parent-Case Trios in TDTParent-Case Trios in TDT/HRR/HRR
BSHI 2002Glasgow, Scotland
- AN EXAMPLE OF TDT -- AN EXAMPLE OF TDT -
TRANSMISSION DISEQUILIBRIUM OF HLA-B62 TO THE TRANSMISSION DISEQUILIBRIUM OF HLA-B62 TO THE PATIENTS WITH CHILDHOOD AMLPATIENTS WITH CHILDHOOD AML
(Dorak et al, BSHI 2002)(Dorak et al, BSHI 2002)
Out of 13 parents heterozygote for B62, 12 transmitted B62 to the affected child and 1 did not
McNemar’s test results:P = 0.0055 (with continuity correction)odds ratio = 12.0, 95% CI = 1.8 to 513
Nontransmitted Allele
B62 Other
Transmitted Allele
B62 x 12
Other 1 y
BSHI 2002Glasgow, Scotland
Multiple comparisonsMultiple comparisons
Not needed if the study is not hypothesis driven (i.e., a fishing experiment)
Not needed if the study is hypothesis driven ('Possible relevance of the HLA system' is not a valid
hypothesis in this context. Those studies belong to the fishing experiments group)
Therefore, it is not clear when it is needed in HLA association studies. Most frequently, it is an
excuse for a busy reviewer to avoid a comprehensive review
Best solution is to avoid facing this problem -ideally by replication and/or functional data to support the statistical association before it is
dismissed as a spurious result of multiple comparisons
BSHI 2002Glasgow, Scotland
Common Mistakes in Statistical Evaluation Common Mistakes in Statistical Evaluation of Association Study Results - Iof Association Study Results - I
Confusion between corrections (Yates/Williams for continuity VS Bonferroni)
Confusion between RR and OR (they are not the same)
Confusion between expected and observed values in cells of a contingency table
Small sample size issue Don’t confuse a negative result with lack of power
(‘No significant difference between the two groups and they were pooled’ VS ‘the difference did not reach significance due to small
sample size’ are different interpretations of the same phenomenon, i.e., lack of power)
Using Chi-squared test for small sample size (why not use Fisher all the time?)
Using Chi-squared test for HWE (use exact test or G-test)
BSHI 2002Glasgow, Scotland
Common Mistakes in Statistical Evaluation Common Mistakes in Statistical Evaluation of Association Study Results - IIof Association Study Results - II
One-tailed and two-tailed P values (always use two-tailed)
Trend test for a multicontingency table? (if appropriate, more powerful)
Multiple comparison issue
Failure to give the strength of the association (OR, RR, RH)
Use of the word ‘proof’. Does statistics prove anything?(A ‘P value’ provides a sense of the strength of the evidence for or
against the null hypothesis of no association)
Reliance on large sample effect to achieve significance
Showing P values as 0.000 (this means P < 0.001)
Confusion between association and linkage
BSHI 2002Glasgow, Scotland
Association and Causality?Association and Causality?
However strong an association does not necessarily mean However strong an association does not necessarily mean causation. Several criteria have been proposed to assess the causation. Several criteria have been proposed to assess the role of an associated marker in causation. Some of those are role of an associated marker in causation. Some of those are as follows:as follows:
1. Biological plausibility1. Biological plausibility2. Strength of association (this is 2. Strength of association (this is notnot measured by the measured by the PP value)value)3. Dose response (are heterozygotes intermediate between 3. Dose response (are heterozygotes intermediate between the two homozygotes, or is homozygosity showing a stronger the two homozygotes, or is homozygosity showing a stronger association than just having the marker?)association than just having the marker?)4. Time sequence (this is inherent in the germ-line nature of 4. Time sequence (this is inherent in the germ-line nature of HLA genes)HLA genes)5. Consistency (next slide lists reasons for inconsistency in 5. Consistency (next slide lists reasons for inconsistency in HLA association studies)HLA association studies)6. Specificity of the association to the disease studied 6. Specificity of the association to the disease studied
BSHI 2002Glasgow, Scotland
Why Are the Inconsistencies? (I)Why Are the Inconsistencies? (I)
1. Mistakes in genotyping (lack of HWE in controls is 1. Mistakes in genotyping (lack of HWE in controls is usually an indication of problems with typing rather than usually an indication of problems with typing rather than selection, admixture, nonrandom mating or other reasons of selection, admixture, nonrandom mating or other reasons of departure from HWE)departure from HWE) 2. Poor control selection (would your controls be in the 2. Poor control selection (would your controls be in the case group if they had the disease, and would the cases be case group if they had the disease, and would the cases be in your control group if they were free of the disease?)in your control group if they were free of the disease?) 3. Design problems including the statistical power issue 3. Design problems including the statistical power issue (negative results due to lack of statistical power should be (negative results due to lack of statistical power should be distinguished from truly negative results observed despite distinguished from truly negative results observed despite having sufficient power)having sufficient power) 4. Publication bias (are there many more studies with 4. Publication bias (are there many more studies with negative results but we have never heard about them?)negative results but we have never heard about them?) 5. Disease misclassification or misclassification bias5. Disease misclassification or misclassification bias
BSHI 2002Glasgow, Scotland
Why Are the Inconsistencies? (II)Why Are the Inconsistencies? (II)
6. Excessive type I errors (are the positive results due to 6. Excessive type I errors (are the positive results due to using using P P < 0.05 as the statistical significance?)< 0.05 as the statistical significance?) 7. Posthoc and subgroup analysis (are positive results due 7. Posthoc and subgroup analysis (are positive results due to fishing (data dredging)?)to fishing (data dredging)?) 8. Unjustified multiple comparisons and subsequent type II 8. Unjustified multiple comparisons and subsequent type II errorerror 9. Failure to consider the mode of inheritance in a genetic 9. Failure to consider the mode of inheritance in a genetic diseasedisease 10. Failure to account for the LD structure of the gene 10. Failure to account for the LD structure of the gene (only haplotype-tagging markers will show the association, (only haplotype-tagging markers will show the association, other markers within the same gene may fail to show an other markers within the same gene may fail to show an association and generate background noise)association and generate background noise) 11. Likelihood that the gene studied account for a small 11. Likelihood that the gene studied account for a small proportion of the variability in risk proportion of the variability in risk
BSHI 2002Glasgow, Scotland
Further InformationFurther Information
Select ‘Biostatistics' or ‘Epidemiology’ at
http://www.dorak.info
or write to me at
dorakmt :at: lycos.com[please do not add to your address book as it will change periodically]