Effective Use of Advanced Statistical Methods in Research OLUWADIYA KS Professor of Surgery Ekiti State University, Ado-Ekiti www.oluwadiya.com
Effective Use of Advanced
Statistical Methods in Research
OLUWADIYA KS
Professor of Surgery
Ekiti State University, Ado-Ekiti
www.oluwadiya.com
Objectives
1. Data transformation
2. Limitations of P-value
3. Statistics for comparing 2 or more groups with continuous data
4. Regression and Correlation
5. Risk Ratios and Odds Ratios
6. Survival Analysis
7. Sensitivity, Specificity and ROC Curves
8. Finding the right test for specific data
9. Introduction to Endnote
2
Why Transform Data?
The assumptions of most parametric methods:
Homogeneity of variance (Homoscedasticity)
Normality
Linearity
Data transformation is used to make your data
conform to the assumptions of the statistical
methods
Skewed data
Homoscedasticity and Normality
The data deviates from both homoscedasticity and normality.
Homoscedasticity and Normality
This data are both normal and have equal variance
Won’t it be nice if we would make the previous data look this way?
• We should always check the assumptions
that data follow a normal distribution with
uniform variance:
i. If the data meet the assumptions we can
analyze the raw data as described.
ii. If they are not met, we have two
possible strategies:
When to transform data?
1-We can use a method which does not
require these assumptions, such as a rank-
based (non-parametric) method.
2-We can transform the data mathematically
to make them fit the assumptions more
closely before analysis.
Methods of data transformation
In healthcare research, there are three commonly
used transformations for quantitative data:
1. Logarithmic transformation,
2. Square root transformation
3. Inverse (reciprocal) transformation.
Determining normality of data
Graphical method
Histograms and Normality plots
Boxplots
Normal Q-Q plots and detrended Q-Q plots
Statistical method
Skewness and kurtosis
Smolgrov-Smirnov statistics
10
Determining normality of data
11
Which of the two variables
has a normal distribution?
Determining normality of data
Age PCV
12
Determining normality of data
AGE PCV
13
Determining normality of data
AGE PCV
14
Determining normality of data
15
Normalizing data
We know that Age is
not normally
distributed
Log transformation:
use SPSS Compute
Sub menu (transform-
compute):
16
Normalize Age
The transformed variable
(log_age) which we asked
SPSS to create has been
created
17
Normalize Age: Result
Original Data (Age) Transformed data
18
Normalize PCV: Result
Log_Age Age
19
The problem with P
P values provide less information than confidence intervals.
Statistical significance tells us whether there is a difference and not how much difference there is.
A P value provides only a probability that an estimate is due to chance
A P value could be statistically significant but of limited clinical significance.
o A very large study might find that a difference of 0.5mmHg in BP between 2 rx groups is statistically significant but is this clinically relevant?
“A large study dooms you to statistical significance”Anonymous Statistician
20
Statistical tests in inferential statistics are designed to answer the
question ‘‘how likely is the difference found in a sample due to chance
(when actually no such difference exists in the population, the null-
hypothesis)?’’.
This is the only purpose they serve—the calculation of a
probability value.
They do not indicate clinical significance!
21
The problem with P
The clinical significance of a research finding – the extent
to which it may influence clinical practice – depends on
many factors.
I. Many of these factors are related to study design.
II. are the adverse effects of a treatment also studied in addition to
benefits?
III. Are the outcome measures clinically relevant (e.g., improvement
in symptoms and functioning rather than cognitive distortions)?
IV. Are the effects lasting?
V. Is the cost of treatment worth the effect and
VI. Can the study findings be generalized to patients across social
and clinical settings?
22
Clinical significance
Effects size (Cohen’s d)
Odds ratio
Absolute Risk Reduction (ARR)
Numbers needed to treat (NNT): this is the
reciprocal of ARR
23
Magnitude and clinical significance
Magnitude and clinical significance
No
response
response Total
Placebo A(60) B(40) A+B(100)
Drug C(40) D(60) A+B(100)
24
Risk of response on
antidepressant=0.6
Risk of response on
placebo = 0.4
Absolute risk reduction =
0.2
Number Needed to Treat
(NNT) = 1/0.2 = 5
Statistical Tests
Parametric tests
Continuous data; normally distributed
Non-parametric tests
Continuous data; not normally distributed
(Categorical or Ordinal data)
25
Comparison of two sample mean
Student’s T test
Assumes normally distributed continuous data.
No need to do the math, commonly generated by most statistics software
But…
Understand the underlying theory and assumption
26
Paired T-Test
Whereas T-test assumes there is independence of observations
Related samples i.e.
Paired T-test is meant for “before” and “after” studies
27
Used to determine if two or more samples are
from the same population i.e. no significant
difference between their means
Requires that….
Dependent variable is continuous data
Independent variable is categorical data
Independent variable = Grouping variable = Factor
The variable will consist of a number of categories or levels
There will be 2 or more of such categories or levels.
If there are only 2 categories, then the result will be identical to t-test
. 28
ANalysis Of VAriance (ANOVA)
ANalysis Of VAriance
An example
You measure the PCV of 50 patients who has sustained fractures.
Does the number of bones fractured affect the PCV of the patients ?
number of bone fractured is the independent variable or factor
0 bone fractured is one level of the factor
1 bone fractured is one level of the factor
2 bones fractured is one level of the factor
3 bones fractured is one level of the factor
PCV is the dependent variable
ANOVA in SPSS
Significant result…now what?
Yes oo! & there is a
significant
difference among
the means. I’m so
happy
Don’t
know
Na wa o
Ok
then
There are
more than 2
means
But is it
among all
means, or
just two?
Better do a
post hoc test
my friend.
What is that?
Better do it
now
Post Hoc tests
After the Fact comparisons of means used
to identify which specific pairs of means are
significantly different
Designed to reduce errors regardless of
how many pairs of means are compared
Post Hoc tests
Also called follow-up tests
Should be computed only after a significant ANOVA
They are like a collection of little t-tests
But they control overall type 1 error comparatively well
They do not have as much power as the omnibus test (the ANOVA) – so you might get a significant ANOVA & no sig. Follow-up
Purpose is to identify the locus of the effect (what means are different, exactly?)
Post Hoc tests
Post Hoc tests: Homogeneous subset
Correlation
Assesses the linear relationship between two variables Example: height and weight
Strength of the association is described by a correlation coefficient- r
r = 0 - .2 low, probably meaningless
r = .2 - .4 low, possible importance
r = .4 - .6 moderate correlation
r = .6 - .8 high correlation
r = .8 - 1 very high correlation
Can be positive or negative
Pearson’s or Spearman’s correlation coefficient
Tells nothing about causation
36
Correlation
Positive Correlation Negative Correlation
37
Regression
Based on fitting a line to data
Provides a regression coefficient, which is the slope of the line
o For example; y = 0 + 1x (simple linear regression)
Use to predict a dependent variable’s value based on the value of an independent variable.
E.g. In analysis of height and weight, for a known height, one can predict weight.
Much more useful than correlation
Allows prediction of values of Y rather than just whether there is a relationship between two variable.
38
Regression
Types of regression
Linear and Multiple - uses continuous data to predict
continuous data outcome
Logistic- uses continuous data to predict probability of
a dichotomous outcome
Poisson regression- time between rare events.
Cox proportional hazards regression- survival analysis.
39
40
• The simple linear regression equation is:
y = 0 + 1x
• Graph of the regression equation is a straight line.
• 0 is the intercept of the regression line on the yaxis.
• 1 is the slope of the regression line.
• y is the expected value of y for a given x value.
Simple Linear Regression Equation
41
Simple Linear Regression Equation
Positive Linear Relationship
y
x
Slope 1
is positive
Regression line
Intercept0
42
Simple Linear Regression Equation
Negative Linear Relationship
Slope 1
is negative
E(y)
x
Regression line
Intercept0
Slope 1
is negative
43
Simple Linear Regression Equation
No Relationship
x
E(y)
Slope 1
is 0
Regression line
Intercept0
Risk Ratios
Risk is the probability that an event will happen.
Number of events divided by the number of people at risk.
Risks are compared by creating a ratio
Example: risk of colon cancer in those exposed to a factor vs. those unexposed
Risk Ratios
Typically used in cohort studies
Prospective observational studies comparing groups with various exposures.
Allows exploration of the probability that certain factors are associated with outcomes of interest
For example: association of smoking with lung cancer
Usually require large and long-term studies to determine risks and risk ratios.
45
Interpreting Risk Ratios
A risk ratio of 1 equals no increased risk
A risk ratio of greater than 1 indicates increased risk
A risk ratio of less than 1 indicates decreased risk
95% confidence intervals are usually presented
Must not include 1 for the estimate to be statistically significant.
Example: Risk ratio of 3.1 (95% CI 0.97- 9.41) includes 1, thus would not be statistically significant.
46
Odds Ratios
Odds of an event occurring divided by the
odds of the event not occurring.
Odds are calculated by the number of times an
event happens by the number of times it does
not happen.
o Odds of heads vs. the odds of tails is 1:1 or 1.
47
Odds Ratios
Are calculated from case control studies
Case control: patients with a condition (often rare) are compared to a group of selected controls for exposure to one or more potential etiologic factors.
Cannot calculate risk from these studies as that requires the observation of the natural occurrence of an event over time in exposed and unexposed patients (prospective cohort study).
Instead we can calculate the odds for each group.
48
Comparing Risk and Odds Ratios
For rare events, ratios very similar If 5 of 100 people have a complication:
o The odds are 5/95 or .0526.
o The risk is 5/100 or .05.
If more common events, ratios begin to differ If 30 of 100 people have a complication:
The odds are 30/70 or .43
The risk is 30/100 or .30
Very common events, ratios very different Male versus female births
The odds are .5/.5 or 1
The risk is .5/1 or .5
49
Risk reduction
Absolute risk reduction: amount by which risk is reduced.
Relative risk reduction: proportion or percentage reduction.
Example:
Death rate without treatment: 10 per 1000
Death rate with treatment: 5 per 1000
ARR = 5 per 1000
RRR = 50%
50
What is survival analysis?
Survival analysis is form of regression technique which models time to an event (death, recurrence, recover).
Unlike linear regression, survival analysis has a dichotomous (binary) outcome
Unlike logistic regression, survival analysis analyzes the time to an event
Able to account for censoring
Can compare survival between groups
Assessses relationship between covariates (variables) and survival time
defined event of interest (such as death, recurrence, new primary)
specify start and end time of study’s observation period
determine time to event or censoring for each subject
Survival analysis requires
† death from specific cancer = event
? lost to follow-up = censor
alive at last visit = censor
Survival Data
?
†
End of follow upStart
What is Censored data?
Patients who do not reach the event by the end of
the study or who are lost to follow-up.
Types of censored data include:
Event did not occur by the end of the study
Subject died from an unrelated cause
Subject lost to follow-up
Subject withdrew from study
54
Regression vs. Survival Analysis
Technique Mathematical
model
Yields
Linear
Regression
Y=B1X + Bo
(linear)
Linear changes
Logistic
Regression
Ln(P/1-P)=B1X+Bo
(sigmoidal prob.) Odds ratios
Survival
Analyses
h(t) =
ho(t)exp(B1X+Bo)
Hazard rates
When to use survival analysis
Estimate time-to-event for a group of individuals,
such as time until second heart attack for a group of
MI patients.
To compare time-to-event between two or more
groups, such as treated vs. placebo patients in a
randomized controlled trial.
To assess the relationship of co-variables to time-
to-event, such as: does weight, smoking, or
cholesterol influence survival time of MI patients?
Kaplan-Meier survival curves
Accounts for censoring
Provides a graphical means of comparing the
outcomes of two groups that vary by intervention
or other factor.
Survival rates can be measured directly from curve.
Difference between curves can be tested for
statistical significance.
Does not account for confounding or effect
modification by other covariates
Days
Surv
ival P
robabili
ty
Males
Female
Survival after Surgery (Kaplan-Meier curves)
Ticks = censor
Step = event
Cox Regression Model
Also called Proportional Hazards Survival Model.
Used to investigate relationship between an event (death,
recurrence) occurring over time and possible explanatory
factors (Covariates).
Reported result: Hazard ratio (HR).
Ratio of the hazard in one group divided by the hazard in another.
Interpreted same as risk ratios and odds ratios
HR 1 = no effect
HR > 1 increased risk
HR < 1 decreased risk
59
Cox Regression Model
Common use in long-term studies where
various factors called covariates might
predispose to an event.
Example: after radiotherapy, which factors
(age, race, tumor staging, etc.) might make
recurrence more likely.
60
Sensitivity
61
Sensitivity = A/(A+C)
Ability of a test to identify
correctly, individuals who are
affected
Proportion of people testing
positive
among affected individuals
Specificity
62
Sensitivity = D/(D+B)
Ability of a test to identify
correctly, individuals who are
not affected
Proportion of people testing
negative among non-affected
individuals
TN
Sp =
TN + FP
TP
Se =
TP + FN
Disease
Test
FP
TN
TP
Performance of a test
FN
NoYes
+
-
63
0 5 10 15 20
Quantitative result of the test
Distribution of quantitative test results among
affected and non-affected people (ideal case)
TN
Non affected:
Affected:
TP
Nu
mb
er o
f p
eople
tes
ted
Threshold for
positive result
64
0 5 10 15 20
TN TP
FN FP
Distribution of quantitative results among affected
and non-affected people (real life)
Non-affected:Threshold for
positive result
Quantitative result of the test
Nu
mb
er o
f p
eop
le t
este
d
Affected:
65
TNTP
FN
FP
Non affected:
Affected:Threshold for
positive result
Effect of Decreasing the Threshold
Nu
mb
er o
f p
eop
le t
este
d
Quantitative result of the test
0 5 10 15 20
66
TP
Se =
TP + FN
TN
Sp =
TN + FP
Effect of decreasing the Threshold
Disease
Test
FP
TN
TP
FN
NoYes
+
-
67
0 5 10 15 20
TNTP
FN
FP
Non-affected:
Affected:Threshold for
positive result
Nu
mb
er o
f p
eop
le t
este
d
Quantitative result of the test
Effect of Increasing the Threshold
68
TP
Se =
TP + FN
TN
Sp =
TN + FP
Effect of Increasing the Threshold
Disease
Test
FP
TN
TP
FN
NoYes
+
-
69
Performance of a Test and Threshold
Sensitivity and specificity vary in opposite
directions when changing the threshold
The choice of a threshold is a compromise
to best reach the objectives of the test What are the consequences of having false positives?
What are the consequences of having false negatives?
70
When false diagnosis (FP)
is worse than missed diagnosis (FN)
Example: Screening for congenital
toxoplasmosis
One should minimise false positives
Prioritise SPECIFICITY
71
When missed diagnosis (FN)
is worse than false diagnosis (FP)
Example: Testing for Helicobacter pylori
infection
One should minimise the false negatives
Prioritise SENSITIVITY
72
Receiver Operating Characteristics curve
(ROC curve)
Representation of relationship
between sensitivity and specificity for a test
Simple tool to:
define best cut-off value of a test
compare performance of two tests
73
ROC Curve 1
0 20 40 60 80 100
100
80
60
40
20
0
100-Specificity
Sensitiv
ity gcs
KTS
RTS
Triage RTS
74
ROC Curve
Score AUC 95% C. I Optimal cutoff
point
Cutoff sensitive/specific
Sensitivity
%
Specificity %
GCS 0.880 0.825 to 0.
923
9 100 68.4
tRTS 0.881 0.825 to 0.
924
9 100 54
RTS 0.883 0.827 to 0.
925
5.7 83.3 83.3
KTS 0.914 0.864 to 0.
950
12 100 70.7
75
76
Determining the Test (I)
What kind of variables are they?
1. Numerical variable
2. Ordinal variable
3. Categorical variable (Nominal)
How many groups are there?
T-test vs ANOVA
77
Determining the Test (II)
Are they “normal distribution”?
Parametric vs. nonparametric methods.
T-test vs. Mann-Whitney U test
ANOVA vs. Kruskal-Wallis test
78
Determining the Test (III)
Measurements are taken from the same
patient for more than one time (before
and after treatment); you should use
Paired t-test
Repeat-measures ANOVA
79
Determining the Test (IV)
Usually, data are analyzed after they are
completed (all the measurements are
finished); but there are some studies that
data input are still ongoing while the
researcher have started their analysis.
For these data; do survival analysis
Selecting the appropriate procedure among
the common statistical procedures
80
Independent Variable
Dependent Variable
Categorical Continuous
Categorical Chi Square Logistic regression
t-tests One-way ANOVA
Continuous Logistic regression Correlation Linear regression
For a more complete table, please see the book:
Getting to know SPSS. Second Edition
by Oluwadiya Kehinde
Further studies?
Getting to know SPSS.
Second Edition by
Oluwadiya Kehinde
www.oluwadiya.sitesled.com
81
Final thoughts…….
82
When reading a journal article…….
Keep In Mind That
No study is perfect
All data is dirty is some way or another;
research is what you do with that dirty data
Measurement involves making choices
Be Critical About Numbers
Every statistic is a way of summarizing complex
information into relatively simple numbers.
How did the researchers arrive at these numbers?
Who produced the numbers and what is their bias?
How were key terms be defined & in how many
different ways?
Be Critical About Numbers
How was the choice for the measurement made?
What type of sample was gathered & how does that affect result?
Is the statistical result interpreted correctly?
If comparisons are made, were they appropriate?
Are there competing statistics?
Be Critical About Numbers
With one foot in a
bucket of ice water,
and one foot in a
bucket of boiling
water, you are, on
the average,
comfortable.
Be critical about numbers:
Bias and Error
Thanks for your attention
88