Top Banner
Basic Biostatistics Prof. Dr. Jamalludin Ab Rahman MD MPH Department of Community Medicine
122

Basic biostatistics for medical students

Jan 25, 2017

Download

Education

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Basic biostatistics for medical students

Basic BiostatisticsProf. Dr. Jamalludin Ab Rahman MD MPH

Department of Community Medicine

Page 2: Basic biostatistics for medical students

Learning outcomesAt the end of this workshop, you should be able toDescribe about population & sample, causation,

level of measurement, distribution of dataSummarise categorical & numerical dataUse appropriate statistical test for bi-variable

analyses using SPSS

26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

2

Page 3: Basic biostatistics for medical students

26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

3

We observe, we believe.What we observe might not be the

truth

Page 4: Basic biostatistics for medical students

Population vs. Sample 26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

4

Page 5: Basic biostatistics for medical students

Parameter vs. StatisticsParameter

characteristic of the whole population Statistics

characteristic of a sample, presumably measurable.

26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

5

Page 6: Basic biostatistics for medical students

Statistics estimate parametersRepresentativeSampling errorDifferent samples yield different estimatesStatistics = Parameter if sampling done properlyHow to prove?

26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

6

Page 7: Basic biostatistics for medical students

N = 35Red = 18/35 = 51.4%

N = 6Red = 3/6 = 50%

N = 6Red = 2/6 = 33.3%

N = 6Red = 4/6 = 66.7%

Average % Red = 50% + 33.3% + 66.7%3

= 50%

50% ≈ 51.4%Parameter

Statistics

Statistics Parameter

26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

7

Page 8: Basic biostatistics for medical students

Variable & its roleA value and whose associated value may be

changed

DependentIndependent

26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

8

Page 9: Basic biostatistics for medical students

CausationRelation of events (cause and effect)But correlation (between two events) does not

(always) imply causationRooster's crow does not cause the sun to riseSwitch does not cause the bulb to light

26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

9

Page 10: Basic biostatistics for medical students

Time & causation

Time

Exposure Exposure

Exposure

Outcome

26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

10

Page 11: Basic biostatistics for medical students

Time & causation (example)

Time

AgeExposure to

silica

Smoking

Lung Cancer

26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

11

Page 12: Basic biostatistics for medical students

Causal Model

Slide 12

OutcomeExposure Exposure Exposure

Exposure

26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

Page 13: Basic biostatistics for medical students

26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

13

Type of data(Level of measurement)

Categorical

Nominal Ordinal

Numerical

Discrete Continuous

e.g. Gender, Race e.g. Cancer staging, Severity of CXR for PTB

e.g. Parity, Gravida

e.g. Hb, RBS, cholesterol.

Page 14: Basic biostatistics for medical students

Distribution (shape) of dataApplicable to numerical valueDiscrete or ContinuousDiscrete ~ Binomial, Poisson, Negative Binomial,

Hypergeometry, Multinomial etc.Continuous ~ Normal, t, chi-square, F etc.

26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

14

Page 15: Basic biostatistics for medical students

Central limit theorem “Given a distribution with a mean μ and variance

σ², the sampling distribution of the mean approaches a normal distribution with a mean (μ) and a variance σ²/N as N, the sample size, increases” (David M. Lane)

26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

15

Page 16: Basic biostatistics for medical students

Normal Distribution 26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

16

𝑓𝑓 𝑥𝑥; 𝜇𝜇,𝜎𝜎2 =1

𝜎𝜎 2𝜋𝜋𝑒𝑒−

12(𝑥𝑥−𝜇𝜇𝜎𝜎 )2)

Why Normal?- Because many biological & psychological variables are distributed normally

Page 17: Basic biostatistics for medical students

CharacteristicsBell shaped curveSymmetricalUnimodal

26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

17

Page 18: Basic biostatistics for medical students

Test of Normality Anderson–Darling Test Corrected Kolmogorov–Smirnov Test (Lilliefors Test) Cramér–von-Mises Criterion D'agostino's K-squared Test Jarque–Bera Test Pearson's Chi-square Test Shapiro–Francia Shapiro–Wilk Test

26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

18

Page 19: Basic biostatistics for medical students

Use Normality test with cautionSmall samples almost always pass a normality

test. Normality tests have little power to tell whether or not a small sample of data comes from a Gaussian distribution.

With large samples, minor deviations from normality may be flagged as statistically significant, even though small deviations from a normal distribution won’t affect the results of a t test or ANOVA.

26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

19

Page 20: Basic biostatistics for medical students

Why run statistical test? 26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

20

1. Determine presence of difference (or similarity)2. Determine degree of difference3. Determine the direction of changes (trend)

Predict changes (outcomes)

Page 21: Basic biostatistics for medical students

26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

21

A B C

Is there any difference between A & B?

How big is the difference between A & B?

Which one is taller? A or B?

Is C different from A & B?

Is there any pattern now?

If there will be D, can you predict how tall is D?

Page 22: Basic biostatistics for medical students

26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

22

Statistical analysis

AnalyticalDescriptive

e.g. Describe socio-demographic characteristics -Age, Sex, Race etc.

e.g. Prevalence of hypertension.

e.g. Compare demographic characteristics between two population – Compare age between male & female

e.g. Distribution of gender by hypertension status

e.g. How demographic characteristics (more than one factors) explain hypertension

Univariable Bivariable Multivariable

IV DV IV DV IV DV IV

Page 23: Basic biostatistics for medical students

Descriptive statistics

26 O

ctob

er 2

016

Basic Biostatistics (C) Jamalludin Ab Rahman 2015 23

Page 24: Basic biostatistics for medical students

Descriptive StatisticsExplain one variable at one timeMethod based on type of measureCategorical

Frequency (Percentage)Numerical

Central measures (e.g. mean, median) & Dispersion (e.g. variance, standard deviation, range, min-max, interquartile range)

26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

24

Page 25: Basic biostatistics for medical students

Data

CategoricalFrequency (count) &

Percentage

Numerical

Normal Mean (SD)

Not Normal Median (Range/IQR)

How to describe a data 26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

25

Page 26: Basic biostatistics for medical students

Analytical statistics

26 O

ctob

er 2

016

Basic Biostatistics (C) Jamalludin Ab Rahman 2015 26

Page 27: Basic biostatistics for medical students

Comparing differenceA

B C

Which of the following shows true difference between two populations?

26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

27

Page 28: Basic biostatistics for medical students

3 methods to compare values1. P-value2. Confidence interval3. Effect size

26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s

28

Page 29: Basic biostatistics for medical students

Hypothesis testingTruth

Ho True Ho FalseTe

st

Do notreject Ho Correct Type II error

(β)

Reject Ho Type I error (α) Correct

26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s

29Please revise the concept on your own

Page 30: Basic biostatistics for medical students

P valueP-value is ‘likely’ or ‘unlikely’ that Ho is true Taking 0.05 as the cut-off point (α), if P ≤ 0.05, it is

then ‘unlikely’ Ho is true, therefore reject Ho

26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s

30

Page 31: Basic biostatistics for medical students

One-tailed vs. two-tailed Is there a difference between Hb 14 g% vs. Hb 12

g% in male & female respectively?Ho: HbM – HbF = 0H1: HbM = HbFH2: HbM > HbFH3: HbM < HbM

Note: Should be determined a priori

26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

31

Page 32: Basic biostatistics for medical students

One-tailed vs. two-tailed

Two-sided Right-sidedLeft-sided

26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

32

Page 33: Basic biostatistics for medical students

Hypothesis testingTruth

Ho True Ho False

Test

Do notreject Ho Correct Type II error

(β)

Reject Ho Type I error (α) Correct

P-value is the probability to make Type I error (based on frequentist inference)

26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

33

Page 34: Basic biostatistics for medical students

P & Sample Size 26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

34

Page 35: Basic biostatistics for medical students

The truth about P value Measures effectiveness (even by US FDA) < P means statistically significant, NOT clinical

significant But, be careful when interpreting P value P is affected BOTH by effectiveness AND sample size P can be < 0.05 even though the effectiveness is

marginal when sample size is huge Compare Ps between studies only appropriate if the

sampling & sample size is the same

26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

35

Page 36: Basic biostatistics for medical students

P < 0.05Why 5%?Cut-off point proposed by Sir Ronald A. Fisher

1925 to reject or not to reject a hypothesis If P < 0.05 = Probability to make Type I error is

less than 5% If P > 0.05, > 5% of the difference occurred by

chance & not due to the TRUE difference

26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

36

Page 37: Basic biostatistics for medical students

Hypothesis Testing using bivariable analysis Try to prove that Exposure causes the Disease

e.g. Smoking causing Lung CancerHo: No difference of risk to get Lung Cancer

between smoker and non-smoker

37

26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

Page 38: Basic biostatistics for medical students

Lung Cancer No Lung Cancer

Smoking 20 (18.2%) 90 (81.8%)

Not Smoking 5 (4.5%) 105 (95.5%)

The occurrence of lung cancer is significantly higher (18.2%) among smokers compared to non-smokers (4.5%) (χ2 (df=1)= 10.15, P =0.001, OR = 4.7 (CI95% 1.7 – 13.0))

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

38

26 O

ctob

er 2

016

Page 39: Basic biostatistics for medical students

Confidence IntervalRange of plausible valuesNarrow interval high precision

Wide interval poor precisionHow narrow is narrow? And how wide is wide?

Base on your clinical judgment

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

40

26 O

ctob

er 2

016

Page 40: Basic biostatistics for medical students

Interpret single CI Compare with the null value

i.e. can be 0 for % or 1 for risk Compare with practical significance or the clinical

significance/indifference

41

Null

Null

A

B

Source: http://www.childrens-mercy.org/stats/journal/confidence.asp

Null

Null

C

D

26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

Page 41: Basic biostatistics for medical students

Comparing multiple CIs 26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

42

A B C

1

2

1 12

2

Page 42: Basic biostatistics for medical students

Effect size The measure of effect irrespective of sample sizeCohen (1988) classify ES into Low (<0.3)Medium (0.3-0.7)Large (> 0.7)

Manual calculation or web based calculation

43

26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

Page 43: Basic biostatistics for medical students

26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

44

Page 44: Basic biostatistics for medical students

Statistical TestBivariable (univariate) ~ One dependent & one

independentMultivariate ~ Multiple dependent & multiple

independent variable

26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

45

Page 45: Basic biostatistics for medical students

What test to use?Variable 1 Variable 2 Test

Categorical Categorical Chi-square

Categorical (2 pop) Numerical (Normal) Independent sample t-test

Categorical (2 pop) Numerical (Not Normal) Mann-Whitney U test

Categorical (> 2 pop) Numerical (Normal) One-way ANOVA

Categorical (> 2 pop) Numerical (Not Normal) Kruskal-Wallis test

Numerical (Normal) Numerical (Normal) Pearson Correlation Coefficient Test

Numerical (Normal/ Not Normal)

Numerical (Not Normal) Spearman Correlation Coefficient Test

Numerical (Normal) Numerical (Normal) –Paired

Paired t-test

Numerical (Not Normal) Numerical (Not Normal) –Paired

Wilcoxon Signed Rank Test

26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

46

Page 46: Basic biostatistics for medical students

Bivariable Analyses Compare meansIndependent sample t-test (Unpaired t-test) ~ Two unrelated

meansPaired t-test ~ Two related meansOne-way ANOVA ~ More than 2 means

χ2 Test ~ Between categorical variables Non-parametric tests (Kruskall-Wallis, Man-Whitney U

tests) ~ If data is not normally distributed

26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

47

Page 47: Basic biostatistics for medical students

Writing plan for statistical analysis #1

26 October 2016Basic Biostatistics (C) Jamalludin Ab Rahman 2015 48

Data were analyzed using the complex sample function of SPSS (version 13.0). Sampling errors were estimated using the primary sampling units and strata provided in the data set. Sampling weights were used to adjust for nonresponse bias and the oversampling of blacks, Mexican Americans, and the elderly in NHANES. The prevalence of hypertension, as well as the awareness, treatment, and control rates, were age adjusted by direct standardization to the US 2000 standard population.10 To analyze differences over time, the 2003–2004 data were compared with the 1999–2000 data. Estimates with a coefficient of variation >0.3 were considered unreliable. A 2-tailed P value <0.05 was considered statistically significant. (Ong et al. 2009)

Page 48: Basic biostatistics for medical students

Writing plan for statistical analysis #2

26 October 2016Basic Biostatistics (C) Jamalludin Ab Rahman 2015 49

To assess the effect of the selection process on the characteristics of the cases, we compared cases included in the final analysis to the rest of the cases. Since controls included in the present analysis were different from the rest of the diabetes free participants by design, no similar comparisons were performed for that group. To compare baseline characteristics of cases and controls appropriate univariate statistics were used. Similar binary logistic and multiple linear regression models were built with incident diabetes or HbA1c as respective outcomes and additive block entry of adiponectin and potential confounders. For linear regression CRP and triglycerides were log transformed. Since HbA1c could be modified by drug treatment, we ran a sensitivity analysis excluding all participants on antidiabetic medication. A p-value of <0.05 was considered significant. Analyses were performed with SPSS 14.0 for Windows.

Page 49: Basic biostatistics for medical students

Reporting analysis (example) 26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

50

Page 50: Basic biostatistics for medical students

51

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

26 O

ctob

er 2

016

Reporting analysis (example)

Page 51: Basic biostatistics for medical students

Reporting analysis (example)

26 October 2016Basic Biostatistics (C) Jamalludin Ab Rahman 2015 52

Page 52: Basic biostatistics for medical students

Summary1. Identify & define variables2. Type – independent vs. dependent3. Level of measurements – nominal, ordinal or

continuous4. Check distribution – Normal vs. Not Normal5. Decide what to do - descriptive vs. analytical

26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

53

Page 53: Basic biostatistics for medical students

PracticalDownload data from

http://bit.ly/1RC6Zte

26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

54

Page 54: Basic biostatistics for medical students

Chapter 2Introduction to SPSSIBM SPSS Statistics v21 for WindowsJamalludin Ab Rahman MD MPHDepartment of Community MedicineKulliyyahof Medicine

Page 55: Basic biostatistics for medical students

IBM SPSS Statistics

IBM Corporation Software Group Route 100 Somers, NY 10589Produced in the United States of AmericaMay 2012

26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

56

Page 56: Basic biostatistics for medical students

SPSS Layout

26 O

ctob

er 2

016

Basic Biostatistics (C) Jamalludin Ab Rahman 2015 57

Page 57: Basic biostatistics for medical students

Layout 26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

58

Main menu

Toolbar

Variables

Page 58: Basic biostatistics for medical students

Data editor 26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

59

Rows = variablesRows = each data

Define & describe your variables here

Enter your data here

Page 59: Basic biostatistics for medical students

Viewer 26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

60

The output of analyses will be displayed here.

Output is separated from

data

Page 60: Basic biostatistics for medical students

Syntax 26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

61

We can compile all the steps of the analyses here. Extend the programming

function in SPSS. Ability to perform complex steps e.g.

“looping”

Page 61: Basic biostatistics for medical students

Creating Dataset

26 O

ctob

er 2

016

Basic Biostatistics (C) Jamalludin Ab Rahman 2015 62

Page 62: Basic biostatistics for medical students

Before even you start SPSS! 26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

63

You must identify & define relevant variables Define means

1. Name – preferably short single name, begins with alphabet, no special character, no space

2. Type of data – e.g. Numeric, Date, String3. Width & Decimal Places (if numeric)4. Label – description for the Name (will be displayed in Viewer)

5. Values – labels for value e.g. 1=Male, 2=Female6. Missing – define missing value e.g. 999 for N/A

Page 63: Basic biostatistics for medical students

Define your variables 26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

64

Page 64: Basic biostatistics for medical students

Variable Types 26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

65

Page 65: Basic biostatistics for medical students

Variable Type 26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

66

Decide the suitable

variable type

For numeric, determine Width

& Decimal. Decimal < Width

For String, no option for

Decimal Place

New option for Numerics with leading zeros

Page 66: Basic biostatistics for medical students

Value Labels 26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

67

Page 67: Basic biostatistics for medical students

Missing Value 26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

68

This question is Not Applicable

to male

e.g. Assign 999 to represent N/A value

& this won’t be included in any

analysis

Page 68: Basic biostatistics for medical students

Chapter 3Descriptive StatisticsIBM SPSS Statistics v21 for WindowsJamalludin Ab Rahman MD MPHDepartment of Community MedicineKulliyyahof Medicine

Page 69: Basic biostatistics for medical students

Exercise data 26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

70

HypotheticalStudy to describe factors related to HbA1c &

Homocystein (HCY)N=301 13 variables

Page 70: Basic biostatistics for medical students

Retrieve file information 26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

71

Page 71: Basic biostatistics for medical students

healthstatus001 26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

72

Page 72: Basic biostatistics for medical students

Task 1 26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

73

1. Describe socio-demographic characteristics of the respondent (age, gender & race)

2. Describe the explanatory variables1. Exercise2. smoking status3. BMI status &4. BP status

3. Describe HbA1c (taking cut-off for Poor HbA1c ≥ 6.5%) & HCY

Page 73: Basic biostatistics for medical students

Describe numerical data

26 O

ctob

er 2

016

Basic Biostatistics (C) Jamalludin Ab Rahman 2015 74

Page 74: Basic biostatistics for medical students

Explore 26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

75

Page 75: Basic biostatistics for medical students

26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

76

Page 76: Basic biostatistics for medical students

Results 26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

77

Check for Normality.Is Age data distributed Normally?

Page 77: Basic biostatistics for medical students

26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

78

Is this Normaldistribution?

Page 78: Basic biostatistics for medical students

Describe age 26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

79

Normal The subjects distributed between 23-67 years old

with the average of 34 (SD=8) years.If not Normal The subjects distributed between 23-67 years old

with the median of 33 (IQR=11) years

Page 79: Basic biostatistics for medical students

Describe categorical data

26 O

ctob

er 2

016

Basic Biostatistics (C) Jamalludin Ab Rahman 2015 80

Page 80: Basic biostatistics for medical students

Frequency 26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

81

Page 81: Basic biostatistics for medical students

26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

82

Page 82: Basic biostatistics for medical students

Results 26 O

ctob

er 2

016

Basic Biostatistics (C) Jamalludin Ab Rahman 2015 83

Page 83: Basic biostatistics for medical students

TRANSFORM

26 O

ctob

er 2

016

Basic Biostatistics (C) Jamalludin Ab Rahman 2015 84

Page 84: Basic biostatistics for medical students

Compute 26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

85

Page 85: Basic biostatistics for medical students

26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

86

weight / ((height / 100) ** 2)

Page 86: Basic biostatistics for medical students

Visual binning 26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

87

Page 87: Basic biostatistics for medical students

26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

88

Normal < 23Overweight 23 to < 27.5Obese >= 27.5

Page 88: Basic biostatistics for medical students

26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

89

Page 89: Basic biostatistics for medical students

Chapter 4Bivariable analysesIBM SPSS Statistics v21 for WindowsJamalludin Ab Rahman MD MPHDepartment of Community MedicineKulliyyahof Medicine

Page 90: Basic biostatistics for medical students

To check association of two variables? 26

Oct

ober

201

6Ba

sic

Bios

tatis

tics

(C) J

amal

ludi

n Ab

Rah

man

201

5

91

HbA1cAge

Page 91: Basic biostatistics for medical students

The steps 26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

92

1. Determine which is dependant & which is independent

2. Determine level of measurements3. Determine Normality of the numerical

measurement4. Determine the suitable statistical test

Page 92: Basic biostatistics for medical students

What are the tests? 26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

93

Variable 1 Variable 2 Test

Categorical Categorical Chi-square

Categorical (2 pop) Numerical (Normal) Independent sample t-test

Categorical (2 pop) Numerical (Not Normal) Mann-Whitney U test

Categorical (> 2 pop) Numerical (Normal) One-way ANOVA

Categorical (> 2 pop) Numerical (Not Normal) Kruskal-Wallis test

Numerical (Normal) Numerical (Normal) Pearson Correlation Coefficient Test

Numerical (Normal/ Not Normal)

Numerical (Not Normal) Spearman Correlation Coefficient Test

Numerical (Normal) Numerical (Normal) – Paired Paired t-test

Numerical (Not Normal) Numerical (Not Normal) –Paired

Friedman test

Page 93: Basic biostatistics for medical students

Tasks 2 26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

94

1. Determine association between socio-demographic characteristics & all the risk factors with HbA1c

2. Determine association between socio-demographic characteristics & all the risk factors with HCY

Note: It would be good if you could construct dummy table for the answers even before the analyses started

Page 94: Basic biostatistics for medical students

HCY normal range 26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

95

Page 95: Basic biostatistics for medical students

Independent sample t-TestComparing Two Means

26 O

ctob

er 2

016

Basic Biostatistics (C) Jamalludin Ab Rahman 2015 96

Page 96: Basic biostatistics for medical students

Age vs. BP 26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

97

Page 97: Basic biostatistics for medical students

26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

98

Page 98: Basic biostatistics for medical students

Results 26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

99

Levene’s test check equality between variances. Ho: There is no difference of variances. So if P is significant, we reject Ho, and therefore equal variances assumed.

The original t-test (Student’s t-test) assumes equal variances for equal sample sizes. However if the variances are equal, it is robust for different sizes.

Welch's correction

Page 99: Basic biostatistics for medical students

Table – Distribution of age by blood pressure status 26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

100

N Mean SD Statistics df P

Normal BP 156 33.9 7.9 t=0.431 299 0.667

High BP 145 34.4 8.9

Page 100: Basic biostatistics for medical students

Chi-squared TestDifference of Two Proportions

26 O

ctob

er 2

016

Basic Biostatistics (C) Jamalludin Ab Rahman 2015 101

Page 101: Basic biostatistics for medical students

Gender vs. BP 26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

102

Page 102: Basic biostatistics for medical students

26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

103

Page 103: Basic biostatistics for medical students

Results 26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

104

Describe this table first. What is your impression? 49% women vs. 47% men with high BP

Some books may suggest the use of Continuity Correction at ALL time, but recent simulations showed that CC (or Yate’s correction) is OVERCONSERVATIVE. Hence, use Pearson χ2 when < 20% of cells have expected count < 5

When ≥ 20% of cells have EC < 5, use Fisher’s Exact Test

This is given because we code the variables using numbers. Can be used to measure P-trend

Page 104: Basic biostatistics for medical students

One-way ANOVAComparing More than Two Means

26 O

ctob

er 2

016

Basic Biostatistics (C) Jamalludin Ab Rahman 2015 105

Page 105: Basic biostatistics for medical students

Race vs. HbA1c 26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

106

Page 106: Basic biostatistics for medical students

26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

107

Page 107: Basic biostatistics for medical students

Results 26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

108

Describe these results first. What is your impression? HbA1c between races? 6.4 (SD 2.1) vs. 6.7 (SD 2.2) vs. 6.5 (SD 2.2)

The F test shows that there is no single significant difference between any two groups

Page 108: Basic biostatistics for medical students

Results – BMI status vs. HbA1c 26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

109

The F test shows that at least there is ONE pair with significant different. Either N vs. OW, N vs. OB or OW vs. OB

We need to run Post-hoc test to determine which of the PAIR is significant.

To decide which Post-hoc test to choose, we have to test for equality of variances i.e. Homogeneity of variances (Levene’s test)

Page 109: Basic biostatistics for medical students

26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

110

Page 110: Basic biostatistics for medical students

Results – Post hoc 26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

111

The significant difference is only for Normal vs. Obese (P=0.002)

Page 111: Basic biostatistics for medical students

Report 26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

112

There is a significant association between BMI Status and HbA1c (F(2,298)=13.129, P<0.001). Post-hoc test showed that Obese subjects have significantly higher HbA1c compared to Normal and Overweight subjects (P=0.001 and P < 0.001 respectively).

Page 112: Basic biostatistics for medical students

Mann-whitney UNon Parametric Tests

26 O

ctob

er 2

016

Basic Biostatistics (C) Jamalludin Ab Rahman 2015 113

Page 113: Basic biostatistics for medical students

Gender vs. HCY 26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

114

Page 114: Basic biostatistics for medical students

Results 26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

115

This ranks table is not to be cited in the research paper. Instead, describe their MEDIAN

Page 115: Basic biostatistics for medical students

KRUSKALL WALLISNon-Parametric TESTS

26 O

ctob

er 2

016

Basic Biostatistics (C) Jamalludin Ab Rahman 2015 116

Page 116: Basic biostatistics for medical students

Race vs. HCY 26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

117

Page 117: Basic biostatistics for medical students

Results 26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

118

Page 118: Basic biostatistics for medical students

Correlation TestRELATIONSHIP OF TWO NUMERICAL DATA

26 O

ctob

er 2

016

Basic Biostatistics (C) Jamalludin Ab Rahman 2015 119

Page 119: Basic biostatistics for medical students

Age vs. HbA1c 26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

120

Page 120: Basic biostatistics for medical students

Results 26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

121

Page 121: Basic biostatistics for medical students

Age vs. HCY 26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

122

Page 122: Basic biostatistics for medical students

26 O

ctob

er 2

016

Basi

c Bi

osta

tistic

s (C

) Jam

allu

din

Ab R

ahm

an 2

015

123