Top Banner
Applications of Applications of Statistics in Statistics in Research Research Bandit Thinkhamrop, Ph.D. Bandit Thinkhamrop, Ph.D. (Statistics) (Statistics) Department of Biostatistics and Department of Biostatistics and Demography Demography Faculty of Public Health Faculty of Public Health Khon Kaen University Khon Kaen University
76

Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

Jan 11, 2016

Download

Documents

Kory Williams
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

Applications of Applications of Statistics in ResearchStatistics in Research

Bandit Thinkhamrop, Ph.D.(Statistics)Bandit Thinkhamrop, Ph.D.(Statistics)Department of Biostatistics and DemographyDepartment of Biostatistics and Demography

Faculty of Public HealthFaculty of Public HealthKhon Kaen UniversityKhon Kaen University

Page 2: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

Steps of Statistical ApplicationsSteps of Statistical Applications(Practical guides for beginners)(Practical guides for beginners)

Begin at the conclusion Begin at the conclusion Identify the primary research questionIdentify the primary research questionIdentify the primary study outcomeIdentify the primary study outcomeIdentify type of the study outcomeIdentify type of the study outcomeIdentify type of the study designIdentify type of the study designGenerate a mock data setGenerate a mock data setIdentify type of the main statistical goalIdentify type of the main statistical goalList choices of the statistical methodsList choices of the statistical methodsSelect the most appropriate statistical methodSelect the most appropriate statistical methodPerform the data analysis using a softwarePerform the data analysis using a softwareReport and interpret the results from the outputsReport and interpret the results from the outputs

Page 3: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

Begin at the conclusion Begin at the conclusion Identify the primary research questionIdentify the primary research questionIdentify the primary study outcomeIdentify the primary study outcomeIdentify type of the study outcomeIdentify type of the study outcomeIdentify type of the study designIdentify type of the study designGenerate a mock data setGenerate a mock data setIdentify type of the main statistical goalIdentify type of the main statistical goalList choices of the statistical methodsList choices of the statistical methodsSelect the most appropriate statistical methodSelect the most appropriate statistical methodPerform the data analysis using a softwarePerform the data analysis using a softwareReport and interpret the results from the outputsReport and interpret the results from the outputs

Steps of Statistical ApplicationsSteps of Statistical Applications(Practical guides for beginners)(Practical guides for beginners)

Page 4: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

Begin at the conclusionBegin at the conclusion

Page 5: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

Begin at the conclusion Begin at the conclusion Identify the primary research questionIdentify the primary research questionIdentify the primary study outcomeIdentify the primary study outcomeIdentify type of the study outcomeIdentify type of the study outcomeIdentify type of the study designIdentify type of the study designGenerate a mock data setGenerate a mock data setIdentify type of the main statistical goalIdentify type of the main statistical goalList choices of the statistical methodsList choices of the statistical methodsSelect the most appropriate statistical methodSelect the most appropriate statistical methodPerform the data analysis using a softwarePerform the data analysis using a softwareReport and interpret the results from the outputsReport and interpret the results from the outputs

Steps of Statistical ApplicationsSteps of Statistical Applications(Practical guides for beginners)(Practical guides for beginners)

Page 6: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

Identify the primary research Identify the primary research questionquestion

Where to find the research question?Where to find the research question?– Title of the studyTitle of the study– The objective(s)The objective(s)– The conclusion(s)The conclusion(s)

If more than one, find the primary aim.If more than one, find the primary aim.

Try to make the question “quantifiable”Try to make the question “quantifiable”

Page 7: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

Begin at the conclusion Begin at the conclusion Identify the primary research questionIdentify the primary research questionIdentify the primary study outcomeIdentify the primary study outcomeIdentify type of the study outcomeIdentify type of the study outcomeIdentify type of the study designIdentify type of the study designGenerate a mock data setGenerate a mock data setIdentify type of the main statistical goalIdentify type of the main statistical goalList choices of the statistical methodsList choices of the statistical methodsSelect the most appropriate statistical methodSelect the most appropriate statistical methodPerform the data analysis using a softwarePerform the data analysis using a softwareReport and interpret the results from the outputsReport and interpret the results from the outputs

Steps of Statistical ApplicationsSteps of Statistical Applications(Practical guides for beginners)(Practical guides for beginners)

Page 8: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

Identify the primary study Identify the primary study outcomeoutcome

It is the “primary” dependence variableIt is the “primary” dependence variableIt is the main finding that was used as the basis It is the main finding that was used as the basis for the conclusion of the studyfor the conclusion of the studyIt is the target of the statistical inference It is the target of the statistical inference It is the basis for sample size calculationIt is the basis for sample size calculationIt resided in the :It resided in the :– TitleTitle– Research questionResearch question– ObjectiveObjective– Sample size calculationSample size calculation– Main finding in the RESULTS section of the reportMain finding in the RESULTS section of the report– ConclusionConclusion

Page 9: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

Begin at the conclusion Begin at the conclusion Identify the primary research questionIdentify the primary research questionIdentify the primary study outcomeIdentify the primary study outcomeIdentify type of the study outcomeIdentify type of the study outcomeIdentify type of the study designIdentify type of the study designGenerate a mock data setGenerate a mock data setIdentify type of the main statistical goalIdentify type of the main statistical goalList choices of the statistical methodsList choices of the statistical methodsSelect the most appropriate statistical methodSelect the most appropriate statistical methodPerform the data analysis using a softwarePerform the data analysis using a softwareReport and interpret the results from the outputsReport and interpret the results from the outputs

Steps of Statistical ApplicationsSteps of Statistical Applications(Practical guides for beginners)(Practical guides for beginners)

Page 10: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

Type of the study outcome: Key for Type of the study outcome: Key for selecting appropriate statistical methodsselecting appropriate statistical methods

Study outcomeStudy outcome– Dependent variable or response variableDependent variable or response variable– Focus on primary study outcome if there are Focus on primary study outcome if there are

moremore

Type of the study outcomeType of the study outcome– ContinuousContinuous– Categorical (dichotomous, polytomous, ordinal)Categorical (dichotomous, polytomous, ordinal)– Numerical (Poisson) countNumerical (Poisson) count– Even-free durationEven-free duration

Page 11: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

Continuous outcomeContinuous outcome

Primary target of estimation: Primary target of estimation: – Mean (SD) Mean (SD) – Median (Min:Max)Median (Min:Max)– Correlation coefficient: r and ICC Correlation coefficient: r and ICC

Modeling:Modeling:– Linear regressionLinear regression

The model coefficient = Mean differenceThe model coefficient = Mean difference– Quantile regressionQuantile regression

The model coefficient = Median differenceThe model coefficient = Median differenceExample: Example: – Outcome = Weight, BP, score of ?, level of ?, etc.Outcome = Weight, BP, score of ?, level of ?, etc.– RQ: Factors affecting birth weightRQ: Factors affecting birth weight

Page 12: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

Categorical outcomeCategorical outcome

Primary target of estimation : Primary target of estimation : – Proportion or Risk Proportion or Risk

Modeling:Modeling:– Logistic regressionLogistic regression

The model coefficient = Odds ratioThe model coefficient = Odds ratio (OR)(OR)

Example: Example: – Outcome = Disease (y/n), Dead(y/n), Outcome = Disease (y/n), Dead(y/n),

cured(y/n), etc.cured(y/n), etc.– RQ: Factors affecting low birth weight RQ: Factors affecting low birth weight

Page 13: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

Numerical (Poisson) count outcomeNumerical (Poisson) count outcome

Primary target of estimation : Primary target of estimation : – Incidence rate (e.g., rate per person time) Incidence rate (e.g., rate per person time)

Modeling:Modeling:– Poisson regressionPoisson regression

The model coefficient = Incidence rate ratio (IRR)The model coefficient = Incidence rate ratio (IRR)

Example: Example: – Outcome = Total number of fallsOutcome = Total number of falls

Total time at risk of fallingTotal time at risk of falling– RQ: Factors affecting elderly fallRQ: Factors affecting elderly fall

Page 14: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

Event-free duration outcomeEvent-free duration outcome

Primary target of estimation : Primary target of estimation : – Median survival time Median survival time

Modeling:Modeling:– Cox regressionCox regression

The model coefficient = Hazard ratio (HR)The model coefficient = Hazard ratio (HR)

Example: Example: – Outcome = Overall survival, disease-free Outcome = Overall survival, disease-free

survival, progression-free survival, etc.survival, progression-free survival, etc.– RQ: Factors affecting survivalRQ: Factors affecting survival

Page 15: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

The outcome determine statisticsThe outcome determine statistics

Continuous

MeanMedian

Categorical

Proportion(PrevalenceOrRisk)

Count

Rate per “space”

Survival

Median survivalRisk of events at T(t)

Linear Reg. Logistic Reg. Poisson Reg. Cox Reg.

Page 16: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

Statistics quantify errors for judgmentsStatistics quantify errors for judgmentsParameter estimation

[95%CI]

Hypothesis testing[P-value]

Parameter estimation[95%CI]

Hypothesis testing[P-value]

Page 17: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

Statistics quantify errors for judgmentsStatistics quantify errors for judgments

7

Parameter estimation[95%CI]

Hypothesis testing[P-value]

Parameter estimation[95%CI]

Hypothesis testing[P-value]

Page 18: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

Begin at the conclusion Begin at the conclusion Identify the primary research questionIdentify the primary research questionIdentify the primary study outcomeIdentify the primary study outcomeIdentify type of the study outcomeIdentify type of the study outcomeIdentify type of the study designIdentify type of the study designGenerate a mock data setGenerate a mock data setIdentify type of the main statistical goalIdentify type of the main statistical goalList choices of the statistical methodsList choices of the statistical methodsSelect the most appropriate statistical methodSelect the most appropriate statistical methodPerform the data analysis using a softwarePerform the data analysis using a softwareReport and interpret the results from the outputsReport and interpret the results from the outputs

Steps of Statistical ApplicationsSteps of Statistical Applications(Practical guides for beginners)(Practical guides for beginners)

Page 19: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

Types of Research

Qualitative Quantitative

Observational Experimental

Descriptive Analytical

Cross-sectional Case-control Cohort

Quasi-experimental Randomized-controlled

Clinical trialField trial

Community intervention trial

Parallel or Cross-over or factorialFixed length or group sequential

With or without baseline

Systematic reviewMeta-analysis

Cross-sectional descriptivePrevalence survey

Poll

PhenomenologyGrounded TheoryEthnographyDescription

Prevalence case-controlNested case-controlCase-cohort case-control

Prospective cohortRetrospective cohortAmbi-spective cohort

Page 20: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

Caution about biasesCaution about biases

Selection bias

Information bias

Confounding bias

Research Design-Prevent them-Minimize them

Page 21: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

Caution about biasesCaution about biases

Selection bias (SB)

Information bias (IB)

Confounding bias (CB)

If data available:SB & IB can be assessedCB can be adjusted using multivariable analysis

Page 22: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

Begin at the conclusion Begin at the conclusion Identify the primary research questionIdentify the primary research questionIdentify the primary study outcomeIdentify the primary study outcomeIdentify type of the study outcomeIdentify type of the study outcomeIdentify type of the study designIdentify type of the study designGenerate a mock data setGenerate a mock data setIdentify type of the main statistical goalIdentify type of the main statistical goalList choices of the statistical methodsList choices of the statistical methodsSelect the most appropriate statistical methodSelect the most appropriate statistical methodPerform the data analysis using a softwarePerform the data analysis using a softwareReport and interpret the results from the outputsReport and interpret the results from the outputs

Steps of Statistical ApplicationsSteps of Statistical Applications(Practical guides for beginners)(Practical guides for beginners)

Page 23: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

Generate a mock data setGenerate a mock data set

General format of the data layoutGeneral format of the data layout

id y x1 x2 X3

11

22

33

44

55

……

nn

Page 24: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

Generate a mock data setGenerate a mock data set

Continuous outcome exampleContinuous outcome example

id y x1 x2 X3

11 22 11 2121 2222

22 22 00 1212 1919

33 00 11 44 2020

44 22 00 8989 2121

55 1414 11 00 1818

……

nn 66 00 4545 2121

Mean (SD)

Page 25: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

Generate a mock data setGenerate a mock data set

Continuous outcome exampleContinuous outcome example

id y x1 x2 X3

11 11 11 2121 2222

22 11 00 1212 1919

33 00 11 44 2020

44 00 00 8989 2121

55 00 11 00 1818

……

nn 00 00 4545 2121

n, percentage

Page 26: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

Begin at the conclusion Begin at the conclusion Identify the primary research questionIdentify the primary research questionIdentify the primary study outcomeIdentify the primary study outcomeIdentify type of the study outcomeIdentify type of the study outcomeIdentify type of the study designIdentify type of the study designGenerate a mock data setGenerate a mock data setIdentify type of the main statistical goalIdentify type of the main statistical goalList choices of the statistical methodsList choices of the statistical methodsSelect the most appropriate statistical methodSelect the most appropriate statistical methodPerform the data analysis using a softwarePerform the data analysis using a softwareReport and interpret the results from the outputsReport and interpret the results from the outputs

Steps of Statistical ApplicationsSteps of Statistical Applications(Practical guides for beginners)(Practical guides for beginners)

Page 27: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

Common types of the statistical goalsCommon types of the statistical goals

Single measurements (no comparison)Single measurements (no comparison)

Difference (compared by subtraction)Difference (compared by subtraction)

Ratio (compared by division)Ratio (compared by division)

Prediction (diagnostic test or predictive Prediction (diagnostic test or predictive model)model)

Correlation (examine a joint distribution) Correlation (examine a joint distribution)

Agreement (examine concordance or Agreement (examine concordance or similarity between pairs of observations)similarity between pairs of observations)

Page 28: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

Begin at the conclusion Begin at the conclusion Identify the primary research questionIdentify the primary research questionIdentify the primary study outcomeIdentify the primary study outcomeIdentify type of the study outcomeIdentify type of the study outcomeIdentify type of the study designIdentify type of the study designGenerate a mock data setGenerate a mock data setIdentify type of the main statistical goalIdentify type of the main statistical goalList choices of the statistical methodsList choices of the statistical methodsSelect the most appropriate statistical methodSelect the most appropriate statistical methodPerform the data analysis using a softwarePerform the data analysis using a softwareReport and interpret the results from the outputsReport and interpret the results from the outputs

Steps of Statistical ApplicationsSteps of Statistical Applications(Practical guides for beginners)(Practical guides for beginners)

Page 29: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.
Page 30: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.
Page 31: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

Dependency of the study outcome required Dependency of the study outcome required special statistical methods to handle itspecial statistical methods to handle it

Example of dependency or correlated data: Example of dependency or correlated data: – Before-after or Pre-post designBefore-after or Pre-post design– Measuring paired organs i.e., ears, eyes, arms, etc.Measuring paired organs i.e., ears, eyes, arms, etc.– Longitudinal data, repeated measurementLongitudinal data, repeated measurement– Clustered data, many observation unit within a cluster Clustered data, many observation unit within a cluster

Choices of approaches:Choices of approaches:– Ignore it => use ordinary analysis as independency - Ignore it => use ordinary analysis as independency -

not savenot save– Simplify it => use summary measure then analyze the Simplify it => use summary measure then analyze the

data as it is independent – not efficientdata as it is independent – not efficient– Handle it => Mixed model, multilevel modeling, GEE - Handle it => Mixed model, multilevel modeling, GEE -

recommendedrecommended

Page 32: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

Dependency of the study outcome required Dependency of the study outcome required special statistical methods to handle itspecial statistical methods to handle it

Continuous Categorical Count Survival

MeanMedian

Proportion(PrevalenceOrRisk)

Rate per “space”

Median survivalRisk of events at T(t)

Linear Reg. Logistic Reg. Poisson Reg. Cox Reg.

Mixed model, multilevel model, GEE

Page 33: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

Answer the research questionbased on lower or upper limit of the CI

Back to the conclusionBack to the conclusion

Continuous Categorical Count Survival

Magnitude of effect95% CIP-value

Magnitude of effect95% CIP-value

MeanMedian

Proportion(Prevalence or Risk)

Rate per “space”

Median survivalRisk of events at T(t)

Appropriate statistical methods

Page 34: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

Always report the magnitude of Always report the magnitude of effect and its confidence intervaleffect and its confidence interval

Absolute effects: Absolute effects: – Mean, Mean differenceMean, Mean difference– Proportion or prevalence, Rate or risk, Rate or Risk differenceProportion or prevalence, Rate or risk, Rate or Risk difference– Median survival timeMedian survival time

Relative effects:Relative effects:– Relative risk, Rate ratio, Hazard ratioRelative risk, Rate ratio, Hazard ratio– Odds ratioOdds ratio

Other magnitude of effects: Other magnitude of effects: – Correlation coefficientCorrelation coefficient (r), Intra-class correlation (ICC)(r), Intra-class correlation (ICC)– KappaKappa– Diagnostic performanceDiagnostic performance– Etc.Etc.

Page 35: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

Touch the Touch the variabilityvariability (uncertainty) (uncertainty) to understand statistical inferenceto understand statistical inference

id A (x- ) (x- ) 2

11 22 -2-2 44

22 22 -2-2 44

33 00 -4-4 1616

44 22 -2-2 44

55 1414 1010 100100

Sum (Sum ()) 2020 00 128128

Mean( )Mean( ) 44 00 32.032.0

SDSD 5.665.66

MedianMedian 22

X

X X2+2+0+2+14 = 202+2+0+2+14 = 20

2+2+0+2+14 = 20 = 4 5 52+2+0+2+14 = 20 = 4 5 5

0 2 2 2 140 2 2 2 14

Variance = SD2Variance = SD2

Standard deviation = SDStandard deviation = SD

Page 36: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

Touch the Touch the variabilityvariability (uncertainty) (uncertainty) to understand statistical inferenceto understand statistical inference

id A (x- ) (x- ) 2

11 22 -2-2 44

22 22 -2-2 44

33 00 -4-4 1616

44 22 -2-2 44

55 1414 1010 100100

Sum (Sum ()) 2020 00 128128

Mean( )Mean( ) 44 00 32.032.0

SDSD 5.665.66

MedianMedian 22

X

X X

Measure of variation

Measure of variation

Measure of central tendency

Measure of central tendency

Page 37: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

1

2

n

XXSD

1

2

n

XXSD

Degree of freedom

Standard deviation (SD) = The average distant between each data item to their mean

Page 38: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

Same mean BUT different variationSame mean BUT different variation

id A

11 22

22 22

33 00

44 22

55 1414

Sum (Sum ()) 2020

MeanMean 44

SDSD 5.665.66

MedianMedian 22

id C

11 44

22 33

33 55

44 44

55 44

Sum (Sum ()) 2020

MeanMean 44

SDSD 0.710.71

MedianMedian 44

Heterogeneous dataSkew distribution

Heterogeneous dataSymmetry distribution

id B

11 00

22 44

33 1212

44 44

55 00

Sum (Sum ()) 2020

MeanMean 44

SDSD 4.904.90

MedianMedian 44

Homogeneous dataSymmetry distribution

Page 39: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

Facts about VariationFacts about Variation

Because of variability, repeated samples will Because of variability, repeated samples will NOT obtain the same statistic such as mean or NOT obtain the same statistic such as mean or proportion:proportion:– Statistics varies from study to study because of the Statistics varies from study to study because of the

role of chancerole of chance– Hard to believe that the statistic is the parameter Hard to believe that the statistic is the parameter – Thus we need statistical inference to estimate the Thus we need statistical inference to estimate the

parameter based on the statistics obtained from a parameter based on the statistics obtained from a studystudy

Data varied widely = heterogeneous dataData varied widely = heterogeneous dataHeterogeneous data requires large sample size Heterogeneous data requires large sample size to achieve a conclusive findingto achieve a conclusive finding

Page 40: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

The HistogramThe Histogramid A

11 22

22 22

33 00

44 22

55 1414

id B

11 44

22 33

33 55

44 44

55 44

00 11 22 33 44 55 66 77 88 99 1010 1111 1212 1313 1414

00 11 22 33 44 55 66 77 88 99 1010 1111 1212 1313 1414

Page 41: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

The Frequency CurveThe Frequency Curveid A

11 22

22 22

33 00

44 22

55 1414

id B

11 44

22 33

33 55

44 44

55 44

00 11 22 33 44 55 66 77 88 99 1010 1111 1212 1313 1414

00 11 22 33 44 55 66 77 88 99 1010 1111 1212 1313 1414

Page 42: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

Area Under The Frequency CurveArea Under The Frequency Curveid A

11 22

22 22

33 00

44 22

55 1414

id B

11 44

22 33

33 55

44 44

55 44

00 11 22 33 44 55 66 77 88 99 1010 1111 1212 1313 1414

00 11 22 33 44 55 66 77 88 99 1010 1111 1212 1313 1414

Page 43: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

Central Limit TheoremCentral Limit Theorem

Right Skew

X1

Symmetry

X2

Left Skew

X3

Normally distributedX1 XX Xn

Page 44: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

Distribution of Distribution of thethe sampling meansampling mean

Distribution of Distribution of the raw datathe raw data

Central Limit TheoremCentral Limit Theorem

X1

X2

X3

X1 XX Xn

Page 45: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

Central Limit TheoremCentral Limit TheoremDistribution of Distribution of

the raw datathe raw data

X1 XX Xn

Distribution of Distribution of

thethe sampling meansampling mean

(Theoretical) Normal Distribution

Large sampleLarge sample

Page 46: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

Central Limit TheoremCentral Limit TheoremMany X, , SDMany X, , SD

Standardized for whatever n, Mean = 0, Standard deviation = 1

Large sampleLarge sample

X1 XX Xn

Many , , SEMany , , SEX XX

X

Standard deviation of the sampling mean Standard error (SE)Estimated by

SE = SD n

Page 47: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

(Theoretical) Normal (Theoretical) Normal DistributionDistribution

Page 48: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

(Theoretical) Normal (Theoretical) Normal DistributionDistribution

Page 49: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

Mean ± 3SD

99.73% of AUC

Page 50: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

Mean ± 2SD

95.45% of AUC

Page 51: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

Mean ± 1SD

68.26% of AUC

Page 52: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

n = 25X = 52SD = 5

Sample

PopulationParameter estimation

[95%CI]

Hypothesis testing[P-value]

Parameter estimation[95%CI]

Hypothesis testing[P-value]

Page 53: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

n

SDSE

n

SDSE

25

5SE25

5SE 5 = 1

5

Z = 2.58Z = 1.96Z = 1.64

Page 54: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

n = 25X = 52SD = 5SE = 1

Sample

PopulationParameter estimationParameter estimation

[95%CI] : 52-1.96(1) to 52+1.96(1) 50.04 to 53.96We are 95% confidence that the population mean would lie between 50.04 and 53.96

[95%CI] : 52-1.96(1) to 52+1.96(1) 50.04 to 53.96We are 95% confidence that the population mean would lie between 50.04 and 53.96

Z = 2.58Z = 1.96Z = 1.64

Page 55: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

n = 25X = 52SD = 5SE = 1

Sample

Hypothesis testing

Hypothesis testing

Population

Z = 55 – 52 1

3H0 : = 55HA : 55

Page 56: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

Hypothesis testing

H0 : = 55HA : 55If the true mean in the population is 55, chance to obtain a sample mean of 52 or more extreme is 0.0027.

Hypothesis testing

H0 : = 55HA : 55If the true mean in the population is 55, chance to obtain a sample mean of 52 or more extreme is 0.0027.

Z = 55 – 52 1

3 P-value = 1-0.9973 = 0.0027

5552

-3SE +3SE

Page 57: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

Report and interpret p-value appropriatelyReport and interpret p-value appropriately

Example of over reliance on p-value: Example of over reliance on p-value: – Real results: n=5900; ORReal results: n=5900; ORDrug A vs Drug BDrug A vs Drug B = 1.02 = 1.02

(P<0.001) (P<0.001) – Inappropriate: Quote p-value as < 0.05 or put * Inappropriate: Quote p-value as < 0.05 or put *

or **** (star) to indicate significant resultsor **** (star) to indicate significant results– Wrong: Drug A is highly significantly better Wrong: Drug A is highly significantly better

than Drug B (P<0.001) than Drug B (P<0.001) – What if 95%CI: 1.001 to 1.300?What if 95%CI: 1.001 to 1.300?– This is no clinical meaningful at all….!This is no clinical meaningful at all….!

Page 58: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

Report and interpret p-value appropriatelyReport and interpret p-value appropriately

Example of over reliance on p-value: Example of over reliance on p-value: – Real results: n=30; ORReal results: n=30; ORDrug A vs Drug BDrug A vs Drug B = 9.2 (P=0.715) = 9.2 (P=0.715) – Inappropriate: Quote p-value as > 0.05Inappropriate: Quote p-value as > 0.05– Wrong: There is no statistical significant difference Wrong: There is no statistical significant difference

of the treatment effect (P<0.05). Thus Drug A is as of the treatment effect (P<0.05). Thus Drug A is as effective as Drug B effective as Drug B

– What if 95%CI: 0.99 to 28.97?What if 95%CI: 0.99 to 28.97?– This is study indicated a low power, NOT suggested This is study indicated a low power, NOT suggested

an equivalence…!an equivalence…!– Correct: There was no sufficient information to Correct: There was no sufficient information to

concluded that . . . => inconclusive findingsconcluded that . . . => inconclusive findings

Page 59: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

P-value is the magnitude of chance P-value is the magnitude of chance NOT magnitude of effectNOT magnitude of effect

P-value < 0.05 = Significant findingsP-value < 0.05 = Significant findings

Small chance of being wrong in rejecting the null Small chance of being wrong in rejecting the null hypothesishypothesis

If in fact there is no [If in fact there is no [effecteffect], it is unlikely to get the ], it is unlikely to get the [[effecteffect] = [] = [magnitude of effectmagnitude of effect] or more extreme] or more extreme

Significance DOES NOT MEAN importanceSignificance DOES NOT MEAN importance

Any extra-large studies can give a very small P-Any extra-large studies can give a very small P-value even if the [value even if the [magnitude of effectmagnitude of effect] is very ] is very smallsmall

Page 60: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

P-value is the magnitude of chance P-value is the magnitude of chance NOT magnitude of effectNOT magnitude of effect

P-value > 0.05 = Non-significant findingsP-value > 0.05 = Non-significant findingsHigh chance of being wrong in rejecting the null High chance of being wrong in rejecting the null hypothesishypothesisIf in fact there is no [If in fact there is no [effecteffect], the [], the [effecteffect] = ] = [[magnitude of effectmagnitude of effect] or more extreme can be ] or more extreme can be occurred chance.occurred chance.Non-significance DOES NOT MEAN no Non-significance DOES NOT MEAN no difference, equal, or no associationdifference, equal, or no associationAny small studies can give a very large P-value Any small studies can give a very large P-value even if the [even if the [magnitude of effectmagnitude of effect] is very large] is very large

Page 61: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

P-value P-value vs.vs. 95%CI 95%CI (1)(1)

A study compared cure rate between Drug A and Drug B

Setting:Drug A = Alternative treatmentDrug B = Conventional treatment

Results:Drug A: n1 = 50, Pa = 80%Drug B: n2 = 50, Pb = 50%

Pa-Pb = 30% (95%CI: 26% to 34%; P=0.001)

An example of a study with dichotomous outcome

Page 62: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

P-value P-value vs.vs. 95%CI 95%CI (2)(2)

Pa-Pb = 30% (95%CI: 26% to 34%; P< 0.05)

Pa > Pb

Pb > Pa

Page 63: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

P-value P-value vs.vs. 95%CI 95%CI (3)(3)Adapted from: Armitage, P. and Berry, G. Statistical methods in medical research. 3rd edition. Blackwell Scientific Publications, Oxford. 1994. page 99

Page 64: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

Tips #6 Tips #6 (b)(b) P-value P-value vs.vs. 95%CI 95%CI (4)(4)

Adapted from: Armitage, P. and Berry, G. Statistical methods in medical research. 3rd edition. Blackwell Scientific Publications, Oxford. 1994. page 99

There were statistically significant different between the two groups.

Page 65: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

Tips #6 Tips #6 (b)(b) P-value P-value vs.vs. 95%CI 95%CI (5)(5)

Adapted from: Armitage, P. and Berry, G. Statistical methods in medical research. 3rd edition. Blackwell Scientific Publications, Oxford. 1994. page 99

There were no statistically significant different between the two groups.

Page 66: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

P-value P-value vs.vs. 95%CI 95%CI (4)(4)

Save tips:Save tips:– Always report 95%CI with p-value, NOT report Always report 95%CI with p-value, NOT report

solely p-valuesolely p-value– Always interpret based on the lower or upper Always interpret based on the lower or upper

limit of the confidence interval, p-value can be limit of the confidence interval, p-value can be an optional an optional

– Never interpret p-value > 0.05 as an indication Never interpret p-value > 0.05 as an indication of no difference or no association, only the CI of no difference or no association, only the CI can provide this message.can provide this message.

Page 67: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

Begin at the conclusion Begin at the conclusion Identify the primary research questionIdentify the primary research questionIdentify the primary study outcomeIdentify the primary study outcomeIdentify type of the study outcomeIdentify type of the study outcomeIdentify type of the study designIdentify type of the study designGenerate a mock data setGenerate a mock data setIdentify type of the main statistical goalIdentify type of the main statistical goalList choices of the statistical methodsList choices of the statistical methodsSelect the most appropriate statistical methodSelect the most appropriate statistical methodPerform the data analysis using a softwarePerform the data analysis using a softwareReport and interpret the results from the outputsReport and interpret the results from the outputs

Steps of Statistical ApplicationsSteps of Statistical Applications(Practical guides for beginners)(Practical guides for beginners)

Page 68: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

The outcome determine statisticsThe outcome determine statistics

Continuous Categorical Count Survival

MeanMedian

Proportion(PrevalenceOrRisk)

Rate per “space”

Median survivalRisk of events at T(t)

Linear Reg. Logistic Reg. Poisson Reg. Cox Reg.

Page 69: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

Dependency of the study outcome required Dependency of the study outcome required special statistical methods to handle itspecial statistical methods to handle it

Continuous Categorical Count Survival

MeanMedian

Proportion(PrevalenceOrRisk)

Rate per “space”

Median survivalRisk of events at T(t)

Linear Reg. Logistic Reg. Poisson Reg. Cox Reg.

Mixed model, multilevel model, GEE

Page 70: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

Back to the conclusionBack to the conclusion

Continuous Categorical Count Survival

Magnitude of effect95% CIP-value

Magnitude of effect95% CIP-value

MeanMedian

Proportion(Prevalence or Risk)

Rate per “space”

Median survivalRisk of events at T(t)

Answer the research questionbased on lower or upper limit of the CI

Appropriate statistical methods

Page 71: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

Begin at the conclusion Begin at the conclusion Identify the primary research questionIdentify the primary research questionIdentify the primary study outcomeIdentify the primary study outcomeIdentify type of the study outcomeIdentify type of the study outcomeIdentify type of the study designIdentify type of the study designGenerate a mock data setGenerate a mock data setIdentify type of the main statistical goalIdentify type of the main statistical goalList choices of the statistical methodsList choices of the statistical methodsSelect the most appropriate statistical methodSelect the most appropriate statistical methodPerform the data analysis using a softwarePerform the data analysis using a softwareReport and interpret the results from the outputsReport and interpret the results from the outputs

Steps of Statistical ApplicationsSteps of Statistical Applications(Practical guides for beginners)(Practical guides for beginners)

Page 72: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

Perform the data analysis using Perform the data analysis using a softwarea software

Use the data being generated as if it would Use the data being generated as if it would be after completion of the researchbe after completion of the researchAnalyze according to the analysis planAnalyze according to the analysis planTry to understand the computer output and Try to understand the computer output and to find if the research question has been to find if the research question has been answered:answered:– What is the magnitude of effect and its 95% What is the magnitude of effect and its 95%

confidence interval?confidence interval?– Was the results due to the role of chance?Was the results due to the role of chance?

Page 73: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

Begin at the conclusion Begin at the conclusion Identify the primary research questionIdentify the primary research questionIdentify the primary study outcomeIdentify the primary study outcomeIdentify type of the study outcomeIdentify type of the study outcomeIdentify type of the study designIdentify type of the study designGenerate a mock data setGenerate a mock data setIdentify type of the main statistical goalIdentify type of the main statistical goalList choices of the statistical methodsList choices of the statistical methodsSelect the most appropriate statistical methodSelect the most appropriate statistical methodPerform the data analysis using a softwarePerform the data analysis using a softwareReport and interpret the results from the outputsReport and interpret the results from the outputs

Steps of Statistical ApplicationsSteps of Statistical Applications(Practical guides for beginners)(Practical guides for beginners)

Page 74: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

Writing Results SectionsOutline Sections: Study algorithm Characteristics of the study sample Results of an exploratory analysis to support ways to

answer the RQ Results to answer the RQ Results of an exploratory analysis to know more

about the answer of the RQFollow formats required by the research sponsor or the target journalBest done with SAP – Statistical Analysis Plan Narrated tables or figures with key messages and avoid repetitionsDo not include explanations in Results section

Page 75: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

Report results with purposeReport results with purpose

Refer to the corresponding table or figures early at the beginning of the descriptions

Report sufficient data to allow evaluation of the calculation while Report sufficient data to allow evaluation of the calculation while avoid redundancyavoid redundancy

Document steps of data analysis from which the results were Document steps of data analysis from which the results were transcribedtranscribed

Provide statistical inference for the main findings that are the basis Provide statistical inference for the main findings that are the basis for the conclusionsfor the conclusions

Always report the confidence intervals, p-value can be an optional Always report the confidence intervals, p-value can be an optional – not the main target– not the main target

Tips for Writing Results SectionTips for Writing Results Section

Page 76: Applications of Statistics in Research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen.

Q & AQ & AThank you