N318b Winter 2002 Nursing Statistics Lecture 3 : Data presentation, distributions and data handling NOTE: Normal distribution deferred to next week
Dec 27, 2015
N318b Winter 2002 Nursing Statistics
Lecture 3: Data presentation,
distributions and data handling
NOTE: Normal distribution deferred to next week
Nur 318b 2002 Lecture 3: page 2
School ofNursing
Institute for Work & Health
Today’s Class(es) Data presentations
bar graphs, pie charts, histograms, lines Skewness and kurtosis
<< 10 min break >> Applying knowledge to assigned readings
(Kilpack et al.; Paulsen & Altmaier)
focuses on preparing a histogram and interpreting the normal distribution
Followed by small groups from 12-2 PM
Nur 318b 2002 Lecture 3: page 3
School ofNursing
Institute for Work & Health
A Quick Review from Last Week
Measures of Central TendencyMeanMedianMode
DispersionStandard deviationRange
Nur 318b 2002 Lecture 3: page 4
School ofNursing
Institute for Work & Health
meaning of this premise takes on additional value in research studies where you are usually VERY strictly limited in publication spacee.g. 2500 words, 3 tables OR figures
Presentation of Data
What are the major options available for descriptive data presentations?
“a picture tells a thousand words”
Nur 318b 2002 Lecture 3: page 5
School ofNursing
Institute for Work & Health
Types of presentations
Know defining features and advantages/disadv
Bar graphs Pie charts Histograms Polygons ( and line graphs) Box plots
Nur 318b 2002 Lecture 3: page 6
School ofNursing
Institute for Work & Health
Why use figures? Visual impressions easier to convey “Meaning” of data usually more apparent Create a stronger, longer lasting impression
“Most effective way to describe, explore, and summarize a set of numbers”
Class textbook pg. 8
When are tables a better choice?
Nur 318b 2002 Lecture 3: page 7
School ofNursing
Institute for Work & Health
Most appropriate for comparing categories within nominal or ordinal data – e.g. countries
Bar Graphs - notes
Usually used for comparing distinct groups Good for comparing %’s across groups Gaps left between bars used to indicate
non-numeric nature of underlying data
Nur 318b 2002 Lecture 3: page 8
School ofNursing
Institute for Work & Health
Bar Graphs – example 1
30.825.3
43.9
0
10
20
30
40
50
Pain status
None/Low
Moderate
High
Percent
Back and/or neck pain frequency in past week
• 1 in 4 nurses have pain most or all of the time
Nur 318b 2002 Lecture 3: page 9
School ofNursing
Institute for Work & Health
Bar Graphs – example 2Horizontal bar chart
Nur 318b 2002 Lecture 3: page 10
School ofNursing
Institute for Work & Health
Bar Graphs – example 3
Multi-group or cluster bar chart
Nur 318b 2002 Lecture 3: page 11
School ofNursing
Institute for Work & Health
Most appropriate for comparing only a few categories within nominal or ordinal data
Pie Charts - notes
Similar to bar chart for usage Gives relative impression of group size Good for comparing %’s across groups Sometimes less intuitive for readers Can sometimes be “exploded”
Nur 318b 2002 Lecture 3: page 12
School ofNursing
Institute for Work & Health
Pie Charts – example 1
13%
17%
57%
13%
1st Qtr 2nd Qtr 3rd Qtr 4th Qtr
Basic pie chart
Sales by Quarter
Nur 318b 2002 Lecture 3: page 13
School ofNursing
Institute for Work & Health
Pie Charts – example 2
13%
17%
57%
13%
70%
1st Qtr 2nd Qtr 3rd Qtr 4th Qtr
Exploded pie chart
Sales by Quarter
Nur 318b 2002 Lecture 3: page 14
School ofNursing
Institute for Work & Health
Most appropriate for examining distribution of interval or ratio data – e.g. age groups, BP
Histograms - notes
Usually used for illustrating shape of underlying distribution (e.g. skewed?)
Histograms have a total area of 100% Bars typically placed side by side to show
numeric (continuous) nature of data Selecting number of bars and ratio of
width to height are important points
Nur 318b 2002 Lecture 3: page 15
School ofNursing
Institute for Work & Health
Histograms – example 1
Choosing size of groups makes a clear difference !
Usually choose a convenient group size (e.g 5 or 10)
Nur 318b 2002 Lecture 3: page 16
School ofNursing
Institute for Work & Health
Histograms – example 2For continuous data usually write data labels to the side
How would you describe these data?
Nur 318b 2002 Lecture 3: page 17
School ofNursing
Institute for Work & Health
Most appropriate for ratio or interval data when comparing shapes of two distributions
Line graphs - Polygons
Can sometimes give better visual impression of distribution (i.e. smoother)
Good for comparing %’s distributions between two (or three) samples
Area under curve same as for histogram (i.e. 100% of sample)
Nur 318b 2002 Lecture 3: page 18
School ofNursing
Institute for Work & Health
Polygons – example 1Histograms and polygons are really just the same thing
Nur 318b 2002 Lecture 3: page 19
School ofNursing
Institute for Work & Health
Polygons – example 2Polygons are especially useful when comparing distributions
Nur 318b 2002 Lecture 3: page 20
School ofNursing
Institute for Work & Health
Measures of Skew
Mean
Median
Mode
This sample is “left” skewed as the mean < median
In previous class we saw that mean, median and mode of “real” data points rarely align perfectly and the degree of asymmetry refers to skew
Nur 318b 2002 Lecture 3: page 21
School ofNursing
Institute for Work & Health
Measures of Skew
Puts a numeric value to asymmetry Range for measure about +/- 1 SD Values of +/- 0.2 indicate skew, with more
extreme values indicating more skew
Pearson’s skewness coefficient
Skewness = (mean – median)/SD
Nur 318b 2002 Lecture 3: page 22
School ofNursing
Institute for Work & Health
Measures of SkewSample calculation – marks from first assignment
MEAN Median SD
Before removing zeros
8.5 9 1.92
After removing zeros
8.9 9 0.67
Skew
- 0.26
- 0.15
Nur 318b 2002 Lecture 3: page 23
School ofNursing
Institute for Work & Health
What if data are Skewed? Sometimes removing influential
observations is enough (marks example) Transformations may be required:
moderate skew – square root
substantial skew – log transformation
severe skew – inverse transformation Negative skewness corrected first by
“reflecting” data (i.e. create mirror images) Dichotomizing (Y/N) possible (easiest!)
Any problems with transformations?
Nur 318b 2002 Lecture 3: page 24
School ofNursing
Institute for Work & Health
Measure of Kurtosis
While skew refers to whether the distribution “leans” to one side or not, kurtosis refers to how “bell-shaped” it is
Indicates whether distribution is too peaked or too flat
No easy calculation (use computer) Values of zero indicate bell-shape Large +’ve values – data too peaked Large -’ve values – data too flat Done only if skew measure is OK
Nur 318b 2002 Lecture 3: page 25
School ofNursing
Institute for Work & Health
Example – marks from first assignment
MEAN Median SD Skew
Before removing zeros
8.5 9 1.92 - 0.26
After removing zeros
8.9 9 0.67 - 0.15
Measure of Kurtosis
Kurtosis
14.6
- 0.14
Nur 318b 2002 Lecture 3: page 26
School ofNursing
Institute for Work & Health
10 minute break !
Nur 318b 2002 Lecture 3: page 27
School ofNursing
Institute for Work & Health
Graphical display using descriptive statistics based on percentiles (box size=IQR)
Box plots - notes
Usually used for comparing several groups Good for identifying outlying values Line in box indicates median (at centre only
if data are evenly distributed) Box ends are 25th and 75th %’iles (IQR) “Whisker” ends are min/max values that are
not outliers
Nur 318b 2002 Lecture 3: page 28
School ofNursing
Institute for Work & Health
Box plots – example 1
Extreme Outlier = 3 x IQR from box edge
Minor Outlier = 1.5 x IQR from box edge
Lower whisker (farthest non-outlier from edge)
IQR = 75th-25th %’iles
Nur 318b 2002 Lecture 3: page 29
School ofNursing
Institute for Work & Health
Box plots – example 2Box plots give a nice visual summary when comparing group distributions but typically less intuitive than histograms
Nur 318b 2002 Lecture 3: page 30
School ofNursing
Institute for Work & Health
Dealing with Outliers
First need to identify them (usually using the box plot method)
Are they “real” or errors? If “real” then can do sensitivity-
type analysis with/without them to show influence on final resultse.g. trimmed mean
Nur 318b 2002 Lecture 3: page 31
School ofNursing
Institute for Work & Health
Dealing with Missing Data
First need to identify reasons Design flaw – e.g. confusing or
sensitive questions Is pattern random (OK) or
systematic (more of a problem) Options include listwise or
pairwise deletions or imputations (see page 54-55)
Nur 318b 2002 Lecture 3: page 32
School ofNursing
Institute for Work & Health
Part 2: Application to the
Assigned Readings
Nur 318b 2002 Lecture 3: page 33
School ofNursing
Institute for Work & Health
Kilpack et al. (1991)
Quick summary of the paper: – an intervention study aimed at decreasing falls in the elderly within medical-surgical specialty units– used rest of hospital as the control group– pre / post quasi-experimental design
Nur 318b 2002 Lecture 3: page 34
School ofNursing
Institute for Work & Health
Some questions …How would you describe this distribution?
What might influence these numbers (i.e. bias)?
Nur 318b 2002 Lecture 3: page 35
School ofNursing
Institute for Work & Health
Paulsen & Almaier (1995)
Quick summary of the paper: – examined effects of spousal support on pain behaviour– used chronic LBP patients and spouses obtained via a chronic pain clinic– looked at pain behaviours with and without spouse being present– observed a significant spouse effect in both positive and negative driections
Nur 318b 2002 Lecture 3: page 36
School ofNursing
Institute for Work & Health
Q1. Which gives you a stronger impression - Table III or Figure 1?
Demonstrates the interpretation advantage gained by graphing data !
A question …
Nur 318b 2002 Lecture 3: page 37
School ofNursing
Institute for Work & Health
Next Week - Lecture 4: Normal curve, probability, statistical inference, and
hypothesis testing
For next week’s class please review:1. Page 14 in syllabus2. Textbook Chapter 3, pages 63-80
NO group work next week ! !
Nur 318b 2002 Lecture 3: page 38
School ofNursing
Institute for Work & Health
“In Group”Session – Q#1:
Please note:There is an error in the syllabus – mean value should be 92.7 not 90.9
Did not cover last assignment question in lecture but please do it anyway …
Nur 318b 2002 Lecture 3: page 39
School ofNursing
Institute for Work & Health
The (Standard) Normal Curve
Nur 318b 2002 Lecture 3: page 40
School ofNursing
Institute for Work & Health
Nur 318b 2002 Lecture 3: page 41
School ofNursing
Institute for Work & Health
The Normal Distribution