Slide 1 Lecture by Dr Zahid Khan King Faisal University,KSA. 1 Measures of dispersion
Slide 1
Lecture by Dr Zahid Khan King Faisal University,KSA.
1
Measures of dispersion
Slide 2 2
Learning Objectives
• Calculate common measures of dispersion from grouped and ungrouped data (including the range, interquartile range, mean deviation, and standard deviation)
• Calculate and interpret the coefficient of variation
Measures of dispersion
Slide 3
Central tendency measures do not reveal the variability present in the data.
Dispersion is the scatteredness of the data series around it average.
Dispersion is the extent to which values in a distribution differ from the average of the distribution.
What is measures of dispersion? (Definition)
Slide 4
Determine the reliability of an average
Serve as a basis for the control of the variability
To compare the variability of two or more series and
Facilitate the use of other statistical measures.
Why we need measures of dispersion? (Significance)
Slide 5
Dispersion Example
Number of minutes 20 clients waited to see a consulting doctor
Consultant Doctor X Y 05 15 15 16 12 03 12 18 04 19 15 14 37 11 13 17 06 34 11 15
X:Mean Time – 14.6 minutes
Y:Mean waiting time 14.6 minutes
What is the difference in the two series?
X: High variability, Less consistency.Y: Low variability, More Consistency
Slide 6
1. It should be rigidly defined.
2. It should be easy to understand and easy to calculate.
3. It should be based on all the observations of the data.
4. It should be easily subjected to further mathematical treatment.
5. It should be least affected by the sampling fluctuation .
6. It should not be unduly affected by the extreme values.
Characteristics of an Ideal Measure of Dispersion
Slide 77
Measures of Dispersion
There are three main measures of dispersion:– The range– The Interquartile range (IQR)– Variance / standard deviation
Slide 8
Measures of dispersion
Range The range is defined as the difference between the largest
score in the set of data and the smallest score in the set of data, XL - XS
• sensitive to extreme scores;• compensate by calculating interquartile range (distance between
the 25th and 75th percentile points) which represents the range of scores for the middle half of a distribution
Usually used in combination with other measures of dispersion.
Slide 9
Range
Source: www.animatedsoftware.com/ statglos/sgrange.htm
Slide 10 10
Interquartile range
Measures the range of the middle 50% of the values only Is defined as the difference between the upper and lower quartiles
Interquartile range = upper quartile - lower quartile
= Q3 - Q1
Slide 1111
The Semi-Interquartile Range
The semi-interquartile range (or SIR) is defined as the difference of the first and third quartiles divided by two– The first quartile is the 25th percentile– The third quartile is the 75th percentile
SIR = (Q3 - Q1) / 2
Slide 1212
SIR Example What is the SIR for the
data to the right? 25 % of the scores are
below 5– 5 is the first quartile
25 % of the scores are above 25– 25 is the third quartile
SIR = (Q3 - Q1) / 2 = (25 - 5) / 2 = 10
246
5 = 25th %tile
81012142030
25 = 75th %tile
60
Slide 1313
When To Use the SIR
The SIR is often used with skewed data as it is insensitive to the extreme scores
Slide 14 14
The mean deviation
Measures the ‘average’ distance of each observation away from the mean of the data
Gives an equal weight to each observation
Generally more sensitive than the range or interquartile range, since a change in any value will affect it
Slide 15 15
Actual and absolute deviations from mean
A set of x values has a mean of
The residual of a particular x-value is:
Residual or deviation = x -
The absolute deviation is:
x
x
x-x
Slide 16 16
Mean deviation
The mean of the absolute deviations
n
xxdeviationMean
Slide 17 17
To calculate mean deviation1. Calculate mean of data Find
2. Subtract mean from each
observation Record the differences
For each x, find xx
3. Record absolute value of each residual
Find xx for each x
4. Calculate the mean of the absolute values
n
xxdeviationMean
Add up absolute values and divide by n
Slide 18 18
The standard deviation Measures the variation of observations from
the mean The most common measure of dispersion Takes into account every observation Measures the ‘average deviation’ of
observations from mean Works with squares of residuals not absolute
values—easier to use in further calculations
Slide 19 19
Standard deviation of a population δ
Every observation in the population is used.
The square of the population standard deviation is called the variance.
n
xxδdeviationdardtanS
2
2δVariance
Slide 20 20
Standard deviation of a sample s
In practice, most populations are very large and it is more common to calculate the sample standard deviation.
Where: (n-1) is the number of observations in the sample
1
2
n
xxsdeviationdardtansSample
Slide 21 21
To calculate standard deviation1. Calculate the mean x
2. Calculate the residual for each x
xx
3. Square the residuals 2)( xx
4. Calculate the sum of the squares 2 xx
5. Divide the sum in Step 4 by (n-1)
1
2
n
xx
6. Take the square root of quantity in Step 5
1
2
n
xx
Slide 22
Uses of the standard deviation
– The standard deviation enables us to determine, with a great deal of accuracy, where the values of a frequency distribution are located in relation to the mean. We can do this according to a theorem devised by the Russian mathematician P.L. Chebyshev (1821-1894).
Uses of Standard deviation
Slide 23 23
Coefficient of variation
Is a measure of relative variability used to:– measure changes that have occurred in a population over time– compare variability of two populations that are expressed in different
units of measurement– expressed as a percentage rather than in terms of the units of the
particular data
Slide 24 24
Formula for coefficient of variation
%100
x
sV
Denoted by V
where = the mean of the sample s = the standard deviation of the sample
x
Slide 25 25
Summary
Measures of Dispersion– no ideal measure of dispersion exists
standard deviation is the most important measure of Dispersion.
• it is the most frequently used• the value is affected by the value of every observation in the data• extreme values in the population may distort the data
Slide 26
REFERENCE
1. Mathematical Statistics- S.P Gupta
2. Statistics for management- Richard I. Levin, David S. Rubin
3. Biostatistics A foundation for Analysis in the Health Sciences.
Slide 27
THANK YOU