Measures of dispersion

Slide 1

Lecture by Dr Zahid Khan King Faisal University,KSA.

1

Measures of dispersion

Slide 2 2

Learning Objectives

• Calculate common measures of dispersion from grouped and ungrouped data (including the range, interquartile range, mean deviation, and standard deviation)

• Calculate and interpret the coefficient of variation


Slide 3

Central tendency measures do not reveal the variability present in the data.

Dispersion is the scatteredness of the data series around it average.

Dispersion is the extent to which values in a distribution differ from the average of the distribution.

What is measures of dispersion? (Definition)

Slide 4

Determine the reliability of an average

Serve as a basis for the control of the variability

To compare the variability of two or more series and

Facilitate the use of other statistical measures.

Why we need measures of dispersion? (Significance)

Slide 5

Dispersion Example

Number of minutes 20 clients waited to see a consulting doctor

Consultant Doctor X Y 05 15 15 16 12 03 12 18 04 19 15 14 37 11 13 17 06 34 11 15

X:Mean Time – 14.6 minutes

Y:Mean waiting time 14.6 minutes

What is the difference in the two series?

X: High variability, Less consistency.Y: Low variability, More Consistency

Slide 6

1. It should be rigidly defined.

2. It should be easy to understand and easy to calculate.

3. It should be based on all the observations of the data.

4. It should be easily subjected to further mathematical treatment.

5. It should be least affected by the sampling fluctuation .

6. It should not be unduly affected by the extreme values.

Characteristics of an Ideal Measure of Dispersion

Slide 77

Measures of Dispersion

There are three main measures of dispersion:– The range– The Interquartile range (IQR)– Variance / standard deviation

Slide 8


Range The range is defined as the difference between the largest

score in the set of data and the smallest score in the set of data, XL - XS

• sensitive to extreme scores;• compensate by calculating interquartile range (distance between

the 25th and 75th percentile points) which represents the range of scores for the middle half of a distribution

Usually used in combination with other measures of dispersion.

Slide 9

Range

Source: www.animatedsoftware.com/ statglos/sgrange.htm

http://www.animatedsoftware.com/statglos/sgrange.htm



http://www.animatedsoftware.com/pics/stats/sgrange.gif

Slide 10 10

Interquartile range

Measures the range of the middle 50% of the values only Is defined as the difference between the upper and lower quartiles

Interquartile range = upper quartile - lower quartile

= Q3 - Q1

Slide 1111

The Semi-Interquartile Range

The semi-interquartile range (or SIR) is defined as the difference of the first and third quartiles divided by two– The first quartile is the 25th percentile– The third quartile is the 75th percentile

SIR = (Q3 - Q1) / 2

Slide 1212

SIR Example What is the SIR for the

data to the right? 25 % of the scores are

below 5– 5 is the first quartile

25 % of the scores are above 25– 25 is the third quartile

SIR = (Q3 - Q1) / 2 = (25 - 5) / 2 = 10

246

5 = 25th %tile

81012142030

25 = 75th %tile

60

Slide 1313

When To Use the SIR

The SIR is often used with skewed data as it is insensitive to the extreme scores

Slide 14 14

The mean deviation

Measures the ‘average’ distance of each observation away from the mean of the data

Gives an equal weight to each observation

Generally more sensitive than the range or interquartile range, since a change in any value will affect it

Slide 15 15

Actual and absolute deviations from mean

A set of x values has a mean of

The residual of a particular x-value is:

Residual or deviation = x -

The absolute deviation is:

x

x

x-x

Slide 16 16

Mean deviation

The mean of the absolute deviations

n

xxdeviationMean

Slide 17 17

To calculate mean deviation1. Calculate mean of data Find

2. Subtract mean from each

observation Record the differences

For each x, find xx

3. Record absolute value of each residual

Find xx for each x

4. Calculate the mean of the absolute values

n

xxdeviationMean

Add up absolute values and divide by n

Slide 18 18

The standard deviation Measures the variation of observations from

the mean The most common measure of dispersion Takes into account every observation Measures the ‘average deviation’ of

observations from mean Works with squares of residuals not absolute

values—easier to use in further calculations

Slide 19 19

Standard deviation of a population δ

Every observation in the population is used.

The square of the population standard deviation is called the variance.

n

xxδdeviationdardtanS

2

2δVariance

Slide 20 20

Standard deviation of a sample s

In practice, most populations are very large and it is more common to calculate the sample standard deviation.

Where: (n-1) is the number of observations in the sample

1

2

n

xxsdeviationdardtansSample

Slide 21 21

To calculate standard deviation1. Calculate the mean x

2. Calculate the residual for each x

xx

3. Square the residuals 2)( xx

4. Calculate the sum of the squares 2 xx

5. Divide the sum in Step 4 by (n-1)

1

2

n

xx

6. Take the square root of quantity in Step 5

1

2

n

xx

Slide 22

Uses of the standard deviation

– The standard deviation enables us to determine, with a great deal of accuracy, where the values of a frequency distribution are located in relation to the mean. We can do this according to a theorem devised by the Russian mathematician P.L. Chebyshev (1821-1894).

Uses of Standard deviation

Slide 23 23

Coefficient of variation

Is a measure of relative variability used to:– measure changes that have occurred in a population over time– compare variability of two populations that are expressed in different

units of measurement– expressed as a percentage rather than in terms of the units of the

particular data

Slide 24 24

Formula for coefficient of variation

%100

x

sV

Denoted by V

where = the mean of the sample s = the standard deviation of the sample

x

Slide 25 25

Summary

Measures of Dispersion– no ideal measure of dispersion exists

standard deviation is the most important measure of Dispersion.

• it is the most frequently used• the value is affected by the value of every observation in the data• extreme values in the population may distort the data

Slide 26

REFERENCE

1. Mathematical Statistics- S.P Gupta

2. Statistics for management- Richard I. Levin, David S. Rubin

3. Biostatistics A foundation for Analysis in the Health Sciences.

Slide 27

THANK YOU

Measures of dispersion

Health & Medicine