Measures of Dispersion, Skewness, and Kurtosis
• Descriptive summary measure
• Helps characterize data
• Variation of observations
• Determine degree of dispersion of observations about the
center of the distribution
• Absolute dispersion
• Same unit as the observations
• Relative dispersion
• No unit
• Measures of dispersion cannot be negative
• Smallest possible value is zero
• Simplest and easiest to use
• Difference between the highest and the lowest observation
𝑹𝒂𝒏𝒈𝒆 = 𝒎𝒂𝒙𝒊𝒎𝒖𝒎 − 𝒎𝒊𝒏𝒊𝒎𝒖𝒎
𝑹𝒂𝒏𝒈𝒆 = 𝑼𝑪𝑳𝑯𝑪𝑰 − 𝑳𝑪𝑳𝑳𝑪𝑰
Disadvantages
• Description of data is not
comprehensive
• Affected by outliers
• Smaller for small samples;
larger for large samples
• Cannot be computed when
there is an open-ended class
interval
Advantages
• Simple
• Easy to compute
• Easy to understand
• Describe variation of the measurements
• Average squared difference of each observation from the
mean
• May also be used as a measure of how good the mean is as a
measure of central tendency
• Unit of the variance is the squared unit of the observations
• People tend to use standard deviation for easier interpretation
Population Variance
• Denoted by σ2
• N elements
• Parameter
• Cannot be computed using
sample data
Sample Variance
• Denoted by s2
• n elements
• Statistic
• Estimate value of the
population variance
• Utilizes every observation
• Affected by outliers; extreme values make the standard
deviation bloated
• Cannot be computed when there are open-ended intervals
• Addition or subtraction of a constant c to each observation
would yield the same standard deviation as the original data
set
• Multiplication or division of each observation by a constant
would result in a standard deviation multiplied by or divided by
the constant
• Compare variability of two or more data sets even if they have
different means or different units of measurement
• Ratio of the standard deviation to the mean, expressed as a
percentage (denoted by CV)
• Small CV means less variability; large CV means greater
variability
• Not to be used when mean is 0 or negative
• A sample survey in a certain province showed the number of
underweight children under five years of age in each
barangay: 3 5 6 4 7 8 6 9 10 4 6 7 5 8 9 8 3 4 5 5
• Given the frequency distribution table of scores
The number of incorrect answers on
a true-false exam for a random
sample of 20 students was
recorded as follows: 2, 1, 3, 2,
3, 2, 1, 3, 0, 1, 3, 6, 0, 3, 3,
5, 2, 1, 4, and 2.
Given the frequency distribution of
scores of 200 students in an
entrance exam in college.
Scores Freq. <CFD LCB UCB
59 – 62 2 2 58.5 62.5
63 – 66 12 14 62.5 66.5
67 – 70 24 38 66.5 70.5
71 – 74 46 84 70.5 74.5
75 – 78 62 146 74.5 78.5
79 – 82 36 182 78.5 82.5
83 – 86 16 198 82.5 86.5
87 – 90 2 200 86.5 90.5
• Relying solely on the mean and standard deviation may be
misleading
• Possible for two data sets to have same mean and standard
deviations, yet different shapes
• If it is possible to divide the histogram at the center into two
identical halves where each half is a mirror image of the other,
then the distribution is symmetric. Otherwise, it is skewed.
Positively Skewed
• Skewed to the right
• Values concentrated at the
left
• Upper tail stretches out more
than the lower tail
Negatively Skewed
• Skewed to the left
• Values concentrated to the
right
• Lower tail stretches out more
than the upper tail
• Single value that indicates the degree and direction of
asymmetry
• Denoted by Sk
Sk = 0 Symmetric
Sk > 0 Positively skewed
Sk < 0 Negatively skewed
• To determine degree of skewness, use |Sk| (magnitude of Sk)
• If |Sk| is far from 0, then it is an indication that the distribution
is seriously skewed
• Most commonly used measures
• Pearson’s first and second coefficients of skewness
• Coefficient of skewness based on third moment
• Coefficient of skewness based on the quartiles
• Relationships among the mean, median, and mode as basis
• Signs of the measures depend only on the sign of the numerator
because S is not negative
• Problems with Pearson’s first coefficient of skewness
associated with problems of using the mode
• Based on the definition of quartiles
• Around 25 percent fall between Q1 and the median
• Around 25 percent fall between the median and Q3
• Symmetric distribution distance between Q1 and Md =
distance between Md and Q3
• Skewed distribution
• Positively skewed Md is closer to Q1
• Negatively skewed Md closer to Q3
• Term coined by Karl Pearson
• Greek word kurtos which means convex
• Shape of a hump of a relative frequency distribution compared
to the normal distribution
• Three classifications
• Mesokurtic
• Leptokurtic
• Platykurtic
1. Construct a rectangle with one end at the first quartile and the
other end at the third quartile.
2. Put a vertical line at the median, across the interior of the
rectangle.
3. Compute for the inter-quartile range, lower fence, and upper
fence.
4. Locate smallest and largest values within the intervals [FL , Q1]
and [Q3, FU], respectively. Draw a line from these values to
the quartiles.
5. Values falling outside the fences are considered outliers,
denoted by “x”.
Definition
• Population Variance
𝜎2 = (𝑋𝑖 − 𝜇)2𝑁
𝑖=1
𝑁
• Sample Variance
𝑆2 = (𝑋𝑖 − 𝑋 )2𝑛
𝑖=1
𝑛 − 1
Computational Formula
• Population Variance
𝜎2 =𝑁 𝑋𝑖
2𝑁𝑖=1 − ( 𝑋𝑖
𝑁𝑖=1 )2
𝑁2
• Sample Variance
𝑆2 =𝑛 𝑋𝑖
2𝑛𝑖=1 − ( 𝑋𝑖
𝑛𝑖=1 )2
𝑛(𝑛 − 1)
Definition
• Population Variance
𝜎2 = 𝑓𝑖(𝑋𝑖 − 𝜇)2𝑘
𝑖=1
𝑁
• Sample Variance
𝑆2 = 𝑓𝑖(𝑋𝑖 − 𝑋 )2𝑘
𝑖=1
𝑛 − 1
Computational Formula
• Population Variance
𝜎2 =𝑁 𝑓𝑖𝑋𝑖
2𝑘𝑖=1 − ( 𝑓𝑖𝑋𝑖
𝑘𝑖=1 )2
𝑁2
• Sample Variance
𝑆2 =𝑛 𝑓𝑖𝑋𝑖
2𝑘𝑖=1 − ( 𝑓𝑖𝑋𝑖
𝑘𝑖=1 )2
𝑛(𝑛 − 1)
Population CV
𝐶𝑉 = 𝜎
𝜇× 100%
• Where
• σis the population standard deviation
• μis the population mean
Sample CV
𝐶𝑉 = 𝑠
𝑋 × 100%
• Where
• s is the sample standard deviation
• 𝑋 is the sample mean
First Coefficient of Skewness
𝑆𝑘1 =𝑋 − 𝑀𝑜
𝑆
Second Coefficient of Skewness
𝑆𝑘2 =3(𝑋 − 𝑀𝑑)
𝑆
Where 𝑋 = sample mean; Md= sample median; Mo = sample mode;
S = sample standard deviation