POLS 7000XSTATISTICS IN POLITICAL SCIENCE
CLASS 3BROOKLYN COLLEGE – CUNYSHANG E. HA
Leon-Guerrero and Frankfort-Nachmias, Essentials of Statistics for a Diverse Society
Leon-Guerrero/Frankfort-Nachmias: Essentials of Social Statistics for a Diverse Society© 2012 SAGE Publications
Chapter 3: Measures of Central Tendency The Mode The Median The Mean Finding the Mean in a Frequency
Distribution The Shape of the Distribution Considerations for Choosing a Measure of
Central Tendency Statistics in Practice: Representing
Income
Leon-Guerrero/Frankfort-Nachmias: Essentials of Social Statistics for a Diverse Society© 2012 SAGE Publications
What is a measure of Central Tendency?
• Numbers that describe what is average or typical of the distribution
• You can think of this value as where the middle of a distribution lies.
Leon-Guerrero/Frankfort-Nachmias: Essentials of Social Statistics for a Diverse Society© 2012 SAGE Publications
The Mode• The category or score with the largest
frequency (or percentage) in the distribution.
• The mode can be calculated for variables with levels of measurement that are: nominal, ordinal, or interval-ratio.
Leon-Guerrero/Frankfort-Nachmias: Essentials of Social Statistics for a Diverse Society© 2012 SAGE Publications
The Mode: An Example Example: Number of Votes for
Candidates for Mayor of Camarillo, California. The mode, in this case, gives you the “central” response of the voters: the most popular candidate.Sheriff Tupper – 11,769 votes The Mode:Jessica Fletcher – 39,443 votes “Dr. Seth Hazlett”Dr. Seth Hazlett – 78,331 votes
Leon-Guerrero/Frankfort-Nachmias: Essentials of Social Statistics for a Diverse Society© 2012 SAGE Publications
The Median The score that divides the
distribution into two equal parts, so that half the cases are above it and half below it.
The median is the middle score, or average of middle scores in a distribution.
Leon-Guerrero/Frankfort-Nachmias: Essentials of Social Statistics for a Diverse Society© 2012 SAGE Publications
The Mean
The arithmetic average obtained by adding up all the scores and dividing by the total number of scores.
Leon-Guerrero/Frankfort-Nachmias: Essentials of Social Statistics for a Diverse Society© 2012 SAGE Publications
Formula for the Mean
Where ΣY = sum of all scores N = the number of scores.
Leon-Guerrero/Frankfort-Nachmias: Essentials of Social Statistics for a Diverse Society© 2012 SAGE Publications
An Example Annual per capita carbon dioxide emissions
(metric tons) for n = 10 largest nations in population size Bangladesh 0.3, Brazil 1.8, China 2.3, India
1.2, Indonesia 1.4, Pakistan 0.7, Russia 9.9, U.S. 20.1, Japan 1.4, Nigeria 0.6
Ordered sample: 0.3, 0.6, 0.7, 1.2, 1.4, 1.4, 1.8, 2.3, 9.9, 20.1
Mode = 1.4 Median = (1.4 + 1.4)/2 = 1.4 Mean = (0.3 + 0.6 + 0.7 + …+ 20.1)/10 =
3.97
Leon-Guerrero/Frankfort-Nachmias: Essentials of Social Statistics for a Diverse Society© 2012 SAGE Publications
Calculating the mean with grouped scores
where: f Y = a score multiplied by its frequency
N
YfY
Leon-Guerrero/Frankfort-Nachmias: Essentials of Social Statistics for a Diverse Society© 2012 SAGE Publications
Example: Mean of Grouped Scores
Leon-Guerrero/Frankfort-Nachmias: Essentials of Social Statistics for a Diverse Society© 2012 SAGE Publications
Grouped Data: the Mean & Median
Number of People Age 18 or older living in a U.S. Household in 1996 (GSS 1996)
Number of People Frequency1 1902 3163 544 175 26 2
TOTAL 581
Calculate the median and mean for the grouped frequency below.
Leon-Guerrero/Frankfort-Nachmias: Essentials of Social Statistics for a Diverse Society© 2012 SAGE Publications
Shape of the Distribution Symmetrical (mean is about equal
to median) Skewed
Negatively (example: years of education)
mean < median Positively (example: income)
mean > median Bimodal (two distinct modes) Multi-modal (more than 2 distinct
modes)
Leon-Guerrero/Frankfort-Nachmias: Essentials of Social Statistics for a Diverse Society© 2012 SAGE Publications
Distribution Shape
Leon-Guerrero/Frankfort-Nachmias: Essentials of Social Statistics for a Diverse Society© 2012 SAGE Publications
Considerations for Choosing a Measure of Central Tendency
For a nominal variable, the mode is the only measure that can be used.
For ordinal variables:-Use the mode to show what is the most common value in the distribution.-Use the median to show which value is located exactly in the middle of the distribution.
For interval-ratio variables, the mode, median, and mean may all be calculated. The mean provides the most information about the distribution, but the median is preferred if the distribution is skewed.
Leon-Guerrero/Frankfort-Nachmias: Essentials of Social Statistics for a Diverse Society© 2012 SAGE Publications
Central Tendency
Leon-Guerrero/Frankfort-Nachmias: Essentials of Social Statistics for a Diverse Society© 2012 SAGE Publications
Chapter 4: Measures of Variability The Importance of Measuring Variability The Range The Inter-Quartile Range The Variance and the Standard Deviation Considerations for Choosing a Measure of
Variation Reading the Research Literature:
Differences in College Aspirations and Expectations Among Latino Adolescents
Leon-Guerrero/Frankfort-Nachmias: Essentials of Social Statistics for a Diverse Society© 2012 SAGE Publications
The Importance of Measuring Variability
Central tendency - Numbers that describe what is typical or average (central) in a distribution
Measures of Variability - Numbers that describe diversity or variability in the distribution. These two types of measures together help us to sum up a distribution of scores without looking at each and every score. Measures of central tendency tell you about typical (or central) scores. Measures of variation reveal how far from the typical or central score that the distribution tends to vary.
Notice that both distributions have the same mean, yet they are shaped differently
Leon-Guerrero/Frankfort-Nachmias: Essentials of Social Statistics for a Diverse Society© 2012 SAGE Publications
The Range
Range = highest score - lowest score
Range – A measure of variation in interval-ratio variables. It is the difference between the highest (maximum) and the lowest (minimum) scores in the distribution.
Leon-Guerrero/Frankfort-Nachmias: Essentials of Social Statistics for a Diverse Society© 2012 SAGE Publications
Percentiles A score below which a specific
percentage of the distribution falls. For example, the 75th percentile
is a score that divides the distribution so that 75% of the cases are below it.
For example, the 25th percentile is a score that divides the distribution so that 25% of the cases are below it.
Leon-Guerrero/Frankfort-Nachmias: Essentials of Social Statistics for a Diverse Society© 2012 SAGE Publications
Inter-Quartile Range Inter-Quartile Range (IQR) – A
measure of variation for interval-ratio data. It indicates the width of the middle 50 percent of the distribution and is defined as the difference between the lower and upper quartiles (Q1 and Q3.)
IQR = Q3 – Q1
Q3 = 75th percentile Q1 = 25th percentile
Leon-Guerrero/Frankfort-Nachmias: Essentials of Social Statistics for a Diverse Society© 2012 SAGE Publications
The Difference Between the Range and IQR
Shows greater variability
These values fall together closely
Yet the ranges are equal!
Importance of the IQR
Leon-Guerrero/Frankfort-Nachmias: Essentials of Social Statistics for a Diverse Society© 2012 SAGE Publications
Variance• Variance – A measure of
variation for interval-ratio variables; it is the average of the squared deviations from the mean
1
)(2
2
Ns
YYY
Leon-Guerrero/Frankfort-Nachmias: Essentials of Social Statistics for a Diverse Society© 2012 SAGE Publications
Standard Deviation• Standard Deviation – A measure
of variation for interval-ratio variables; it is equal to the square root of the variance.
1
)(2
2
N
ssYY
Yy
Leon-Guerrero/Frankfort-Nachmias: Essentials of Social Statistics for a Diverse Society© 2012 SAGE Publications
Finding the Mean
Leon-Guerrero/Frankfort-Nachmias: Essentials of Social Statistics for a Diverse Society© 2012 SAGE Publications
Finding the Standard Deviation
Leon-Guerrero/Frankfort-Nachmias: Essentials of Social Statistics for a Diverse Society© 2012 SAGE Publications
Considerations for Choosing a Measure of Variability
For nominal variables, you can only use IQV (Index of Qualitative Variation) Not discussed in this class!
For ordinal variables, you can calculate the IQV or the IQR (Inter-Quartile Range). Though, the IQR provides more information about the variable.
For interval-ratio variables, you can use IQV, IQR, or variance/standard deviation. The standard deviation (also variance) provides the most information, since it uses all of the values in the distribution in its calculation.