Top Banner
Chapter 3: Data Chapter 3: Data Description Description
51
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Chapter 3

Chapter 3: Data Chapter 3: Data DescriptionDescription

Page 2: Chapter 3

Parameter vs. StatisticParameter vs. Statistic

• A statisticstatistic is a characteristic or measure obtained by using the data values from a sample.

• A parameterparameter is a characteristic or measure obtained by using all the data values for a specific population.

Page 3: Chapter 3

Parameter vs. StatisticParameter vs. Statistic

• In statistics Greek letters are used to denote parameters and Roman letters are used to denote statistics.

• Assume that the data are obtained from samples unless otherwise specified.

Page 4: Chapter 3

Measures of Central Tendency: Measures of Central Tendency: MeanMean

• The meanmean is the sum of the values, divided by the total number of values.

• The symbol represents the sample mean

• where n represents the total number of values in the sample.

x

Page 5: Chapter 3

Measures of Central Tendency: Measures of Central Tendency: MeanMean

• For a population the Greek letter is used for the mean.

_________________________________ where N represents the total number of values in the population.

Page 6: Chapter 3

Example: Chief JusticesExample: Chief Justices

• The lengths of service (in years) of eight of the Chief Justices of the Supreme Court are 7, 1, 5, 35, 28, 10, 15, 22. Find the mean.

Page 7: Chapter 3

What Makes the Mean a What Makes the Mean a Center?Center?

1 5 7 10 15 22 28 35

Page 8: Chapter 3

Measures of Central Tendency: Measures of Central Tendency: MeanMean

• The mean should be rounded to one more decimal place that occurs in the raw data.

Page 9: Chapter 3

Measures of Central Tendency: Measures of Central Tendency: MeanMean

• To estimate the mean from a frequency distribution, use the class midpoint to represent each class.

____________________________

Page 10: Chapter 3

Example: Mean Age of 120 Example: Mean Age of 120 StudentsStudents

• Approximate the mean age for students in MAT 120.

Class Frequency( ) Midpoint( ) ______

15 – 19 1620 – 24 3425 – 29 1230 – 34 535 – 39 140 – 44 045 – 49 1

Page 11: Chapter 3

Measures of Central Tendency: Measures of Central Tendency: MedianMedian

• The medianmedian is the midpoint of the data array. To find the median, the data must be arranged in order.

Page 12: Chapter 3

Example: Supreme Court Example: Supreme Court JusticesJustices

• Find the median value for the lengths of service for the sample of Supreme Court Justices 7, 1, 5, 35, 28, 10, 15, 22.

Page 13: Chapter 3

Example: Hospital Example: Hospital SystemSystem

• Example: The number of hospitals for the five largest hospital systems is shown here. Find the median. 340, 75, 123, 259, 151

Page 14: Chapter 3

Measures of Central Tendency: Measures of Central Tendency: ModeMode

• The value that occurs most often in a set of data is called the modemode.

• Find the mode for 5, 6, 2, 4, 2, 3, 6, 4, 1, 2

• A set of data that has two modes is called bimodal.

• A data set may also have no mode.

Page 15: Chapter 3

Example: Birth Month Example: Birth Month DataData• Find the mode for the class birth month data.

Birth Month FrequencyJanuary 4February 3March 4April 5May 6June 3July 11August 9September 7October 5November 6December 6

Page 16: Chapter 3

Measures of Central Tendency: Measures of Central Tendency: ModeMode

• The mode for grouped data is the modal modal classclass. The modal class is the class with the largest frequency.

Age Distribution of MAT 120 StudentsClasses Frequencies15 –19 1620 –24 3425 –29 1230 –34 535 –39 140 –44 045 –49 1

Page 17: Chapter 3

Measures of Central Tendency: Measures of Central Tendency: MidrangeMidrange

• The midrange is defined as the sum of the lowest and highest values in the data set, divided by 2. The symbol MR is used for the midrange.

• _____ = _________________

Page 18: Chapter 3

Example: Midrange of Example: Midrange of AgesAges

• Find the midrange of the student ages for MAT 120. (Recall that the lowest value was 17 and the highest value was 49.)

Page 19: Chapter 3

Measures of Central Tendency: Measures of Central Tendency: Weighted MeanWeighted Mean

• Find the weighted mean of a variable X by multiplying each value by its corresponding weight and dividing the sum of the products by the sum of the weights

__________________________________

Page 20: Chapter 3

Example: Weighted MeanExample: Weighted Mean

• Example: An instructor grades exams 20%; term paper, 30%; final exam, 50%. A student had grades of 83, 72, and 90, respectively, for exams, term paper, and final exam. Find the student’s final average.

Page 21: Chapter 3

Example: Weighted MeanExample: Weighted Mean

• Example: A student has the following grades for the Fall term: MAT 120 (3 hrs), A; BIO 210 (4 hrs), B; HIS 201, 3 (hrs) C; SOC 101 (3 hrs), A; CPT 170 (3 hrs), A. Calculate the student’s GPA for the fall term.

Page 22: Chapter 3

Example: Weighted MeanExample: Weighted Mean• Example: In a dental survey of third grade

students, this distribution was obtained for the number of cavities found. Find the average number of cavities.

Number of Students Number of Cavities12 08 15 25 3

Page 23: Chapter 3
Page 24: Chapter 3

Measures of VariationMeasures of Variation

First Valley Bank 6.5 6.6 6.7 6.8 7.1

7.3

7.4

7.7

7.7

7.7

Bank of the USA 4.2 5.4 5.8 6.2 6.7

7.7

7.7

8.5

9.3

10.0

First Valley Bank of the USA

Mean

Median

Mode

Midrange

Page 25: Chapter 3

Back-to-Back Stem & Back-to-Back Stem & Leaf PlotLeaf Plot

First Valley Bank 6.5 6.6 6.7 6.8 7.1 7.3 7.4 7.7 7.7 7.7

Bank of the USA 4.2 5.4 5.8 6.2 6.7 7.7 7.7 8.5 9.3 10.0

First Valley Bank Bank of the USA

Page 26: Chapter 3

RangeRange• The range is the highest value minus the lowest

value. The symbol R is used for the range.

____________________

• The range is affected by extremely high or low values.

• The range is easy to compute.

• Example: Determine the range for the First Valley Bank and the Bank of USA.

RangeFirst Valley _______

Bank of USA _______

Page 27: Chapter 3

Measures of VariationMeasures of Variation

Student A Student B Student C Student D

75 100 73 60

75 100 74 70

75 100 76 80

75 0 77 90

Mean 75 75 75 75

Range

Standard Deviation

Page 28: Chapter 3

Deriving the Variation and Deriving the Variation and Standard Deviation Standard Deviation

FormulasFormulas

Page 29: Chapter 3

Population Variance &Population Variance & Standard Deviation Standard Deviation

• The variancevariance is the average of the squares of the distance each value is from the mean. The symbol for the population population variancevariance is 22. The formula for the population variance is

______________________________

• The standard deviationstandard deviation is the square root of the variance. The symbol for the population standard deviationpopulation standard deviation is . The formula for the population standard deviation is

____________________.

Page 30: Chapter 3

Sample Variance & Sample Variance & Standard DeviationStandard Deviation

• The formula for the sample variance, denoted by s2 is

_____________• The standard deviation for a sample

is ____=______ =____________

Page 31: Chapter 3

Example:Example:

• Use your calculator to determine the standard deviation and variance for the First Valley Bank and the Bank of USA.

Variance Standard Deviation

• First Valley _______ _________• Bank of USA_______ _________

Page 32: Chapter 3

Finding the Standard Finding the Standard Deviation From a Deviation From a

Frequency DistributionFrequency Distribution• Example: Use your calculator to approximate the

variance and standard deviation for the age for students in MAT 120.

• Class Frequency (___) Midpoint (____) _______

15 – 19 1620 – 24 3425 – 29 1230 – 34 535 – 39 140 – 44 045 – 49 1

Page 33: Chapter 3

Coefficient of VariationCoefficient of Variation

• The coefficient of variation is the standard deviation divided by the mean. It allows one to compare standard deviations when the units are different.

•_________________________

Page 34: Chapter 3

ExampleExample

• The average score on an English final exam was 85, with a standard deviation of 5. The average score on a history final exam was 110 with a standard deviation of 8. Which class was more variable?

Page 35: Chapter 3

Chebyshev’s TheoremChebyshev’s Theorem

• The proportion of values from a data set that will fall within k standard deviations of the mean will be at least

• __________,• where k is a number greater than 1.

Page 36: Chapter 3

Empirical RuleEmpirical Rule

• – For data that is bell-shaped, the following statements make up the Empirical Rule.

• Approximately 68% of the data values will fall within 1 standard deviation of the mean.

• Approximately 95% of the data values will fall within 2 standard deviations of the mean

• Approximately 99.7% of the data values will fall within 3 standard deviations of the mean

Page 37: Chapter 3
Page 38: Chapter 3

Empirical Rule ExampleEmpirical Rule Example

• A study of the number of paid sick days taken per year by employees results in a mound-shaped distribution with a mean of 8.7 and a standard deviation or 3. According to the empirical rule, what percentage of employees were taking between 2.7 and 14.7 paid sick days per year?

Page 39: Chapter 3

ExampleExample

• A bakery makes loaves of rye bread that have an average weight of 28 ounces and a standard deviation of 0.8 ounce. The distribution of weights is mound shaped.

About 95% of the loaves will have weights that lie within what interval?

Page 40: Chapter 3

ExampleExample

• A bakery makes loaves of rye bread that have an average weight of 28 ounces and a standard deviation of 0.8 ounce. The distribution of weights is mound shaped.

Nearly all the loaves will have weights that lie within what interval?

Page 41: Chapter 3

ExampleExample

• A bakery makes loaves of rye bread that have an average weight of 28 ounces and a standard deviation of 0.8 ounce. The distribution of weights is mound shaped.

Approximately what percentage of loaves will weigh more than 28.8 ounces?

Page 42: Chapter 3

ExampleExample

• A taxi company has found that its fares average $7.80 with a standard deviation of $1.40. What can we say about the percentage of fares that are between $5.00 and $10.60 if

• A. The distribution of fares is mound shaped?

Page 43: Chapter 3

• A taxi company has found that its fares average $7.80 with a standard deviation of $1.40. What can we say about the percentage of fares that are between $5.00 and $10.60 if

B. The distribution of fares in not mound shaped?

Page 44: Chapter 3

• Example: A pharmaceutical company manufactures capsules that contain an average of 507 grams of vitamin C. The standard deviation is 3 grams. At least 96 percent of the capsules will contain what amount of vitamin C?

Page 45: Chapter 3

Measures of PositionMeasures of Position

• A z-scorez-score or standard scorestandard score for a value is obtained by subtracting the mean from the value and dividing the result by the standard deviation. The formula is

_________________________

____________ ____________

Page 46: Chapter 3

• The z-scorez-score represents the number of standard deviations that a data value falls above or below the mean.

Page 47: Chapter 3

ExampleExample

• A student scores 60 on a mathematics test that has a mean of 54 and a standard deviation of 3, and she scores 80 on a history test with a mean of 75 and a standard deviation of 2. On which test did she perform better?

Page 48: Chapter 3

PercentilesPercentiles

• Percentiles divide the data set into 100 equal groups.

• The percentile corresponding to a given value X is computed using the following formula

______________________

Page 49: Chapter 3

Find a Data Value Find a Data Value Corresponding to a Given Corresponding to a Given

PercentilePercentile• Arrange the data in order from highest to

lowest• Substitute into the formula __________ where

__________ and ____________• If c is not a whole number, round up to the

next whole number. Starting at the lowest value, count over to the number that corresponds to the rounded-up value.

• If c is a whole number, use the value halfway between the cth and (c + 1)th values when counting up from the lowest value.

Page 50: Chapter 3

ExampleExample

• The data given are weights are in pounds.

78, 82, 86, 88, 92, 97• Find the percentile rank of each weight in

the data set.

• What value corresponds to the 30th percentile?

Page 51: Chapter 3

ExampleExample

• a. Find the percentile rank for each test score in the data set.

12, 28, 35, 42, 47, 49, 50

• What value corresponds to the 60th percentile?