This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Types of Descriptive Types of Descriptive MeasuresMeasures
Measures of central tendencyMeasures of central tendency Measures of variationMeasures of variation Measures of positionMeasures of position Measures of shapeMeasures of shape
The MeanThe MeanThe Mean is simply the average of the dataThe Mean is simply the average of the data
Each value in the sample is represented by x.Each value in the sample is represented by x.
Thus to get the mean simply add all the Thus to get the mean simply add all the values in the sample and divide by the values in the sample and divide by the number of values in the sample (n)number of values in the sample (n)
Each value in the population is Each value in the population is represented by x.represented by x.
Thus to get the population mean Thus to get the population mean (()) simply add all the values in the simply add all the values in the population and divide by the number of population and divide by the number of values in the population (N)values in the population (N)
The Median The Median (Md)(Md) of a set of data is of a set of data is the value in the center of the data the value in the center of the data values when they are arranged values when they are arranged from lowest to highestfrom lowest to highest
The value that has an equal number of The value that has an equal number of items to the right and left is the median items to the right and left is the median
If n is an odd number, If n is an odd number, MdMd is the center is the center data value of the ordered data setdata value of the ordered data set
Md = st ordered valueMd = st ordered valuenn + 1 + 122
The value that has an equal number of The value that has an equal number of items to the right and left is the median items to the right and left is the median
If n is an even number, If n is an even number, MdMd is the average of is the average of the two center values of the ordered data setthe two center values of the ordered data set
The Mode The Mode (Mo)(Mo) of a data set is of a data set is the value that occurs more than the value that occurs more than once and the most oftenonce and the most often
The Mode is not always a The Mode is not always a measure of central tendency; this measure of central tendency; this value need not occur in the value need not occur in the center of the datacenter of the data
Level of Measurement and Level of Measurement and Measure of Central TendencyMeasure of Central Tendency
Summary of levels of measurement and appropriate measure Summary of levels of measurement and appropriate measure of central tendency. A of central tendency. A “Y”“Y” indicates this measure can be indicates this measure can be used with the corresponding level of measurement.used with the corresponding level of measurement.
Homogeneity refers to the degree of Homogeneity refers to the degree of similarity within a set of datasimilarity within a set of data
The more homogeneous a set of The more homogeneous a set of data is, the better the mean will data is, the better the mean will represent a typical valuerepresent a typical value
Variation is the tendency of data Variation is the tendency of data values to scatter about the mean,values to scatter about the mean, xx
Range =Range = H H -- L L = 23 - 5 = 18= 23 - 5 = 18
Rather crude measure but easy to Rather crude measure but easy to calculate and contains valuable calculate and contains valuable information in some situationsinformation in some situations
The Coefficient of VariationThe Coefficient of Variation
The Coefficient of Variation The Coefficient of Variation (CV)(CV) is is used to compare the variation of used to compare the variation of two or more data sets where the two or more data sets where the values of the data differ greatlyvalues of the data differ greatly
Percentile (Quartile)Percentile (Quartile) Most common measure of positionMost common measure of position Quartiles are percentiles with the data Quartiles are percentiles with the data
divided into quartersdivided into quarters
Z-ScoreZ-Score The relative position of a data value The relative position of a data value
expressed in terms of the number of expressed in terms of the number of standard deviations above or below the standard deviations above or below the meanmean
Rule 1:Rule 1: If n If n PP/100/100 is not a counting number, is not a counting number, round it up, and the Pth percentile round it up, and the Pth percentile will be the value in this position of will be the value in this position of the ordered datathe ordered data
Rule 2:Rule 2: If n If n PP/100/100 is a counting number, is a counting number, the Pth percentile is the average of the Pth percentile is the average of the number in this location (of the the number in this location (of the ordered data) and the number in the ordered data) and the number in the next largest locationnext largest location
Aptitude Scores ExampleAptitude Scores ExampleMs. Jensen received a score of Ms. Jensen received a score of 8383 on the on the aptitude test. What is her percentile value?aptitude test. What is her percentile value?
83 is the 45th largest value out of 50.83 is the 45th largest value out of 50.A guess of the percentile would be:A guess of the percentile would be:
P = • 100 = 90P = • 100 = 9045455050
Examining the surrounding values clarifies Examining the surrounding values clarifies the true percentilethe true percentile
PP ((nn • • PP)/100)/100 P P th Percentileth Percentile
QuartilesQuartilesQuartiles are merely particular percentiles Quartiles are merely particular percentiles that divide the data into quarters, namely:that divide the data into quarters, namely:
Z-ScoresZ-Scores Z-score determines the relative position Z-score determines the relative position
of any particular data value x and is of any particular data value x and is based on the mean and standard based on the mean and standard deviation of the data setdeviation of the data set
The Z-score is expresses the number of The Z-score is expresses the number of standard deviations the value x is from standard deviations the value x is from the meanthe mean
A negative Z-score implies that x is to the A negative Z-score implies that x is to the left of the mean and a positive Z-score left of the mean and a positive Z-score implies that x is to the right of the meanimplies that x is to the right of the mean
Standardizing Sample DataStandardizing Sample Data
The process of subtracting the The process of subtracting the mean and dividing by the standard mean and dividing by the standard deviation is referred to as deviation is referred to as standardizing the sample data.standardizing the sample data.
The corresponding z-score is the The corresponding z-score is the standardized score.standardized score.
Kurtosis is a measure of the Kurtosis is a measure of the peakedness of a distributionpeakedness of a distribution
Large values occur when there is a Large values occur when there is a high frequency of data near the high frequency of data near the mean and in the tailsmean and in the tails
The calculation is cumbersome and The calculation is cumbersome and the measure is used infrequentlythe measure is used infrequently
Chebyshev’s InequalityChebyshev’s Inequality1.1. At least At least 75%75% of the data values are between of the data values are between
xx - 2 - 2s and x + s and x + 22s, ors, orAt least At least 75%75% of the data values have a z- of the data values have a z-score value between score value between -2-2 and and 22
3.3. In general, at least In general, at least (1-1/(1-1/kk22) x) x 100%100% of the of the data values lie between x - ks and x data values lie between x - ks and x ++ ks for any kks for any k>1>1
2.2. At least 89% of the data values are between At least 89% of the data values are between x x - 3- 3s and x s and x + 3+ 3s, or s, or At least At least 75%75% of the data values have a z- of the data values have a z-score value between score value between -3-3 and and 33
Empirical RuleEmpirical RuleUnder the assumption of a bell Under the assumption of a bell shaped population:shaped population:
1.1. Approximately Approximately 68%68% of the data values lie of the data values lie between x between x -- s and x s and x ++ s (have z-scores s (have z-scores between between -1-1 and and 11))
2.2. Approximately Approximately 95%95% of the data values lie of the data values lie between x between x -- 22s and x s and x ++ 22s (have z-scores s (have z-scores between between -2-2 and and 22))
3.3. Approximately Approximately 99.7%99.7% of the data values lie of the data values lie between x between x -- 33s and x s and x ++ 33s (have z-scores s (have z-scores between between -3-3 and and 33))
Allied Manufacturing ExampleAllied Manufacturing ExampleIs the Empirical Rule Is the Empirical Rule applicable to this data?applicable to this data?
Probably yes.Probably yes.
Histogram is Histogram is approximately bell approximately bell shaped.shaped.
xx - 2 - 2ss = 10.275 and = 10.275 and xx + 2 + 2ss = 10.3284 = 10.3284
96 of the 100 data values fall between these limits 96 of the 100 data values fall between these limits closely approximating the 95% called for by the closely approximating the 95% called for by the Empirical RuleEmpirical Rule
Class NumberClass Number Class (Age in years)Class (Age in years) FrequencyFrequency
11 20 and under 3020 and under 30 5522 30 and under 4030 and under 40 141433 40 and under 5040 and under 50 9944 50 and under 6050 and under 60 6655 60 and under 7060 and under 70 22
3636
Table 3.4Table 3.4
When raw data are not availableWhen raw data are not available
Estimate Estimate xx by assuming data values are equal to the by assuming data values are equal to the midpoint of their classmidpoint of their class
Grouped DataGrouped DataWhen raw data are not availableWhen raw data are not available
Estimate Estimate ss22 by assuming data values are equal to the by assuming data values are equal to the midpoint of their class and using the normal methodmidpoint of their class and using the normal method
ss22 = =∑∑(each data value)(each data value)22 - ∑(each data value) - ∑(each data value)22//nn
Class Class NumberNumber ClassClass ff mm ff • • mm ff • • mm22
11 20 and under 3020 and under 30 55 2525 125125 3,1253,12522 30 and under 4030 and under 40 1414 3535 490490 17,15017,15033 40 and under 5040 and under 50 99 4545 405405 18,22518,22544 50 and under 6050 and under 60 66 5555 330330 18,15018,15055 60 and under 7060 and under 70 22 6565 130130 8,4508,450
Box plots are graphical representations of Box plots are graphical representations of data sets that illustrate the lowest data data sets that illustrate the lowest data value (value (LL), the first quartile (), the first quartile (QQ11), the median ), the median
((QQ22, MD), the third quartile (, MD), the third quartile (QQ33), the ), the
interquartile range (IQR), and the highest interquartile range (IQR), and the highest data value (data value (HH))