Jan 27, 2015
IBS Statistics Year 1
What we are going to learn?
• Review
• Chapter 3: Dispersion• Range• Variance (SD2)• Standard Deviation (SD)• Coefficient of variation (CV)
• Chapter 4: Displaying and exploring data• Dotplot• Stem-leaf• Boxplot• Skewness
ReviewDiscrete counting or Continuous measuring
• Class size
• Age
Review
Chapter 3: Dispersion–Range–Variance (SD2)–Standard Deviation (SD)
–Coefficient of variation (CV)
Chapter 4: Displaying and exploring data–Dotplot–Stem-leaf–Boxplot–Skewness • Temperature
• Sales volume
• Salary
• Height
• Weight
• Shoe size (NL)
Review
Chapter 3: Dispersion–Range–Variance (SD2)–Standard Deviation (SD)
–Coefficient of variation (CV)
Chapter 4: Displaying and exploring data–Dotplot–Stem-leaf–Boxplot–Skewness
Review
P46. N.30 Ch.2
25 = 32, 26 = 64, suggests 6 classes
i = 88.33> 571- 416
Use interval of 100
Constructing Frequency Distribution: Quantitative Data
45 observations
Review
Chapter 3: Dispersion–Range–Variance (SD2)–Standard Deviation (SD)
–Coefficient of variation (CV)
Chapter 4: Displaying and exploring data–Dotplot–Stem-leaf–Boxplot–Skewness
Review
P46. N.30 Ch.2
0relative
Class interval =
100
Review
Chapter 3: Dispersion–Range–Variance (SD2)–Standard Deviation (SD)
–Coefficient of variation (CV)
Chapter 4: Displaying and exploring data–Dotplot–Stem-leaf–Boxplot–Skewness
Review
P87 N.60 Ch.3
SCCoast, an Internet provider in the Southeast, developed the following frequency distribution on the age of Internet users. Describe the central tendency:
X = 2410 / 60 = 40.17 (years)
Central Tendency : Mean, Mode, Median
Mean: Average Median: Midpoint Mode: Most Frequency
Mode = 45 (years) Median = ? (years)
Review
Chapter 3: Dispersion–Range–Variance (SD2)–Standard Deviation (SD)
–Coefficient of variation (CV)
Chapter 4: Displaying and exploring data–Dotplot–Stem-leaf–Boxplot–Skewness
Review
Lm=(60+1)/2=30.5 Value:40 50
Location: 28 48
30.5
30.5-2848-28 =
M-4050-40
Median= 41.25
Step 1: Define the location of the median Step 2: Calculate the median
P87 N.60 Ch.3
M
Chapter 3 Dispersion
Dispersion
Range
Interquartile Range
Variance (SD2) and Standard Deviation (SD)
Review
Chapter 3: Dispersion–Range–Variance (SD2)–Standard Deviation (SD)
–Coefficient of variation (CV)
Chapter 4: Displaying and exploring data–Dotplot–Stem-leaf–Boxplot–Skewness
Coefficient of variation (CV)
Review
Chapter 3: Dispersion–Range–Variance (SD2)–Standard Deviation (SD)
–Coefficient of variation (CV)
Chapter 4: Displaying and exploring data–Dotplot–Stem-leaf–Boxplot–Skewness
Dispersion: – tells us about the spread of the data. – Help us to compare the spread in two or more
distributions.
Mean is not reliable
Chapter 3 Dispersion
RangeRange:is the difference between the largest and the smallest
value in a data set.
Example:To find the range in 3,5,7,3,11
Range = 11-3 = 8
Review
Chapter 3: Dispersion–Range–Variance (SD2)–Standard Deviation (SD)
–Coefficient of variation (CV)
Chapter 4: Displaying and exploring data–Dotplot–Stem-leaf–Boxplot–Skewness
Variance
Population Variance:• is the mean of the squared difference between each
value and the mean. • overcomes the weakness of the range by using all the
values in the population.
Sample Variance:
Nμ)-Σ(X
=σ2
2
1-n)X-Σ(X
=s2
2
Review
Chapter 3: Dispersion–Range–Variance (SD2)–Standard Deviation (SD)
–Coefficient of variation (CV)
Chapter 4: Displaying and exploring data–Dotplot–Stem-leaf–Boxplot–Skewness
Review
Chapter 3: Dispersion–Range–Variance (SD2)–Standard Deviation (SD)
–Coefficient of variation (CV)
Chapter 4: Displaying and exploring data–Dotplot–Stem-leaf–Boxplot–Skewness
Population Variance:N
μ)-Σ(X=σ
22
27
EXAMPLE – Variance and Standard Deviation
The number of traffic citations issued during the last five months in Beaufort County, South Carolina, is 38, 26, 13, 41, and 22. What is the population variance?
Step 1: Get the mean
Step 2: Find the difference between each observation and the mean
Step 3: Square the difference and sum up Step 4: Divided by N
Variance
Population Standard Deviation:is the square root of the population variance.
Sample Standard Deviation:is the square root of the sample variance.
2σ=σ
2s=s
Standard DeviationReview
Chapter 3: Dispersion–Range–Variance (SD2)–Standard Deviation (SD)
–Coefficient of variation (CV)
Chapter 4: Displaying and exploring data–Dotplot–Stem-leaf–Boxplot–Skewness
Example:The hourly wages earned by a sample of five students are:
€7, €5, €11, €8, €6. Find the variance and standard deviation.
Step 1: Get the meanStep 1: Get the mean
Step 2: Sum up the squared differences
Step 2: Sum up the squared differences
Step 3: Divided by N-1Step 3: Divided by N-1
Step 4: Square root itStep 4: Square root it
7.40=537
=nΣX
=X
( ) ( ) ( )
5.30=1-5
21.2=
1-57.4-6+...+7.4-7
=1-nX-XΣ
=s222
2
s = €2.30
The variance is €5.30; the standard deviation is €2.30.
Standard DeviationReview
Chapter 3: Dispersion–Range–Variance (SD2)–Standard Deviation (SD)
–Coefficient of variation (CV)
Chapter 4: Displaying and exploring data–Dotplot–Stem-leaf–Boxplot–Skewness
Review
Chapter 3: Dispersion–Range–Variance (SD2)–Standard Deviation (SD)
–Coefficient of variation (CV)
Chapter 4: Displaying and exploring data–Dotplot–Stem-leaf–Boxplot–Skewness
Schiphol Utrecht20 40 50 60 80 20 49 50 51 80
Compare
• The number of coffee sales in Utrecht Starbucks is more closely clustered around the mean of 50 than for the sales number in Schiphol
Starbucks.
Standard Deviation
Review
Chapter 3: Dispersion–Range–Variance (SD2)–Standard Deviation (SD)
–Coefficient of variation (CV)
Chapter 4: Displaying and exploring data–Dotplot–Stem-leaf–Boxplot–Skewness
Standard Deviation of Grouped Data
Step 2: Use f * (M-Xmean)2
Step 3: Sum up
P87 N.60 Ch.3
Step 1: Find the Midpoint
Step 4: Divided by N-1 709860-1
Step 5: Square root it 709860-1
= 10.97
Review
Chapter 3: Dispersion–Range–Variance (SD2)–Standard Deviation (SD)
–Coefficient of variation (CV)
Chapter 4: Displaying and exploring data–Dotplot–Stem-leaf–Boxplot–Skewness
Coefficient of VariationThis is the ratio of the standard deviation to the mean:
The coefficient of variation describes the magnitude sample values and the variation within them.
The following times were recorded by the quarter-mile and mile runners of a university
track team (times are in minutes).
Quarter-Mile Times: 0.92 0.98 1.040.90 0.99
Mile Times: 4.52 4.35 4.60 4.704.50
After viewing this sample of running times, one of the coaches commented that the quarter milers turned in the more consistent times. Calculate the appropriate measure to check this and comment on the coach’s statement.We can compare the dispersion with the coefficient of variation because they have different “magnitudes”.
Review
Chapter 3: Dispersion–Range–Variance (SD2)–Standard Deviation (SD)
–Coefficient of variation (CV)
Chapter 4: Displaying and exploring data–Dotplot–Stem-leaf–Boxplot–Skewness
The following times were recorded by the quarter-mile and mile runners of a university
track team (times are in minutes).
Quarter-Mile Times: 0.92 0.98 1.040.90 0.99
Mile Times: 4.52 4.35 4.60 4.704.50
After viewing this sample of running times, one of the coaches commented that the quarter milers turned in the more consistent times. Calculate the appropriate measure to check this and comment on the coach’s statement.We can compare the dispersion with the coefficient of variation because they have different “magnitudes”.
Coefficient of variation of Q-Mile Times is: 0.05639/0.966=0.05837==>6%Coefficient of variation of Mile Times is: 0.12954/4.534=0.02857==>3%No, the mile-time team showed more consistent times.
Review
Chapter 3: Dispersion–Range–Variance (SD2)–Standard Deviation (SD)
–Coefficient of variation (CV)
Chapter 4: Displaying and exploring data–Dotplot–Stem-leaf–Boxplot–Skewness
Chapter 4 Displaying and Exploring Data
Dot plots:
Stem-and-Leaf Displays:Each numerical value is divided into two parts. The leading
digit(s) becomes the stem and the trailing digit the leaf. The stems are located along the vertical axis, and the leaf values are stacked against each other along the horizontal axis.
Leaf
Stem
Chapter 4 Displaying and Exploring DataReview
Chapter 3: Dispersion–Range–Variance (SD2)–Standard Deviation (SD)
–Coefficient of variation (CV)
Chapter 4: Displaying and exploring data–Dotplot–Stem-leaf–Boxplot–Skewness
Stem-and-Leaf Displays:
Chapter 4 Displaying and Exploring DataReview
Chapter 3: Dispersion–Range–Variance (SD2)–Standard Deviation (SD)
–Coefficient of variation (CV)
Chapter 4: Displaying and exploring data–Dotplot–Stem-leaf–Boxplot–Skewness
Review
Chapter 3: Dispersion–Range–Variance (SD2)–Standard Deviation (SD)
–Coefficient of variation (CV)
Chapter 4: Displaying and exploring data–Dotplot–Stem-leaf–Boxplot–Skewness
Quartiles, Deciles, and PercentilesAlternative ways of describing spread of data include determining thelocation of values that divide a set of observations into equal parts.
Chapter 4 Displaying and Exploring Data
Chapter 4 Displaying and Exploring Data
Quartiles, Deciles, and Percentiles
Review
Chapter 3: Dispersion–Range–Variance (SD2)–Standard Deviation (SD)
–Coefficient of variation (CV)
Chapter 4: Displaying and exploring data–Dotplot–Stem-leaf–Boxplot–Skewness
Chapter 4 Displaying and Exploring Data
Quartiles, Deciles, and Percentiles
95 1 25 100
93 1 24 96
88 2 23 92
85 3 21 84
79 1 18 72
75 4 17 68
70 6 13 52
65 2 7 28
62 1 5 20
58 1 4 16
54 2 3 12
50 1 1 4
N = 25
Raw PercentileScore Frequency Frequency Rank
Review
Chapter 3: Dispersion–Range–Variance (SD2)–Standard Deviation (SD)
–Coefficient of variation (CV)
Chapter 4: Displaying and exploring data–Dotplot–Stem-leaf–Boxplot–Skewness
Review
Chapter 3: Dispersion–Range–Variance (SD2)–Standard Deviation (SD)
–Coefficient of variation (CV)
Chapter 4: Displaying and exploring data–Dotplot–Stem-leaf–Boxplot–Skewness
Quartiles, Deciles, and Percentiles
43 61 9175101 104
Example:
Chapter 4 Displaying and Exploring Data
The first quartile is ?
Review
Chapter 3: Dispersion–Range–Variance (SD2)–Standard Deviation (SD)
–Coefficient of variation (CV)
Chapter 4: Displaying and exploring data–Dotplot–Stem-leaf–Boxplot–Skewness
Organize the data from lowest to largest valueStep 1:Step 1:
43 61 9175101 104
L25 = (n+1) = (6+1) =1.75 Step 2:Step 2: P100
25100
P1 P2 P3 P4 P5 P6
Draw two linesStep 3:Step 3:
43 6161-43 = 18
P1.75
P1 P20.75
Quartiles, Deciles, and Percentiles
Chapter 4 Displaying and Exploring Data
Draw two linesStep 3:Step 3:
43 6161-43 = 18
P1 P20.75 * 18 = 13.5
43+13.5 = 56.5
The first quartile is 56.5.
Quartiles, Deciles, and Percentiles
Chapter 4 Displaying and Exploring DataReview
Chapter 3: Dispersion–Range–Variance (SD2)–Standard Deviation (SD)
–Coefficient of variation (CV)
Chapter 4: Displaying and exploring data–Dotplot–Stem-leaf–Boxplot–Skewness
Review
Chapter 3: Dispersion–Range–Variance (SD2)–Standard Deviation (SD)
–Coefficient of variation (CV)
Chapter 4: Displaying and exploring data–Dotplot–Stem-leaf–Boxplot–Skewness
Listed below, ordered from smallest to largest, are the number of visits last week.
The median is 58.
a. Determine the median number of calls.
Q1 = 51.25 Q3 = 66.00
b. Determine the first and third quartiles.
P110. N.14 Ch.4
Exercise
Review
Chapter 3: Dispersion–Range–Variance (SD2)–Standard Deviation (SD)
–Coefficient of variation (CV)
Chapter 4: Displaying and exploring data–Dotplot–Stem-leaf–Boxplot–Skewness
Listed below, ordered from smallest to largest, are the number of visits last week.
D1 = 45.30 D9 = 76.40
c. Determine the first decile and the ninth decile.
P33 = 53.53
d. Determine the 33rd percentile.
P110. N.14 Ch.4
Exercise
Review
Chapter 3: Dispersion–Range–Variance (SD2)–Standard Deviation (SD)
–Coefficient of variation (CV)
Chapter 4: Displaying and exploring data–Dotplot–Stem-leaf–Boxplot–Skewness
Box Plots
A graphical display, based on quartiles to visualize a set of data.
Chapter 4 Displaying and Exploring Data
minimumminimum Q1Q1 MedianMedian Q3Q3 maximummaximum
Review
Chapter 3: Dispersion–Range–Variance (SD2)–Standard Deviation (SD)
–Coefficient of variation (CV)
Chapter 4: Displaying and exploring data–Dotplot–Stem-leaf–Boxplot–Skewness
Box Plots
Chapter 4 Displaying and Exploring Data
minimumminimum Q1Q1 MedianMedian Q3Q3 maximummaximum
Review
Chapter 3: Dispersion–Range–Variance (SD2)–Standard Deviation (SD)
–Coefficient of variation (CV)
Chapter 4: Displaying and exploring data–Dotplot–Stem-leaf–Boxplot–Skewness
Box Plots & Cumulative Frequency Distribution
Chapter 4 Displaying and Exploring Data
minimumminimum Q1Q1 MedianMedian Q3Q3 maximummaximum
Chapter 4 Displaying and Exploring Data
minimumminimum Q1Q1 MedianMedian Q3Q3 maximummaximum
Review
Chapter 3: Dispersion–Range–Variance (SD2)–Standard Deviation (SD)
–Coefficient of variation (CV)
Chapter 4: Displaying and exploring data–Dotplot–Stem-leaf–Boxplot–Skewness
skewed
Skewness:
Another characteristic of a set of data is the shape.
• symmetric, • positively skewed, • negatively skewed, • bimodal.
Chapter 4 Displaying and Exploring DataReview
Chapter 3: Dispersion–Range–Variance (SD2)–Standard Deviation (SD)
–Coefficient of variation (CV)
Chapter 4: Displaying and exploring data–Dotplot–Stem-leaf–Boxplot–Skewness
Zero skewness
mode=median=mean
Zero skewness
mode=median=mean
Review
Chapter 3: Dispersion–Range–Variance (SD2)–Standard Deviation (SD)
–Coefficient of variation (CV)
Chapter 4: Displaying and exploring data–Dotplot–Stem-leaf–Boxplot–Skewness
Chapter 4 Displaying and Exploring Data
positive skewness
Mode median mean
positive skewness
Mode median mean
Review
Chapter 3: Dispersion–Range–Variance (SD2)–Standard Deviation (SD)
–Coefficient of variation (CV)
Chapter 4: Displaying and exploring data–Dotplot–Stem-leaf–Boxplot–Skewness
Chapter 4 Displaying and Exploring Data
negative skewness
Mode median mean
negative skewness
Mode median mean
Review
Chapter 3: Dispersion–Range–Variance (SD2)–Standard Deviation (SD)
–Coefficient of variation (CV)
Chapter 4: Displaying and exploring data–Dotplot–Stem-leaf–Boxplot–Skewness
Chapter 4 Displaying and Exploring Data
Review
Chapter 3: Dispersion–Range–Variance (SD2)–Standard Deviation (SD)
–Coefficient of variation (CV)
Chapter 4: Displaying and exploring data–Dotplot–Stem-leaf–Boxplot–Skewness
Chapter 4 Displaying and Exploring Data
Review
Chapter 3: Dispersion–Range–Variance (SD2)–Standard Deviation (SD)
–Coefficient of variation (CV)
Chapter 4: Displaying and exploring data–Dotplot–Stem-leaf–Boxplot–Skewness
Skewness: • symmetric, • positively skewed, • negatively skewed, • bimodal.
Chapter 4 Displaying and Exploring Data
ExerciseA sample of 28 time shares in the Orlando, Florida, area revealed the following daily charges for a one-bedroom suite. For convenience the data are ordered from smallest to largest. Construct a box plot to represent the data. Comment on the distribution. Be sure to identify the first and third quartiles and the median.
• The median is $253.
• About 25% of the semi-private rooms are less than $214 and 25% above $304.
• The distribution is negatively skewed.
P113. N.18 Ch.4
Review
Chapter 3: Dispersion–Range–Variance (SD2)–Standard Deviation (SD)
–Coefficient of variation (CV)
Chapter 4: Displaying and exploring data–Dotplot–Stem-leaf–Boxplot–Skewness
• Review
• Chapter 3: Dispersion• Range• Variance (SD2)• Standard Deviation (SD)• Coefficient of variation (CV)
• Chapter 4: Displaying and exploring data• Dotplot• Stem-leaf• Boxplot• Skewness
What we have learnt today?