1 Business 90: Business Statistics Professor David Mease Sec 03, T R 7:30-8:45AM BBC 204 Lecture 5 = More of Chapter “Presenting Data in Tables and Charts” (PDITAC) Agenda: 1) Reminder about Homework 2 (due Tuesday 2/16)
Dec 17, 2015
1
Business 90: Business Statistics
Professor David Mease
Sec 03, T R 7:30-8:45AM BBC 204
Lecture 5 = More of Chapter “Presenting Data in Tables and Charts”
(PDITAC)
Agenda:1) Reminder about Homework 2 (due Tuesday
2/16) 2) Lecture over more of Chapter PDITAC3) Take pictures
2
Homework 2 - Due Tuesday 2/16 1) Read the chapter entitled “Presenting Data in Tables and Charts”
2) The Excel file at http://www.cob.sjsu.edu/mease_d/old-quiz-scores.xls has Quiz 1 scores for a Bus 90 class I thought last semester. Right click this link and select "Save Target As..." to download this file onto your computer. Then open it using Excel.
a) Make the frequency distribution by hand. Begin at 0 and end at 22 using 11 intervals. (Hint: You may use Excel to sort the data first if you like).
b) Graph the frequency histogram by hand.
c) Graph the percentage polygon by hand.
d) Make the cumulative percentage distribution by hand.
e) Graph the ogive by hand.
f) Check your answer for part a using Excel.
3) The data at http://www.cob.sjsu.edu/mease_d/houses.xls has house prices for a sample of 1500 California homes. The prices are in thousands of dollars. Right click this link and select "Save Target As..." to download this file onto your computer. Then open it with Excel and use Excel to do the following. **Be sure to print out your solutions and bring them with you to class for the quiz.**
a) Make the frequency distribution using Excel. Begin at 0 and end at 3.5 million using 7 intervals.
b) Graph the percentage histogram using Excel.
c) Graph the percentage polygon using Excel.
d) Make the cumulative percentage distribution using Excel.
e) Graph the ogive using Excel.
3
Presenting Data in Tables and Charts
Statistics for ManagersUsing Microsoft® Excel
4th Edition
4
Chapter Goals
After completing this chapter, you should be able to: Create an ordered array Construct and interpret a frequency distribution,
histogram, and polygon for numerical data Construct and interpret a cumulative percentage
distribution and ogive for numerical data Create and interpret contingency tables, bar charts,
and pie charts for categorical data Create and interpret a scatter diagram and a least
squares regression line (in other chapter p. 387-398) Describe appropriate and inappropriate ways to
display data graphically
5
Frequency Distributions in Excel
In ICE #5 we constructed a frequency distribution for the exam scores.
92 60 83 36 62 65 80 8850 63 92 64 84 89 83 80 8891 90 84 71 77 25 92 49 8854 51 59 41 71 53 69 68 6857 60 90 66 50
Next we will learn how to make frequency distributions using Microsoft Excel.
This is especially useful for large data sets.
6
Frequency Distributions in Excel
1) First make a column with your desired upper end points for your intervals.
2) Next highlight a column (of the same length) to store the frequency values.
3) Do insert > function > statistical > frequency.
4) “Data_array” is the data and “Bins_array” is your desired upper end points for your intervals.
5) IMPORTANT: You must hold down shift and control keys then press enter.
10
In class exercise #10:Construct a frequency distribution for the exam scores using Excel.
11
Frequency Distributions in Excel
Problem: We do “up to but not including” but excel doesn’t do this.
Solution: Change 30 to 29.99 and change 40 to 39.99 and so on.
But you don’t want it to say 29.99 when you make your table.
Solution: Paste table somewhere else and then fix the numbers.
12
In class exercise #10:Construct a frequency distribution for the exam scores using Excel.
ANSWER:
Intervals Frequency20 up to but not including 30 130 up to but not including 40 140 up to but not including 50 250 up to but not including 60 760 up to but not including 70 1070 up to but not including 80 380 up to but not including 90 1090 up to but not including 100 6
13
In class exercise #11:Construct a percentage distribution for the exam scores using Excel.
14
In class exercise #11:Construct a percentage distribution for the exam scores using Excel.
ANSWER:
Intervals Percentage20 up to but not including 30 2.5%30 up to but not including 40 2.5%40 up to but not including 50 5.0%50 up to but not including 60 17.5%60 up to but not including 70 25.0%70 up to but not including 80 7.5%80 up to but not including 90 25.0%90 up to but not including 100 15.0%
15
Histograms in Excel
1) You first need one column with the class MIDPOINTS and another column that has the frequencies.
2) Do insert > chart > column.
16
Histograms in Excel
3) Click “Next” and put your frequencies in the “Data range” and then click the “series” tab at the top and put your midpoints in the “Category (X) axis labels”
4) Add axis labels and a title, remove the legend and then click “Finish”
17
Histograms in Excel
5) IMPORTANT: Histograms do not have gaps. Correct this by double clicking on any bar, go to “Options” and make the “Gap width” be 0.
18
In class exercise #12:Construct a frequency histogram for the exam scores using Excel.
19
In class exercise #12:Construct a frequency histogram for the exam scores using Excel.
ANSWER:
Frequency Histogram
0
2
4
6
8
10
25 35 45 55 65 75 85 95
Exam Scores
Fre
qu
en
cie
s
20
In class exercise #13:Construct a percentage histogram for the exam scores using Excel.
21
In class exercise #13:Construct a percentage histogram for the exam scores using Excel.
ANSWER:
Percentage Histogram
0%
5%
10%
15%
20%
25%
25 35 45 55 65 75 85 95
Exam Scores
Pe
rce
nt
22
Questions for Grouping Data into Classes
1. How wide should each interval be? (How many classes should be used?)
2. How should the endpoints of the intervals be determined?
Often answered by trial and error, subject to user judgment
The goal is to create a distribution that is neither too "jagged" nor too "blocky”
Goal is to appropriately show the pattern of variation in the data
230
2
4
6
8
10
12
0 30 60 More
Temperature
Fre
qu
en
cy
How Many Class Intervals? Too Many (Narrow class intervals)
may yield a very jagged distribution with gaps from empty classes
Can give a poor indication of how frequency varies across classes
Too Few (Wide class intervals) may compress variation too much
and yield a blocky distribution can obscure important patterns of
variation.
0
0.5
1
1.5
2
2.5
3
3.5
4 8
12 16 20 24 28 32 36 40 44 48 52 56 60
Mor
e
Temperature
Fre
qu
ency
24
The Polygon
The polygon is just like a histogram EXCEPT:
1) It uses points connected by lines instead of bars2) Midpoints are used3) Two extra class intervals having frequency of
zero are included for completeness
Like histograms, these can be frequency OR percentage
Percentage polygons can show multiple groups on the same plot using colored lines
25
In class exercise #14:Construct a percentage polygon for the exam scores by hand.
26
Polygons in ExcelThese are done in Excel by “Insert” > “Chart” > “Line” and then selecting “line with markers displayed at each data value” and click next
The frequencies or percents should be the “data range” (including zero at beginning and end)
Under the “Series” tab at the top the “Category (X) axis labels” should be the class midpoints (including the extra two extra ones)
Provide a “chart title” and labels for the X axis and Y axis
If you have just one line uncheck the “show legend” box under the legend tab
28
In class exercise #15:Construct a percentage polygon for the exam scores using Excel.
29
In class exercise #15:Construct a percentage polygon for the exam scores using Excel.
ANSWER:
Exam Score Percentage Polygon
0.0%
5.0%
10.0%
15.0%
20.0%
25.0%
30.0%
15 25 35 45 55 65 75 85 95 105
Exam Scores
Per
cen
tag
e
.
30
The Cumulative Distribution
The cumulative distribution lists the total percentage LESS THAN each class boundary
It starts at zero and ends at 100%
The corresponding polygon is called an ogive and uses the class boundaries (NOT the midpoints)
31
In class exercise #16:Construct a cumulative percentage distribution for the exam scores by hand.