Top Banner
Graphical Displays of Information Chapter 3.1 – Tools for Analyzing Data Mathematics of Data Management (Nelson) MDM 4U
26

Graphical Displays of Information Chapter 3.1 – Tools for Analyzing Data Mathematics of Data Management (Nelson) MDM 4U.

Jan 01, 2016

Download

Documents

Hilda Gardner
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Graphical Displays of Information Chapter 3.1 – Tools for Analyzing Data Mathematics of Data Management (Nelson) MDM 4U.

Graphical Displays of Information

Chapter 3.1 – Tools for Analyzing Data

Mathematics of Data Management (Nelson)

MDM 4U

Page 2: Graphical Displays of Information Chapter 3.1 – Tools for Analyzing Data Mathematics of Data Management (Nelson) MDM 4U.

Histograms

contain continuous data grouped in class intervals, which will display how data is spread over a range

the width of each bar is known as the bin width

different bin widths produce different shaped distributions

bin widths should be equal and there should be at least five (5)

Page 3: Graphical Displays of Information Chapter 3.1 – Tools for Analyzing Data Mathematics of Data Management (Nelson) MDM 4U.

Histogram Example these

histograms represent the same data

however, one shows much less of the structure of the data

too many bins (bin width too small) is also a problem

Co

un

t

5

10

15

20

25

30

SomeData40 60 80 100 120

Data Histogram

Co

un

t

1

2

3

4

5

6

7

8

9

SomeData40 60 80 100 120

Data Histogram

Co

un

t1

2

3

4

5

6

SomeData30 40 50 60 70 80 90 100 110

Data Histogram

Page 4: Graphical Displays of Information Chapter 3.1 – Tools for Analyzing Data Mathematics of Data Management (Nelson) MDM 4U.

Histogram Applet – Old Faithfulhttp://www.isixsigma.com/offsite.asp?A=Fr&Url

=http://www.stat.sc.edu/~west/javahtml/Histogram.html

Page 5: Graphical Displays of Information Chapter 3.1 – Tools for Analyzing Data Mathematics of Data Management (Nelson) MDM 4U.

Bin Width Calculation

the bin width is calculated by dividing the range = max – min by the number of intervals you desire (5-6)

the bins should not overlap wrong: 0-10, 10-20, 20-30, 30-40

Discrete correct: 0-10, 11-20, 21-30, 31-40

Continuous correct: 0-9.99, 10-19.99, 20-29.99, 30-39.99

Page 6: Graphical Displays of Information Chapter 3.1 – Tools for Analyzing Data Mathematics of Data Management (Nelson) MDM 4U.

Mound-shaped distribution The middle interval(s) have the greatest

frequency (i.e. the tallest bar) The bars get smaller as you move out to the

edges.

Page 7: Graphical Displays of Information Chapter 3.1 – Tools for Analyzing Data Mathematics of Data Management (Nelson) MDM 4U.

U-shaped distribution

Lowest frequency in the centre, highest towards the outside

E.g. height of a combined grade 1 and 6 class

Page 8: Graphical Displays of Information Chapter 3.1 – Tools for Analyzing Data Mathematics of Data Management (Nelson) MDM 4U.

Uniform distribution

All bars are approximately the same height E.g. roll a die 50 times

Page 9: Graphical Displays of Information Chapter 3.1 – Tools for Analyzing Data Mathematics of Data Management (Nelson) MDM 4U.

Symmetric distribution A distribution that is the same on either side of the

centre U-Shaped, Uniform and Normal Distributions are

symmetric

Page 10: Graphical Displays of Information Chapter 3.1 – Tools for Analyzing Data Mathematics of Data Management (Nelson) MDM 4U.

Skewed distribution (left and right) Highest frequencies at one end Left-skewed drops off to the left E.g. the years on a handful of quarters

Page 11: Graphical Displays of Information Chapter 3.1 – Tools for Analyzing Data Mathematics of Data Management (Nelson) MDM 4U.

Exercises Define in your notes:

Frequency distribution (p. 146) Cumulative frequency (p. 146) Relative frequency (p. 146)

Try page 146 #1,2,3, 11 (use Excel or Fathom),13

Page 12: Graphical Displays of Information Chapter 3.1 – Tools for Analyzing Data Mathematics of Data Management (Nelson) MDM 4U.

Measures of Central Tendency

Chapter 3.2 – Tools for Analyzing Data

Mathematics of Data Management (Nelson)

MDM 4U

Page 13: Graphical Displays of Information Chapter 3.1 – Tools for Analyzing Data Mathematics of Data Management (Nelson) MDM 4U.

Sigma Notation the sigma notation is used to compactly

express a mathematical series ex: 1 + 2 + 3 + 4 + … + 15 this can be expressed:

the variable k is called the index of summation.

the number 1 is the lower limit and the number 15 is the upper limit

we would say: “the sum of k for k = 1 to k = 15

15

1

1514...4321k

k

Page 14: Graphical Displays of Information Chapter 3.1 – Tools for Analyzing Data Mathematics of Data Management (Nelson) MDM 4U.

Examples:

write in expanded form:

= [2(4) + 1] + [2(5) + 1] + [2(6) + 1] + [2(7) + 1] = 9 + 11 + 13 + 15 =48 note that any letter can be used for the index of

summation, though k, a, n, i, j & x are often used

7

4

)12(n

n

Page 15: Graphical Displays of Information Chapter 3.1 – Tools for Analyzing Data Mathematics of Data Management (Nelson) MDM 4U.

Example: write the following in sigma notation

3210 2

3

2

3

2

3

2

38

3

4

3

2

33

3

0 2

3

nn

Page 16: Graphical Displays of Information Chapter 3.1 – Tools for Analyzing Data Mathematics of Data Management (Nelson) MDM 4U.

The Mean

n

x

x

n

ii

1

found by dividing the sum of all the data points by the number of elements of data

Deviation the distance of a data point from the mean calculated by subtracting the mean from the

value

Page 17: Graphical Displays of Information Chapter 3.1 – Tools for Analyzing Data Mathematics of Data Management (Nelson) MDM 4U.

The Weighted Mean

n

ii

n

iii

w

wxx

1

1

where xi represent the data points, wi represents the weight or the frequency

see examples on page 153 and 154 example: 7 students have a mark of 70 and 10

students have a mark of 80 mean = (70 * 7 + 80 * 10) / (7 + 10)

Page 18: Graphical Displays of Information Chapter 3.1 – Tools for Analyzing Data Mathematics of Data Management (Nelson) MDM 4U.

Means with grouped data

for data that is already grouped into class intervals (assuming you do not have the original data), you must use the midpoint of each class to estimate the weighted mean

see the example on page 154-5

Page 19: Graphical Displays of Information Chapter 3.1 – Tools for Analyzing Data Mathematics of Data Management (Nelson) MDM 4U.

Median

the midpoint of the data calculated by placing all the values in order if there are an even number of values, the

median is the mean of the middle two numbers 1 4 6 8 9 12 median = 7

if there is an odd number of values, the median is the middle number 1 4 6 8 9 median = 6

Page 20: Graphical Displays of Information Chapter 3.1 – Tools for Analyzing Data Mathematics of Data Management (Nelson) MDM 4U.

Mode

Simply chosen by finding the number that occurs most often There may be no mode, one mode, two modes (bimodal), etc. Which distributions from yesterday have one mode? Mound-shaped, Left/Right-Skewed Two modes? U-Shaped, some Symmetric Multiple modes? Uniform Modes are appropriate for discrete data or non-numerical data

shoe sizes shoe colors

Page 21: Graphical Displays of Information Chapter 3.1 – Tools for Analyzing Data Mathematics of Data Management (Nelson) MDM 4U.

Distributions and Central Tendancy the relationship between the three measures

changes depending on the spread of the data

symmetric (mound shaped) mean = median = mode

right skewed mean > median > mode

left skewed mean < median < mode

Co

un

t

1

2

3

data0 1 2 3 4 5 6 7

Data Histogram

Co

un

t

1

2

3

4

5

data0 1 2 3 4 5 6 7

Data Histogram

Co

un

t1

2

3

4

5

data0 1 2 3 4 5 6 7

Data Histogram

Page 22: Graphical Displays of Information Chapter 3.1 – Tools for Analyzing Data Mathematics of Data Management (Nelson) MDM 4U.

What Method is Most Appropriate? Outliers are data points that are quite

different from the other points Outliers have the greatest effect on the mean Median is least affected by outliers Skewed data is best represented by the

median If symmetric either median or mean If not numeric or if the frequency is the most

critical, use the mode

Page 23: Graphical Displays of Information Chapter 3.1 – Tools for Analyzing Data Mathematics of Data Management (Nelson) MDM 4U.

Example 1 find the mean, median and mode

mean = [(1x2) + (2x8) + (3x14) + (4x3)] / 27 = 2.7 median = 3 mode = 3

which way is it skewed? Left

Survey responses 1 2 3 4

Frequency 2 8 14 3

Page 24: Graphical Displays of Information Chapter 3.1 – Tools for Analyzing Data Mathematics of Data Management (Nelson) MDM 4U.

Example 2 Find the mean, median and mode

mean = [(145x3) + (155x7) + (165x4)] / 14 = 155.7 median = 155 mode = 151-160

which way is it skewed? Mound-shaped

Height 141-150 151-160 161-170

No. of Students 3 7 4

Page 25: Graphical Displays of Information Chapter 3.1 – Tools for Analyzing Data Mathematics of Data Management (Nelson) MDM 4U.

Exercises

try page 159 #4, 5, 6, 8

Remembrance Day by the Numbers http://www42.statcan.ca/smr08/smr08_064_e

.htm

Page 26: Graphical Displays of Information Chapter 3.1 – Tools for Analyzing Data Mathematics of Data Management (Nelson) MDM 4U.

References

Wikipedia (2004). Online Encyclopedia. Retrieved September 1, 2004 from http://en.wikipedia.org/wiki/Main_Page