Top Banner
Lecture 3 Data Descriptive Measurement Distribution of Frequency Source: Copyright @2005 Brooks/Cole, a division of Thomson Learning, Inc, UoW Lecture handout, text book Managerial Statistics
27

Lecture 03 week 2 distribution of frequency

Jan 21, 2015

Download

Documents

 
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Lecture 03 week 2 distribution of frequency

Lecture 3

Data Descriptive Measurement Distribution of Frequency

Source: Copyright @2005 Brooks/Cole, a division of Thomson Learning, Inc, UoW Lecture handout, text book Managerial

Statistics

Page 2: Lecture 03 week 2 distribution of frequency

Introduction Data

measurement

Data

measurement

Data

measurement

Page 3: Lecture 03 week 2 distribution of frequency

Definitions…

• Population is a set of all observed objects.

• Sample is a subset of population

• A variable [Typically called a “random” variable since we do not know it’s value until we observe it] is some characteristic of a population or sample.

E.g. student grades.

Typically denoted with a capital letter: X, Y, Z…

• The values of the variable are the range of possible values for a variable.

E.g. student marks (0..100)

• Data are the observed values of a variable.

E.g. student marks: {67, 74, 71, 83, 93, 55, 48}

2.3

Page 4: Lecture 03 week 2 distribution of frequency

Examples Observe the Student mark on Quantitative Methods (MAT 210)

• Population: Students of INTI College Indonesia

• Sample : Student BSP who take MAT 210

• Variable : Student Marks

• Value: (0:100)

• Data: {79,85,90,87,75,68}

2.4

Page 5: Lecture 03 week 2 distribution of frequency

Type of Data

Categorical (qualitative data)

1. Ordinal Data

appear to be categorical in nature, but their values have an order; a ranking to them:

E.g. College course rating system poor = 1, fair = 2, good = 3, very good = 4, excellent = 5

Do not have any sense on arithmetic's operation

2. Nominal Data

The values of nominal data are categories. E.g. responses to questions about marital status, coded as:

Single = 1, Married = 2, Divorced = 3, Widowed =4

Do not have any sense on arithmetic's operation

Nominal data has no natural order to the values.

Numerical Data (quantitative data)

1. Interval Data

Real number

Arithmetic's operation can be perform on

interval data

Has no natural 0

Ex: Temperature

100 degrees is 50 degrees hotter than 50

degrees BUT not twice as hot.

2. Ratio Data

Real number

Has natural 0

Ex: Height, Weight, price, etc

100 pounds is 50 pounds heaver than 50

pounds AND is twice as heavy.

Page 6: Lecture 03 week 2 distribution of frequency

Making Sense of Data

2.6

A shoe seller

sets up on

campus &

collects some

data about what

size shoes

students wear

What do you see

in these data?

Page 7: Lecture 03 week 2 distribution of frequency

2.7

What might we do

to make sense of

the shoe size data?

Page 8: Lecture 03 week 2 distribution of frequency

Data Measurement

Distribution of Frequency

Tabulate, Scaling (range)

Sketch the graph

Central allocation Measurement

Mean

Median

Modus

Quartile (relative standing measurement)

Spread or variability measurement

Variance

Standard Deviation

Range

Coefficient of variation

Linear relationship measurement

Coefficient of correlation, covariance, Least square line

Page 9: Lecture 03 week 2 distribution of frequency

Sketch the distribution of frequency

Page 10: Lecture 03 week 2 distribution of frequency

Nominal Data (Tabular Summary)

2.10

Page 11: Lecture 03 week 2 distribution of frequency

Nominal Data (Frequency)

2.11

Bar Charts are often used to display frequencies…

Page 12: Lecture 03 week 2 distribution of frequency

Nominal Data (Relative Frequency)

2.12

Pie Charts show relative frequencies…

Page 13: Lecture 03 week 2 distribution of frequency

Graphical Techniques for Interval Data

• There are several graphical methods that are used when the data are interval (i.e. numeric,

non-categorical).

• The most important of these graphical methods is the histogram.

• The histogram is not only a powerful graphical technique used to summarize interval data,

but it is also used to help explain probabilities.

2.13

Page 14: Lecture 03 week 2 distribution of frequency

Building a Histogram…

1) Collect the Data : 200 Long distance telephone bills (MS p: 31)

2) Create a frequency distribution for the data…

How?

a) Determine the number of classes to use. [8]

b) Determine how large to make each class…

How?

Look at the range of the data, that is,

Range = Largest Observation – Smallest Observation

Range = $119.63 – $0 = $119.63

Number of class interval= 1+3.3 log (n)

Then each class width becomes:

Range ÷ (# classes) = 119.63 ÷ 8 ≈ 15

2.14

Page 15: Lecture 03 week 2 distribution of frequency

Histogram for interval data 1) Collect the Data

2) Create a frequency distribution for the data.

3) Draw the Histogram.

2.15

Page 16: Lecture 03 week 2 distribution of frequency

Interpret…

2.16

about half (71+37=108) of the bills are “small”, i.e. less than $30 There are only a few

telephone bills in the middle range.

(18+28+14=60)÷200 = 30%

i.e. nearly a third of the phone bills are greater than $75

Page 17: Lecture 03 week 2 distribution of frequency

Shapes of Histograms…

• Symmetry

• A histogram is said to be symmetric if, when we draw a vertical line

down the center of the histogram, the two sides are identical in shape

and size:

2.17

Fre

quency

Variable

Fre

quency

Variable

Fre

quency

Variable

Page 18: Lecture 03 week 2 distribution of frequency

Shapes of Histograms…

• Skewness

• A skewed histogram is one with a long tail extending to either the right or the left:

2.18

Fre

quency

Variable

Fre

quency

Variable

Positively Skewed Negatively Skewed

Page 19: Lecture 03 week 2 distribution of frequency

Shapes of Histograms…

• Bell Shape

• A special type of symmetric unimodal histogram is one that is bell shaped:

2.19

Fre

quency

Variable

Bell Shaped

Many statistical techniques require that the population be bell shaped. Drawing the histogram helps verify the shape of the population in question.

Page 20: Lecture 03 week 2 distribution of frequency

Relative Frequencies… • For example, we had 71 observations in our first class (telephone bills

from $0.00 to $15.00). Thus, the relative frequency for this class is 71

÷ 200 (the total # of phone bills) = 0.355 (or 35.5%)

2.20

Page 21: Lecture 03 week 2 distribution of frequency

Cumulative Relative Frequencies…

2.21

first class…

next class: .355+.185=.540

last class: .930+.070=1.00

:

:

Page 22: Lecture 03 week 2 distribution of frequency

Cross Tabulation for comparing two nominal

variables

• In Example 2.10, a sample of newspaper readers was asked to report which

newspaper they read: Globe and Mail (1), Post (2), Star (3), or Sun (4), and to

indicate whether they were blue-collar worker (1), white-collar worker (2), or

professional (3).

2.22

This reader’s response is captured

as part of the total number on the

contingency table…

Page 23: Lecture 03 week 2 distribution of frequency

Contingency Table…

• Interpretation: The relative frequencies in the columns 2 & 3 are similar, but there are large differences between columns 1 and 2 and between columns 1 and 3.

• This tells us that blue collar workers tend to read different

newspapers from both white collar workers and professionals and that white collar and professionals are quite similar in their newspaper choice.

2.23

dissimilar

similar

Page 24: Lecture 03 week 2 distribution of frequency

Graphing the Relationship Between Two Nominal

Variables…

Use the data from the contingency table to create bar charts…

2.24

Professionals tend

to read the Globe &

Mail more than

twice as often as the

Star or Sun…

Page 25: Lecture 03 week 2 distribution of frequency

Scatter Diagram (describing two interval

variables)…

• Example 2.12 A real estate agent wanted to know to what extent

the selling price of a home is related to its size… (raw data)

1) Collect the data

2) Determine the independent variable

(X – house size) and the dependent

variable (Y – selling price)

3) Use Excel to create a “scatter

diagram”…

2.25

Page 26: Lecture 03 week 2 distribution of frequency

Scatter Diagram… • It appears that in fact there is a relationship, that is, the greater the house size

the greater the selling price…

2.26

Page 27: Lecture 03 week 2 distribution of frequency

Patterns of Scatter Diagrams… • Linearity and Direction are two concepts we are interested in

2.27

Positive Linear Relationship Negative Linear Relationship

Weak or Non-Linear Relationship