Understanding Basic Statistics Chapter One Organizing Data.

Post on 04-Jan-2016

220 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

Transcript

Understanding Basic Statistics

Chapter One

Organizing Data

Statistics is

The study of how to:

• collect

• organize

• analyze

• interpret

numerical information from data

Types of Data

• Quantitative data are numerical measurements– example: number of siblings

• Qualitative data involve non-numerical observations– example: brand of computer

Population

all measurements or observations of interest

Example: incomes of all residents of a county

Sample

part of a population used to represent the population

Example: incomes of selected residents

Methods of Producing Data

• Sampling: drawing subsets from the population

• Experimention: impose a change and measure the result

• Simulation: numerical facsimile of real-world phenomena

• Census: using measurements from entire population

Potential Problems

• Strong opinions may be overepresented if responses are voluntary.

• A hidden bias may exist in the way data is collected.

• There may be hidden effects of other variables.

• There is no guarantee that results can be generalized.

Levels of Measurement

• Nominal

• Ordinal

• Interval

• Ratio

Nominal Measurement

Data is put into categories only.

Example: eye color

Ordinal Measurement

Data can be ordered. Differences cannot be

calculated or interpreted.

Example: class rank

Interval Measurement

Data can be ordered. Differences between data values can be compared.

Example: temperature

Ratio Measurement

Data can be ordered. Differences and ratios

between data values can be compared.

Example: time

Simple Random Sample of n measurements:

• every sample of size n has equal chance of being selected

• every item in the population has equal chance of being included

Not random sampling:

• asking for volunteers to respond to a survey

• choosing the first five customers in a store

Random sampling:

• drawing names “from a hat”

• using a random number table to select sample

• using a random number generator

Sampling techniques

• Simple Random Sampling

• Stratified Sampling

• Systematic Sampling

• Cluster Sampling

• Convenience Sampling

Stratified Sampling

• Population is divided into groups

• Random samples are drawn from each group

Systematic Sampling

• Population is arranged in sequential order.

• Select a random starting point.

• Select every “kth” item.

Cluster Sampling

• Population is divided into sections

• Some sections are randomly selected

• Every item in selected sections is included in sample

Convenience Sampling

• Use whatever data is readily available.

• Risk severe bias.

Which sampling technique is described?

College students are waiting in line for registration. Every eighth

person in line is surveyed.

Systematic sampling

Which sampling technique is described?

College students are waiting in line for registration. Students are asked to volunteer to respond to

a survey.Convenience sampling

Which sampling technique is described?

In a large high school, students from every homeroom are

randomly selected to participate in a survey

Stratified sampling

Which sampling technique is described?

An accountant uses a random number generator to select ten

accounts for audit.

Simple random sampling

Which sampling technique is described?

To determine students’ opinions of a new registration method, a college randomly selects five

majors. All students in the selected majors are surveyed.

Cluster sampling

Bar Graph

• bars of uniform width

• uniformly spaced• may be vertical or

horizontal• lengths represent

quantities being compared

Reasons for Returns

0

10

20

30

40

50

60

Color Si

ze

Didn'

t lik

e

Qualit

y

Pareto Chart

• tool of quality control• start with a bar chart• arrange bars in

decreasing order of frequency

• frequently used to investigate causes of problems

Reasons for returns

50

20

10

5

0

10

20

30

40

50

60

Size Didn't Like Quality Didn't Like

Circle Graph (Pie Chart)

• shows division of whole into component parts

• label parts with appropriate percentages of the whole

Conventions held

49%

27%

19%5%

FloridaCaliforniaVirginiaTexas

Time Plot

• Shows data values in chronological order

• time on horizontal scale

• other variable on vertical scale

• connect data points with line segments

Sales (in thousands of dollars)

050

100150200250300

Histogram

Differences from a bar chart:

• bars touch

• width of bars represents quantity

To construct a histogram from raw data:

• Decide on the number of classes (5 to 15 is customary) and find a convenient class width.

• Organize the data into a frequency table.

• Find the class boundaries and the class midpoints.

• Tally data and determine the freqency

• Sketch the histogram.

Computing the class width

1. Compute:

largest data value smallest data value

desired number of classes

2. Increase the value computed to the next highest whole number

Class Width

Raw Data:

10.2 18.7 22.3 20.0

6.3 17.8 17.1 5.0

2.4 7.9 0.3 2.5

8.5 12.5 21.4 16.5

0.4 5.2 4.1 14.3

19.5 22.5 0.0 24.7

11.4

Use 5 classes.

24.7 – 0.0

5

= 4.94

Round class width up to 5.

Computing Class Width

difference between the lower class limit of one class and the

lower class limit of the next class

Finding Class Widths# of miles Class Width

0.0 - 4.9 5

5.0 - 9.9 5

10.0 - 14.9 5

15.0 - 19.9 5

20.0 - 24.9 5

Class Boundaries

(Upper limit of one class + lower limit of next class)

divided by two

Finding Class Boundaries# of miles f class boundaries

0.0 - 4.9 6

5.0 - 9.9 5 4.95 - 9.95

10.0 - 14.9 4

15.0 - 19.9 5

20.0 - 24.9 5

Finding Class BoundariesFinding Class Boundaries

# of miles f class boundaries

0.0 - 4.9 6

5.0 - 9.9 5 4.95 - 9.95

10.0 - 14.9 4 9.95 - 14.95

15.0 - 19.9 5

20.0 - 24.9 5

# of miles f class boundaries

0.0 - 4.9 6

5.0 - 9.9 5 4.95 - 9.95

10.0 - 14.9 4 9.95 - 14.95

15.0 - 19.9 5 14.95 - 19.95

20.0 - 24.9 5

Finding Class Boundaries

# of miles f class boundaries

0.0 - 4.9 6 ??

5.0 - 9.9 5 4.95 - 9.95

10.0 - 14.9 4 9.95 - 14.95

15.0 - 19.9 5 14.95 - 19.95

20.0 - 24.9 5 19.95 - 24.95

Finding Class Boundaries

# of miles f class boundaries

0.0 - 4.9 6 ?? - 4.95 5.0 - 9.9 5 4.95 - 9.95

10.0 - 14.9 4 9.95 - 14.95

15.0 - 19.9 5 14.95 - 19.95

20.0 - 24.9 5 19.95 - 24.95

Finding Class Boundaries

# of miles f class boundaries

0.0 - 4.9 6 0.05 - 4.95 5.0 - 9.9 5 4.95 - 9.95

10.0 - 14.9 4 9.95 - 14.95

15.0 - 19.9 5 14.95 - 19.95

20.0 - 24.9 5 19.95 - 24.95

Finding Class Boundaries

Computing Class Midpoints

lower class limit + upper class limit

2

# of miles f class midpoints

0.0 - 4.9 6 2.45

5.0 - 9.9 5

10.0 - 14.9 4

15.0 - 19.9 5

20.0 - 24.9 5

Finding Class Midpoints

# of miles f class midpoints

0.0 - 4.9 6 2.45

5.0 - 9.9 5 7.45

10.0 - 14.9 4

15.0 - 19.9 5

20.0 - 24.9 5

Finding Class Midpoints

# of miles f class midpoints

0.0 - 4.9 6 2.45

5.0 - 9.9 5 7.45

10.0 - 14.9 4 12.45

15.0 - 19.9 5 17.45

20.0 - 24.9 5 22.45

Finding Class Midpoints

Tallying the Data# of miles tally frequency

0.0 - 4.9 |||| | 6

5.0 - 9.9 |||| 5

10.0 - 14.9 |||| 4

15.0 - 19.9 |||| 5

20.0 - 24.9 |||| 5

# of miles f

0.0 - 4.9 6

5.0 - 9.9 5

10.0 - 14.9 4

15.0 - 19.9 5

20.0 - 24.9 5

Constructing the Histogramf

| | | | | |

6

5

4

3

2

1

0

-

-

-

-

-

-

--0.05 4.95 9.95 14.95 19.95 24.95

mi.

Grouped Frequency Table# of miles f

0.0 - 4.9 6

5.0 - 9.9 5

10.0 - 14.9 4

15.0 - 19.9 5

20.0 - 24.9 5

Class limits:

lower - upper

Relative Frequency

Relative frequency =

f = class frequency

n total of all frequencies

Relative Frequency

f = 6 = 0.24

n 25

f = 5 = 0.20

n 25

# of miles f relative frequency

0.0 - 4.9 6 0.24

5.0 - 9.9 5 0.20

10.0 - 14.9 4 0.16

15.0 - 19.9 5 0.20

20.0 - 24.9 5 0.20

Relative Frequency Histogram

| | | | | |

.24

.20

.16

.12

.08

.04

0

-

-

-

-

-

-

--0.05 4.95 9.95 14.95 19.95 24.95

mi.

Rel

ativ

e fr

eque

ncy

f/n

Common Shapes of Histograms

Common Shapes of Histograms

Symmetric

ff

When folded vertically, both sides are (more or less) the same.

Common Shapes of Histograms

Common Shapes of Histograms

Also Symmetric

ff

Common Shapes of Histograms

Uniform

ff

Common Shapes of Histograms

Common Shapes of Histograms

Non-Symmetric Histograms

These histograms are skewed. skewed.

Common Shapes of Histograms

Common Shapes of Histograms

Skewed Histograms

Skewed left Skewed right

Common Shapes of Histograms

Common Shapes of Histograms

Bimodal

ff

The two largest rectangles are separated by at least one class.

Stem and Leaf Display

Raw Data:

35, 45, 42, 45, 41, 32, 25, 56, 67, 76, 65, 53, 53, 32, 34, 47, 43, 31

Stem and Leaf DisplayFirst data value = 35

Stem and Leaf DisplayFirst data value = 35

2

3

4

5

6

7

stems

5 leaf

Stem and Leaf DisplaySecond data value = 45

Stem and Leaf DisplaySecond data value = 45

2

3

4

5

6

7

5

5

Stem and Leaf DisplayThird data value = 42

Stem and Leaf DisplayThird data value = 42

2

3

4

5

6

7

5

5 2

Stem and Leaf DisplayNext data value = 45

Stem and Leaf DisplayNext data value = 45

2

3

4

5

6

7

5

5 2 5

Stem and Leaf DisplayNext data value = 41

Stem and Leaf DisplayNext data value = 41

2

3

4

5

6

7

5

5 2 5 1

Stem and Leaf DisplayNext data value = 32

Stem and Leaf DisplayNext data value = 32

2

3

4

5

6

7

5 2

5 2 5 1

Stem and Leaf DisplayNext data value = 25

Stem and Leaf DisplayNext data value = 25

2

3

4

5

6

7

5 2

5 2 5 1

5

Stem and Leaf DisplayNext data value = 56

Stem and Leaf DisplayNext data value = 56

2

3

4

5

6

7

5 2

5 2 5 1

5

6

Stem and Leaf DisplayNext data value = 67

Stem and Leaf DisplayNext data value = 67

2

3

4

5

6

7

5 2

5 2 5 1

5

6

7

Stem and Leaf DisplayNext data value = 76

Stem and Leaf DisplayNext data value = 76

2

3

4

5

6

7

5 2

5 2 5 1

5

6

7

6

Stem and Leaf DisplayNext data value = 65

Stem and Leaf DisplayNext data value = 65

2

3

4

5

6

7

5 2

5 2 5 1

5

6

7 5

6

Stem and Leaf DisplayNext data value = 53

Stem and Leaf DisplayNext data value = 53

2

3

4

5

6

7

5 2

5 2 5 1

5

6 3

7 5

6

Stem and Leaf DisplayNext data value = 53

Stem and Leaf DisplayNext data value = 53

2

3

4

5

6

7

5 2

5 2 5 1

5

6 3 3

7 5

6

Stem and Leaf DisplayNext data value = 32

Stem and Leaf DisplayNext data value = 32

2

3

4

5

6

7

5 2 2

5 2 5 1

5

6 3 3

7 5

6

Stem and Leaf DisplayNext data value = 34

Stem and Leaf DisplayNext data value = 34

2

3

4

5

6

7

5 2 2 4

5 2 5 1

5

6 3 3

7 5

6

Stem and Leaf DisplayNext data value = 47

Stem and Leaf DisplayNext data value = 47

2

3

4

5

6

7

5 2 2 4

5 2 5 1 7

5

6 3 3

7 5

6

Stem and Leaf DisplayNext data value = 43

Stem and Leaf DisplayNext data value = 43

2

3

4

5

6

7

5 2 2 4

5 2 5 1 7 3

5

6 3 3

7 5

6

Stem and Leaf DisplayNext data value = 31

Stem and Leaf DisplayNext data value = 31

2

3

4

5

6

7

5 2 2 4 1

5 2 5 1 7 3

5

6 3 3

7 5

6

Finished Stem and Leaf Display

Finished Stem and Leaf Display

2

3

4

5

6

7

5 2 2 4 1

5 2 5 1 7 3

5

6 3 3

7 5

6

top related