Probability & Statistics Chapter 02 Summarizing and Graphing Data.

Post on 04-Jan-2016

213 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

Transcript

Probability & Statistics

Chapter 02Summarizing and Graphing Data

Describing Data- 1. Frequency distribution

• Frequency distribution: A grouping of data into categories showing the number of observations in each mutually exclusive category.

The steps for organizing data into a frequency distribution

1. Set up grouping called classes. Decide on the size of the class interval.

2. Tally the raw data or ungrouped data into classes.

3. Count the number of tallies in each class. The number of observation in the is called the class frequency.

• Class: A class is one of the categories into which data can be classified.

• Class interval for a frequency distribution is obtained by subtracting the lower limit of a class from the lower limit of the next class or subtracting the higher limit of a class from the higher limit of the next class

• Class limits: represent the smallest and largest data values that can be included in a class

• The class boundaries are used to separate the classes so that there are no gaps in the frequency distribution

• Class mark (midpoint): A point that divides a class into two equal parts. This is the average between the upper and lower class limits.

EXAMPLE 1

• Prof. Munasinghe is the head of the Department of Industrial Management (IM) and wishes to determine the amount of studying IM students do. He selects a random sample of 30 students and determines the number of hours each student studies per week: 15.0, 23.7, 19.7, 15.4, 18.3, 23.0, 14.2, 20.8, 13.5, 20.7, 17.4, 18.6, 12.9, 20.3, 13.7, 21.4, 18.3, 29.8, 17.1, 18.9, 10.3, 26.1, 15.7, 14.0, 17.8, 33.8, 23.2, 12.9, 27.1, 16.6,

• Organize the data into a frequency distribution.

Guidelines for Constructing a Frequency Distribution

• To determine the number of classes “2 to the k rule” can be usedselect the smallest integer (k) such that

2k n

n is the number of observations• There should be between 5 and 20 classes.

Guidelines for Constructing a Frequency Distribution

classes of number

value lowest - value highest i

• Determine the class interval based on the number of suggested classes by using the formula:

• The class intervals used in the frequency distribution should be equal.

• Use the computed suggested class interval to construct the frequency distribution. Note: this is a suggested class interval; if the computed class interval is 97, it may be better to use 100.

Guidelines for Constructing a Frequency Distribution

• Avoid open ended classes• Do not have overlapping classes• Count the number of values in each class.

EXAMPLE 1

• 15.0, 23.7, 19.7, 15.4, 18.3, 23.0, 14.2, 20.8, 13.5, 20.7, 17.4, 18.6, 12.9, 20.3, 13.7, 21.4, 18.3, 29.8, 17.1, 18.9, 10.3, 26.1, 15.7, 14.0, 17.8, 33.8, 23.2, 12.9, 27.1, 16.6

• Number of possible classes: 2k nneed to find a value for k where 2k 30 then k = 5number of possible classes = 5

• Class interval:i = (highest value – lowest value) / number of classesi = (33.8 – 10.3) / 5 = 4.7better to use i = 5

EXAMPLE 1

Classes Tally Frequency

10.3 – 15.2 |||| ||| 8

15.3 – 20.2 |||| |||| | 11

20.3 – 25.2 |||| || 7

25.3 – 30.2 ||| 3

30.3 – 35.2 | 1

Relative Frequency Distribution

• Shows the percentage of the observation in each class

• The relative frequency of a class is obtained by dividing the class frequency by the total frequency.

• This allows us to make statements regarding the number of observations in a single class relative to the entire sample

Relative Frequency Distribution for Exercise 1

Classes Frequency Relative Frequency

10.3 – 15.2 8 8/30 = 26.7%

15.3 – 20.2 11 11/30 = 36.7%

20.3 – 25.2 7 7/30 = 23.3%

25.3 – 30.2 3 3/30 = 10%

30.3 – 35.2 1 1/30 = 3.3%

Cumulative Relative Frequency Distribution

• The cumulative relative frequency distribution express the cumulative frequency of each class relative to the entire sample

Cumulative Relative Frequency Distribution for Exercise 1

Classes Frequency Cumulative Frequency

Cumulative Relative Frequency

10.3 – 15.2 8 8 8/30 = 26.7%

15.3 – 20.2 11 19 19/30 = 63.3%

20.3 – 25.2 7 26 26/30 = 86.7%

25.3 – 30.2 3 29 29/30 = 96.7%

30.3 – 35.2 1 30 30/30 = 100%

Contingency Table

Complaint Origin Reason for complaint

Electrical Mechanical Appearance

During guarantee period

18% 13% 32%

After guarantee period

12% 22% 3%

• Contingency table indicates the number of observations for both variables that fall jointly in each category.

• Example: Distribution of product complaint

Graphic Presentation

• There are three commonly used graphic forms of a frequency distribution – histograms, – frequency polygons, – cumulative frequency distribution (ogive).

Histogram

• A graph in which – the classes are marked on the horizontal axis – the class frequencies on the vertical axis.– the class frequencies are represented by the

heights of the bars– the bars are drawn adjacent to each other.

Histogram for Exercise 1

Class Midpoint

Frequency

12.75 8

17.75 11

22.75 7

27.75 3

32.75 1

Histrogram for study hours of IM student

0

4

8

12

12.75 17.75 22.75 27.75 32.75

Number of hours

Fre

qu

en

cy

Frequency Polygon

• A frequency polygon consists of line segments connecting the points formed by the class midpoint and the class frequency.

Frequency Polygon for Exercise 1

Class Midpoint

Frequency

7.75 0

12.75 8

17.75 11

22.75 7

27.75 3

32.75 1

37.75 0

Frequency Polygon for study hours of IM students

0

2

4

6

8

10

12

7.75 12.75 17.75 22.75 27.75 32.75 37.75

Number of hours

Fre

qu

ency

Cumulative Frequency distribution

• A cumulative frequency distribution (ogive) is used to determine how many or what proportion of the data values are below or above a certain value.

Ogive for Exercise 1

Class Midpoint

FrequencyLess than cumulative frequency

More than cumulative frequency

7.75 0 0 30

12.75 8 8 30

17.75 11 19 22

22.75 7 26 11

27.75 3 29 4

32.75 1 30 1

37.75 0 30 0

Ogive for Exercise 1

Ogive for study hours if IM students

05

1015

20253035

7.75 12.75 17.75 22.75 27.75 32.75 37.75

Number of hours

Fre

qu

ency

Less than cumulative frequency More than cumulative frequency

Bar Chart

CityNumber of

unemployed (in ‘000)

Gampaha 730

Matale 540

Chilaw 670

Colombo 890

Galle 820

Kandy 890

• A bar chart can be used to depict any of the levels of measurement (nominal, ordinal, interval, or ratio).

2-17

EXAMPLE : Construct a bar chart for the number of unemployed people per 100,000 population for selected cities of 1995.

EXAMPLE continued

Unemployment data

0

200

400

600

800

1000

Gampaha Matale Chilaw Colombo Galle Kandy

City

nu

mb

er o

f p

erso

ns

2-18

a bar chart for the number of unemployed people per 100,000 population for selected cities of 1995

Pie Chart

Type of shoe

# of runners

Nike 92

Adidas 49

Reebok 37

Asics 13

• A pie chart is especially useful in displaying a relative frequency distribution. A circle is divided proportionally to the relative frequency and portions of the circle are allocated for the different groups.

• Example: Draw a pie chart based on the following information.

EXAMPLE continued

Pie chart for running shoes

Nike45%

Asics7%

Other5%

Reebok19%

Adidas24%

Type of shoe

# of runners

Nike 92

Adidas 49

Reebok 37

Asics 13

Line Chart

• A graph of data that is mapped by a series of lines. Line charts show changes in data or categories of data over time and can be used to document trends

CityNumber of

unemployed (in ‘000)

Gampaha 730

Matale 540

Chilaw 670

Colombo 890

Galle 820

Kandy 890

EXAMPLE : Construct a bar chart for the number of unemployed people per 100,000 population for selected cities of 1995.

EXAMPLE continued

Unemployment data

0

200

400

600

800

1000

Gampa

ha

Mata

le

Chilaw

Colom

boGall

e

Kandy

City

Nu

mb

er

of

pe

rso

n

Stem-and-Leaf Displays

• Stem-and-Leaf Display: A statistical technique for displaying a set of data. Each numerical value is divided into two parts: the leading digits become the stem and the trailing digits become the leaf.

Note: An advantage of the stem-and-leaf display over a frequency distribution is we do not lose the identity of each observation.

EXAMPLE

stem leaf

6 9

7 8 9

8 2 3 4 5 6 8

9 1 2 6

• Saman achieved the following scores on his twelve accounting quizzes this semester: 86, 79, 92, 84, 69, 88, 91, 83, 96, 78, 82, 85. Construct a stem-and-leaf chart for the data.

top related