Top Banner
BUSINESS STATISTICS Bijay Lal Pradhan, Ph.D. MBA, Pokhara University
21

Data Collection and Analysis · PDF file5/2/2017 · Sampling Distribution and Estimation Hypothesis Testing ... Divide the data set into five classes of equal width and construct

Mar 13, 2018

Download

Documents

dinhdat
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Data Collection and Analysis · PDF file5/2/2017 · Sampling Distribution and Estimation Hypothesis Testing ... Divide the data set into five classes of equal width and construct

BUSINESS STATISTICS

Bijay Lal Pradhan, Ph.D.

MBA, Pokhara University

Page 2: Data Collection and Analysis · PDF file5/2/2017 · Sampling Distribution and Estimation Hypothesis Testing ... Divide the data set into five classes of equal width and construct

WHY BUSINESS STATISTICS

Most successful Manager and Decision

makers understand the information and

know how to use it effectively

Page 3: Data Collection and Analysis · PDF file5/2/2017 · Sampling Distribution and Estimation Hypothesis Testing ... Divide the data set into five classes of equal width and construct

COURSE CONTENT

Introduction and Data Collection

Summarization of Data

Grouping and Displaying Data

Basic Probability: Concepts and Applications.

Probability Distributions

Sampling Distribution and Estimation

Hypothesis Testing

Chi-Square Test and Analysis of Variance

Correlation and Regression Analysis

Page 4: Data Collection and Analysis · PDF file5/2/2017 · Sampling Distribution and Estimation Hypothesis Testing ... Divide the data set into five classes of equal width and construct

DATA

Data Type

Qualitative data

Quantitative data

Discrete data

Continuous data

Presenting data

Individual form

Discrete frequency form

Continuous frequency form

Upper limit included form

Upper limit excluded form

Page 5: Data Collection and Analysis · PDF file5/2/2017 · Sampling Distribution and Estimation Hypothesis Testing ... Divide the data set into five classes of equal width and construct

NUMERICAL DESCRIPTIVE MEASURE

Arithmetic Mean

Geometric Mean

Harmonic Mean

Median

Mode

Page 6: Data Collection and Analysis · PDF file5/2/2017 · Sampling Distribution and Estimation Hypothesis Testing ... Divide the data set into five classes of equal width and construct

SOME OTHERS NUMERICAL DESCRIPTIVE MEASURE

Midhinge: average of first and third quartiles

Midrange: average of largest and smallest value

Quartiles: first and third quartiles

Range: Difference between largest and smallest

item

Standard Deviation: Positive square root of mean

of square of deviation from its AM

Variance: Square of SD

Coefficient of Variation: CV = 𝜎

𝑥∗ 100

Shape (Symmetric and Skewed) : Skewness

(Difference between mean and mode)

Page 7: Data Collection and Analysis · PDF file5/2/2017 · Sampling Distribution and Estimation Hypothesis Testing ... Divide the data set into five classes of equal width and construct

DETECTING OUTLIERS

If relative measure ( z= 𝑥−𝜇

𝜎) of any value is less

than -3 and more than 3 then they are said to be

outliers and taken out from the study.

Page 8: Data Collection and Analysis · PDF file5/2/2017 · Sampling Distribution and Estimation Hypothesis Testing ... Divide the data set into five classes of equal width and construct

EXPLORATORY DATA ANALYSIS

Five Number Summary: Three quartiles together

with the low and high data values give us a very

useful look at the data and their spread.

Box and Whisker Plot: uses a Five-Number

Summary to create a graphic sketch of the data.

Q1XSmallest Median XlargestQ3

MedianXsmallest Q1 Q3

25% 25%50%

Xlargest

Page 9: Data Collection and Analysis · PDF file5/2/2017 · Sampling Distribution and Estimation Hypothesis Testing ... Divide the data set into five classes of equal width and construct

BOX AND WHISKER PLOT

Box and Whisker Plot use a Five-Number

Summary to create a graphic sketch of the data.

Box and Whisker Plot gives a graphical

representation of the data set contained between

the upper and lower limits. This plot determines

the degree of symmetry (or skewness) based on

the distances that separate the five numbers.

Page 10: Data Collection and Analysis · PDF file5/2/2017 · Sampling Distribution and Estimation Hypothesis Testing ... Divide the data set into five classes of equal width and construct

THE DISTRIBUTION IS POSITIVELY SKEWED IF

The distance from the median (md) to the third

Quartile(Q3) is greater than the distance from the

median (md) to the first quartile (Q1).

The distance from the median (md) to the largest

value is greater than the distance from the

median (md) to the smallest value of the data.

The distance from the third Quartile (Q3) to the

largest value is greater than the distance from

the first quartile (Q1) to the smallest value.

Page 11: Data Collection and Analysis · PDF file5/2/2017 · Sampling Distribution and Estimation Hypothesis Testing ... Divide the data set into five classes of equal width and construct

THE DISTRIBUTION IS NEGATIVELY SKEWED IF

The distance from the median (md) to the third

Quartile(Q3) is less than the distance from the

median (md) to the first quartile (Q1).

The distance from the median (md) to the largest

value is less than the distance from the median

(md) to the smallest value of the data.

The distance from the third Quartile(Q3) to the

largest value is less than the distance from the

first quartile (Q1) to the smallest value.

Page 12: Data Collection and Analysis · PDF file5/2/2017 · Sampling Distribution and Estimation Hypothesis Testing ... Divide the data set into five classes of equal width and construct

THE DISTRIBUTION IS PERFECTLY SYMMETRICAL IF

The distance from the median (md) to the third

Quartile (Q3) is equal to the distance from the

median (md) to the first quartile (Q1).

The distance from the median (md) to the largest

value is equal to the distance from the median

(md) to the smallest value of the data.

The distance from the third Quartile (Q3) to the

largest value is equal to the distance from the

first quartile (Q1) to the smallest value.

Page 13: Data Collection and Analysis · PDF file5/2/2017 · Sampling Distribution and Estimation Hypothesis Testing ... Divide the data set into five classes of equal width and construct

EXAMPLE:1

Size 0-10 10-20 20-30 30-40 40-50 50-60

Frequency 10 12 25 35 40 50

Size f c f

0-10 10 10

10-20 12 22

20-30 25 47

30-40 35 82

40-50 40 122

50-60 50 172

N 172

SolutionLargest value XL = 60Smallest value Xs = 0

List the five number summary and prepare a box and whisker plot from the following information. Are the data skewed?

Size of Md = 𝑁

2= 172

2= 86th item

The c.f. just >86 is 122. Hence class 40 - 50 is themedian class.

Md = L +𝑵

𝟐− 𝒄.𝒇.

𝒇x h = 40 +

𝟖𝟔 − 𝟖𝟐

𝟒𝟎x 10 = 40+1 = 41

Size of Q1 =𝑁

4=172

4= 43

The c.f. just > 43 is 47. Hence, Q1 lies in class 20 - 30

Q1= L +𝑵

𝟒− 𝒄.𝒇.

𝒇x h = 20 +

𝟒𝟑 − 𝟐𝟐

𝟐𝟓x 10 = 20 + 8.4 = 28.4

Size of Q3 =3𝑁

4=

3𝑥172

4= = 129

The c.f. just >129 is 172. Hence, Q3 lies in class 50-60

Q3= L + 𝟑𝑵

𝟒− 𝒄.𝒇.

𝒇x h= 50 +

𝟏𝟐𝟗 − 𝟏𝟐𝟐

𝟓𝟎x 10 = 50 + 1.4 = 51.4

Page 14: Data Collection and Analysis · PDF file5/2/2017 · Sampling Distribution and Estimation Hypothesis Testing ... Divide the data set into five classes of equal width and construct

EXAMPLE:1

Sf

Xs Q1 Md Q3 Xl

The five number summary is {0, 28.4, 41, 51.4, 60}

Since the length of left whisker is longer than the length of

right whisker the given distribution is negatively (left) skewed.

0 10 20 30 40 50 60

Solution

box and whisker plot

Page 15: Data Collection and Analysis · PDF file5/2/2017 · Sampling Distribution and Estimation Hypothesis Testing ... Divide the data set into five classes of equal width and construct

PU2016

18 15 25 18 29 28 23 20 13 26

29 24 13 13 28 26 18 30 24 23

26 17 18 30 28 18 19 16 15 35

21 15 17 24 26 22 32 38 20 15

35 18 14 25 19 28 28 12 30 36

To help determine the need for more golf courses, a survey was undertaken.

A sample of 50 self-declared golfers was asked how many rounds of golf they

played last year. These data are as follows:

Summarize these data using stem and leaf display.

Divide the data set into five classes of equal width and construct a frequency

distribution and relative frequency distribution.

Construct frequency histogram and frequency polygon.

Compute mean and standard deviation of the grouped data set constructed

in (b).

Construct a box and whisker plot of the grouped data set prepared in (b).

Then, describe the shape of the data.

Page 16: Data Collection and Analysis · PDF file5/2/2017 · Sampling Distribution and Estimation Hypothesis Testing ... Divide the data set into five classes of equal width and construct

Number Ascen Stem Leaf

18 12 1 23334555567788888899

15 13 2 0012334445566668888899

25 13 3 0002556818 13 interval *=L-S/n=38-12/5= 5.2

29 14

28 15 Bin Class CF FREQ RF m fm fm2

23 15 17 12-18 12 12 0.24 15 180 2700

20 15 23 18-24 26 14 0.28 21 294 6174

13 15 29 24-30 42 16 0.32 27 432 11664

26 16 35 30-36 48 6 0.12 33 198 6534

29 17 42 36-42 50 2 0.04 39 78 3042

24 17

13 18 1182 30114

13 18 mean 23.64 sd 6.59

28 18 smalles 12

26 18 largest 38

18 18 Q1 17.5 15+((12.5-5)/15)*5

30 18 Q2 22.78 20+((25-20)/9)*5

24 19 Q3 28.27 25+((37.5-29)/13)*523 19

26 20 Five number 12 17.5 22.8 28.27 3817 20

18 21

30 22

28 23

18 23

19 24

16 24

15 24

Page 17: Data Collection and Analysis · PDF file5/2/2017 · Sampling Distribution and Estimation Hypothesis Testing ... Divide the data set into five classes of equal width and construct

lower upper frequency percent

10 < 15 5 10.0

15 < 20 15 30.0

20 < 25 9 18.0

25 < 30 13 26.0

30 < 35 4 8.0

35 < 40 4 8.0

0

5

10

15

20

25

30

35

Perc

ent

Data

Histogram

0.0

5.0

10.0

15.0

20.0

25.0

30.0

35.0

10 15 20 25 30 35

Perc

ent

Data

Frequency Polygon

Page 18: Data Collection and Analysis · PDF file5/2/2017 · Sampling Distribution and Estimation Hypothesis Testing ... Divide the data set into five classes of equal width and construct

BOX WHISKER PLOT

12 17.5 22.8 28.3 38

10 15 20 25 30 35 40

Box Whisker Plot

Since the length of right whisker is longer than the length of

left whisker the given distribution is positively (Right) skewed.

Page 19: Data Collection and Analysis · PDF file5/2/2017 · Sampling Distribution and Estimation Hypothesis Testing ... Divide the data set into five classes of equal width and construct

2014 PUThe following data represent the cost of electricity during June 2014 for a random sample of 50 one-bedroom apartments in a large city.

Raw Data on Utility Charges (Rs)

96 121 202 178 147 102 153 197 127 82

157 185 90 116 172 111 148 213 130 165

141 149 206 175 123 128 144 168 109 167

95 163 150 154 130 143 187 166 139 149

108 119 183 151 114 135 191 137 129 158

a) Construct a frequency distribution using interval as 80-100,100-120 and so on.

b) Construct frequency histogram and frequency polygon. Around

what amount does the monthly electricity cost seem to be

concentrated?

c) Construct an ogive to find the value of median.

d) Construct a box and whisker plot of the grouped data setprepared in (a). Then, describe the shape of the data.

e) Compare the median value of (c) and (d). Are they equal? If so,

what can be concluded?

Page 20: Data Collection and Analysis · PDF file5/2/2017 · Sampling Distribution and Estimation Hypothesis Testing ... Divide the data set into five classes of equal width and construct

2016 SPRING PUThe president of Ocean Airlines is trying to estimate when the Federal Aviation Administration (FAA) is most likely to rule on the company's application for a new flight between Charlotte and Nashville. Assistants to the president have assembled the following waiting times for applications filed during the past year. The data are given in days from the date of application until an FAA ruling.

(i) Arrange above data in ascending order by preparing stem and leaf display.

(i) Construct frequency distribution using 6 intervals of equally spaced. Also

construct histogram and comment on the shape of the distribution.

(i) Prepare a box and whisker plot from group data set prepared in (ii) and then

describe nature of the distribution of data points.

(i) Compute the mean and coefficient of the variation from group data prepare

in (ii).

(i) Detect the outlier if any using exploratory data analysis.

14 40 13 48 31 40 25 33 62 12

44 34 68 11 33 42 26 55 47 11

29 40 41 30 34 31 64 35 57 63

44 44 17 52 32 36 34 53 41 39

29 22 28 44 51 31 44 28 56 53

Page 21: Data Collection and Analysis · PDF file5/2/2017 · Sampling Distribution and Estimation Hypothesis Testing ... Divide the data set into five classes of equal width and construct