Statistics

Polytechnic University of the Philippines

College of Engineering

Department of Mechanical Engineering

TOTAL QUALITY

MANAGEMENT FOR ME

Basic Statistics

Submitted by:

Aromin, Albert S.

Marcelo, Christian John B.

BSME IV-1

Submitted to:

Prof. Rhodora Nicolas Buluran

Instructor

Introduction

A. Definition/Importance

A collection of quantitative data pertaining to any subject or group, especially

when data are systematically gathered and collated

The science that deals with the collection, tabulation, analysis, interpretation,

and presentation of quantitative data.

Helps to understand the quality control charts that are used in business and

manufacturing processes.

Let us suppose that we want to control the diameter of the piston rings that are being

produced by us. The centerline in the X-bar chart would now represent the desired

standard size for example: diameter in millimeters, of the rings, and the center line in the

R chart would now represent the acceptable range of the rings within samples and

within specifications.

B. Two Phases of Statistics

A. Descriptive/ Deductive

which endeavor to describe a subject or group

B. Inductive

Endeavor to determine from limited amount of data (sample) an important

conclusion about much larger amount of data (population)

C. Collecting the data

A. Variable

Are those quality characteristics that are measurable

B. Attributes

Are those quality characteristics that are classified as either conforming or not

conforming to specifications

C. Continuous

A variable that is capable of any degree of subdivision

D. Discrete

Variables that exhibit gaps

Difference between Precision and Accuracy

Accuracy

Degree of conformity of a measure to a standard or a true value

Precision

The degree of refinement with which an operation is performed or a measurement

stated

D. Describing the data

Frequency distribution

A summarization of how the data points occur within each subdivision of observed

values or groups of observed values.

Measures of Central Tendency

It is a numerical value that describes the central position of the data or how the data

tend to build up in the center.

Measures of Dispersion

It describes how the data are spread out or scattered on each side of the central

value.

Frequency of Distribution

Ungrouped Data

Comprise a listing of the observed values

Array

It is the arrangement of raw numerical data in ascending and descending order of

magnitude

Frequency

It is the numerical value for the number of tallies.

Grouped Data

It represents a lumping together of the observed values

Steps in creating Frequency Distribution Table for Grouped Data

1. Collect data and construct a tally sheet

2. Determine the range

It is the difference between the highest observed value and the lowest observed value.

Where:

R= range

= highest number

= lowest number

Example

=2.575-2.531

= 0.044

3. Determine the cell interval

is the distance between adjacent cell midpoints

For the example problem, the answer is

Where:

h = no. of cells

Example:

4. Determine the cell midpoints

= midpoint for lowest cell

Example:

5. Determine the cell boundaries

Cell boundaries are the extreme or limit values of a cell, referred to as the upper

boundary and the lower boundary.

6. Post the cell frequency

Measures of Central tendency

In general terms, central tendency is a statistical measure that determines a

single value that accurately describes the center of the distribution.

Measures of central tendency are statistical measures which describe the

position of a distribution.

The value or the figure which represents the whole series is neither the lowest

value in the series nor the highest. It lies somewhere between these two

extremes.

AVERAGE

The statistical mean of set of observations is the average measurement in a set of

data.

Ungrouped

∑

Example:

1.4 1.5 2.6 3.9 2.4 1.9 3.6 2.5 2.4 3.2

It can be arranged in ascending order

1.4 1.5 1.9 2.4 2.4 2.5 2.6 3.2 3.6 3.9

Getting the summation divided by total number of items, thus 2.54

GROUPED

Example:

Median

The median of a set of observations is the value that, when the observations are

arranged in ascending or descending order, satisfies the following conditions

a. If the number of observations is odd, the median is the middle value

b. If the number of observations is even, the median is the average of the two

middle numbers

Example:

1.4 1.5 2.6 3.9 2.4 1.9 3.6 2.5 2.4 3.2

It can be arranged in ascending order

1.4 1.5 1.9 2.4 2.4 2.5 2.6 3.2 3.6 3.9

Since there are ten items, therefore the 2 middle values are 2.4 and 2.5. Thus the

median will be 2.45

(

)

(

)

(

)

Mode

A. UNGROUPED

The mode of a set of observations is the specific value that occurs with the greatest

frequency

1.4 1.5 2.6 3.9 2.4 1.9 3.6 2.5 2.4 3.2

MODE = 2.4

B. GROUPED

(

)

L = exact lower limit of the modal class

numerical difference between the frequency of the modal class and the

frequency of the adjacent lower class

numerical difference between the frequency of the modal class and the

frequency of the adjacent higher class

C = class interval

Measures of Dispersion

Measures of dispersion are descriptive statistics that describe how similar the

values to each other

It describes how the data are spread out or scattered on each side of the

central value.

The more similar the scores are to each other, the lower the measure of

dispersion will be

The less similar the scores are to each other, the higher the measure of dispersion

will be

In general, the more spread out a distribution is, the larger the measure of

dispersion will be

Which of the distributions of scores has the larger dispersion?

The left side distribution has more dispersion because the scores are more spread out.

That is, they are less similar to each other.

Measures of dispersion:

0

25

50

75

100

125

1 2 3 4 5 6 7 8 9 10

0

2550

75

100

125

1 2 3 4 5 6 7 8 9 10

– The range

– Variance / standard deviation

Other Measures

– Coefficient of Variation

– Skewness and Kurtosis

The Range

The range is defined as the difference between the largest score in the set of

data and the smallest score in the set of data, Xh - Xl

What is the range of the following data:

4 8 1 6 6 2 9 3 6 9

The largest score (Xh) is 9; the smallest score (Xl) is 1; the range is Xh – Xl = 9 – 2

= 8

The range is used when:

you have ordinal data or

you are presenting your results to people with little or no knowledge of

statistics

The range is rarely used in scientific work as it is fairly insensitive

It depends on only two scores in the set of data, Xh and Xl

Two very different sets of data can have the same range:

1 1 1 1 9 vs 1 3 5 7 9

Variance

Variance is defined as the average of the square deviations:

What Does the Variance Formula Mean?

First, it says to subtract the mean from each of the scores

This difference is called a deviate or a deviation score

N

X

2

2

The deviate tells us how far a given score is from the typical, or average,

score

Thus, the deviate is a measure of dispersion for a given score

X d = =

11 (11-19) = - 8 64

13 (13-19) = - 6 36

17 (17-19) = - 2 4

18 (18-19) = - 1 1

21 (21-19) = 2 4

24 (24-19) = 5 25

29 (29-19) = 10 100

N = 7

Computational Formula Example

What Does the Variance Formula Mean?

• Variance is the mean of the squared deviation scores

• The larger the variance is, the more the scores deviate, on average, away from

the mean

• The smaller the variance is, the less the scores deviate, on average, from the

mean

Standard Deviation

• When the deviate scores are squared in variance, their unit of measure is

squared as well

– E.g. If people’s weights are measured in pounds, then the variance of the

weights would be expressed in pounds2 (or squared pounds)

• Since squared units of measure are often awkward to deal with, the square root

of variance is often used instead

– The standard deviation is the square root of variance

• Standard deviation = variance

• Variance = standard deviation2

Computational Formula

When calculating variance, it is often easier to use a computational formula which is

algebraically equivalent to the definitional formula:

2 is the population variance, X is the frequency, is the population mean, and N is the

number of scores



NN

N XX

X

2

2

2

2

Variance of a Sample

s2 is the sample variance, X is a score, X is the sample mean, and N is the number of

scores

Standard Deviation

• √∑

Standard Deviation

Ungrouped – SAMPLE

• √ ∑

∑

Standard Deviation

grouped – SAMPLE

• √ ∑

∑

Standard Deviation

grouped – EXAMPLE

1

2

2

N

XXs

Coefficient of variation

A measure that allow statistician to compare the variation of two or more

different variables

It is used to compare distributions with different means.

The distribution with the largest coefficient of variation value has the greatest

relative variation.

CV =

Example 1

The mean of parking tickets issued over a 4 month period is 90. the standard

deviation was 5. The average revenue was 5,400 and the standard deviation is 775.

Compute the variations of the two variables.

CV(tickets) = 5/90 x 100 = 5.56%

CV(revenue) = 775/ 5400 x 100 = 14.35%

Since CV(revenues) > CV(tickets), more variability at recorded revenues.

Skewness and Kurtosis

Measure of Skew

• If s3 < 0, then the distribution has a negative skew

• If s3 > 0 then the distribution has a positive skew

• If s3 = 0 then the distribution is symmetrical

• The more different s3 is from 0, the greater the skew in the distribution

Kurtosis

• Kurtosis measures whether the scores are spread out more or less than they

would be in a normal distribution

• When the distribution is normally distributed, its kurtosis equals 3 and it is said to

be mesokurtic

• When the distribution is less spread out than normal, its kurtosis is greater than 3

and it is said to be leptokurtic

• When the distribution is more spread out than normal, its kurtosis is less than 3 and

it is said to be platykurtic

Measure of Kurtosis

• The measure of kurtosis is given by:

Difference between Skewness and Kurtosis

Skewness > extent and direction

Kurtosis > degree of peakedness

Collectively, the variance (s2), skew (s3), and kurtosis (s4) describe the shape of the

distribution

N

N

XX

XX

s

4

2

4

References:

Cruz, Myrna et al. Statistics and Probability Theory. 2011 ed.

Balasubramanian , P., & Baladhandayutham, A. (2011).Research methodology in

library science. (pp. 164-170). New Delhi: Deep & Deep Publications.

Busha,Charles, H., & Harter,Stephen, P. (1980). Research methods in librarianship:

techniques and interpretation. (pp. 372-395). New York: Academic Press.

Elvers, Greg C. Measures of Dispersion. academic.udayton.edu/

gregelvers/psy216/ppt/dispersion.ppt

Tutor Vista.com. Central Tendency. Measures of Central Tendency.

http://math.tutorvista.com/statistics/central-tendency.html.

Tutor Vista.com. Control Charts - Types, Formula, Examples & Tables

http://math.tutorvista.com/statistics/control-charts.html.

http://math.tutorvista.com/statistics/central-tendency.html

Statistics

Documents