Top Banner
1 STAT 12049 Week 7 Discrete random variables and discrete distributions
43

Lecture TQM

May 27, 2017

Download

Documents

kkarthik101
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Lecture TQM

1

STAT 12049 Week 7

Discrete random variablesand

discrete distributions

Page 2: Lecture TQM

2

ObjectivesOn completion of this weeks material you should be able to:

define a random variable, a discrete random variable and a probability distribution

describe conditions under which the binomial distribution should be used

describe conditions under which the Poisson distribution should be used

calculate probabilities using the binomial and Poisson distributions

Page 3: Lecture TQM

3

Discrete random variables and discrete distributions

It is often argued that a significant problem with management is their failure to understand uncertainty and variation in a system or process.

This uncertainty needs to be incorporated into any analysis where decisions regarding people and processes are made.

Over the next two weeks we will discuss how to model such uncertainty.

Page 4: Lecture TQM

4

Deming’s Red Bead Experiment

This experiment is a useful way of illustrating how causes are often associated with certain occurrences when no association is justified.

You are given a jar with a large number of beads in it: 90% are white and 10% are red.

This jar represents a production process where a white bead is an acceptable item and a red bead is a defect.

Page 5: Lecture TQM

5

The results of the process are simulated by each participant taking repeated samples of 50 beads (using a paddle).

The first participant, Joe, samples 3 red beads out of his 50 – he gets praised for a job well done, since this is better than the 10% average.

Mary now has her turn and finds 2 red beads among the 50 – her defect rate is considerably smaller and so she is promoted to line supervisor.

Page 6: Lecture TQM

6

John find 8 defectives in the 50 and is scolded for his poor performance.

This illustrates how easy it is to blame workers for problems with the system.

Although this experiment seems a little contrived, this practice occurs frequently in the “real world”.

Page 7: Lecture TQM

7

Management in this experiment has missed the fact that the defects are inherent in the process and the outcomes generated by the random draws from the beads – the workers have no influence!

There are, of course, instances when workers are to blame for defects (errors, sickness, inattentiveness etc) but usually these cases are rare compared to the number of defects.

Page 8: Lecture TQM

8

Most variations and defects are a result of the system, so it is the system that needs improvement – the workers do not need reprimanding.

Deming believes that 94% of problems belong to the system and thus are the responsibility of management and only 6% are due to the operator.

Page 9: Lecture TQM

9

We often need some way of measuring how likely a certain event is – this week we begin to do this.

For example in the Red Bead Experiment, it is highly unlikely that 15 or more defects will occur in one sample but not unlikely for 7 or more defects to occur.

Page 10: Lecture TQM

10

A random variable is a variable whose outcome is uncertain.

We can often predict the value that a random variable will take, before the activity that leads to this variable occurs.

For example we are uncertain about the weight of a box of cereal until we actually measure it, but past experience may allow us to predict the weight with a degree of certainty.

Page 11: Lecture TQM

11

A random variable that can only take on distinct values (often whole numbers) is a discrete random variable.

We denote a random variable with a capital letter such as X.

The possible outcomes of this random variable are indicated by lower case letters.

Page 12: Lecture TQM

12

If, for example, a random variable represents the number of flaws in a product, then its possible outcomes will be: x=0 (no flaws), x=1 (exactly 1 flaw), x=2 (exactly 2 flaws) etc.

If a random variable is the outcome of a quality inspection, then its possible values could be: x=0 (good) and x=1 (defective).

Page 13: Lecture TQM

13

Probabilities are always non-negative numbers between 0 and 1.

The sum of the probabilities for all possible outcomes must always be 1.

Often these values come from past experience.

For example, if we know that in the past 98% of items have been acceptable and 2% defective, then the probabilities are:P(item is good)=P(X=0)=0.98 and P(item is defective)=P(X=1)=0.02.

Page 14: Lecture TQM

14

Example Let X be the outcome of a roll of a fair dice.

Then we know that P(X=1)=P(X=2)=…=P(X=6)=1/6

Page 15: Lecture TQM

15

ExampleIf we toss a fair coin three times and let X be the number of heads then we can create a table of probabilities for the four possible outcomes: 0, 1, 2 and 3.

x 0 1 2 3P(X=x) 1/8 3/8 3/8 1/8

Page 16: Lecture TQM

16

Let us see how these probabilities were calculated by first listing all the possibilities:

TTT (x=0)TTH, THT, HTT (x=1)THH, HTH, HHT (x=2)HHH (x=3)

Since there are eight possibilities, occurring 1, 3, 3 and 1 times respectively, the probabilities are 1/8, 3/8, 3/8 and 1/8 respectively.

Note that these probabilities are all non-zero and they sum to one.

Page 17: Lecture TQM

17

ExampleLet X be the number of defects in found in the inspection of a microwave oven component. From previous years inspections, we know that the following distribution describes these defects:

x 0 1 2 3P(X=x) 0.7 0.2 0.06 0.04

Page 18: Lecture TQM

18

We can calculate probabilities of related events such as the probability of at most one defect as:

The values x=0, 1, … are basic outcomes as they cannot occur at the same time.

The probability of the related event (X ≤ 1) is the sum of the probabilities.

1 0 or 1 0 1

0.7 0.2 0.9

P X P X P X P X

Page 19: Lecture TQM

19

Another example is the probability of at least one defect:

1 1 or 2 or 3

1 2 3

0.2 0.06 0.04 0.3

P X P X

P X P X P X

Page 20: Lecture TQM

20

1 1 2 2i ii

x P X x x P X x x P X x

The mean and standard deviation are often used to describe a distribution.

The mean measures the central location and the standard deviation the variability or dispersion.

In general for a discrete random variable, X, with possible outcomes x1, x2,… and associated probabilities P(X=x1), P(X=x2),… the mean is given by

Page 21: Lecture TQM

21

Similarly, the standard deviation is given by

2

2 21 1 2 2

i ii

x P X x

x P X x x P X x

Page 22: Lecture TQM

22

For the distribution given in the example, the mean is

and the standard deviation is

(to 4 decimal places).

0 0.7 1 0.2 2 0.06 3 0.040.44

2 20 0.44 0.7 3 0.44 0.04

0.60640.7787

Page 23: Lecture TQM

23

The mean and standard deviation of a distribution are different from the mean and standard deviation of a sample.

With a distribution there is no data involved so we call these parameters of the distribution and use Greek letters to denote them.

The probability distribution is used to describe the population.

With a sample we use Roman letters for mean and standard deviation of this sample ( and s).

x

Page 24: Lecture TQM

24

and s are estimates of the parameters of the distribution, and .

For larger sample sizes (relative to the population), the values and s will converge to and .

x

x

Page 25: Lecture TQM

25

The Binomial Distribution

The following three conditions must be met in order for a binomial distribution to be appropriate:

1. An experiment (a trial) is conducted that can result in only one of two possibilities. These are usually called success (S) and failure (F). The probability of success is given by P(S)= and the probability of failure by P(F)=1-.

Page 26: Lecture TQM

26

2. You conduct n such trials. These trials must be independent (the result of one will not affect the result of another).

3. The random variable X, is the number of successes among these n trials.

Page 27: Lecture TQM

27

“Success” is whatever characteristic is being studied.

It could be either a positive (an item coming off a production line is defect free) or a negative (a road accident involves fatalities).

Page 28: Lecture TQM

28

The possible values that X can take are 0, 1, 2,… n.

The probabilities of these n+1 outcomes are given by the formula

where x! (read x factorial) is defined as x!=x×(x-1)×…×2×1

We refer to this as the binomial distribution.

! 1! !

n xxnP X xx n x

Page 29: Lecture TQM

29

The mean of the binomial distribution is given by

and the standard deviation by

n

1n

Page 30: Lecture TQM

30

ExampleA production process is known to result in 5% of the items being defective. Thus = 0.05 and 1- =0.95. If we sample 5 items, what is the probability of getting exactly 1 defective item?

1 45!1 0.05 0.95 0.20361!4!

P X

! 1

! !n xxnP X x

x n x

Page 31: Lecture TQM

31

What is the probability of getting exactly 2 defective items?

What is the probability that we get at least one defective item?

2 35!2 0.05 0.95 0.02142!3!

P X

1 1 2 5P X P X P X P X

Page 32: Lecture TQM

32

We could calculate each of the individual probabilities and add them together or we could use the fact the all the possible probabilities must add to one:

0 5

1 1 0

5!1 0.05 0.950!5!

0.2262

P X P X

Page 33: Lecture TQM

33

The mean and standard deviation of this distribution would be:

5 0.05 0.25

1 5 0.05 0.95 0.4873

n

n

Page 34: Lecture TQM

34

Using the binomial distribution formula works well for a small number of trials.

In our example we had only 5 trials and could easily produce a table of all 6 probabilities (for x=0,1,…5).

Imagine if n was 20 (or even more) – we would not want to evaluate the binomial formula so many times!

An alternative is to use tabulated values of the binomial distribution probabilities. (See the table on page 198 of Ledolter).

Page 35: Lecture TQM

35

Given n and , these tables give the probability P(X x).

Returning to our example we could use the tables to find the probability (as before!):

or other probabilities such as

1 1 01 0.7738 0.2262

P X P X

1 0.9774P X

2 ( 3) 21.0000 0.9988 0.0012

P X P X P X

Page 36: Lecture TQM

36

The Poisson Distribution

We use the Poisson distribution when we have information about the average rate at which something is occurring.

For example the average number of calls to a switchboard per hour or the average number of defective items coming off a production line each day.

We define our variable, X, to be the number of successes in a certain interval.

Page 37: Lecture TQM

37

Success is whatever characteristic we are interested in examining, so it could be either a positive (average number of babies born per day) or a negative (the average number of deaths from cancer each month).

The possible outcomes for X are x=0,1,2,… (all the non-negative whole numbers).

Page 38: Lecture TQM

38

Poisson probabilities are found by

where is the rate (the Poisson parameter) and e is a known constant (e = 2.718282 to 6 decimal places).

The mean and standard deviation of a Poisson distribution are and .

!

x

P X x ex

Page 39: Lecture TQM

39

ExampleA production process is known to generate pocket calculator components, with an average of 5 defective components per hour (=5). The probability of getting eight or more defective components is:

(using the tables on pages 202 and 203 of Ledolter).

8 1 71 0.867 0.133

P X P X

Page 40: Lecture TQM

40

The probability of getting no defective components would be

using the Poisson formula.

0

550 0.00670!

P X e

Page 41: Lecture TQM

41

Many statistical software packages will calculate binomial and Poisson probabilities for you.

Tables exist in other textbooks, which cover more parameter values. (For example see the STAT 11048 text “Introduction to Business Statistics” by Weiers).

Page 42: Lecture TQM

42

We have examined the Binomial and Poisson distributions since we will need these to produce control charts and to discuss acceptance sampling later in the term.

After covering this weeks material, ensure that you are able to identify which distribution is appropriate for the situation you are investigating – the type of distribution you have will determine the type of control chart you will use.

Page 43: Lecture TQM

43

Complete this weeks recommended exercises from the study guide.