Lecture TQM

1

STAT 12049 Week 7

Discrete random variablesand

discrete distributions

2

ObjectivesOn completion of this weeks material you should be able to:

define a random variable, a discrete random variable and a probability distribution

describe conditions under which the binomial distribution should be used

describe conditions under which the Poisson distribution should be used

calculate probabilities using the binomial and Poisson distributions

3

Discrete random variables and discrete distributions

It is often argued that a significant problem with management is their failure to understand uncertainty and variation in a system or process.

This uncertainty needs to be incorporated into any analysis where decisions regarding people and processes are made.

Over the next two weeks we will discuss how to model such uncertainty.

4

Deming’s Red Bead Experiment

This experiment is a useful way of illustrating how causes are often associated with certain occurrences when no association is justified.

You are given a jar with a large number of beads in it: 90% are white and 10% are red.

This jar represents a production process where a white bead is an acceptable item and a red bead is a defect.

5

The results of the process are simulated by each participant taking repeated samples of 50 beads (using a paddle).

The first participant, Joe, samples 3 red beads out of his 50 – he gets praised for a job well done, since this is better than the 10% average.

Mary now has her turn and finds 2 red beads among the 50 – her defect rate is considerably smaller and so she is promoted to line supervisor.

6

John find 8 defectives in the 50 and is scolded for his poor performance.

This illustrates how easy it is to blame workers for problems with the system.

Although this experiment seems a little contrived, this practice occurs frequently in the “real world”.

7

Management in this experiment has missed the fact that the defects are inherent in the process and the outcomes generated by the random draws from the beads – the workers have no influence!

There are, of course, instances when workers are to blame for defects (errors, sickness, inattentiveness etc) but usually these cases are rare compared to the number of defects.

8

Most variations and defects are a result of the system, so it is the system that needs improvement – the workers do not need reprimanding.

Deming believes that 94% of problems belong to the system and thus are the responsibility of management and only 6% are due to the operator.

9

We often need some way of measuring how likely a certain event is – this week we begin to do this.

For example in the Red Bead Experiment, it is highly unlikely that 15 or more defects will occur in one sample but not unlikely for 7 or more defects to occur.

10

A random variable is a variable whose outcome is uncertain.

We can often predict the value that a random variable will take, before the activity that leads to this variable occurs.

For example we are uncertain about the weight of a box of cereal until we actually measure it, but past experience may allow us to predict the weight with a degree of certainty.

11

A random variable that can only take on distinct values (often whole numbers) is a discrete random variable.

We denote a random variable with a capital letter such as X.

The possible outcomes of this random variable are indicated by lower case letters.

12

If, for example, a random variable represents the number of flaws in a product, then its possible outcomes will be: x=0 (no flaws), x=1 (exactly 1 flaw), x=2 (exactly 2 flaws) etc.

If a random variable is the outcome of a quality inspection, then its possible values could be: x=0 (good) and x=1 (defective).

13

Probabilities are always non-negative numbers between 0 and 1.

The sum of the probabilities for all possible outcomes must always be 1.

Often these values come from past experience.

For example, if we know that in the past 98% of items have been acceptable and 2% defective, then the probabilities are:P(item is good)=P(X=0)=0.98 and P(item is defective)=P(X=1)=0.02.

14

Example Let X be the outcome of a roll of a fair dice.

Then we know that P(X=1)=P(X=2)=…=P(X=6)=1/6

15

ExampleIf we toss a fair coin three times and let X be the number of heads then we can create a table of probabilities for the four possible outcomes: 0, 1, 2 and 3.

x 0 1 2 3P(X=x) 1/8 3/8 3/8 1/8

16

Let us see how these probabilities were calculated by first listing all the possibilities:

TTT (x=0)TTH, THT, HTT (x=1)THH, HTH, HHT (x=2)HHH (x=3)

Since there are eight possibilities, occurring 1, 3, 3 and 1 times respectively, the probabilities are 1/8, 3/8, 3/8 and 1/8 respectively.

Note that these probabilities are all non-zero and they sum to one.

17

ExampleLet X be the number of defects in found in the inspection of a microwave oven component. From previous years inspections, we know that the following distribution describes these defects:

x 0 1 2 3P(X=x) 0.7 0.2 0.06 0.04

18

We can calculate probabilities of related events such as the probability of at most one defect as:

The values x=0, 1, … are basic outcomes as they cannot occur at the same time.

The probability of the related event (X ≤ 1) is the sum of the probabilities.

1 0 or 1 0 1

0.7 0.2 0.9

P X P X P X P X

19

Another example is the probability of at least one defect:

1 1 or 2 or 3

1 2 3

0.2 0.06 0.04 0.3

P X P X

P X P X P X

20

1 1 2 2i ii

x P X x x P X x x P X x

The mean and standard deviation are often used to describe a distribution.

The mean measures the central location and the standard deviation the variability or dispersion.

In general for a discrete random variable, X, with possible outcomes x1, x2,… and associated probabilities P(X=x1), P(X=x2),… the mean is given by

21

Similarly, the standard deviation is given by

2

2 21 1 2 2

i ii

x P X x

x P X x x P X x

22

For the distribution given in the example, the mean is

and the standard deviation is

(to 4 decimal places).

0 0.7 1 0.2 2 0.06 3 0.040.44

2 20 0.44 0.7 3 0.44 0.04

0.60640.7787

23

The mean and standard deviation of a distribution are different from the mean and standard deviation of a sample.

With a distribution there is no data involved so we call these parameters of the distribution and use Greek letters to denote them.

The probability distribution is used to describe the population.

With a sample we use Roman letters for mean and standard deviation of this sample ( and s).

x

24

and s are estimates of the parameters of the distribution, and .

For larger sample sizes (relative to the population), the values and s will converge to and .

x

x

25

The Binomial Distribution

The following three conditions must be met in order for a binomial distribution to be appropriate:

1. An experiment (a trial) is conducted that can result in only one of two possibilities. These are usually called success (S) and failure (F). The probability of success is given by P(S)= and the probability of failure by P(F)=1-.

26

2. You conduct n such trials. These trials must be independent (the result of one will not affect the result of another).

3. The random variable X, is the number of successes among these n trials.

27

“Success” is whatever characteristic is being studied.

It could be either a positive (an item coming off a production line is defect free) or a negative (a road accident involves fatalities).

28

The possible values that X can take are 0, 1, 2,… n.

The probabilities of these n+1 outcomes are given by the formula

where x! (read x factorial) is defined as x!=x×(x-1)×…×2×1

We refer to this as the binomial distribution.

! 1! !

n xxnP X xx n x

29

The mean of the binomial distribution is given by

and the standard deviation by

n

1n

30

ExampleA production process is known to result in 5% of the items being defective. Thus = 0.05 and 1- =0.95. If we sample 5 items, what is the probability of getting exactly 1 defective item?

1 45!1 0.05 0.95 0.20361!4!

P X

! 1

! !n xxnP X x

x n x

31

What is the probability of getting exactly 2 defective items?

What is the probability that we get at least one defective item?

2 35!2 0.05 0.95 0.02142!3!

P X

1 1 2 5P X P X P X P X

32

We could calculate each of the individual probabilities and add them together or we could use the fact the all the possible probabilities must add to one:

0 5

1 1 0

5!1 0.05 0.950!5!

0.2262

P X P X

33

The mean and standard deviation of this distribution would be:

5 0.05 0.25

1 5 0.05 0.95 0.4873

n

n

34

Using the binomial distribution formula works well for a small number of trials.

In our example we had only 5 trials and could easily produce a table of all 6 probabilities (for x=0,1,…5).

Imagine if n was 20 (or even more) – we would not want to evaluate the binomial formula so many times!

An alternative is to use tabulated values of the binomial distribution probabilities. (See the table on page 198 of Ledolter).

35

Given n and , these tables give the probability P(X x).

Returning to our example we could use the tables to find the probability (as before!):

or other probabilities such as

1 1 01 0.7738 0.2262

P X P X

1 0.9774P X

2 ( 3) 21.0000 0.9988 0.0012

P X P X P X

36

The Poisson Distribution

We use the Poisson distribution when we have information about the average rate at which something is occurring.

For example the average number of calls to a switchboard per hour or the average number of defective items coming off a production line each day.

We define our variable, X, to be the number of successes in a certain interval.

37

Success is whatever characteristic we are interested in examining, so it could be either a positive (average number of babies born per day) or a negative (the average number of deaths from cancer each month).

The possible outcomes for X are x=0,1,2,… (all the non-negative whole numbers).

38

Poisson probabilities are found by

where is the rate (the Poisson parameter) and e is a known constant (e = 2.718282 to 6 decimal places).

The mean and standard deviation of a Poisson distribution are and .

!

x

P X x ex

39

ExampleA production process is known to generate pocket calculator components, with an average of 5 defective components per hour (=5). The probability of getting eight or more defective components is:

(using the tables on pages 202 and 203 of Ledolter).

8 1 71 0.867 0.133

P X P X

40

The probability of getting no defective components would be

using the Poisson formula.

0

550 0.00670!

P X e

41

Many statistical software packages will calculate binomial and Poisson probabilities for you.

Tables exist in other textbooks, which cover more parameter values. (For example see the STAT 11048 text “Introduction to Business Statistics” by Weiers).

42

We have examined the Binomial and Poisson distributions since we will need these to produce control charts and to discuss acceptance sampling later in the term.

After covering this weeks material, ensure that you are able to identify which distribution is appropriate for the situation you are investigating – the type of distribution you have will determine the type of control chart you will use.

43

Complete this weeks recommended exercises from the study guide.

Lecture TQM

Documents