Probability distributions: part 1 BSAD 30 Dave Novak Source: Anderson et al., 2013 Quantitative Methods for Business 12 th edition – some slides are directly from J. Loucks © 2013 Cengage Learning
Probability distributions: part 1
BSAD 30
Dave Novak
Source: Anderson et al., 2013 Quantitative Methods for Business 12th edition – some slides are directly from J. Loucks © 2013 Cengage Learning
Covered so far…
Chapter 1: IntroductionWhat is modelingTypes of modelsBasic problem formulationReview of basic linear (algebraic) problems
Chapter 2: Introduction to probabilityReview of probability concepts (complement,
union, intersection, conditional probability, joint probability table, independence, mutually exclusive)
2
Overview
Random Variables Discrete Probability Distributions
Uniform Probability Distribution Binomial Probability DistributionPoisson Probability Distribution
Link to examples of types of discrete distributions
• http://www.epixanalytics.com/modelassist/AtRisk/Model_Assist.htm#Distributions/Discrete_distributions/Discrete_distributions.htm3
Overview
We will briefly look at three “common” discrete probability examplesUniformBinomialPoisson
In business applications, we often find instances of random variables that follow a discrete uniform, binomial, or Poisson probability distribution
4
What is a random variable?
A random variable (RV) is a numerical description of the outcome of an experiment
Keep in mind that there is a difference between numeric variables and categorical variablesNumeric: temperature, speed, age,
monetized data, etc.Categorical: state of residence, gender,
blood type, etc.
5
What is a random variable?
Two types of random variables:Discrete
Continuous
6
Random variables
7
Random variables
Question Random Variable x Type
Familysize
x = Number of dependents infamily reported on tax return
Discrete
Distance fromhome to store
x = Distance in miles fromhome to the store site
Continuous
Own dogor cat
x = 1 if own no pet; = 2 if own dog(s) only; = 3 if own cat(s) only; = 4 if own dog(s) and cat(s)
Discrete
8
Example
Discrete random variable (RV) with a finite number of possible values
There is a readily identifiable upper bound to the number of TVs sold on any given day
In this case, no more than 4 TVs sold
Let x = number of TVs sold at the store in one day,
where x can take on 5 values (0, 1, 2, 3, 4)
9
Example
Discrete random variable (RV) with an infinite number of possible values
There is no readily identifiable upper bound on the number of customers coming into the store on any given day
There cannot be an infinite # of customers, but we are not setting an upper bound (could be 75, 500, or 2,000)
Let x = number of customers arriving in one day,
where x can take on the values 0, 1, 2, . . .
10
Discrete probability distributions The probability distribution for a random
variable describes how probabilities associated with each value are distributed (or allocated) over all possible values
We can describe a discrete probability distribution with a table, graph, or equationIn the TV sales example, we would want a
mathematical and/or visual representation of the probability of selling 0, 1, 2, 3, or 4 TVs on any given day
11
Discrete probability distributions The probability distribution is defined by a
probability function, denoted by f(x), which provides the probability for each value of the random variableThe function f(x) is a mathematical
representation of the probability distributionThe following conditions are required:
f(x) > 0
f(x) = 112
Discrete distribution: DiCarlo motors example Using historical data on car sales, a tabular
representation of sales is created
Number Units Sold of Days
0 54 1 117 2 72 3 42 4 12
5 3 300
x f(x) 0 .18 1 .39 2 .24 3 .14 4 .04 5 .01 1.00
.18 = 54/300
.04 = 12/300
13
Discrete distribution: DiCarlo motors example Graphical representation
.10.10
.20.20
.30.30
.40.40
.50.50
0 1 2 3 4 50 1 2 3 4 5Values of Random Variable x (car sales)Values of Random Variable x (car sales)
Pro
babi
lity
Pro
babi
lity
14
Discrete distribution: DiCarlo motors example The probability distribution provides the
following informationThere is a 0.18 probability that no cars will
be sold during a day f(0) = 18%The most probable sales volume is 1, with
f(1) = 0.39 f(1) = 39%There is a 0.05 probability of either four or
five cars being sold f(4) + f(5) = 5%
15
Summary Up to this point, we have not discussed
the specific TYPE of discrete probability distribution (i.e. uniform, binomial, Poisson, etc.)
We have only discussed probability distributions in terms of being discrete as opposed to continuous
A review of basic statistical concepts is next
16
Expected value and variance The expected value, or mean, of a random
variable is a measure of its central locationMean, median, and mode are measures of
central tendency because they identify a single value as “typical” or representative of all values in a probability distribution
E(x) = = x f(x)
17
Expected value and variance The variance, 2, summarizes the variability
in the values of a random variable The standard deviation, , is defined as the
positive square root of the variance
Var(x) = 2 = (x - )2f(x)
StdDev(x) = =
18
Expected value and variance Both the StdDev and variance provide a
measure of how much the values in the probability distribution differ from the mean
The higher the standard deviation, the more different the different observations are from one another and from the mean
When a probability distribution has a high standard deviation, the mean is not a good measure of central tendency
19
Expected value and varianceScores = 1,4,3,4,2,7,18,3,7,2,4,3Mean = 5Median = 3.5Standard Deviation = 4.53
The standard deviation indicates that the average difference between each score and the mean is around 4.5 points. However, only one score (18) is 4.5 or more points different from the mean. The one extreme score (18) overly influences the mean. The median (3.5) is a better measure of central tendency in this case because extreme scores do not influence the median
20
Discrete distribution: DiCarlo motors example
Number Units Sold of Days
0 54 1 117 2 72 3 42 4 12
5 3 300
x f(x) 0 .18 1 .39 2 .24 3 .14 4 .04 5 .01 1.00
21
DiCarlo motors example
Calculate expected value of discrete RV
expected number of cars sold in a day
x f(x) xf(x) 0 .18 .00 1 .39 .39 2 .24 .48 3 .14 .42 4 .04 .16 5 .01 .05
E(x) = 1.50
22
0 x 0.18 = 0
1 x 0.39 = 0.39
DiCarlo motors example
Calculate variance and StdDev
012345
-1.5-0.5 0.5 1.5 2.5 3.5
2.25 0.25 0.25 2.25 6.2512.25
.18
.39
.24
.14
.04
.01
.4050
.0975
.0600
.3150
.2500
.1225
x - (x - )2 f(x) (x - )2f(x)
Variance of daily sales = s 2 = 1.2500
x
carssquared
Standard deviation of daily sales = s = = 1.118 cars
23
DiCarlo motors example
Calculate variance and StdDev
Standard deviation of daily sales = s = = 1.118 cars
24
Var(x) = 2 = (x - )2f(x) = 0.4050 + 0.0975 + 0.0600 + 0.3150 + 0.2500 + 0.1225
Var(x) = 2 = 1.25
Expected value and variance From a decision-making or analyst
perspective what are some of the practical implications of this discussion?If the data you are analyzing have a high
variance, making decisions based on the mean, or even stressing the importance of the average, is likely to be misleading
25
Expected value and variance What should you do?
Generate a visual representation of the data!You need to better characterize the data to
see if they fit into any well-known families of probability distributions – this would be the first step in analysis• Knowing what the data “aren’t” is also useful
26
Expected value and variance What should you do?
Knowing that data do not follow a particular distribution is important in terms of analysis
There are particular characteristics associated with different types of distributions that can guide you in your analysis
27
Discrete Distributions we will examine 1) Uniform
2) Binomial or Bernoulli
3) Poisson
28
Discrete uniform probability distribution The discrete uniform probability distribution
is the simplest example of a discrete probability distribution given by a formula
Example: getting a 1, 2, 3, 4, 5, or 6 when rolling single die – f(x) = 1/6
f(x) = 1/n
where:n = the number of values the random variable may assume
the values of the random variable are equally likely
29
Binomial probability distribution Also known as Bernoulli distribution Has four properties:
1) Experiment consists of n, independent trials
2) Only TWO outcomes are possible for each trial (success/failure, good/bad, on/off, yes/no, etc.)
3) The probability of success stays the same for all trials
4) All trials are independent30
Binomial probability distribution We are interested in the number of
successes, or positive outcomes occurring in the n trialsx denotes the number of successes, or
positive outcomes occurring in the n trials
where: f(x) = the probability of x successes in n trials n = the number of trials p = the probability of success on any one trial
( )!( ) (1 )
!( )!x n xn
f x p px n x
31
Binomial probability distribution
( )!( ) (1 )
!( )!x n xn
f x p px n x
Probability of a particular sequence of trial outcomes with x successes in n trials
Number of experimental outcomes providing exactlyx successes in n trials
32
Binomial probability distribution Assume the probability that any customer
who comes into a store and actually makes a purchase is 0.3 (30% chance of success)
What is the probability that 2 of the next 3 customers who enter the store make a purchase?
Identify: n, x, p
33
Binomial probability distribution
34
Binomial probability distribution (decision tree)
1st Customer 1st Customer 2nd Customer2nd Customer 3rd Customer3rd Customer xx Prob.Prob.
Purchases (.3)Purchases (.3)
(.7)Does NotPurchase
(.7)Does NotPurchase
33
22
00
22
22
Purchases (.3)Purchases (.3)
Purchases (.3)Purchases (.3)
DNP (.7)DNP (.7)
Does NotPurchase (.7)Does NotPurchase (.7)
Does NotPurchase (.7)Does NotPurchase (.7)
DNP (.7)DNP (.7)
DNP (.7)DNP (.7)
DNP (.7)DNP (.7)
P (.3)P (.3)
P (.3)P (.3)
P (.3)P (.3)
P (.3)P (.3) .027.027
.063.063
.063.063
.343.343
.063.063
11
11
.147.147
.147.147
.147.147
11
35
Binomial probability distribution If a six-sided die is rolled three times, what
is the probability that the number 5 comes up twice?Identify: n, x, p
36
Binomial probability distribution
37
Binomial probability distribution
1st roll 1st roll 2nd roll2nd roll 3rd roll3rd roll xx Prob.Prob.
Success “5” (.17)Success “5” (.17)
(.83)Failure (1,2, 3, 4, 6)
(.83)Failure (1,2, 3, 4, 6)
33
22
00
22
22
Success (.17)Success (.17) F (.83)F (.83)
Failure (.83)Failure (.83)
S (.17)S (.17) .005.005
.572.572
.024.024
11
11
.117.11711
Success (.17)Success (.17)
Failure (.83)Failure (.83)
S (.17)S (.17)
S (.17)S (.17)
S (.17)S (.17)
F (.83)F (.83)
F (.83)F (.83)
F (.83)F (.83)38
.024.024
.024.024
.117.117
.117.117
Binomial probability distribution What’s the probability if I roll a die 10 times,
the number 5 comes up four times?Identify: n, x, p
39
Binomial probability distribution Expected value
Variance
Standard deviation
E(x) = = np
Var(x) = 2 = np(1 - p)
(1 )np p 40
Binomial probability distribution In the clothing store example, calculate:
Expected value
Variance
Standard deviation
41
Poisson probability distribution A Poisson distributed random variable is
often useful in estimating the number of occurrences over a specified interval of time or space which can be counted in whole numbersVery useful in RISK analysis
It is a discrete random variable that may assume an infinite sequence of values (x = 0, 1, 2, . . . ∞)
42
Poisson probability distribution How is an RV that follows a Poisson
distribution different from an RV that follows a binomial distribution?It is possible to count how many events have
occurred, but meaningless to ask how many events have NOT occurred
In the binomial situation, we know the probability of two mutually exclusive events (p, q) – in the Poisson situation, we have no q (it has only one parameter the average frequency an event occurs)43
Poisson probability distribution Examples
Number of customers arriving at a supermarket checkout between 5 PM and 6 PM
Number of text messages you receive over the course of a week
Number of car accidents over the course of a year
44
Poisson probability distribution Two properties of Poisson distributions
1) The probability of occurrence is the same over any two time intervals of equal length
2) The occurrence or nonoccurrence in any time interval is independent of occurrence or nonoccurrence in any other time interval
45
Poisson probability distribution
!)(
x
exf
x
where:
f(x) = probability of x occurrences in an interval
l = mean number of occurrences in an interval
e = 2.71828
46
For more info: https://en.wikipedia.org/wiki/E_(mathematical_constant)
Drive-up teller window exampleSuppose that we are interested in the number of cars arriving at the drive-up teller window of a bank during a 15-minute period on weekday mornings We assume that the probability of a car arriving is the
same for any two time periods of equal length (i.e. prob of a car arriving in the first minute is exactly the same as the prob of a car arriving in the last minute), and the arrival or non-arrival of a car in any time period is independent of the arrival or non-arrival in any other time period
An analysis of historical data shows that the average number of cars arriving during a 15-minute interval of time is 10, so the Poisson probability function with = 10 applies47
Drive-up teller window example
l = 10 arrivals / 15 minutes, x = 5
We want to know the probability that exactly 5 cars will arrive over the 15 minute time interval
Identify: x and
X = 5
=> we are given that there are 10 arrivals every 15 minutes, so the average # of arrivals over the time period is 10
48
Drive-up teller window example
5 1010 (2.71828)(5) .0378
5!f
l = 10 arrivals / 15 minutes, x = 5
So, there is a 3.78% chance that exactly 5 cars will arrive over the 15 minute time period
49
Highway defect example
• Suppose that we are concerned with the occurrence of major defects in a section of highway one month after that section was resurfaced
• We assume that the probability of a defect is the same for any two highway intervals of equal length (i.e. the probability of a defect between mile markers 1 and 2 is the same as the probability of a defect between mile markers 4 and 5, etc.) and that the occurrence of a defect in any one mile interval is independent of the occurrence or nonoccurrence of a defect in any other interval
• Thus, the Poisson probability distribution applies
50
Highway defect example
Find the probability that no major defects occur in a specific 3-mile stretch of highway assuming that major defects occur at the average rate of two defects per mile
51
Highway defect example
52
Poisson probability distribution Expected value
Variance
Standard deviation
E(x) = µ = the rate or frequency of an event
Var(x) = 2 =
=
53
Highway defect example
In the highway defect example, calculate:Expected value
Variance
Standard deviation
54
Summary
Discussion of random variables DiscreteContinuous
Examples of discrete probability distributionsUniformBinomialPoisson
55