Top Banner
Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)
66

Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

Dec 17, 2015

Download

Documents

Rosemary Melton
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

Probability & Statistical Inference Lecture 3

MSc in Computing (Data Analytics)

Page 2: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

Lecture Outline A quick recap Solutions to last weeks question Continuous distributions.

Page 3: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

A Quick Recap

Page 4: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

Probability & Statistics We want to make

decisions based on evidence from a sample i.e. extrapolate from sample evidence to a general population

To make such decisions we need to be able to quantify our (un)certainty about how good or bad our sample information is.

Population

Representative Sample

Sample Statistic

Describe

Make

Inference

Page 5: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

Some Definitions An experiment that can result in different outcomes, even

though it is repeated in the same manner every time, is called a random experiment.

The set of all possible outcomes of a random experiment is called the sample space of an experiment and is denote by S

A sample space is discrete if it consists of a finite or countable infinite set if outcomes.

A sample space is continuous if it contains an interval or real numbers.

An event is a subset of the sample space of a random experiment.

Page 6: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

Some Definitions A sample space is discrete if it consists of a

finite or countable infinite set if outcomes.

A sample space is continuous if it contains an interval or real numbers.

An event is a subset of the sample space of a random experiment.

Page 7: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

Probability Whenever a sample space consists of n possible outcomes that

are equally likely, the probability of the outcome 1/n.

For a discrete sample space, the probability of an event E, denoted by P(E), equals the sum of the probabilities of the outcome in E.

Some rules for probabilities: For a given sample space containing n events E1, E2,

E3, ........,En

1. All simple event probabilities must lie between 0 and 1:0 <= P(Ei) <= 1 for i=1,2,........,n

2. The sum of the probabilities of all the simple events within a sample space must be equal to 1:1)(

1

n

iiEP

Page 8: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

Discrete Random Variable

A Random Variable (RV) is obtained by assigning a numerical value to each outcome of a particular experiment.

Probability Distribution: A table or formula that specifies the probability of each possible value for the Discrete Random Variable (DRV)

DRV: a RV that takes a whole number value only

Page 9: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

Summary Continued… For Discrete RV we often have a mathematical formula

which is used to calculate probabilities,

i.e. P(x) = some formula

This formula is called the Probability Mass Function (PMF)

Given the PMF you can calculate the mean and variance by:

When the summation is over all possible values of x

222 )(

)(

xPx

xxP

Page 10: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

Binomial Distribution – General Formula This all leads to a very general rule for calculating binomial

probabilities:

In General Binomial (n,p)

n = no. of trials

p = probability of a success

x = RV (no. of successes)

Where P(X=x) is read as the probability of seeing x successes.

xnx ppx

nxXP

)1()(

Page 11: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

Binomial Distribution If X is a binomial random variable with the

paramerters p and n then

)1(

)1(2

pnp

pnp

np

Page 12: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

Poisson Probability Distribution Probability Distribution for Poisson

Where is the known mean:

x is the value of the RV with possible values 0,1,2,3,….e = irrational constant (like ) with value 2.71828…

The standard deviation , , is given by the simple relationship;

=

!)(

x

exXP

x

Page 13: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

Question Time

Page 14: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

Questions4. A factory has two assembly lines, each of which is shut down

(S), at partial capacity (P), or at full capacity (F). The following table gives the sample space

For where (S,P) denotes that the first assembly line is shut down and the second one is operating at partial capacity. What is the probability that:

a) Both assembly lines are shut down?

b) Neither assembly lines are shut down

c) At least one assembly line is on full capacity

d) Exactly one assembly line is at full capacity

Event A

P(A) Event A

P(A) Event A

P(A)

(S,S) 0.02 (S,P) 0.06 (S,F) 0.05

(P,S) 0.07 (P,P) 0.14 (P,F) 0.2

(F,S) 0.06 (F,P) 0.21 (F,F) 0.19

Page 15: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

Answer A4. A factory has two assembly lines, each of which is shut down

(S), at partial capacity (P), or at full capacity (F). The following table gives the sample space

a) What is the probability both assembly lines are shut down = 0.02

Event A

P(A) Event A

P(A) Event A

P(A)

(S,S) 0.02 (S,P) 0.06 (S,F) 0.05

(P,S) 0.07 (P,P) 0.14 (P,F) 0.2

(F,S) 0.06 (F,P) 0.21 (F,F) 0.19

Page 16: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

Answer B4. A factory has two assembly lines, each of which is shut down

(S), at partial capacity (P), or at full capacity (F). The following table gives the sample space

b) What is the probability that neither assembly lines are shut down

= 0.14 + 0.21 + 0.2 + 0.19 = 0.74

Event A

P(A) Event A

P(A) Event A

P(A)

(S,S) 0.02 (S,P) 0.06 (S,F) 0.05

(P,S) 0.07 (P,P) 0.14 (P,F) 0.2

(F,S) 0.06 (F,P) 0.21 (F,F) 0.19

Page 17: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

Answer C4. A factory has two assembly lines, each of which is shut down

(S), at partial capacity (P), or at full capacity (F). The following table gives the sample space

c) What is the probability at least one assembly line is on full capacity

= 0.06 + 0.21 + 0.05 + 0.2 + 0.19 = 0.71

Event A

P(A) Event A

P(A) Event A

P(A)

(S,S) 0.02 (S,P) 0.06 (S,F) 0.05

(P,S) 0.07 (P,P) 0.14 (P,F) 0.2

(F,S) 0.06 (F,P) 0.21 (F,F) 0.19

Page 18: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

Answer D4. A factory has two assembly lines, each of which is shut down

(S), at partial capacity (P), or at full capacity (F). The following table gives the sample space

d) What is the probability exactly one assembly line is at full capacity

= 0.06 + 0.21 + 0.05 + 0.2 = 0.52

Event A

P(A) Event A

P(A) Event A

P(A)

(S,S) 0.02 (S,P) 0.06 (S,F) 0.05

(P,S) 0.07 (P,P) 0.14 (P,F) 0.2

(F,S) 0.06 (F,P) 0.21 (F,F) 0.19

Page 19: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

Exercise: There is more that one way to skin a cat!1. If two fair die are thrown what is the

probability that at least one score is a prime number (2, 3, 5)?

2. What is the compliment of the event? 3. What is its probability?

There are three ways (at least) that we can approach this problem

Page 20: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

Solution 1: Brute Force Approach Enumerate the sample space and select those

outcomes that satisfy the desired conditions 36 possible combinations of 2 die

Page 21: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

Solution 1: Brute Force Approach Enumerate the sample space and select those

outcomes that satisfy the desired conditions 36 possible combinations of 2 die 27 combinations include a prime number

Page 22: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

Solution 1: Brute Force Approach Enumerate the sample space and select those

outcomes that satisfy the desired conditions 36 possible combinations of 2 die 27 combinations include a prime number Probability of at least one prime is 27/36 = 0.75

Page 23: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

Solution 1: The Compliment What is the compliment of the event?

That neither score is a prime number (2, 3, 5) when two fair dice are thrown

What is its probability? Let the event be E and its probability be P(E) Then the compliment of E is E’ and the probability

of E`, P(E`), is equal to 1 – P(E)

In our case P(E) = 0.75 => P(E`) = 1 – 0.75

= 0.25

Page 24: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

Solution 2: Find the Probability of the Compliment To start let’s work out, if we throw a single

dice what is the probability of not getting a prime number?

Page 25: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

Solution 2: Find the Probability of the Compliment The brute force approach is fine for two dice,

but cumbersome as the number of dice increases – i.e. 3 dice, 4 dice…..12 dice….1,247 dice!

Our question can be slightly rearranged to reveal a possible solution If two fair die are thrown what is the probability

that at least one score is a prime number (2, 3, 5)?

Page 26: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

Solution 2: Find the Probability of the Compliment The brute force approach is fine for two dice,

but cumbersome as the number of dice increases – i.e. 3 dice, 4 dice…..12 dice….1,247 dice!

Our question can be slightly rearranged to reveal a possible solution If two fair die are thrown what is the

probability that at least one score is a prime number (2, 3, 5)?

Page 27: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

Solution 2: Find the Probability of the Compliment The brute force approach is fine for two dice,

but cumbersome as the number of dice increases – i.e. 3 dice, 4 dice…..12 dice….1,247 dice!

Our question can be slightly rearranged to reveal a possible solution If two fair die are thrown what is the

probability that at least one score is a prime number (2, 3, 5)?

What is the probability of one or more primes from two dice throws?

Page 28: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

Solution 2: Find the Probability of the Compliment The brute force approach is fine for two dice,

but cumbersome as the number of dice increases – i.e. 3 dice, 4 dice…..12 dice….1,247 dice!

Our question can be slightly rearranged to reveal a possible solution If two fair die are thrown what is the

probability that at least one score is a prime number (2, 3, 5)?

What is the probability of one or more primes from two dice throws?

What is the probability of one or more of outcome O from X trials?

Page 29: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

Solution 2: Find the Probability of the Compliment The brute force approach is fine for two dice, but

cumbersome as the number of dice increases – i.e. 3 dice, 4 dice…..12 dice….1,247 dice!

Our question can be slightly rearranged to reveal a possible solution If two fair die are thrown what is the probability

that at least one score is a prime number (2, 3, 5)?

What is the probability of one or more primes from two dice throws?

What is the probability of one or more of outcome O from X trials?

If questions are of this form we can work out the answer by working out the compliment first

Page 30: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

Solution 2: Find the Probability of the Compliment What is the probability that neither score is a

prime number (2, 3, 5) when two fair dice are thrown? This is an easier probability to calculate as we can

consider throwing each dice as an independent event and combine the probabilities that neither results in a prime

It is the “one or more” in the previous problem that makes things tricky as we cannot consider each dice throw as an independent event

Page 31: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

Solution 2: Find the Probability of the Compliment To start let’s work out, if we throw a single

dice what is the probability of not getting a prime number? Sample space: {1, 2, 3, 4, 5, 6} Primes: {2, 3, 5} Non-primes: {1, 4, 6} So, probability is 3/6

Page 32: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

Solution 2: Find the Probability of the Compliment If the probability of getting no prime if we

throw one dice is 3/6, what is the probability of getting no primes if we throw two dice in a row?

Page 33: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

Solution 2: Find the Probability of the Compliment If the probability of getting no prime if we

throw one dice is 3/6, what is the probability of getting no primes if we throw two dice in a row? Dice rolls are independent events Remember our intersection rule for independent

events

So, the probability of getting no primes if we throw two dice in a row is:

)()()( BPAPBAP

P(E) 36 * 3

6

936

14

Page 34: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

Solution 2: Find the Probability of the Compliment Our event, E, was that neither score is a prime

number (2, 3, 5) when two fair dice are thrown So the complement of this event, E`, is that at

least one score is a prime number (2, 3, 5) when two fair dice are thrown

We know that given the probability of event E, P(E), we can work our the probability of the complement of this event, P(E`), as 1 – P(E)

So for our dice example

43

411`)(

41)(

EP

EP

Page 35: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

Solution 2: Find the Probability of the Compliment The great thing is that this works for any number

of dice The probability, P(E), of getting no primes if we

throw n dice in a row is:

So, for three dice the probability of getting no primes is

This means that the probability of getting at least one prime from 3 dice rolls is 1 – 1/8 = 7/8

P(E) 36 n

36 3

36 * 3

6 * 36

27216

18

Page 36: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

Solution 3: Use the Binomial Distribution Problems that can be stated as:

what is the probability of seeing x successes in n independent binary trials

can be solved using the Binomial distribution.

For example: what is the probability of seeing 1 prime in 2

dice throws

Page 37: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

Solution 3: Use the Binomial Distribution The Binomial probability, P(X=x), (read as the

probability of seeing x successes) is given by:

where n is the number of trials, p is the probability of a success and , known as a combination, is the number of ways of getting x successes from n trials

P(X x) n

x

px (1 p)n x

n

x

n

x

n!

(n x)!x!

Page 38: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

Solution 3: Use the Binomial Distribution So, what is the probability of seeing 1 prime in

2 dice throws n = 2 p = 1/2 x = 1

n

x

n!

(n x)!x!

2

1

2!

(2 1)!1!

2

P(X x) n

x

px (1 p)n x

P(X 1) 2

1

12

1(1 1

2)2 1

2* 12 * 1

2

12

Page 39: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

Solution 3: Use the Binomial Distribution Exercise: What is the probability of seeing 2

primes in 2 dice throws

n

x

n!

(n x)!x!

2

2

2!

(2 2)!2!

1

P(X x) n

x

px (1 p)n x

P(X 2) 2

2

12

2(1 1

2)2 2

1* 14 *1

14

Page 40: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

Solution 3: Use the Binomial Distribution Exercise: What is the probability of seeing 2

primes in 2 dice throws

Page 41: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

Solution 3: Use the Binomial Distribution So, what is the probability of seeing one or

more primes in 2 dice throws? P(1 ≤ X ≤ 2) = P(X = 1) + P(X = 2)

= ½ + ¼ = ¾

More generally then we can say that the probability of seeing one or more primes in n dice throws is:

n

x

xXPnXP1

)1(

Page 42: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

Continuous Probability Distributions

Page 43: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

Continuous Probability Distributions Experiments can lead to continuous responses i.e.

values that do not have to be whole numbers. For example: height could be 1.54 meters etc.

In such cases the sample space is best viewed as a histogram of responses.

The Shape of the histogram of such responses tells us what continuous distribution is appropriate – there are many.

Lifetime of Component

De

nsity

0.0 0.5 1.0 1.5 2.0 2.5

01

23

4

Waiting TimeD

en

sity

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Page 44: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

Normal Distribution (AKA Gaussian)• The Histogram below is symmetric & 'bell

shaped'• This is characteristic of the Normal

Distribution• We can model the shape of such a

distribution (i.e. the histogram) by a Curve

Page 45: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

Normal Distribution The Curve may not fit the histogram

'perfectly' - but should be very close Normal Distribution - two parameters,

µ = mean, = standard deviation,

The mathematical formula that gives a bell shaped symmetric curve

f(x) = Height of curve at x =

2

2

2

)(

22

1

x

e

Page 46: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

Normal Distribution Why Not P(x) as before?

=> because response is continuous

What is the probability that a person sampled at random is 6 foot?

Equivalent question: what proportion of people are 6 foot?

=> really mean what proportion are 'around 6 foot' ( as good as the measurement device

allows) - so not really one value, but many values close together.

Page 47: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

Example: What proportion of graduates earn €35,000?

Would we exclude €35,000.01 or €34,999.99?

Round to the nearest €, €10, €100, €1000?

Continuous measure => more useful to get proportion from €35,000 - €40,000

Some Mathematical Jargon:

The formula for the normal distribution is formally called the normal probability density function (pdf)

Page 48: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

The Shaded portion of the Histogram is the Proportion of interest

Can visualise this using the histogram of salaries.

Page 49: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

Since the histogram of salaries is symmetric and bell shaped, we model this in statistics with a Normal distribution curve.

Proportion = the proportion of the area of the curve that is shaded

So proportions = proportional area under the curve = a probability of interest

Need;• To know , • To be able to find area under

curve

Page 50: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

Area under a curve is found using integration in mathematics.

In this case would need a technique called numerical integration.

Total area under curve is 1. However, the values we need are in

Normal Probability Tables.

Page 51: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

The Tables are for a Normal Distribution with = 0 and = 1

• this is called the Standard Normal• Can 'convert' a value from any normal to the

standard normal using standard scores (Z scores)

Value from any NormalDistribution

Standardize

Corresponding Value from

Normal = 0 = 1

Z :score edstandardis

x

Standard Normal

Page 52: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

Z scores are a unit-less quantity, measuring how far above/below a certain score (x) is, in standard deviation units.

Example: A score of 35, from a normal distribution with

= 25 and = 5.Z = ( 35 − 25) / 5 => 10/5 = 2

So 35 is 2 standard deviation units above the mean

What about a score of 20 ?

Z = ( 20 - 25) / 5 => − 5 / 5 = − 1

So 20 is 1 standard unit below the mean

Z-Score Example

Page 53: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

Positive Z score => score is above the meanNegative Z score => score is below the mean

By subtracting and dividing by the we convert any normal to = 0, =1, so only need one set of tables!

Z-Score Example

Page 54: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

From looking at the histogram of peoples weekly receipts, a supermarket knows that the amount people spend on shopping per week is normally distributed with:

= €58 = €15.

Example:

Page 55: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

What is the probability that a customer sampled at random will spend less than €83.50 ?

Z = ( x − ) / = ( €83.50 - €58 ) / €15 => 1.7

Area from Z=1.7 to the left can be read in tables

From tables area less than Z = 1.7 => 0.9554

So probability is 0.9554 Or 95.54%

Page 56: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)
Page 57: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

What is the probability that a customer sampled at random will spend more than €83.50 ?

Z = ( x − ) / = ( €83.50 - €58 ) / €15 => 1.7

Area from Z=1.7 to the right can be read in tables

From tables area greater than Z = 1.7 => 1- 0.9554 = 0.0446

So probability is 0.0446 Or 4.46%

Page 58: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

Exercise Find the proportion of people who spend

more than €76.75 Find the proportion of people who spend

less than €63.50

Note: The tables can also be used to find other areas

(less than a particular value, or the area between two points)

Page 59: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

Characteristics of Normal Distributions Standard Deviation has particular

relevance to Normal distribution Normal Distribution => Empirical Rule

68%

95%

99.7% Between Z (lower, upper)

%Area

-1,1 68 %

-2,2 95 %

-3,3 99.7 %

-∞, +∞ 100%

Page 60: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

The normal distribution is just one of the known continuous probability distributions.

Each have their own probability density function, giving different shaped curves.

In each case, we find probabilities by calculating areas under these curves using integration.

However, the Normal is the most important – as it plays a major role in Sampling Theory.

Page 61: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

Other important continuous probability distributions include• Exponential distribution – especially positively

skewed lifetime data.

• Uniform distribution.

• Weibull – especially for ‘time to event’ analysis.

• Gamma distribution – waiting times between Poisson events in time etc.

• Many others…..

Page 62: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

Summary – Random Variables There are two types – discrete RVs and

continuous RVs

For both cases we can calculate a mean (μ) and standard deviation (σ)

μ can be interpreted as average value of the RV

σ can be interpreted as the standard deviation of the RV

Page 63: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

Summary Continued… For Discrete RV we often have a mathematical formula

which is used to calculate probabilities,

i.e. P(x) = some formula

This formula is called the Probability Mass Function (PMF)

Given the PMF you can calculate the mean and variance by:

When the summation is over all possible values of x

222 )(

)(

xPx

xxP

Page 64: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

Summary Continued… For continuous RVs, we use a Probability Density

Function (PDF) to define a curve over the histogram of the values of the random variables.

We integrate this PDF to find areas which are equal to probabilities of interest.

Given the PDF you can calculate the mean and variance by:

Where f(x) is usual mathematical notation for the PDF

-dx )(dx )( 22 xfxxxf

Page 65: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

Question Time

Page 66: Probability & Statistical Inference Lecture 3 MSc in Computing (Data Analytics)

Next Week Next week we will start with the practical part

of the course. We will move to Lab