Top Banner
Biostat. 200 Review slides Week 1-3
30

Biostat. 200 Review slides Week 1-3. Recap: Probability.

Dec 31, 2015

Download

Documents

Ira Campbell
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Biostat. 200 Review slides Week 1-3. Recap: Probability.

Biostat. 200Review slides

Week 1-3

Page 2: Biostat. 200 Review slides Week 1-3. Recap: Probability.

Recap: Probability

Page 3: Biostat. 200 Review slides Week 1-3. Recap: Probability.

Basic Probability

1) ComplementP(A)= 1-P(Ā)

2) Intersection = P(A ∩ B)

3) Union = P(A U B) P(A U B) =P(A) + P(B) – P(A ∩ B)

Page 4: Biostat. 200 Review slides Week 1-3. Recap: Probability.

Basic Probability4) Mutually exclusivity (but still dependant)

• Mutual exclusivity = Additive RuleP(A ∩ B) = 0P(A U B) = P(A) + P(B) - P(A ∩ B) = P(A) + P(B)

4

Page 5: Biostat. 200 Review slides Week 1-3. Recap: Probability.

Basic Probability5) Conditional Probability

The probability that an event B will occur given that event A has occurred.

• Use the multiplicative rule– P(A ∩ B) = P(A) P(B|A)– P(B|A) = P(A ∩ B) / P(A)

• Applies to – Relative risks – Odds ratios

Page 6: Biostat. 200 Review slides Week 1-3. Recap: Probability.

Basic Probability

6) IndependenceNote that independence ≠ mutual exclusivity!

• If A and B are independent: – P(B | A)=P(B | Ā) = P(B)– P(A | B) = P(A|B)= P(A)– P(A ∩ B) = P(A)P(B) (Multiplicative rule)

Page 7: Biostat. 200 Review slides Week 1-3. Recap: Probability.

Probability Distributions

Discrete distributionsContinuous distributions

Page 8: Biostat. 200 Review slides Week 1-3. Recap: Probability.

Discrete Variables

• For discrete variables the probability distribution describes the probability of each possible value

8

Page 9: Biostat. 200 Review slides Week 1-3. Recap: Probability.

Discrete distributions

• Bernoulli distribution • variable that can take on one of two values with a

constant probability p, then it is a Bernoulli random variable

• outcomes are either 0 or 1• theoretical building block to describe the

distribution of more than one trial.

Page 10: Biostat. 200 Review slides Week 1-3. Recap: Probability.

Discrete Distributions

Binomial Distribution:

With:• p is probability of “success” in each “trial”• n is the number of “trials” • n and p are the parameters of the binomial

distribution, (summarize the distribution)• x is the number of “successes” (outcomes)• Note that Stata and Table A.1 use the symbol k for x

xnx ppx

nxXP

)1()(

10

Page 11: Biostat. 200 Review slides Week 1-3. Recap: Probability.

Binominal Distributions

• Assumes

– Fixed number of trials n, each with one of two mutually exclusive outcomes

– Independent outcomes of the n trials– Constant probability of success p for each trial

Page 12: Biostat. 200 Review slides Week 1-3. Recap: Probability.

Binominal Distribution

• What is the probability of exactly 2 cases of disease in a sample of n=5 where p=0.15?

• How to calculate the probability?1) Use the binomial formulaIn Stata: display comb(n,k). display comb(5,2)10

– (10)(0.15)2 (1-0.15)5-2

– (10)(0.0225) (0.614) = 0.138

xnx ppx

nxXP

)1()(

Page 13: Biostat. 200 Review slides Week 1-3. Recap: Probability.

Binominal Distribution

• What is the probability of exactly 2 cases of disease in a sample of n=5 where p=0.15?

• How to calculate the probability?2) Use Table A1 – Table A.1 gives you P(X=k)– Look up p=.15, n=5, k=2, answer=.1382

Page 14: Biostat. 200 Review slides Week 1-3. Recap: Probability.

Binominal Distribution

• What is the probability of exactly 2 cases of disease in a sample of n=5 where p=0.15?

• How to calculate the probability?3) Use Stata

• Binomialp (n,k,p)• display binomialp(5,2,.15).13817813

Page 15: Biostat. 200 Review slides Week 1-3. Recap: Probability.

Binomial DistrubutionWhat is the probability of 1 or more cases of disease in a sample of n=5 where p=0.15?1) Use Binomial Formula•P(X≥1) = 1-P(X=0)

di comb(5,1)*0.15^1.85^5.44370531

•So 1-P(X=0) = 1- 0.4437 = 0.5563

15

550 85.*1)85(.15.0

5)0(

XP

Page 16: Biostat. 200 Review slides Week 1-3. Recap: Probability.

Binomial DistrubutionWhat is the probability of 1 or more cases of disease in a sample of n=5 where p=0.15?

2) Use Table A1•P(X≥1) = 1-P(X=0) •Looking up P(X=0) we get 0.4437

– So 1-P(X=0) = 1- 0.4437 = 0.5563

16

Page 17: Biostat. 200 Review slides Week 1-3. Recap: Probability.

Binominal Distribution

• What is the probability of 1 or more cases of disease in a sample of n=5 where p=0.15?

• How to calculate the probability?3) Use Stata– display binomialtail(5,1,.15)

.55629469

Page 18: Biostat. 200 Review slides Week 1-3. Recap: Probability.

Binomial Distribution• Binomial mean = np• Binomial variance= np(1-p)

– Variance is largest when p=0.5, smaller when p closer to 0 or 1– The distribution is symmetric when p=0.5– The distribution is a mirror image for 1-p (i.e. the distribution for p=0.05 is the

mirror image of the one for p=0.95)

18

0.1

.2.3

.4bin

om

ial pro

bability

0 2 4 6 8 10 12 14 16 18 20n successes

Binomial distribution n=20 p=.05

0.1

.2.3

.4

bin

om

ial pro

bability

0 2 4 6 8 10 12 14 16 18 20n successes

Binomial distribution n=20 p=.95

0.0

5.1

.15

.2

bin

om

ial pro

bability

0 2 4 6 8 10 12 14 16 18 20n successes

Binomial distribution n=20 p=.5

Page 19: Biostat. 200 Review slides Week 1-3. Recap: Probability.

Continuous distributions

Normal distribution

Page 20: Biostat. 200 Review slides Week 1-3. Recap: Probability.

Continuous Distribution

• For continuous variables, the distribution describes the probability of a range of values

Page 21: Biostat. 200 Review slides Week 1-3. Recap: Probability.

Normal distribution• The probability density function is

• μ is the mean and σ is the standard deviation of a normally distributed random variable– They are the parameters of the normal distribution– π is the constant that is approximately 3.14159

x -exf

x

where2

1)(

2

2

1

21

Page 22: Biostat. 200 Review slides Week 1-3. Recap: Probability.

-10 -8 -6 -4 -2 0 2 4 6 8 10x

Mean0SD1 Mean0SD3Mean4SD1

Several normal distributions

22

Page 23: Biostat. 200 Review slides Week 1-3. Recap: Probability.

The Standard Normal Distribution

• μ and σ can take on an infinite number of values

• standard curve with– μ =0 – σ =1 (and variance σ2=1).

• Denoted N(0,1)

23

x -exfx

where2

1)(

2

2

1

Page 24: Biostat. 200 Review slides Week 1-3. Recap: Probability.

The Standard Normal Distribution

• If X is a normally distributed random variable with mean μ and standard deviation σ then

Z= (X – μ)/σ

is a standard normal random variable

• That is, a normally distributed random variable with its mean subtracted off, divided by its standard deviation, is a normal random variable with mean=0 and standard deviation=1

24

Page 25: Biostat. 200 Review slides Week 1-3. Recap: Probability.

For Z ~ N(0,1) P(Z≥0) = 0.50

25

-5 -4 -3 -2 -1 0 1 2 3 4 5Z

Standard normal distribution

Zero is the mean & medianFor a standard normal distribution

Page 26: Biostat. 200 Review slides Week 1-3. Recap: Probability.

For Z ~ N(0,1) P(Z≥1.96) = 0.025

26

-5 -4 -3 -2 -1 0 1 2 3 4 5Z

Standard normal distribution

Probability of observing a value of 1.96 or greater is 0.025

Page 27: Biostat. 200 Review slides Week 1-3. Recap: Probability.

P(µ-2σ ≤ Z ≤ µ+2σ)

Remember µ=0 and σ=1, so this is

P(-2 < Z < 2) = 0.954

Therefore, approximately 95.4% of the area of the standard normal is within 2 SD of the mean.

0.0230.023

27

0.954

-5 -4 -3 -2 -1 0 1 2 3 4 5Z

Standard normal distribution

Page 28: Biostat. 200 Review slides Week 1-3. Recap: Probability.

•Stata will calculate standard normal probabilities for you

•In Stata, the left portion of the curve P(Z<z) is calculated for you.display normal(1.96).9750021

•If you want the right hand portion of the curve, P(Z>z), you subtract your answer from 1display 1-normal(1.96).0249979

•If you want the middle: display normal(1.96) -normal(-1.96).95000421

28

-5 -4 -3 -2 -1 0 1 2 3 4 5Z

Prob Z<1.96 highlighted

Standard normal distribution

Page 29: Biostat. 200 Review slides Week 1-3. Recap: Probability.

• To get the z value for P(Z<z) = p usedisplay invnormal(p)

• To get the z value for P(Z>z) = p usedisplay invnormal(1-p)

E.g. what is the z value for P(Z≤z) = 0.025. display invnormal(0.025)-1.959964

E.g. what is the z value for P(Z>z) = 0.025. display invnormal(1-.025)1.959964

Finding z values for probabilities in Stata

29

Page 30: Biostat. 200 Review slides Week 1-3. Recap: Probability.

• To get the z value for P(Z>z) = p – find p in the table and read the corresponding z

• To get the z value for P(Z<z) = p – find p and use -1* the corresponding p

E.g. what is the z value for P(Z≤z) = 0.025For p=0.025 the table value is 1.96, so the answer is -1.96

E.g. what is the z value for P(Z>z) = 0.025For p=0.025 the table value is 1.96

Finding z values for probabilities in using Table A.3

30