7/29/2019 Lecture 3 - Probability Distributions
1/15
Statistical inference: probability
distributions and confidence intervals
7/29/2019 Lecture 3 - Probability Distributions
2/15
We are now familiar with descriptivestatistics; but the main use of statisticalmethods is not description, but prediction
o i.e. we collect samples mostly to predict
characteristics of the whole population
The key instrument of extrapolation fromsample to population is the analysis ofprobability distributions:
o by assuming that our variables have a certaindistribution (normal, uniform, etc.), we can usesamples to infer population properties
In the following we examine the concept and
uses of statistical distributions 2
7/29/2019 Lecture 3 - Probability Distributions
3/15
Most utilised statistical distribution is the
normal distribution (the Bell curve)
o also the most infamous due to certain misuses
o
http://crab.rutgers.edu/~goertzel/normalcurve.htm
However, there is nothing intrinsically wrong
with using probability distributions
o
well, anything in the wrong hands (from a breadknife to a fundamental law of nature proposed by
a pacifist) may become a weapon
3
http://crab.rutgers.edu/~goertzel/normalcurve.htmhttp://crab.rutgers.edu/~goertzel/normalcurve.htmhttp://crab.rutgers.edu/~goertzel/normalcurve.htmhttp://crab.rutgers.edu/~goertzel/normalcurve.htmhttp://crab.rutgers.edu/~goertzel/normalcurve.htmhttp://crab.rutgers.edu/~goertzel/normalcurve.htmhttp://crab.rutgers.edu/~goertzel/normalcurve.htmhttp://crab.rutgers.edu/~goertzel/normalcurve.htm7/29/2019 Lecture 3 - Probability Distributions
4/15
The first reason for popularity of the normal curve isdescriptive; i.e. we use it to model distribution ofcertain traits that look bell-shaped
What traits are bell-shaped? Typically, traits that are
optimised or established by biological or socialprocesses, and thus have a tendency to occur at anexpected valueo classic example: biological traits under natural selection
o A reason Darwin applied the principle of optimisation tonatural processes is that optimisation was a current
concept in Victorian society (especially in Economics)
4
7/29/2019 Lecture 3 - Probability Distributions
5/15
7/29/2019 Lecture 3 - Probability Distributions
6/15
The normal distribution is just a modified version ofour exponential
The curve
N(0,1) =
is thestandard normal distribution with
mean=0
sd=1
sum of frequencies=1
Distribution N(0, 1) is possibly the most used instatistical analyses
It says that for example:
the probability of being well above average (+3standard deviations above mean) is only 0.1%
probability of being one standard deviation
below average (-1 sd) is 0.1+2.1+13.6=15.8%(i.e. everything below -1) 6
-3 -2 -1 0 +1 +2 +3
7/29/2019 Lecture 3 - Probability Distributions
7/15
However, real traits (body height,income, schooling years,number of social mediaaccounts) may have a normaldistribution (bell shape), butrarely with mean=0 andstandard deviation=1
That is not a problem: we canstandardise variables, i.e.
transform them so thateverything you measure hasmean=0 and sd=1
How is this done? With z-scores7
7/29/2019 Lecture 3 - Probability Distributions
8/15
1) We take variable x and subtract themean from each caseo if mean height is 180 cm, someone 170 cm tall
now measures 170-180=-10
2) We take all residuals (case minus mean)and divide by standard deviationo if sd=10 and mean is 180cm, someone
measuring 190 cm deviates -10 cm/10 cm= -1standard deviation below the mean
In summary, standardisation or calculation
ofz-scores is simply convertinganymeasurements into standard deviationunitsz
=
-3 -2 -1 0 +1 +2 +3
7/29/2019 Lecture 3 - Probability Distributions
9/15
So: if in a populationo mean height = 180 cm
o standard deviation=10
and you are 170cm, theno you measure 10 cm above the average
o you measure z = (170 180)/10 = -1
This means that the probability ofbeing shorter than 170 cm in thispopulation iso 0.1 + 2.1 + 13.6 = 15.8%
The reason for standardising is clear:it is the theoretical step that allows theapplication of the normal distributionto many quantifiable aspects of reality
9
7/29/2019 Lecture 3 - Probability Distributions
10/15
We are interested in intervals of the normal
curve, not points
Why? What does it mean to ask what is the
probability of being a millionaire in the UK?(or their frequency)
o it does not mean the probability of having
exactly 1 million (thats a single point in
the curve)
o it means everyone havingover 1 million
(and thats an interval of the curve)
Cumulative probabilityis the probability of an
interval of values 10
a lower interval
an upper interval
7/29/2019 Lecture 3 - Probability Distributions
11/15
It is easy to estimate cumulative probability of being
smaller than a value in RStudio
o you provide individual (test) value, mean, and sd,
and R calculates z-score and probability of the
interval defined by that value
Command pnorm(test value, mean, sd) calculates
cumulative probability from left to right, i.e. from to a value x (thats the blue area)
Example: if your height is 170 cm, average is 180
cm, and sd=10 cm, then probability of being shorter
than 170 cm is
o > pnorm(170,180,10)
o [1] 0.1586553 11
a lower interval
7/29/2019 Lecture 3 - Probability Distributions
12/15
pnorm can estimate upper intervals too (i.e. the probability of
beingovera given value)
Example:
o what is the probability of being at least (i.e. taller than) 190
cm in the same population?
1) Probability of beingsmallerthan 190 cm (the WHITE area) is
> pnorm(190,180,10)
[1] 0.8413447
i.e. 0.841=84.1%
2) Thus probability of being over 190 cm is the rest of the curve
> 1-pnorm(190,180,10)
[1] 0.1586553
i.e.: probability of being taller than 190 cm is 1 (100%) minus the
probability of being smaller than 190 cm 12
an upper interval
7/29/2019 Lecture 3 - Probability Distributions
13/15
Important: we can combine the two
things to calculate probability of extreme
values (i.e. too large or too small)
So what is the probability of being
shorter than 170cm OR taller than 190
cm, with N(180, 10)?
> 1pnorm(190, 180, 10)+pnorm(170, 180, 10)
(check why)
13
7/29/2019 Lecture 3 - Probability Distributions
14/15
Now the most important case (well see why):
What about probability ofnot being extreme, i.e. of being
between 170 cm and 190 cm? (This means less than 10 cm
off average of 180 cm)
o > pnorm(190, 180, 10) pnorm(170, 180, 10)
14
7/29/2019 Lecture 3 - Probability Distributions
15/15
Take the estimates of years at school by country (from the HDR2011
database); this is the variableschoolingyears:
How can we estimate the proportion of countries with children havinga) less than 3 years of schooling?
b) less than 5 years of schooling?
c) at least 7 years of schooling?
Hints:
-You need to use function pnorm
-To use pnorm you need the test value, the mean and the standard
deviation of variableschooling years 15