Basic Probability Concepts - Statpower

Random VariablesProbability Distributions

Sampling DistributionsConfidence IntervalsHypothesis Testing

Basic Probability Concepts

James H. Steiger

Department of Psychology and Human DevelopmentVanderbilt University

Multilevel Regression Modeling, 2009

Multilevel Basic Probability Concepts



An Introduction to R

1 Random VariablesInformal DefinitionManifest and Latent Random VariablesContinuous and Discrete Random Variables

2 Probability DistributionsProbability ModelsThe Normal DistributionThe Multivariate Normal DistributionThe Lognormal DistributionThe Binomial DistributionThe Poisson Distribution

3 Sampling Distributions4 Confidence Intervals

The Classic Normal Theory ApproachConfidence Intervals on Linear TransformationsConfidence Intervals Via Simulation

5 Hypothesis Testing




Informal DefinitionManifest and Latent Random VariablesContinuous and Discrete Random Variables

Random Variables

Random VariablesThe term random variable has a technical definition thatwe discussed in Psychology 310For our purposes, it will suffice to consider a randomvariable to be a random process with numerical outcomesthat occur according to a distribution law

Example (Uniform (0,1) Random Variable)

A random process that generates numbers so that all valuesbetween 0 and 1, inclusive, are equally likely to occur is said tohave a U(0,1) distribution.





Random Variables








Random Variables








Random Variables








Manifest and Latent Variables

Manifest and Latent VariablesIn advanced applications, we will refer to manifest andlatent random variablesA variable is manifest if it can be measured directlyA variable is latent if it is an assumed quantity that cannotbe measured directlyThe dividing line between manifest and latent variables isoften rather imprecise

Example (Manifest Variable)

Your grade on an exam is a manifest random variable.













































Continuous and Discrete Random Variables

A continuous random variable has an uncountably infinitenumber of possible outcomes because it can take on allvalues over some range of the number lineA discrete random variable takes on only a countablenumber of discrete outcomesAs we saw in Psychology 310, discrete random variablescan assign a probability to a particular numerical outcome,while continuous random variables cannot

Example (Discrete Random Variable)

Suppose you assign the number 1 to all people born male, and 2to all people born female. This random variable is discrete,because it takes on only the values 1 and 2.




































Probability ModelsThe Normal DistributionThe Multivariate Normal DistributionThe Lognormal DistributionThe Binomial DistributionThe Poisson Distribution

Using Probability Distributions

Using Probability DistributionsProbability distributions are frequently used to providesuccinct models for quantities of scientific interestWe observe distributions of data, and assess how well thedistributions conform to the specified modelWhile observing the distribution of the data, we mayhypothesize the general family of the distribution, but leaveopen the question of the values of the parametersIn that case, we talk of free parameters to be estimated





























Using Probability DistributionsMore Complex Applications

Using Probability DistributionsIn more complex applications, such as multilevel modeling,we may model data emanating from a particulardistribution family at one level (say kids within a school)At another level, we might model the parameters for theschools as having a distribution across schoolsFor example, we might hypothesize that the parametersacross schools have a normal distributionIn that case, the size of the variance of that distributionwould indicate how much the schools show variation on aparticular characteristicIn the slides that follow, we shall examine some of the moreuseful distributions we will encounter early in the course



































The Normal Distribution

The Normal DistributionThe normal distribution is a widely used continuousdistributionThe normal distribution family is a two-parameter familyEach normal distribution is characterized by twoparameters, the mean µ and the standard deviation σ.Shaped like a bell, the normal pdf is sometimes referred toas the bell curveThe central limit theorem, discussed on pages 13–14 ofGelman & Hill, explains why many quantities have adistribution that is approximately normalThe normal distribution family is closed under lineartransformations, i.e., any normal distribution may betransformed into any other normal distribution by a lineartransformation









































The Multivariate Normal Distribution

The Multivariate Normal DistributionThe multivariate normal distribution is a continuousmultivariate distribution having two matrix parameters,the vector of means µ and the covariance matrix Σ

Any linear combination of multi-normal variables has anormal distributionAs we saw in Psychology 310, the mean and variance of thelinear combination is determined by µ, Σ, and the linearweights


























The Lognormal Distribution

The Lognormal DistributionIf X is normally distributed, then y = ex is said to have alognormal distribution. If Y is lognormally distributed, thelogarithm of Y has a normal distributionIn R, dlnorm gives the density, plnorm gives thedistribution function, qlnorm gives the quantile function,and rlnorm generates random deviates





The Lognormal DistributionSome Basic Facts

The Lognormal DistributionIt is common, when referring to a normal distribution, touse the abbreviations N (µ, σ) or N (µ, σ2).It is important to realize that, when referring to alognormal distribution for a variable Y , the convention isto refer to the parameters µ and σ from the correspondingnormal variable X = ln(Y )In this case, the actual mean and variance of Y are not µand σ2, but rather are

E (Y ) = eµ+12σ2

,

Var(Y ) = (eσ2 − 1)e2µ+σ2







E (Y ) = eµ+12σ2

,

Var(Y ) = (eσ2 − 1)e2µ+σ2







E (Y ) = eµ+12σ2

,

Var(Y ) = (eσ2 − 1)e2µ+σ2







E (Y ) = eµ+12σ2

,

Var(Y ) = (eσ2 − 1)e2µ+σ2





The Lognormal Distribution

Example (The Lognormal Distribution)

Here is a picture comparing the lognormal and correspondingnormal distribution.

−3 −2 −1 0 1 2 3

0.0

0.2

0.4

0.6

0.8

1.0

x

f(x)





The Lognormal DistributionApplications

Applications of the LognormalWhen independent processes combine multiplicatively, theresult can be lognormally distributedFor a detailed and entertaining discussion of the lognormaldistribution, see the article by Limpert, Stahel, and Abbt(2001) in the reading list

















The Binomial Distribution

The Binomial DistributionThis discrete distribution is one of the foundations ofmodern categorical data analysisThe binomial random variable X represents the number of“successes” in N outcomes of a binomial processA binomial process is characterized by

N independent trialsOnly two outcomes, arbitrarily designated “success” and“failure”Probabilities of success and failure remain constant overtrials

Many interesting real world processes only approximatelymeet the above specificationsNevertheless, the binomial is often an excellentapproximation





































































Characteristics of the Binomial Distribution

Characteristics of the Binomial DistributionThe binomial distribution is a two-parameter family, N isthe number of trials, p the probability of successThe binomial has pdf

Pr(X = r) =(

Nr

)pr (1− p)N−r

The mean and variance of the binomial are

E (X ) = NpVar(X ) = Np(1− p)







Pr(X = r) =(

Nr

)pr (1− p)N−r


E (X ) = NpVar(X ) = Np(1− p)







Pr(X = r) =(

Nr

)pr (1− p)N−r


E (X ) = NpVar(X ) = Np(1− p)







Pr(X = r) =(

Nr

)pr (1− p)N−r


E (X ) = NpVar(X ) = Np(1− p)






Normal Approximation to the Binomial

The B(N , p) distribution is well approximated by aN (Np,Np(1− p)) distribution as long as p is not too farremoved from .5 and N is reasonably largeA good rule of thumb is that both Np and N (1− p mustbe greater than 5The approximation can be further improved by correctingfor continuity


























The Poisson Distribution

The Poisson DistributionWhen events arrive without any systematic “clustering,”i.e., they arrive with a known average rate in a fixed timeperiod but each event arrives at a time independent of thetime since the last event, the exact integer number ofevents can be modeled with the Poisson distributionThe Poisson is a single parameter family, the parameterbeing λ, the expected number of events in the interval ofinterestFor a Poisson random variable X , the probability ofexactly r events is

Pr(X = r) =λre−λ

r !








r !








r !








r !





Characteristics of the Poisson Distribution

Characteristics of the Poisson DistributionThe Poisson is used widely to model occurrences of lowprobability eventsA random variable X having a Poisson distribution withparameter λ has mean and variance given by

E (X ) = λ

Var(X ) = λ




Sampling Distributions

Sampling DistributionsAs discussed in your introductory course, we frequentlysample from a population and obtain a statistic as anestimate of some key quantityOver repeated samples, these estimates show variabilityThis variability is like noise, degrading the signal that isthe parameterThe known or hypothetical sampling distribution of thestatistic allows us to gauge how accurate our parameterestimate is (at least in the long run)
























Sampling DistributionsAn Example

Sampling Distributions — An ExampleSuppose we take an opinion poll of N = 100 people atrandom, and 47% of them favor some positionThe question is, what does that tell us about the proportionof people in the population favoring the position?















Sampling Distributions — An ExampleIn your introductory course, you learned as a simpleconsequence of the binomial distribution that if thepopulation proportion is p, the sample proportion p̂ has asampling distribution that is approximately normal, withmean p and variance p(1− p)/NFor any hypothesized value of p, this tells us, through ourknowledge of the normal distribution, how likely we wouldbe to observe a value of .47We can use this, in turn, to evaluate which values of p are“reasonable” in some sense




















Confidence Intervals

Confidence IntervalsA confidence interval is a numerical interval constructed onthe basis of dataSuch an interval is called a 95% (or .95) confidence intervalif it is constructed so that it contains the true parametervalue at least 95% of the time in the long runThere are a variety of methods available for constructingconfidence intervals























Simple Normal Theory Confidence Intervals

Normal Theory Confidence IntervalsIn Psychology 310 we leared about simple symmetricconfidence intervals based on the normal distributionIf a statistic θ̂ used to estimate a parameter θ has a normalsampling distribution with mean θ and sampling varianceVar(θ̂), then we may construct a 95% confidence intervalfor θ as

θ̂ ± 1.96√

Var(θ̂)

In general, a consistent estimator V̂ar(θ̂) may besubstituted for Var(θ̂) in the above







θ̂ ± 1.96√

Var(θ̂)








θ̂ ± 1.96√

Var(θ̂)








θ̂ ± 1.96√

Var(θ̂)






Confidence Intervals on Linear Combinations

Confidence Intervals on Linear CombinationsAs we saw in Psychology 310, frequently linearcombinations of parameters are of interestIn that case, we can construct appropriate point estimates,standard errors, test statistics, and confidence intervalsMethods are discussed in detail in the Psychology 310handout, A Unified Approach to Some Common StatisticalTests























Confidence Intervals Via Simulation

Confidence Intervals Via SimulationIn some cases, we are interested in a function of parametersWe know the distribution of individual parameterestimates, but we don’t have a convenient expression forthe distribution of the function of the parameter estimatesIn this case, we can simulate the distribution of thefunction of parameter estimates using random numbergenerationTo generate the 95% confidence interval, we extract the.025 and .975 quantiles of the resulting simulated data





























Confidence Intervals Via SimulationAn Example

Example (Confidence Intervals Via Simulation)

An example of the simulation approach can be found onpage 20 of Gelman & HillThey assume that, with N = 500 per group, thedistribution of the sample proportion can be approximatedvery accurately with a normal distributionIn the problem of interest, the experimenter has observedsample proportions p̂1 and p̂2, each based on samples of 500However, the experimenter wishes to construct a confidenceinterval on p1/p2.



































The experimenter proceeds by constructing 10000independent replications of p̂1 and 10000 replications of p̂2

For each pair, the ratio p̂1/p̂2 is computedThis creates a set of 10000 replications of the ratio ofproportionsThe 95% confidence interval is then constructed from the.025 and .975 quantiles of this set of 10000 ratios




































Hypothesis Testing

Hypothesis TestingGelman and Hill make a number of interesting points intheir brief discussionThey suggest viewing a hypothesis as a model about thedataTesting the hypothesis involves comparing the behavior ofthe data with the data predicted by the modelFor example, if proportions are showing their standardrandom variation, this implies something about the size ofthat variationThey examine this notion in an extensive example




Hypothesis Testing





Hypothesis Testing





Hypothesis Testing





Hypothesis Testing





Hypothesis Testing



Basic Probability Concepts - Statpower

Documents