Transcript
Variance Reduction Techniques
Antithetic Variables
Control Variates
1
2
3 Conditioning Sampling
Content
4 Stratified Sampling (optional)
5 Importance Sampling
Recall we estimate the unknown quantityθ= E(X) by generating random numbers X1,. . . , Xn , and use to estimateθ.
Introduction
The mean square error is
Hence, if we can obtain a different unbiased estimate of θ having a smaller variance than does , we would obtain an improved estimator θ= E(X)
Antithetic Variables
The Use of Antithetic Variables
X1 and X2 are random variables generated identically distributed having mean θ. Then
If X1 and X2 rather than being independent were negatively correlated, …
The Use of Antithetic Variables
Suppose we simulate U1, U2, …Um, which are uniform random numbers. Then V1 =1 – U1., . . , Vm = 1 - Um would also be uniform random numbers.
Therefore, (Ui,Vi,) are negatively correlated.
Actually, it can be proven that If X1 = h(U1,. . . ,Um) and X2 = h(V1,. . . , Vm,) ( h is a monotone function (either increasing or decreasing)), then X1 and X2 have the same distribution and are negatively correlated. (Proof: Appendix 8.10 ,Page 210)
How do we generate negatively correlated random numbers?
How to arrange for X1 and X2 to be negatively correlated?
Step1: X1 = h(U1,. . . ,Um) where U1, …, Um i.i.d. ~ U(0,1), and h is a monotone function of each of its coordinates.
Step2:
which has the same distribution as X1.
The Use of Antithetic Variables
X2 =h(1 – U1., . . , 1 – Um )
What Does “Antithetic” Mean?
“Antithetic” means: Opposed to, or The opposite of, or Negatively correlated with…
The idea is to determine the value of an output variable at random,
then determine its antithetic value (which comes from the opposite part of its distribution),
then form the average of these two values, using the average as a single observation.
AdvantageThe estimator have smaller variance (at least
when h is a monotone function)
We saved the time of generating a second set of random numbers.
Example1
If X1 = eU1, X2 =eU2 ,where U1,U2 iid ~ U (0, 1). We have
where
Suppose we were interested in using simulation to estimate
h (u)= eu is clearly a monotone function. X1 = eU, X2 =e1-U ,where U ~ U (0, 1)Cov (X1 ,X2 ) =
Var ((X1 +X2)/2) =
The variance reduction is 96.7 percent.
Example1
n=1000;u=rand(n,1);x=exp([u;1-u]); %Antithetic Variabletheta=sum(x)/(2*n) % esitmator using Antithetic
Variableu0=rand(n,1);x0=exp([u;u0]); %independent variabletheta0=sum(x0)/(2*n)
Example1: Matlab code
n=1000; m=1000;u=rand(n,m);x=exp([u;1-u]); %Antithetic Variabletheta=sum(x)/(2*n);true=exp(1)-1; %the true value is e-1=1.7183mseav=sum((theta-true).^2)/m % mean square error
u0=rand(n,m);x0=exp([u;u0]);theta0=sum(x0)/(2*n); %independent variablemse0=sum((theta0-true).^2)/m % mean square error reduction=1-mseav/mse0
Example1: Matlab code
so an unbiased combined estimator is (1/2){[−ln(Ui)]0.9+[−ln(1−Ui)]0.9}
n
in 1
1
Estimate the value of the definite integral
Solution: Firstly, we can generate values from the probability density function f(x)=e−x..
This is done by setting Xi=−lnUi, where Ui, i = 1…n is random number with Ui U(0 1).∼
Example 2
An antithetic variable is[−ln(1−Ui)]0.9
Additional Example Consider the case where we want to estimate E(X2) whereX ~Normal(2, 1). How to use antithetic variables to estimate and improve its variance.n.sim=5000; %set the numbers to simulateout1=(2+randn(1,n.sim)).^2;%generate a Normal random variable (2,1) and square it
mean(out1)var(out1)
out2_1=2+randn(1,n.sim/2); % now use antithetic variablesout2_2=4-out2_1;out2=0.5*(out2_1.^2 + out2_2.^2);mean(out2)var(out2)
%how much variance is reduced?Var.reduction=(var(out1)/n.sim -var(out2)/(n.sim/2))/(var(out1)/n.sim);Var.reduction
Control Variates
The Use of Control Variates• Assume desired simulation quantity is θ = E[X];there is another simulation R.V. Y with known μy = E[Y ].
• Then for any given constant c, the quantity
X + c (Y - μy)
is also an unbiased estimator of θ
• Consider its variance:
• It can be shown that this variance is minimized when c isequal to
The Use of Control Variates• The variance of the new estimator is:
• Y is called the control variate for the simulation estimator X.
• We can re-express this by dividing both sides by Var(X):
where
the correlation between X and Y.
The variance is therefore reduced by 100[Corr(X,Y)]2 percent.
The controlled estimator
The Use of Control Variates
and its variance is given by
=
Note: goal is to choose Y so that Y is ≈ X, with Y is easyto simulate and μY is easy to find.
EstimationIf Cov(X,Y) and Var(Y) are unknown in advance.Use the estimators
and
The approximation of c*
Several Variables as a Control
We can use more than a single variable as a control. For example if a simulation results in output variables Yi, i=1, … , k, and E[Yi]=μi is known, then for any constants ci, i=1, …, k, we may use
as an unbiased estimator of E[X].
A natural control variate is random number UX = eU, Y =U ,where U ~ U (0, 1)Cov (X ,Y )
Var(Y ) = Var(U)=1/12, then
= -12*0.14086= -1.6903
Example 3 Estimate . 10][ dxxeUeE
Thus, the controlled estimator is:
n
ii
u uen
i
1
))5.0(6903.1(1̂
Var (X + c*(Y-μy ))
From Example 1, Var(eU)=0.2420 The variance reduction is 98.4 percent
1- 0.0039/0.2420 = 98.4%
Example 3
n=1000;m=1000;y=rand(n,m); %control variatex=exp(y);c=-1.6903;z=x+c*(y-0.5); % X + c (Y - μy)theta=sum(z)/(n); true=exp(1)-1;msecv=sum((theta-true).^2)/m; % mean square
error theta0=sum(x)/(n);mse0=sum((theta0-true).^2)/m;reduction=1-msecv/mse0
Example 3: Matlab code
Example 4:Suppose we wanted to estimate , where
a) Explain how control variables may be used to estimate θ.
b) Do 100 simulation runs, using the control given in (a), to estimate first c and then the variance of the estimator.∗
c) Explain how to use antithetic variables to estimateθ. Using the same data as in (b), determine the variance of the antithetic variable estimator.
d) Which of the two types of variance reduction techniques worked better in this example?
nUcen
ii
U i /))3/1((ˆ1
22
Example 4:
a) Let , so that .One possible choice of control variable is Y = U2. The expected value of Y is
2UeX )(2UeE
So we can use the unbiased estimator of θ given by
where
Example 4:b) The following Matlab program can be used to answer the question
m = 100; U = rand(1,m); Y = U.^2; Ybar = 1/3; X = exp(Y); Xbar = sum(X)/m;A = sum((X-Xbar).*(Y-Ybar));B = sum((Y-Ybar).^2);C = sum((X-Xbar).^2);CovXY = A/(m-1);VarY = B/(m-1);VarX = C/(m-1);c = -A/B;%Estimator:Xc = X + c*(Y-Ybar);Xcbar = sum(Xc)/m;%Variance of estimator:VarXc = (VarX - CovXY^2/VarY)/m
One run of the above program gave the following: the estimated value of c was −1.5950, ∗and the variance of the estimator Xc was Var(Xc ) = 4.5860 × 10−5.
Example 4:c) The antithetic variable estimator can be:
Matlab code: %Antithetic estimatorXa = (exp(U.^2)+exp((1-U).^2))/2;Xabar = sum(Xa)/m;VarXa = var(Xa)/m
The variance of the antithetic variable estimator (using the same U) was: Var(Xa) = 2.7120 × 10−4.
d) It is clear from part (c) that it is better to use the control variable method.
Conditioning sampling
Variance Reduction by ConditioningReview: Conditional Expectation:E[X|Y] denotes that function of the random variable Y whose value at Y=y is E[X|Y=y].
If X and Y are jointly discrete random variables,
If X and Y are jointly continuous with joint p.d.f. f(x,y),
• Recall the law of conditional expectations: (textbook Page 34)
Variance reduction by conditioning
This implies that the estimator E(X|Y) is also an unbiased estimator.
• Now, recall the conditional variance formula: (textbook Page 34)
Clearly, both terms on the right are non-negative, so that we have
This implies that the estimator, by conditioning, produces amore superior variance.
Variance Reduction by Conditioning
Procedure:
Step1: Generate r.v. Y=yi
Step2: Compute the (conditional) expected value of X given Y : E [X | yi].
Step3: An unbiased estimate of θ is Σni=1E[X|yi]/n
Example 5: Estimate π
To estimate π
Recall the simulation introduced in Chapter 1Vi = 2Ui – 1, i =1,2 where Ui ~ U(0,1)Set
E[I] = π/4.Use E[I|V1] rather than I to estimate π/4.
1-1
1
-1
0
Hence, E[I|V1] = (1 – V12)1/2
Use (1 – V12)1/2 as the estimator
Example 5: Estimate π
The variance
I is a Bernouli r.v. having mean π/4, then
The conditioning results in a 70.44 percent reduction in variance.
Example 5: Estimate π
Example 5: Estimate πProcedure 2:Step1: Generate V i= -2Ui-1, i=1…n, where Ui i.i.d. ~ U(0, 1).
Step 2: Evaluate each and take the average of all these values to estimate π/4.
2/12)1( iV
Matlab code: n=1000;m=1000;u1=rand(n,m);v1=2*u1-1;v=(1-v1.^2).^0.5; theta=4*sum(v)/n; msecv=sum((theta-pi).^2)/m; % reduction in variancereduction=1-msecv/mse0
n=1000;m=1000;u1=rand(n,m);v1=2*u1-1;% ------------raw simulation-------------------------v2=2*rand(n,m)-1;s=v1.^2+v2.^2<=1; theta0=4*sum(s)/(n);mse0=sum((theta0-pi).^2)/m; % ----------conditioning sampling-------------------------v=(1-v1.^2).^0.5; theta=4*sum(v)/n; msecv=sum((theta-pi).^2)/m; % reduction in variancereduction=1-msecv/mse0
Example 5: Estimate πMatlab program for comparison of two simulation procedure:
Suppose that Y ~ Exp (1)Suppose that, conditional on Y= y, X ~ N (y, 4)How to estimateθ= P{X>1}?
Raw simulation:Step1: generate Y = - log(U), where U ~ Uni(0, 1)Step2: if Y= y, generate X ~ N (y, 4)Step3: set
then E[I]=θ
Example 6:
If Y= y, is a standard normal r.v..
Then,
where . Therefore, the average value of obtained over many runs
is superior to the raw simulation estimator.
Example 6: Can we express the exact value of E(I | Y=y) in terms of y?
)(1)( xx
Improvement:
)2
1( y
Example 6: Procedure 2:Step1: Generate Y i= - ln(Ui), i=1…n, where Ui i.i.d. ~ U(0, 1).
Step 2: Evaluate each and take the average of all these values to estimate θ.
)2
1( iY
Matlab code: n=1000;EIy=zeros(1,n);for i=1:n y=exprnd(1); EIy(i)=1-normcdf((1-y)/2);endtheta=mean(EIy)
Further Improvement: Using antithetic variables
Example 6: Can we use antithetic variables to improve simulation?
Because the conditional expectation estimator is monotone in Y, the simulation can be improved by using antithetic variables.
)2
1( y
Example 6:
Procedure 3:
Step 1: Generate U1, U2, …Um, 1-U1, 1-U2, …,1-Um .
Step 2: Evaluate and average all these values to estimate θ.
miuu ii ...1)),2
)1log(1()2
log1((21
The random variable
is said to be a compound random variable if N is a nonnegative integer valued random variable and X1, X2, … be a sequence of i.i.d. positive r.v.s that are independent of N.
Example 7:
In an insurance application, Xi could represent the amount of the ith claim made to an insurance company, and N could represent the number of claims made by some specified time t ; S would be the total claim amount made by time t .
In such application, N is often assumed be a Poisson random
variable (in which case S is called a compound Poisson random variable).
Example 7:
Suppose that we want to use simulation to estimate
Raw simulationStep1: generate N, say N = nStep2: generate the values of X1, …, Xn
Step3: set
then E[I] = p
Example 7:
Improvement by conditioningIntroduce a random variable M
What is E[I | M=m]? We can prove:
Example 7:
N and M are independent
Thus, if given M=m, the value E[I | M=m] obtained is P{N≥m}. Since the distribution of N is known (Specially, Poisson distribution), the probability P{N≥m} is easy to be found.
Procedure of simulation improved by conditioning
Step1: Generate Xi in sequence, stopping when .
Step2: Calculate P{N≥m} as the estimate of p from this run.
Example 7:
cXSm
ii
1
Suppose that N is a Poisson r.v. with rate of 10 per day, the amount of a claim X is an exponential r.v. with mean $1000, and that c=$325000. Simulate the probability that the total amount claimed within 30 days exceeds c.
Code: n=100;c=325000;I=zeros(1,n);for i=1:n s=0;m=0; while s<c x=exprnd(1000); % exponential with mean 1000 s=s+x; m=m+1; end p=1-poisscdf(m-1,300); % poisson with rate of 10 per day I(i)=p; endp_bar=sum(I)/n
Example 7:
top related