K. H. Ko School of Mechatronics Gwangju Institute of Science and Technology
Post on 08-Feb-2016
29 Views
Preview:
DESCRIPTION
Transcript
Photo-realistic Rendering and Global Illumination in Computer Graphics
Spring 2012
Monte Carlo Method
K. H. Ko
School of MechatronicsGwangju Institute of Science and Technology
2
Brief History Comte de Buffon in 1677
He conducted an experiment in which a needle of length L was thrown at random on a horizontal plane with lines drawn at a distance d apart (d > L).
He repeated the experiment many times to estimate the probability P that the needle would intersect one of these lines.
Laplace suggested that this technique of repeated experimentation could be used to compute an estimated value of pi.
The term “Monte Carlo” was coined in the 1940s, at the advent of electronic computing, to describe mathematical techniques that use statistical sampling to simulate phenomena or evaluate values of functions. These techniques were originally devised to simulate neutron
transport by a group of scientists working on nuclear weapons.
3
Why are Monte Carlo Techniques Useful? Overall steps of Monte Carlo Techniques
Given a problem of computing the value of the integration of a function with respect to an appropriately defined measure over a domain.
The Monte Carlo approach would be to define a random variable such that the expected value of that random variable would be the solution to the problem.
Samples of this random variable are then drawn and averaged to compute an estimate of the expected value of the random variable.
This estimated expected value is an approximation to the solution of the given problem.
4
Why are Monte Carlo Techniques Useful? Advantages
The conceptual simplicity. Given an appropriate random variable, the computation
consists of sampling the random variable and averaging the estimates obtained from the sample.
Can be applied to a wide range of problems Problems that are stochastic in nature: transport
problems in nuclear physics Problems that require the higher-dimensional integration
of complicated functions. Often Monte Carlo techniques are the only feasible solution.
5
Why are Monte Carlo Techniques Useful? Disadvantage
Relative slow convergence rate. 1/sqrt(N), N is the number of samples.
Several variance reduction techniques to accelerate the convergence have been proposed.
However, they are not used unless there are no viable alternatives.
BUT!!! There are problems for which Monte Carlo methods are the only feasible solution technique: Higher-dimensional integrals and integrals with
nonsmooth integrands.
6
Review of Probability Theory A Monte Carlo process is a sequence
of random events.A numerical outcome can be associated
with each possible event. When a fair die is thrown, the outcome could
be any value from1 to 6. A random variable describes the possible
outcomes of an experiment.
7
Review of Probability Theory Discrete Random Variables
When a random variable can take a finite number of possible values, it is called a discrete random variable.
A probability pi can be associated with any event with outcome xi.
A random variable xdie might be said to have a value of 1 to 6 associated with each of the possible outcomes of the throw of the die.
The probability pi associated with each outcome for a fair die is 1/6.
8
Review of Probability Theory Discrete Random Variables
Some properties of the probabilities pi are: The probability of an event lies between 0 and 1:
0≤pi≤1. If an outcome never occurs, its probability is 0. If an event always occurs, its probability is 1.
The probability that either of two events occurs is: Pr(Event1 or Event2) ≤ Pr(Event1) + Pr(Event2) Two events are mutually exclusive if and only if the
occurrence of one of the events implies the other event cannot possibly occur.
Pr(Event1 or Event2) = Pr(Event1) + Pr(Event2) A set of all the possible events/outcomes of an
experiment such that the events are mutually exclusive and collectively exhaustive satisfies the following normalization property: Σi pi = 1.
9
Review of Probability Theory Expected Value
For a discrete random variable with n possible outcomes, the expected value, or mean of the random variable is
n
iiixpxE
1
)(
10
Review of Probability Theory Variance and Standard Deviation
The variance is a measure of the deviation of the outcomes from the expected value of the random variable.
The standard deviation is the square root of the variance.
iii
iii
iii
pxpxxExExExE
or
pxExxExE
222222
222
)(])[(][]])[[(
])[(]])[[(
11
Review of Probability Theory Functions of Random Variables
Consider a function f(x), where x takes values xi with probabilities pi.
x is a random variable. f(x) is also a random variable whose expected value or mean is defined as
n
iii xfpxfE
1
)()]([
12
Review of Probability Theory Functions of Random Variables
The variance of the function f(x) is defined similarly as
])])([)([( 22 xfExfE
13
Review of Probability Theory Continuous Random Variables
Probability Distribution Function For a real-valued (continuous) random variable x, a
probability density function (PDF) p(x) is defined such that the probability that the variable takes a value x in the interval [x,x+dx] equals p(x)dx.
Cumulative Distribution Function (CDF) It provides a more intuitive definition of probabilities
for continuous variables.
ydxxpyxyP )()Pr()(
14
Review of Probability Theory Continuous Random Variables
The CDF gives the probability with which an event occurs with an outcome whose value is less than or equal to the value y.
The CDF P(y) is a nondecreasing function. The CDF P(y) is non-negative over the domain
of the random variable.
15
Review of Probability Theory The PDF p(x) has the following properties:
16
Review of Probability Theory Expected Value
Similar to the discrete-valued case, the expected value of a continuous random variable x is given as:
Consider some function f(x), where p(x) is the probability distribution function of the random variable x.
Since f(x) is also a random variable, its expected value is
dxxxpxE )(][
dxxpxfxfE )()()]([
17
Review of Probability Theory Variance and Standard Deviation
dxxpxExxExE )(])[(]])[[( 222
22222 ))(()(])[(][ dxxxpdxxpxxExE
18
Review of Probability Theory Conditional and Marginal Probabilities
Consider a pair of random variables x and y. For discrete random variables, pij specifies
the probability that x takes a value of xi and y takes a value of yj.
Similarly, a joint probability distribution function p(x,y) is defined for continuous random variables.
19
Review of Probability Theory Conditional and Marginal Probabilities
The marginal density function of x is defined as
The conditional density function p(y|x) is the probability of y given some x;
dyyxpxp ),()(
dyyxpyxp
xpyxpxyp
),(),(
)(),()|(
20
Review of Probability Theory Conditional and Marginal Probabilities
The conditional expectation of a random function g(x,y) is computed as:
dyyxp
dyyxpyxgdyxypyxgxgE
),(
),(),()|(),(]|[
21
Monte Carlo Integration Assume that we have some function f(x)
defined over the domain x∈[a,b]. We would like to evaluate the integral
For one-dimensional integration, Monte Carlo is typically not used.
b
adxxfI )(
22
Monte Carlo Integration Weighted Sum of Random Variables
Consider a function G that is the weighted sum of N random variables g(x1),…,g(xN).
Each of the xi has the same probability distribution function p(x).
xi : independent identically distributed variables Let gi(x) denote the function g(xi):
N
jjjgwG
1
23
Monte Carlo Integration Weighted Sum of Random Variables
The linearity property holds:
Consider the case where the weights wj are the same and all add to 1. When N functions are added together, wj=1/N:
j
jj xgEwxGE )]([)]([
24
Monte Carlo Integration Weighted Sum of Random Variables
The expected value of G(x) is The expected value of
G is the same as the expected value of g(x).
G can be used to estimate the expected value of g(x).
G is called an estimator of the expected value of the function g(x).
25
Monte Carlo Integration Weighted Sum of Random Variables
The variance of G is
Variance, in general, satisfies the following equation, with the covariance Cov[x,y] given as
For independent random variables, Cov[x,y] = 0
N
i
i
NxgxG
1
22 )()]([
],[2][][][ 222 yxCovyxyx
][][][],[ yExExyEyxCov
26
Monte Carlo Integration Weighted Sum of Random Variables
The following property holds for any constant a:
Using the fact that the xi in G are independent identically distributed variables, the variance for G is
][][ 222 xaax
N
i
i
NxgxG
1
22 ])([)]([
27
Monte Carlo Integration Weighted Sum of Random Variables
So,
28
Monte Carlo Integration Weighted Sum of Random Variables
As N increases, the variance of G decreases with N, making G an increasingly good estimator of E[g(x)].
The standard deviation σ decreases as sqrt(N).
29
Monte Carlo Integration Estimator
The Monte Carlo approach to computing the integral is to consider N samples to estimate the value of the integral.
The samples are selected randomly over the domain of the integral with probability distribution function p(x).
The estimator is denoted as <I> and is
N
i i
i
xpxf
NI
1 )()(1
30
Monte Carlo Integration Estimator
The expected value of the estimator is computed as follows:
31
Monte Carlo Integration Estimator
The variance of this estimator is
As N increases, the variance decreases linearly with N. The error in the estimator is proportional to the standard
deviation σ. The standard deviation decreases as sqrt(N).
One problem with Monte Carlo is the slow convergence of the estimator to the right solution. Four times more samples are required to decrease the
error of the Monte Carlo computation by half.
dxxpI
xpxf
N)(
)()(1 22
32
Monte Carlo Integration Example of Simple Monte Carlo
Integration
33
Monte Carlo Integration Bias
When the expected value of the estimator is exactly the value of the integral I, the estimator is said to be unbiased.
An estimator that does not satisfy this property is said to be biased.
The difference between the expected value of the estimator and the actual value of the integral is called bias: B[<I>] = E[<I>] – I.
The total error on the estimate is typically represented as the sum of the standard deviation and the bias.
34
Monte Carlo Integration Bias
A biased estimator is called consistent if the bias vanishes as the number of samples increases. limN->∞ B[<I>] = 0.
35
Monte Carlo Integration Accuracy
There exist two theorems which explain how the error of the Monte Carlo estimator reduces as the number of samples increases.
These error bounds are probabilistic in nature. Chebyshev’s Inequality
The probability that a sample deviates from the solution by a value greater than sqrt(σ2/δ), where δ is an arbitrary positive number, is smaller than δ.
36
Monte Carlo Integration Accuracy
Assuming an estimator that averages N samples and has a well-defined variance, the variance of the estimator is
37
Monte Carlo Integration Accuracy
The Central Limit Theorem gives an even stronger statement about the accuracy of the estimator.
As N->∞, the Central Limit Theorem states that the values of the estimator have a normal distribution.
Therefore, as N->∞, the computed estimate lies in a narrower region around the expected value of the integral with higher probability.
It only applies when N is large enough. How large N should be is not clear.
38
Monte Carlo Integration Estimating the Variance
The variance for the Monte Carlo estimator is
39
Monte Carlo Integration Deterministic Quadrature versus Monte Carlo
A deterministic quadrature rule to compute a one-dimensional integral could be to compute the sum of the area of regions over the domain.
Extending these deterministic quadrature runes to a d-dimensional integral would require Nd samples.
40
Monte Carlo Integration Multidimensional Monte Carlo
Integration The Monte Carlo integration technique can
be extended to multiple dimensions in a straightforward manner as follows:
41
Monte Carlo Integration Multidimensional Monte Carlo Integration
One of the main strengths of Monte Carlo integration is that it can be extended seamlessly to multiple dimensions.
Monte Carlo techniques permit an arbitrary choice of N as oppose to Nd samples for deterministic quadrature techniques.
Example. Integration over a Hemisphere.
42
Monte Carlo Integration Sampling Random Variables
The Monte Carlo technique is about computing samples from a probability distribution p(x).
Samples should be found such that the distribution of the samples matches p(x).
Inverse Cumulative Distribution Function Rejection Sampling Look-Up Table
43
Monte Carlo Integration Inverse Cumulative Distribution Function
Discrete Random Variables Given a set of probabilities pi, we want to pick xi
with probability pi. The discrete cumulative probability distribution
(CDF) corresponding to the pi is:
i
jii pF
1
44
Monte Carlo Integration Inverse Cumulative Distribution Function
Discrete Random Variables The selection of samples is done as follows:
Compute a sample u that is uniformly distributed over the domain [0,1).
Output k that satisfies the property:
45
Monte Carlo Integration Inverse Cumulative Distribution Function
Discrete Random Variables For a uniform PDF, F(a≤u≤b) = (b-a). The probability that the value of u lies between Fk-1
and Fk is Fk-Fk-1 = pk. But this is the probability that k is selected. Therefore, k is selected with probability pk.
46
Monte Carlo Integration Inverse Cumulative Distribution Function
Continuous Random Variables A sample can be generated according to a given
distribution p(x) by applying the inverse cumulative distribution function of p(x) to a uniformly generated random variable u over the interval [0,1).
The resulting sample is F-1(u). This method of sampling requires the ability to
compute and analytically invert the cumulative probability distribution.
47
Monte Carlo Integration Rejection Sampling
It is often not possible to derive an analytical formula for the inverse of the cumulative distribution function.
Rejection sampling is an alternative.
In rejection sampling, samples are tentatively proposed and tested to determine acceptance or rejection of the sample.
This method raises the dimension of the function being sampled by one and then uniformly samples the bounding box that includes the entire PDF.
This sampling technique yields samples with the appropriate distibution.
48
Monte Carlo Integration Rejection Sampling
For a one-dimensional PDF case. The maximum value over the domain [a,b] to be
sampled is M. Rejection sampling raises the dimension of the function
by one and creates a two-dimensional function over [a,b]x[0,M].
This function is then sampled uniformly to compute samples (x,y).
Rejection sampling rejects all samples (x,y) such that p(x) < y.
All other samples are accepted. The distribution of the accepted samples is exactly the
PDF p(x) we want to sample.
49
Monte Carlo Integration Rejection Sampling
For a one-dimensional PDF case.
50
Monte Carlo Integration Rejection Sampling
For a one-dimensional PDF case. One criticism of rejection sampling is that rejecting
samples could be inefficient. The efficiency of this technique is proportional to
the probability of accepting a proposed sample. This probability is proportional to the ratio of the
area under the function to the area of the box. If this ratio is small, a lot of samples are rejected.
51
Monte Carlo Integration Look-Up Table
It approximates the PDF to be sampled using piecewise linear approximations.
It is not, however, commonly used though it is very useful when the sampled PDF is obtained from measured data.
52
Monte Carlo Integration Variance Reduction
Monte Carlo integration techniques can be roughly subdivided into two categories:
Blind Monte Carlo: those that have no information about the function to be integrated
Informed Monte Carlo: those that do have some kind of information
Informed Monte Carlo methods are able to produce more accurate results as compared to blind Monte Carlo methods.
Designing efficient estimators is a major area of research in Monte Carlo literature.
Reduction of variance is a critical one.
53
Monte Carlo Integration Importance Sampling
It is a technique that uses a nonuniform probability distribution function to generate samples.
The variance of the computation can be reduced by choosing the probability distribution wisely based on information about the function to be integrated.
Given a PDF p(x) defined over the integration domain D, and samples xi, generated according to the PDF, the value of the integral I can be estimated by generating N sample points and computing the weighted mean:
N
i i
i
xpxf
NI
1 )()(1
54
Monte Carlo Integration Importance Sampling
The expected value of the estimator is I The estimator is unbiased.
N
i i
i
xpxf
NI
1 )()(1
55
Monte Carlo Integration Importance Sampling
To determine if the variance of this estimator is better than an estimator using uniform sampling, the variance is estimated.
Clearly, the choice of p(x) affects the value of the variance.
The difficulty of importance sampling is to choose a p(x) such that the variance is minimized.
A perfect estimator would have the variance be zero.
56
Monte Carlo Integration Importance Sampling
The optimal p(x) for the perfect estimator can be found by minimizing the equation of the variance using variational techniques and Lagrange multipliers.
We have to find a scalar λ for which the expression L, a function of p(x), reaches a minimum.
57
Monte Carlo Integration Importance Sampling
The boundary condition is that the integral of p(x) over the integration domain equals 1.
This kind of minimization problem can be solved using the Euler-Lagrange differential equation.
58
Monte Carlo Integration Importance Sampling
To minimize the function, differentiate L(p) with respect to p(x) and solve for the value of p(x) that makes this quantity zero.
59
Monte Carlo Integration Importance Sampling
The constant is a scaling factor, such that p(x) can fulfill the boundary condition.
The optimal p(x) is then given by:
60
Monte Carlo Integration Importance Sampling
If we use this p(x), the variance will be exactly 0 (assuming f(x) does not change sign).
This optimal p(x) requires us to know the value of the integral of f(x).
This value is what we want to compute!!! Clearly, finding the optimal p(x) is not possible. A good importance sampling function matches the
shape of the original function as closely as possible.
61
Monte Carlo Integration Stratified Sampling
One problem with the sampling techniques Samples can be badly distributed over the domain of
integration resulting in a poor approximation of the integral.
This clumping of samples can happen irrespective of the PDF used, because the PDF only tells us something about the expected number of samples in parts of the domain.
Increasing the number of samples collected will eventually address this problem of uneven sample distribution.
Stratified sampling is an alternative of increasing the number of samples to avoid the clumping of samples.
62
Monte Carlo Integration Stratified Sampling
The basic idea is to split the integration domain into m disjoint subdomains (called strata).
Then evaluate the integral in each of the subdomains separately with one or more samples.
63
Monte Carlo Integration Stratified Sampling
This method often leads to a smaller variance as compared to a blind Monte Carlo integration method.
The variance of a stratified sampling method, where each stratum receives a number of samples nj, which are in turn distributed uniformly over their respective intervals, is equal to
64
Monte Carlo Integration Stratified Sampling
If all the strata are of equal size (αj – αj-1 = 1/m) and each stratum contains one uniformly generated sample (nj = 1; N = m), the equation can be given by:
65
Monte Carlo Integration Stratified Sampling
The variance obtained using stratified sampling is always smaller than the variance obtained by a pure Monte Carlo sampling scheme.
As a consequence, there is no advantage in generating more than one sample within a single stratum, since a simple equal subdivision of the stratum such that each sample is attributed to a single stratum always yields a better result.
This does not mean that the above sampling scheme always gives us the smallest possible variance.
We did not take into account the size of the strata relative to each other and the number of samples per stratum.
It is not an easy problem to determine how these degrees of freedom can be chosen optimally.
66
Monte Carlo Integration Stratified Sampling
It can be proved that the optimal number of samples in one subdomain is proportional to the variance of the function values relative to the average function value in that subdomain.
Applied to the principle of one sample per stratum, this implies that the size of the strata should be chosen such that the function variance is equal in all strata.
Such a sampling strategy assumes prior knowledge of the function in question, which is often not available.
However, such a sampling strategy might be used in an adaptive sampling algorithm.
67
Monte Carlo Integration Stratified Sampling
This works well when the number of samples required is known in advance and the dimensionality of the problem is relatively low.
Typically less than 20. For a d-dimensional function, the number of
samples required is Nd. The number of strata required does not scale well
with an increase in the number of dimensions.
68
Monte Carlo Integration N-Rooks or Latin Hypercube Algorithm
The N-Rooks algorithm can keep the number of samples fixed (irrespective of dimensionality).
Consider a two-dimensional function. Stratification of both dimensions would require N2 strata
with one sample per stratum. The N-rooks algorithm addresses this by distributing
N samples evenly among the strata. Each dimension is still subdivided into N subintervals. However, only N samples are needed!!!!!
These samples are distributed such that one sample lies in each subinterval.
69
Monte Carlo Integration N-Rooks or Latin Hypercube Algorithm
Such distribution is achieved by computing permutations of 1..N and letting the ith d-dimensional sample be
In two dimensions, this means that no row or column has more than one sample.
70
Monte Carlo Integration Combining Stratified Sampling and
Importance Sampling These two methods can easily be integrated
with importance sampling The samples computed from a uniform probability
distribution can be stratified. Then these stratified samples are transformed
using the inverse cumulative distribution function. This strategy avoids the clumping of sample,
which at the same time distributing the samples according to the appropriate probability distribution function.
71
Monte Carlo Integration Combining Estimators of Different
Distributions It is useful to combine different sampling
techniques so as to obtain robust solutions that have low variance over a wide range of parameter settings.
The rendering equation consists of the BRDF, the geometry term, the incoming radiance, etc. Each one of these different terms could be used for importance sampling.
However, depending on the material properties or the distribution of objects in a scene, one of these techniques could be more effective than the other.
72
Monte Carlo Integration Combining Estimators of Different
Distributions Using Variance
Consider combining two estimators, <I1> and <I2>, to compute an integral I.
Any linear combination w1<I1> + w2<I2> with constant weights w1+w2=1 will be an estimator for S.
The variance of the linear combination however depends on the weights
73
Monte Carlo Integration Combining Estimators of Different
Distributions Using Variance
If <I1> and <I2> are independent, the covariance is zero.
Minimization of the variance expression above allows us to fix the optimal combination weights:
74
Monte Carlo Integration Combining Estimators of Different
Distributions Using Variance
The weights can be calculated in two different ways
Using analytical expressions for the variance of the involved estimators.
Using a posteriori estimates for the variances based on the samples in an experiment themselves.
By doing so, a slight bias is introduced. As the number of samples is increased, the bias
vanishes. The combination is asymptotically unbiased or
consistent.
75
Monte Carlo Integration Combining Estimators of Different Distributions
Multiple Importance Sampling Combine different estimators using potentially different
weights for each individual sample, even for samples from the same estimator.
Samples from one estimator could have different weights assigned to them, unlike the approach where the weight depends only on the variance.
The balance heuristic is used to determine the weights that combine these samples from different PDFs provided the weights sum to 1.
The balance heuristic results in an unbiased estimator that provably has variance that differs from the variance of the optimal estimator by an additive error term.
76
Monte Carlo Integration Combining Estimators of Different
Distributions Multiple Importance Sampling
77
Monte Carlo Integration Control Variates
Another technique to reduce variance uses control variates.
Variance could be reduced by computing a function g that can be integrated analytically and subtracted from the original function to be integrated.
78
Monte Carlo Integration Control Variates
Since the integral of the function ∫g(x)dx has been computed analytically, the original integral is estimated by computing an estimator for ∫f(x)-g(x) dx.
If f(x)-g(x) is almost constant, this technique is very effective at decreasing variance.
If f/g is nearly constant, g should be used for importance sampling.
79
Monte Carlo Integration Quasi-Monte Carlo
These techniques decrease the effects of clumping in samples by eliminating randomness completely.
Samples are deterministically distributed as uniformly as possible.
They try to minimize clumping with respect to a measure called the discrepancy.
The most commonly used measure of discrepancy is the star discrepancy measure.
80
Monte Carlo Integration Quasi-Monte Carlo
Consider a set of points P. Consider each possible axis-aligned box with one
corner at the origin. Given a box of size Bsize, the ideal distribution of points
would have NBsize points. The star discrepancy measure computes how much the
point distribution P deviates from this ideal situation
NumPoints(P∩B) are the number of points from the set P that lie in Box B.
81
Monte Carlo Integration Quasi-Monte Carlo
The star discrepancy is significant. It is closely related to the error bounds for quasi-
Monte Carlo integration. The Koksma-Hlawka inequality states that the
different between the estimator and the integral to be computed satisfies the condition:
Here the VHK term is the variation in the function f(x). VHK measures how fast the function can change.
82
Monte Carlo Integration Quasi-Monte Carlo
The important point to take from this inequality is that the error in the MC estimate is directly proportional to the discrepancy of the sample set.
Therefore, much effort has been expended in designing sequences that have low discrepancy.
These sequences are called low-discrepancy sequences (LDS).
Examples of low-discrepancy sequences: Hammersley, Halton, Sobol, etc.
83
Monte Carlo Integration Why Quasi-Monte Carlo?
The error bound for low-discrepancy sequences when applied to MC integration is O((logN)d/N) or )((logN)d-1/N) for large N and dimension d.
This bound could have a substantial potential benefit compared to the 1/sqrt(N) error bounds for pure Monte Carlo techniques.
Low-discrepancy sequences work best for low dimensions (about 10-20).
At higher dimensions, their performance is similar to pseudorandom sampling.
However, as compared to pseudorandom sampling, low-discrepancy sequences are highly correlated.
The difference between successive samples in the van der Corput sequence (a base-2 Halton sequence) is 0.5 half of the time.
The upshot is that low-discrepancy sampling gives up randomness in return for uniformity in the sample distribution.
84
Monte Carlo Integration Comparison of Quasi-MC and MC
QMC is based on low-discrepancy sequences. MC is based on sequences of pseudorandom
numbers.
The accuracy of the quasi-MC method increases faster than that of the MC method.
The advantage of the qMC is greater is the integrand is smooth, and the number of dimensions of the integral is small.
85
Monte Carlo Integration Note on Low discrepancy sequence.
A low-discrepancy sequence is a sequence with the property that for all values of N, its subsequence has a low discrepancy.
For a numerical integration,
If the points are chosen as x = i/N: rectangle rule If the points are chosen to be randomly (or
pseudorandomly) distributed: MC If the points are chosen as elements of a low-
discrepancy sequence: qMC
86
Monte Carlo Integration Note on Low discrepancy sequence.
top related