Top Banner
Random-Number Generation Andy Wang CIS 5930-03 Computer Systems Performance Analysis
43

Random-Number Generation Andy Wang CIS 5930-03 Computer Systems Performance Analysis.

Dec 31, 2015

Download

Documents

Joella Howard
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Random-Number Generation Andy Wang CIS 5930-03 Computer Systems Performance Analysis.

Random-Number Generation

Andy WangCIS 5930-03

Computer SystemsPerformance Analysis

Page 2: Random-Number Generation Andy Wang CIS 5930-03 Computer Systems Performance Analysis.

Generate Random Values

• Two steps– Random-number generation

• Get a sequence of random numbers distributed uniformly between 0 and 1

– Random-variate generation• Transform the sequence to produce random

values satisfying the desired distribution

2

Page 3: Random-Number Generation Andy Wang CIS 5930-03 Computer Systems Performance Analysis.

3

Background

• The most common method– Use a recursive function

xn = f(xn-1, xn-2, …)

Page 4: Random-Number Generation Andy Wang CIS 5930-03 Computer Systems Performance Analysis.

Example

• xn = (5xn-1 + 1) %16– Suppose x0 = 5– The first 32 numbers

are between 0 and 15

• Divide xn by 15 to get numbers between 0 and 1

4

0 5 10 15 20 25 30 350

2

4

6

8

10

12

14

16

Nth number

Random number

0 5 10 15 20 25 30 350

0.10.20.30.40.50.60.70.80.9

1

Nth number

Random number

Page 5: Random-Number Generation Andy Wang CIS 5930-03 Computer Systems Performance Analysis.

Basic Terms

• x0 = seed– Given a function, the entire sequence can

be regenerated with x0

• Generated numbers are pseudo random– Deterministic – Can pass statistical tests for randomness– Preferred to fully random numbers so that

simulated results can be repeated

5

Page 6: Random-Number Generation Andy Wang CIS 5930-03 Computer Systems Performance Analysis.

0 5 10 15 20 25 30 350

0.10.20.30.40.50.60.70.80.9

1

Nth number

Random number

Cycle Length

• Note that starting with the 17th number, the sequence repeats– Cycle length of 16

6

Page 7: Random-Number Generation Andy Wang CIS 5930-03 Computer Systems Performance Analysis.

More Terms

• Some generators do not repeat the initial part (tail) of the sequence

• Period of a generator = tail + cycle length

7

tail cycle length

period

Page 8: Random-Number Generation Andy Wang CIS 5930-03 Computer Systems Performance Analysis.

Question

• How to choose seeds and random-number generation functions?1. Efficiently computable

• Heavily used in simulations

2. The period should be large3. Successive values should be

independent and uniformly distributed

8

Page 9: Random-Number Generation Andy Wang CIS 5930-03 Computer Systems Performance Analysis.

Types of Random-Number Generators

• Linear-congruential generators• Tausworth generators• Extended Fibonacci generators• Combined generators• Others

9

Page 10: Random-Number Generation Andy Wang CIS 5930-03 Computer Systems Performance Analysis.

10

Linear-Congruential Generators

• In 1951, Lehmer found residues of successive powers of a number have good randomness propertiesxn = an % m = aan-1 % m = axn-1 % m

• Lehmer’s choices of a and ma = 23 (multiplier)m = 108 + 1 (modulus)

• Implemented on ENIAC

Page 11: Random-Number Generation Andy Wang CIS 5930-03 Computer Systems Performance Analysis.

(Mixed) Linear-Congruential

Generators (LCG)• xn = (axn-1 + b) % m

• xn is between 0 and m – 1• a and b are non-negative integers

• “Mixed” using both multiplication by a and addition by b

11

Page 12: Random-Number Generation Andy Wang CIS 5930-03 Computer Systems Performance Analysis.

The Choice of a, b, and m

• m should be large– Period is never longer than m

• To compute % m efficiently– Make m = 2k

– Just truncate the result by k bits

12

Page 13: Random-Number Generation Andy Wang CIS 5930-03 Computer Systems Performance Analysis.

The Choice of a, b, and m

• If b > 0, maximum period m is obtained when– m = 2k

– a = 4c + 1– b is odd– c, b, and k are positive integers

13

Page 14: Random-Number Generation Andy Wang CIS 5930-03 Computer Systems Performance Analysis.

Full-Period Generators

• Generators with maximum possible periods

• Not equally good– Look for low autocorrelations between

successive numbers– xn = ((234 + 1)xn-1 + 1) % 235 has an

autocorrelation of 0.25– xn = ((218 + 1)xn-1 + 1) % 235 has an

autocorrelation of 2-18

14

Page 15: Random-Number Generation Andy Wang CIS 5930-03 Computer Systems Performance Analysis.

Multiplicative LCG

• xn = axn-1 % m, b = 0• Can compute more efficiently when m =

2k

• However, maximum period is only 2k-2

• Problem: Cyclic patterns with lower bits

15

Page 16: Random-Number Generation Andy Wang CIS 5930-03 Computer Systems Performance Analysis.

Multiplicative LCG with m = 2k

• When a = 8i ± 3– E.g., xn = 5xn-1 % 25

• Period is only 8• Which is ¼ of 25

• When a ≠ 8i ± 3– E.g., xn = 7xn-1 % 25

• Period is only 4

16

0 5 10 15 20 25 30 350

5

10

15

20

25

30

Nth number

Random number

0 5 10 15 20 25 30 350

5

10

15

20

25

30

Nth number

Random number

Page 17: Random-Number Generation Andy Wang CIS 5930-03 Computer Systems Performance Analysis.

Multiplicative LCG with m ≠ 2k

• To get a longer period, use m = prime number– With proper choice of a, it is possible to get

a period of m – 1– a needs to be a prime root of m

• If and only if an % m ≠ 1 for n = 1..m - 2

17

Page 18: Random-Number Generation Andy Wang CIS 5930-03 Computer Systems Performance Analysis.

Multiplicative LCG with m ≠ 2k

• xn = 3xn-1 % 31– x0 = 1– Period is 30– 3 is a prime root of 31

18

0 5 10 15 20 25 30 350

5

10

15

20

25

30

Nth number

Ra

nd

om

nu

mb

er

Page 19: Random-Number Generation Andy Wang CIS 5930-03 Computer Systems Performance Analysis.

Multiplicative LCG with m ≠ 2k

• xn = 75xn-1 % (231 – 1)– 75 is a prime root of 231 – 1– But watch out for computational errors

• Multiplication overflow– Need to apply tricks mentioned in p. 442

• Truncation due to the number of digits available

19

Page 20: Random-Number Generation Andy Wang CIS 5930-03 Computer Systems Performance Analysis.

20

Tausworthe Generations

• How to generate large random numbers?

• The Tausworthe generator produces a random sequence of binary digits– The generator then divides the sequence

into strings of desired lengths– Based on a characteristic polynomial

Page 21: Random-Number Generation Andy Wang CIS 5930-03 Computer Systems Performance Analysis.

Tausworthe Example

• Suppose we use the following characteristic polynomialx7 + x3 + 1– The corresponding generation function is

• bn+7 bn+3 bn = 0Or• bn = bn-4 bn-7

– Need a 7-bit seed

21

Page 22: Random-Number Generation Andy Wang CIS 5930-03 Computer Systems Performance Analysis.

Tausworthe Example

• The bit stream sequence1111111000011101111001011001….

• Convert to random numbers between 0 and 1, with 8-bit numbersx0 = 0.111111102 = 0.9921910

x1 = 0.000111012 = 0.1132810

x2 = 0.111001012 = 0.8945310

22

Page 23: Random-Number Generation Andy Wang CIS 5930-03 Computer Systems Performance Analysis.

Tausworthe Generator Characteristics

• For the L-bit numbers generated+E[xn] = ½

+V[xn] = 1/12+The serial correlation is zero+ Good results over the complete cycle- Poor local behavior within a sequence

23

Page 24: Random-Number Generation Andy Wang CIS 5930-03 Computer Systems Performance Analysis.

Tausworthe Example

• If a characteristic polynomial of order q has a period of 2q – 1, it is a primitive polynomial

• For x7 + x3 + 1• q = 7• Sequence repeats after 127 bits = 27 - 1• A primitive polynomial

24

Page 25: Random-Number Generation Andy Wang CIS 5930-03 Computer Systems Performance Analysis.

Tausworthe Implementation

• Can be easily generated via linear-feedback shift-registers

• For x5 + x3 + 1

25

bn bn-1 bn-2 bn-3 bn-4 bn-5

Page 26: Random-Number Generation Andy Wang CIS 5930-03 Computer Systems Performance Analysis.

26

Extended Fibonacci Generators

• xn = (xn-1 + xn-2) % m– Does not have good randomness

properties– High serial correlation

• An extension– xn = (xn-5 + xn-17) % 2k

Page 27: Random-Number Generation Andy Wang CIS 5930-03 Computer Systems Performance Analysis.

27

Combined Generations

• Add random numbers by two or more generators– Can considerably increase the period and

randomness

xn = 40014xn-1 % 2147483563

yn = 40692yn-1 % 2147483399

wn = (xn - yn) % 2147483562– This generator has a period of 2.3 x 1018

Page 28: Random-Number Generation Andy Wang CIS 5930-03 Computer Systems Performance Analysis.

28

wn = 157wn-1 % 32363

xn = 146xn-1 % 31727

yn = 142yn-1 % 31657

vn = (wn - xn + yn) % 32362– This generator has a period of 8.1 x 1012

– Can avoid the multiplication overflow problem

Combined Generators

Page 29: Random-Number Generation Andy Wang CIS 5930-03 Computer Systems Performance Analysis.

29

• XOR random numbers by two or more generators

Combined Generators

Page 30: Random-Number Generation Andy Wang CIS 5930-03 Computer Systems Performance Analysis.

30

• Shuffle– One sequence as an index

• To an array filled with random numbers generated by the second sequence

– The chosen number in the second sequence is replaced by a new random number

– Problem• Cannot skip to the nth random number

Combined Generators

Page 31: Random-Number Generation Andy Wang CIS 5930-03 Computer Systems Performance Analysis.

31

A Survey of Random-number Generators

• Some published generator functionsxn = 75xn-1 % (231 – 1)– Full period of 231 – 2– Low-order bits are randomly distributed

• Many others (see textbook)– All have problems

• General lessons: Use established ones; Do not invent your own

Page 32: Random-Number Generation Andy Wang CIS 5930-03 Computer Systems Performance Analysis.

32

Seed Selection

• If the generator has a full period– Only one random variable is required– Any seed value is good

• However, with more than one random variable, the story is different for multistream simulations– E.g., random arrival and service times– Should use two streams of random

numbers

Page 33: Random-Number Generation Andy Wang CIS 5930-03 Computer Systems Performance Analysis.

Seed Selection Guidelines

• Do not use zero– Not good for multiplicative LCGs and

Tausworthe generators• Avoid even values

– Not good if a generator does not have a full period

• Do not use one stream for all variables– May yield strong correlations among

variables

33

Page 34: Random-Number Generation Andy Wang CIS 5930-03 Computer Systems Performance Analysis.

Seed Selection Guidelines

• Use nonoverlapping streams– Each stream requires a separate seed– Otherwise…

• A long interarrival time may correlate with a long service time

– Suppose we need 10,000 random numbers for interarrival times; 10,000 for service times, use seeds 1 and 10,001

– xn = [anx0 + c(an – 1)/(a – 1)] % m• For multiplicative LCGs, c = 0

34

Page 35: Random-Number Generation Andy Wang CIS 5930-03 Computer Systems Performance Analysis.

Seed Selection Guidelines

• Not to reuse seeds in successive simulation runs– No point to run a simulation again with the

same seed– Just continue with the last random number

as the seed for the successive runs

35

Page 36: Random-Number Generation Andy Wang CIS 5930-03 Computer Systems Performance Analysis.

Seed Selection Guidelines

• Do not use random random-number generator seeds– E.g., do not use the time of day, or /dev/random to seed simulations

– Simulations should be repeatable – Cannot guarantee that multiple streams will

not overlap• Do not use numbers generated by

random-number generators as seeds

36

Page 37: Random-Number Generation Andy Wang CIS 5930-03 Computer Systems Performance Analysis.

37

Myths About Random-number Generation

• A complex set of operations leads to random results– Hard to guess does not mean random

• Random numbers are not predictable– Given a few successive numbers from an

LCG– Can solve a, c, and m– Not suitable for cryptographic applications

Page 38: Random-Number Generation Andy Wang CIS 5930-03 Computer Systems Performance Analysis.

38

• Some seeds are better than others– True– Avoid generators whose period and

randomness depend on the seed• Accurate implementation is not

important– Watch out for overflows and truncations

Myths about Random- number Generation

Page 39: Random-Number Generation Andy Wang CIS 5930-03 Computer Systems Performance Analysis.

39

• Bits of successive words generated by a random-number generator are equally randomly distributed– Nope

Myths about Random- number Generation

Page 40: Random-Number Generation Andy Wang CIS 5930-03 Computer Systems Performance Analysis.

40

• xn = (25173xn-1 + 13849) % 216

– x0 = 1– Least significant bit is always 1– Bit 2 is always 0– Bit 3 has a cycle of 2– Bit 4 has a cycle of 4– Bit 5 has a cycle of 8

Myths about Random- number Generation

n decimal binary

1 25173 01100010 01010101

2 12345 00110000 00111001

3 54509 11010100 11101101

4 27825 01101100 10110001

5 55493 11011000 11000101

6 25449 01100011 01101001

7 13277 00110011 11011101

Page 41: Random-Number Generation Andy Wang CIS 5930-03 Computer Systems Performance Analysis.

41

• For all multiplicative LCGs• The Lth bit has a period that is at most 2L

• For LCGs, with the formxn = axn-1 % 2k

– The least significant bit is always 0 or always 1

• High-order bits are more random

Myths about Random- number Generation

Page 42: Random-Number Generation Andy Wang CIS 5930-03 Computer Systems Performance Analysis.

More on Random Number Generations

• Mersenne twister– Period =~ 219937-1

• /dev/random– Extract randomness from physical devices– Truly random

42

Page 43: Random-Number Generation Andy Wang CIS 5930-03 Computer Systems Performance Analysis.

43

White Slide