rand() Considered Harmful

Post on 22-Feb-2016

50 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

DESCRIPTION

rand() Considered Harmful. Stephan T. Lavavej (" Steh -fin Lah - wah -wade") Senior Developer - Visual C++ Libraries stl@microsoft.com. What's Wrong With This Code?. # include < stdio.h > #include < stdlib.h > #include < time.h > int main() { srand (time(NULL)); - PowerPoint PPT Presentation

Transcript

Version 1.1 - September 5, 2013 1

rand() Considered HarmfulStephan T. Lavavej ("Steh-fin Lah-wah-wade")Senior Developer - Visual C++ Librariesstl@microsoft.com

2

What's Wrong With This Code?#include <stdio.h>#include <stdlib.h>#include <time.h>int main() { srand(time(NULL)); for (int i = 0; i < 16; ++i) { printf("%d ", rand() % 100); } printf("\n");}

3

What's Right With This Code?#include <stdio.h>#include <stdlib.h>#include <time.h>int main() { srand(time(NULL)); for (int i = 0; i < 16; ++i) { printf("%d ", rand() % 100); } printf("\n");}

All required headers are included!All included headers are required!

Headers are sorted!One True Brace Style!

Unnecessary argc, argv, return 0; omitted!

%d is correct for int!

4

What's Wrong With This Code?#include <stdio.h>#include <stdlib.h>#include <time.h>int main() { srand(time(NULL)); for (int i = 0; i < 16; ++i) { printf("%d ", rand() % 100); } printf("\n");}

5

What's Wrong With This Code?#include <stdio.h>#include <stdlib.h>#include <time.h>int main() { srand(time(NULL)); for (int i = 0; i < 16; ++i) { printf("%d ", rand() % 100); } printf("\n");}

ABOMINATION!

6

What's Wrong With This Code?#include <stdio.h>#include <stdlib.h>#include <time.h>int main() { srand(time(NULL)); for (int i = 0; i < 16; ++i) { printf("%d ", rand() % 100); } printf("\n");}

ABOMINATION!Frequency: 1 Hz!

7

What's Wrong With This Code?#include <stdio.h>#include <stdlib.h>#include <time.h>int main() { srand(time(NULL)); for (int i = 0; i < 16; ++i) { printf("%d ", rand() % 100); } printf("\n");}

ABOMINATION!Frequency: 1 Hz!

warning C4244:

'argument' : conversion

from 'time_t' to 'unsigned

int', possible loss

of data

32-bit seed!

8

What's Wrong With This Code?#include <stdio.h>#include <stdlib.h>#include <time.h>int main() { srand(time(NULL)); for (int i = 0; i < 16; ++i) { printf("%d ", rand() % 100); } printf("\n");}

ABOMINATION!Frequency: 1 Hz!

warning C4244:

'argument' : conversion

from 'time_t' to 'unsigned

int', possible loss

of data

32-bit seed!

Range: [0, 32767] Linear congruential low quality!

9

What's Wrong With This Code?#include <stdio.h>#include <stdlib.h>#include <time.h>int main() { srand(time(NULL)); for (int i = 0; i < 16; ++i) { printf("%d ", rand() % 100); } printf("\n");}

ABOMINATION!Frequency: 1 Hz!

warning C4244:

'argument' : conversion

from 'time_t' to 'unsigned

int', possible loss

of data

32-bit seed!

Range: [0, 32767] Linear congruential low quality!

Non-uniform distribution!

10

Modulo Non-Uniform Distributionint src = rand(); // Assume uniform [0, 32767]int dst = src % 100; // Non-uniform [0, 99]// [0, 99] src [0, 99] dst// [100, 199] src [0, 99] dst// ...// [32700, 32767] src [0, 67] dst• This is modulo's fault, not rand()'s• Trigger: input range isn't exact multiple of output range

11

Floating-Point Treacheryint src = rand(); // Assume uniform [0, 32767]int dst = static_cast<int>( // As seen on (src * 1.0 / RAND_MAX) * 99 // StackOverflow); // Hilariously non-uniform [0, 99]• Only one input produces the output 99:static_cast<int>((32765 * 1.0 / 32767) * 99) == 98static_cast<int>((32766 * 1.0 / 32767) * 99) == 98static_cast<int>((32767 * 1.0 / 32767) * 99) == 99

12

Floating-Point Double Treacheryint src = rand(); // Assume uniform [0, 32767]int dst = static_cast<int>( (src * 1.0 / (RAND_MAX + 1)) * 100); // Subtly non-uniform [0, 99]• Less likely outputs (327/32768 vs. 328/32768):3, 6, 9, 12, 15, 18, 21, 24, 28, 31, 34, 37, 40, 43, 46, 49, 53, 56, 59, 62, 65, 68, 71, 74, 78, 81, 84, 87, 90, 93, 96, 99• Same problem as src % 100• Nothing can uniformly map 32768 inputs to 100 outputs

13

Floating-Point Triple Treachery• What if the input is [0, 232) or [0, 264)?• Non-uniformity is reduced, but not eliminated, when the

input is much larger than the output• What if IEEE runs out of bits?• Example: [0, 264) input [0, 1018 ≈ 259.8) output• double has only 53 bits of significand precision

• Say you have a problem, so you use floating-point• Now you have 2.000001 problems

• DO NOT MESS WITH FLOATING-POINT

14

<random> URNGs(Uniform Random Number Generators)• Engine templates:

• linear_congruential_engine• mersenne_twister_engine• subtract_with_carry_engine

• Engine adaptor templates:• discard_block_engine• independent_bits_engine• shuffle_order_engine

• Non-deterministic:• random_device

• Engine (adaptor) typedefs:• minstd_rand0• minstd_rand• mt19937• mt19937_64• ranlux24_base• ranlux48_base• ranlux24• ranlux48• knuth_b• default_random_engine

15

<random> Distributions• Uniform distributions

• uniform_int_distribution• uniform_real_distribution

• Poisson distributions• poisson_distribution• exponential_distribution• gamma_distribution• weibull_distribution• extreme_value_distribution

• Sampling distributions• discrete_distribution• piecewise_constant_distribution• piecewise_linear_distribution

• Bernoulli distributions• bernoulli_distribution• binomial_distribution• geometric_distribution• negative_binomial_distribution

• Normal distributions• normal_distribution• lognormal_distribution• chi_squared_distribution• cauchy_distribution• fisher_f_distribution• student_t_distribution

16

Hello, "Random" World!#include <iostream>#include <random>int main() { std::mt19937 mt(1729); std::uniform_int_distribution<int> dist(0, 99); for (int i = 0; i < 16; ++i) { std::cout << dist(mt) << " "; } std::cout << std::endl;}

17

Hello, "Random" World!#include <iostream>#include <random>int main() { std::mt19937 mt(1729); std::uniform_int_distribution<int> dist(0, 99); for (int i = 0; i < 16; ++i) { std::cout << dist(mt) << " "; } std::cout << std::endl;}

Deterministic 32-bit seed

18

Hello, "Random" World!#include <iostream>#include <random>int main() { std::mt19937 mt(1729); std::uniform_int_distribution<int> dist(0, 99); for (int i = 0; i < 16; ++i) { std::cout << dist(mt) << " "; } std::cout << std::endl;}

Deterministic 32-bit seed

Engine: [0, 232)

19

Hello, "Random" World!#include <iostream>#include <random>int main() { std::mt19937 mt(1729); std::uniform_int_distribution<int> dist(0, 99); for (int i = 0; i < 16; ++i) { std::cout << dist(mt) << " "; } std::cout << std::endl;}

Deterministic 32-bit seed

Engine: [0, 232)

Distribution: [0, 99]

20

Hello, "Random" World!#include <iostream>#include <random>int main() { std::mt19937 mt(1729); std::uniform_int_distribution<int> dist(0, 99); for (int i = 0; i < 16; ++i) { std::cout << dist(mt) << " "; } std::cout << std::endl;}

Deterministic 32-bit seed

Engine: [0, 232)

Distribution: [0, 99]

Note: [inclusive, inclusive]

21

Hello, "Random" World!#include <iostream>#include <random>int main() { std::mt19937 mt(1729); std::uniform_int_distribution<int> dist(0, 99); for (int i = 0; i < 16; ++i) { std::cout << dist(mt) << " "; } std::cout << std::endl;}

Deterministic 32-bit seed

Engine: [0, 232)

Distribution: [0, 99]

Note: [inclusive, inclusive]Run engine,

viewed through distribution

22

Hello, Random World!#include <iostream>#include <random>int main() { std::random_device rd; std::mt19937 mt(rd()); std::uniform_int_distribution<int> dist(0, 99); for (int i = 0; i < 16; ++i) { std::cout << dist(mt) << " "; } std::cout << std::endl;}

Non-deterministic 32-bit seed

23

mt19937 vs. random_device• mt19937 is:• Fast (499 MB/s = 6.5 cycles/byte for me)• Extremely high quality, but not cryptographically secure• Seedable (with more than 32 bits if you want)• Reproducible (Standard-mandated algorithm)

• random_device is:• Possibly slow (1.93 MB/s = 1683 cycles/byte for me)• Strongly platform-dependent (GCC 4.8 can use IVB RDRAND)

• Possibly crypto-secure (check documentation, true for VC)• Non-seedable, non-reproducible

24

uniform_int_distribution• Takes any Uniform Random Number Generator• Usually [0, 232) or [0, 264) but [1701, 1729] works• If your URNG does that, you are bad and you should feel bad

• Emits any desired range of integers [low, high]• signed/unsigned short/int/long/long long• Why not char/signed char/unsigned char? Standard Says SoTM

• Preserves perfect uniformity• Requires obsessive implementers• Uses bitwise/etc. magic, invokes URNG repeatedly (rare)• Runs fairly quickly (34% raw speed for me)

• Deterministic, but not invariant• Will vary across platforms, may vary across versions

25

random_shuffle() Considered Harmfultemplate <typename RanIt> void random_shuffle(RanIt f, RanIt l);• May call rand()• C++ Standard Library, I trusted you!

template <typename RanIt, typename RNG>void random_shuffle(RanIt f, RanIt l, RNG&& r);• Not evil, but highly inconvenient• Knuth shuffle needs r(n) to return [0, n)

26

shuffle() Considered Awesometemplate <typename RanIt, typename URNG> void shuffle(RanIt f, RanIt l, URNG&& g);• Takes URNGs directly (e.g. mt19937)• Shuffles perfectly• All permutations are equally likely

• Invokes the URNG in-place (can't copy)• Other algorithms can copy functors, like generate()• Special exception: for_each() moves functors

27

Random <random> Notes• Running mt19937 is fast, constructing/copying isn't• Constructing/copying engines often is already undesirable

• URNG/distribution function call ops are non-const• Multiple threads cannot simultaneously call a single object

• When is it safe to skip uniform_int_distribution?• mt19937's [0, 232) or mt19937_64's [0, 264) [0, 2N)• In this case, masking is safe, simple, and efficient• In all other cases, use uniform_int_distribution

top related