rand() Considered Harmful
Post on 22-Feb-2016
50 Views
Preview:
DESCRIPTION
Transcript
Version 1.1 - September 5, 2013 1
rand() Considered HarmfulStephan T. Lavavej ("Steh-fin Lah-wah-wade")Senior Developer - Visual C++ Librariesstl@microsoft.com
2
What's Wrong With This Code?#include <stdio.h>#include <stdlib.h>#include <time.h>int main() { srand(time(NULL)); for (int i = 0; i < 16; ++i) { printf("%d ", rand() % 100); } printf("\n");}
3
What's Right With This Code?#include <stdio.h>#include <stdlib.h>#include <time.h>int main() { srand(time(NULL)); for (int i = 0; i < 16; ++i) { printf("%d ", rand() % 100); } printf("\n");}
All required headers are included!All included headers are required!
Headers are sorted!One True Brace Style!
Unnecessary argc, argv, return 0; omitted!
%d is correct for int!
4
What's Wrong With This Code?#include <stdio.h>#include <stdlib.h>#include <time.h>int main() { srand(time(NULL)); for (int i = 0; i < 16; ++i) { printf("%d ", rand() % 100); } printf("\n");}
5
What's Wrong With This Code?#include <stdio.h>#include <stdlib.h>#include <time.h>int main() { srand(time(NULL)); for (int i = 0; i < 16; ++i) { printf("%d ", rand() % 100); } printf("\n");}
ABOMINATION!
6
What's Wrong With This Code?#include <stdio.h>#include <stdlib.h>#include <time.h>int main() { srand(time(NULL)); for (int i = 0; i < 16; ++i) { printf("%d ", rand() % 100); } printf("\n");}
ABOMINATION!Frequency: 1 Hz!
7
What's Wrong With This Code?#include <stdio.h>#include <stdlib.h>#include <time.h>int main() { srand(time(NULL)); for (int i = 0; i < 16; ++i) { printf("%d ", rand() % 100); } printf("\n");}
ABOMINATION!Frequency: 1 Hz!
warning C4244:
'argument' : conversion
from 'time_t' to 'unsigned
int', possible loss
of data
32-bit seed!
8
What's Wrong With This Code?#include <stdio.h>#include <stdlib.h>#include <time.h>int main() { srand(time(NULL)); for (int i = 0; i < 16; ++i) { printf("%d ", rand() % 100); } printf("\n");}
ABOMINATION!Frequency: 1 Hz!
warning C4244:
'argument' : conversion
from 'time_t' to 'unsigned
int', possible loss
of data
32-bit seed!
Range: [0, 32767] Linear congruential low quality!
9
What's Wrong With This Code?#include <stdio.h>#include <stdlib.h>#include <time.h>int main() { srand(time(NULL)); for (int i = 0; i < 16; ++i) { printf("%d ", rand() % 100); } printf("\n");}
ABOMINATION!Frequency: 1 Hz!
warning C4244:
'argument' : conversion
from 'time_t' to 'unsigned
int', possible loss
of data
32-bit seed!
Range: [0, 32767] Linear congruential low quality!
Non-uniform distribution!
10
Modulo Non-Uniform Distributionint src = rand(); // Assume uniform [0, 32767]int dst = src % 100; // Non-uniform [0, 99]// [0, 99] src [0, 99] dst// [100, 199] src [0, 99] dst// ...// [32700, 32767] src [0, 67] dst• This is modulo's fault, not rand()'s• Trigger: input range isn't exact multiple of output range
11
Floating-Point Treacheryint src = rand(); // Assume uniform [0, 32767]int dst = static_cast<int>( // As seen on (src * 1.0 / RAND_MAX) * 99 // StackOverflow); // Hilariously non-uniform [0, 99]• Only one input produces the output 99:static_cast<int>((32765 * 1.0 / 32767) * 99) == 98static_cast<int>((32766 * 1.0 / 32767) * 99) == 98static_cast<int>((32767 * 1.0 / 32767) * 99) == 99
12
Floating-Point Double Treacheryint src = rand(); // Assume uniform [0, 32767]int dst = static_cast<int>( (src * 1.0 / (RAND_MAX + 1)) * 100); // Subtly non-uniform [0, 99]• Less likely outputs (327/32768 vs. 328/32768):3, 6, 9, 12, 15, 18, 21, 24, 28, 31, 34, 37, 40, 43, 46, 49, 53, 56, 59, 62, 65, 68, 71, 74, 78, 81, 84, 87, 90, 93, 96, 99• Same problem as src % 100• Nothing can uniformly map 32768 inputs to 100 outputs
13
Floating-Point Triple Treachery• What if the input is [0, 232) or [0, 264)?• Non-uniformity is reduced, but not eliminated, when the
input is much larger than the output• What if IEEE runs out of bits?• Example: [0, 264) input [0, 1018 ≈ 259.8) output• double has only 53 bits of significand precision
• Say you have a problem, so you use floating-point• Now you have 2.000001 problems
• DO NOT MESS WITH FLOATING-POINT
14
<random> URNGs(Uniform Random Number Generators)• Engine templates:
• linear_congruential_engine• mersenne_twister_engine• subtract_with_carry_engine
• Engine adaptor templates:• discard_block_engine• independent_bits_engine• shuffle_order_engine
• Non-deterministic:• random_device
• Engine (adaptor) typedefs:• minstd_rand0• minstd_rand• mt19937• mt19937_64• ranlux24_base• ranlux48_base• ranlux24• ranlux48• knuth_b• default_random_engine
15
<random> Distributions• Uniform distributions
• uniform_int_distribution• uniform_real_distribution
• Poisson distributions• poisson_distribution• exponential_distribution• gamma_distribution• weibull_distribution• extreme_value_distribution
• Sampling distributions• discrete_distribution• piecewise_constant_distribution• piecewise_linear_distribution
• Bernoulli distributions• bernoulli_distribution• binomial_distribution• geometric_distribution• negative_binomial_distribution
• Normal distributions• normal_distribution• lognormal_distribution• chi_squared_distribution• cauchy_distribution• fisher_f_distribution• student_t_distribution
16
Hello, "Random" World!#include <iostream>#include <random>int main() { std::mt19937 mt(1729); std::uniform_int_distribution<int> dist(0, 99); for (int i = 0; i < 16; ++i) { std::cout << dist(mt) << " "; } std::cout << std::endl;}
17
Hello, "Random" World!#include <iostream>#include <random>int main() { std::mt19937 mt(1729); std::uniform_int_distribution<int> dist(0, 99); for (int i = 0; i < 16; ++i) { std::cout << dist(mt) << " "; } std::cout << std::endl;}
Deterministic 32-bit seed
18
Hello, "Random" World!#include <iostream>#include <random>int main() { std::mt19937 mt(1729); std::uniform_int_distribution<int> dist(0, 99); for (int i = 0; i < 16; ++i) { std::cout << dist(mt) << " "; } std::cout << std::endl;}
Deterministic 32-bit seed
Engine: [0, 232)
19
Hello, "Random" World!#include <iostream>#include <random>int main() { std::mt19937 mt(1729); std::uniform_int_distribution<int> dist(0, 99); for (int i = 0; i < 16; ++i) { std::cout << dist(mt) << " "; } std::cout << std::endl;}
Deterministic 32-bit seed
Engine: [0, 232)
Distribution: [0, 99]
20
Hello, "Random" World!#include <iostream>#include <random>int main() { std::mt19937 mt(1729); std::uniform_int_distribution<int> dist(0, 99); for (int i = 0; i < 16; ++i) { std::cout << dist(mt) << " "; } std::cout << std::endl;}
Deterministic 32-bit seed
Engine: [0, 232)
Distribution: [0, 99]
Note: [inclusive, inclusive]
21
Hello, "Random" World!#include <iostream>#include <random>int main() { std::mt19937 mt(1729); std::uniform_int_distribution<int> dist(0, 99); for (int i = 0; i < 16; ++i) { std::cout << dist(mt) << " "; } std::cout << std::endl;}
Deterministic 32-bit seed
Engine: [0, 232)
Distribution: [0, 99]
Note: [inclusive, inclusive]Run engine,
viewed through distribution
22
Hello, Random World!#include <iostream>#include <random>int main() { std::random_device rd; std::mt19937 mt(rd()); std::uniform_int_distribution<int> dist(0, 99); for (int i = 0; i < 16; ++i) { std::cout << dist(mt) << " "; } std::cout << std::endl;}
Non-deterministic 32-bit seed
23
mt19937 vs. random_device• mt19937 is:• Fast (499 MB/s = 6.5 cycles/byte for me)• Extremely high quality, but not cryptographically secure• Seedable (with more than 32 bits if you want)• Reproducible (Standard-mandated algorithm)
• random_device is:• Possibly slow (1.93 MB/s = 1683 cycles/byte for me)• Strongly platform-dependent (GCC 4.8 can use IVB RDRAND)
• Possibly crypto-secure (check documentation, true for VC)• Non-seedable, non-reproducible
24
uniform_int_distribution• Takes any Uniform Random Number Generator• Usually [0, 232) or [0, 264) but [1701, 1729] works• If your URNG does that, you are bad and you should feel bad
• Emits any desired range of integers [low, high]• signed/unsigned short/int/long/long long• Why not char/signed char/unsigned char? Standard Says SoTM
• Preserves perfect uniformity• Requires obsessive implementers• Uses bitwise/etc. magic, invokes URNG repeatedly (rare)• Runs fairly quickly (34% raw speed for me)
• Deterministic, but not invariant• Will vary across platforms, may vary across versions
25
random_shuffle() Considered Harmfultemplate <typename RanIt> void random_shuffle(RanIt f, RanIt l);• May call rand()• C++ Standard Library, I trusted you!
template <typename RanIt, typename RNG>void random_shuffle(RanIt f, RanIt l, RNG&& r);• Not evil, but highly inconvenient• Knuth shuffle needs r(n) to return [0, n)
26
shuffle() Considered Awesometemplate <typename RanIt, typename URNG> void shuffle(RanIt f, RanIt l, URNG&& g);• Takes URNGs directly (e.g. mt19937)• Shuffles perfectly• All permutations are equally likely
• Invokes the URNG in-place (can't copy)• Other algorithms can copy functors, like generate()• Special exception: for_each() moves functors
27
Random <random> Notes• Running mt19937 is fast, constructing/copying isn't• Constructing/copying engines often is already undesirable
• URNG/distribution function call ops are non-const• Multiple threads cannot simultaneously call a single object
• When is it safe to skip uniform_int_distribution?• mt19937's [0, 232) or mt19937_64's [0, 264) [0, 2N)• In this case, masking is safe, simple, and efficient• In all other cases, use uniform_int_distribution
top related