RANDOMNESS AND PSEUDORANDOMNESS Omer Reingold, Microsoft Research and Weizmann
RANDOMNESS AND PSEUDORANDOMNESSOmer Reingold, Microsoft Research and Weizmann
Randomness and Pseudorandomness
When Randomness is Useful When Randomness can be reduced or
eliminated – derandomization Basic Tool: Pseudorandomness
An object is pseudorandom if it “looks random” (indistinguishable from uniform), though it is not.
Expander Graphs
Randomness In Computation (1)
Distributed computing (breaking symmetry)
Cryptography: Secrets, Semantic Security, …
Sampling, Simulations, …
Can’t live without you
Randomness In Computation (2)
Communication Complexity (e.g., equality)
Routing (on the cube [Valiant]) - drastically reduces congestion
You change my world
Randomness In Computation (3)
In algorithms – useful design tool, but many times can derandomize (e.g., PRIMES in P). Is it always the case?
BPP=P means that every randomized algorithm can be derandomized with only polynomial increase in time
RL=L means that every randomized algorithm can be derandomized with only a constant factor increase in memory
Do I really need you?
In Distributed Computing
Byzantine Agreement
Deterministic
Randomized
Synchronoust failures
t+1 rounds O(1)
Asynchronous
impossible possible
Dining Philosophers: breaking symmetry
Attack Now
Don’t Attack
Randomness Saves Communication
Original File
Copy
=?
Deterministic: need to send the entire file! Randomness in the Sky: O(1) bits (or log in 1/error) Private Randomness: Logarithmic number of bits (derandomization).
In Cryptography
Private Keys: no randomness - no secrets and no identitiesEncryption: two encryptions of same message with same key need to be differentRandomized (interactive) Proofs: Give rise to wonderful new notions: Zero-Knowledge, PCPs, …
Random Walks and Markov Chains
When in doubt, flip a coin: Explore graph: minimal memory Page Rank: stationary
distribution of Markov Chains Sampling vs. Approx counting.
Estimating size of Web Simulations of Physical Systems …
Shake Your Input Communication network (n-dimensional cube)
Every deterministic routing scheme will incur exponentially busy links (in worse case)Valiant: To send a message from x y, select node z at random, send x z y.Now: O(1) expected load for every edge
Another example – randomized quicksort Smoothed Analysis: small perturbations, big
impact
In Private Data AnalysisHide Presence/Absence of Any IndividualHow many people in the database have the BC1 gene? Add random noise to true answer distributed as Lap(/) More questions? More privacy? Need more noise.
0 2 3 4--2-3-4
ratio bounded
Randomness and Pseudorandomness
When Randomness is Useful When Randomness can be reduced or
eliminated – derandomization Basic Tool: Pseudorandomness
An object is pseudorandom if it “looks random” (indistinguishable from uniform), though it is not.
Expander Graphs
Cryptography: Good Pseudorandom Generators are Crucial
With them, we have one-time pad (and more):
ciphertext = plaintext K = 01101011
E Dplaintext data:00001111
plaintext data: (ciphertext K) = 00001111
short key K0: 110
Without, keys are bad, algorithms are worthless (theoretical & practical)
derived key K:01100100
Data Structures & Hash Functions
Linear Probing:
Alice
Mike
Bob
Mary
Bob
F(Bob)
If F is random then insertion time and query time are O(1) (in expectation).
But where do you store a random function ?!? Derandomize!
Heuristic: use SHA1, MD4, … Recently (2007): 5-wise independent
functions are sufficient* Similar considerations all over: bloom
filters, cuckoo hashing, bit-vectors, …
Weak Sources & Randomness Extractors
Available random bits are biased and correlated Von Neumann sources:
Randomness Extractors produce randomness from general weak sources, many other applications
b1 b2 … bi … are i.i.d. 0/1 variables and bi =1 with some probability p < 1 then translate
01 1
10 0
Algorithms: Can Randomness Save Time or Memory?
Conjecture - No* (*moderate overheads may still apply)
Examples of derandomization:
Holdouts: Identity testing, approximation algorithms, …
Primality Testing in Polynomial Time
Graph Connectivity logarithmic Memory
(Bipartite) Expander Graphs
|(S)| A |S|
(A > 1)
S, |S| K
Important: every (not too large) set expands.
D
N N
(Bipartite) Expander Graphs
|(S)| A |S|
(A > 1)
S, |S| K
Main goal: minimize D (i.e. constant D)
• Degree 3 random graphs are expanders! [Pin73]
D
N N
(Bipartite) Expander Graphs
|(S)| A |S|
(A > 1)
S, |S| K
Also: maximize A. Trivial upper bound: A D
even A ≲ D-1 Random graphs: AD-1
D
N N
Applications of Expanders
These “innocent” looking objects are intimately related to various fundamental problems:
Network design (fault tolerance), Sorting networks, Complexity and proof theory, Derandomization, Error correcting codes, Cryptography, Ramsey theory And more ...
Non-blocking Network with On-line Path Selection [ALM]
N (Inputs)
N (Outputs)
Depth O(log N), size O(N log N), bounded degree.
Allows connection between input nodes and output nodes using vertex disjoint paths.
Non-blocking Network with On-line Path Selection [ALM]
N (Inputs)
N (Outputs)
Every request for connection (or disconnection) is satisfied in O(log N) bit steps:
• On line. Handles many requests in parallel.
The Network
“Lossless”
Expander
N (Inputs) N (outputs)
Slightly Unbalanced, “Lossless” Expanders
|(S)| 0.9 D |S|
S, |S| K
D
N M= N
0< 1 is an arbitrary constant D is constant & K= (M/D) = (N/D). [CRVW 02]: such expanders (with D = polylog(1/))
Property 1: A Very Strong Unique Neighbor Property
S, |S| K, |(S)| 0.9 D |S|
S
Non Unique neighbor
S has 0.8 D |S| unique neighbors !
Unique neighbor of S
Using Unique Neighbors for Distributed Routing
Task: match S to its neighbors (|S| K)
S
Step I: match S to its unique neighbors.
S`
Continue recursively with unmatched vertices S’.
Reminder: The Network
Adding new paths: think of vertices used by previous paths as faulty.
Property 2: Incredibly Fault Tolerant
S, |S| K, |(S)| 0.9 D |S|
Remains a lossless expander even if adversary removes (0.7 D) edges from each vertex.
Simple Expander Codes [G63,Z71,ZP76,T81,SS96]
M= N (Parity Checks)
Linear code; Rate 1 – M/N = (1 - ).Minimum distance K. Relative distance K/N= ( / D) = / polylog (1/). For small beats the Zyablov bound and is quite
close to the Gilbert-Varshamov bound of / log (1/).
N (Variables)
1
100
1
++
+
+
0
Error set B, |B| K/2
Algorithm: At each phase, flip every variable that “sees” a majority of 1’s (i.e, unsatisfied constraints).
Simple Decoding Algorithm in Linear Time (& log n parallel phases)
[SS 96]
M= N (Constraints)N (Variables)
++
+
+1
100
1|Flip\B| |B|/4|B\Flip| |B|/4 |Bnew| |B|/2
|(B)| > .9 D |B|
|(B)Sat|< .2 D|B|0
10
0
11
0
Random Walk on Expanders [AKS 87]
...x0
x1
x2 xi
xi converges to uniform fast (for arbitrary x0).
For a random x0: the sequence x0, x1, x2 . . . has interesting “random-like” properties.
Thanks