-
Chapter 1: Events and Probability Thursday, June 9, 2010,7:00
(GMT+7) 1. Background Pr(E1E2) = Pr(E1) + Pr(E2) - Pr(E1E2) ng dng:
- 2 independent events: Pr(AB) = Pr(A)Pr(B) - 2 disjoint events:
Pr(E1E2) = Pr(E1) + Pr(E2) 2. Tm tt l thuyt Cc mc ch o
Ni dung Take note
1.Verifying Polynomial Identities
Gi s ta c mt chng trnh nhn cc a thc. V d: (x + 1)(x 2)(x + 3(x
4)(x + 5)(x 6) x6 7x3 + 25 Chng trnh ca ta s output kt qu la x6 7x3
+ 25 Ta mun kim tra tnh ng n ca kt qu ny. Ta c th nhn ln lt cc s
hng vi nhau. Th nhng vic lm ny rt tn km, m thc t ta li thc hin li
gii thut nhn a thc c nh vy l i theo con ng c ri. Nu sai vn ra kt qu
sai m thi. By gi ta s s dng gii thut random gii quyt cu hi : F(x) ?
G(x) ALGORITHM 1.1: Chn mt s x = a bt k. Kim tra nu F(a) G(a)ta kt
lun ngay F(x) G(x). PROBABILISTIC ANALYSIS: Khi F(a) = G(a) ta cha
th kt lun ngay F(x) G(x) hn bi a c th l nghim ca phng trnh F(x)
G(x) = 0. Gi s F(x) l a thc bc d. Khi F(x) - G(x) khng th c qu d
nghim. Tc nu nh F(x) G(x) trong tt c s nguyn ta chn ch c d trng hp
m F(a) = G(a).
Gii thut bn c ngha m u cho phng php gii quyt vn bng xc sut. Khng
nn ch trng n: - vic nghim a thc l s thc hay nguyn - vic chn khong
{1..100d} hay {1..1000000d}
-
Nu nh ta chn a trong khong {0,..100d} tc khng gian mu c 100d kh
nng. --> Xc sut chn trng nghim ca F(x) - G(x) l 1/100. y cng l
xc sut ALGORITHM 1.1 sai. Thc hin thut ton n ln c lp. p dng cng thc
cho cc s kin c lp ta c xc sut tht bi ca thut ton 1.1 l ( 1100
)n 2. Axioms of Probability
Khng gian mu (Sample Space) l tp hp tt c cc kh nng c th xy ra ca
mt s kin. Hm xc sut l mt nh x t tp cc s kin vo tp s thc R. Ta gi hm
s ny l: Pr(E) = Xc sut ca s kin E. Conditional Probability: Pr(E |
F) =Pr(EF)Pr(F) Law of Total Probability: E1, E2 l cc s kin xung
khc (E1 E2 =) m E1 v E2 lp y khng gian mu. Khi vi mt s kin bt k B
ta c: Pr(B) = Pr(B E1) + Pr(B E2) => Bayes' Law: Pr(E1 | B) =
Pr(B E1) / Pr(B) = Pr(BE1)
Pr(BE1)! Pr(BE2)
- d nh ta s coi Conditional Probability nh sau: s kin F l iu kin
gii hn khng gian mu thnh mt khng gian mu nh hn. - VD: + khng gian
mu ca tp cc s t nhin l {0,1,2,3, ..}. + khng gian mu ca tp cc s t
nhin vi iu kin nh hn 5 l {0,1,2,3,4} + S kin s chn l tp ca cc s t
nhin chn. (S kin l mt tp con ca khng gian mu). + Hai s kin xung khc
(disjoint) lp u khng gian mu l E1 s t nhin chn , v E2 s t nhin
l.
3. Verifying Matrix Multiplication
Cho 3 ma trn n*n l A, B v C. Ta cn kim tra xem A*B ?= C . Trong
A,B l cc ma trn n v, ch bao gm cc s 0 v 1. GIi thut c in: Tnh A*B v
so snh vi C. Time: (n3 ). ALGORITHM(1.3) : chn mt vector n v n chiu
ngu nhin r = (r1,r2, ....,rn), ri = 0 or 1, 1
-
Ta tnh A*B*r = A * (B *r) ri so snh vi C*r. Case 1: A B r C r
suy ra A B C Case 2: A B r = C r. Lc ny vn c th A B C. Ta tnh xc
sut : A B C v A B r = C r. Tc xc sut gii thut tht bi. t D = AB - C.
Lc ny D 0 v D r = 0 . Do D 0 nn tn ti mt phn t dij trong ma trn D m
dij 0 Thm vo D r = 0 nn nj!1 dij rj = 0 ri = nj!1,j i dij rjdij
(3.1) Trong cc rj u c chn ngu nhin. Gi s ta chn ngu nhin tt cc cc
rj(j = 1 n n) ch cn li ri. Lc ny nj!1,j i dij rj
dij nhn mt gi tr no c th l 0, 1 hay khc i. Suy ra kh nng chn ri
tha mn phng trnh (3.1) l khng qu 1/2 bi ri ch c th nhn gi tr 0 hoc
1. Vy xc sut tht bi trong mt ln chy ALGORITHM(1.3) l 1/2. Chy n ln
c lp cho ta xc sut tht bi l (1
2)n
Tt c rj u c chn ngu nhin (j = 1,2, ...,n). ch cn li ri ta xt sau
cng. Phng php ny c gi l deferred decision. Cc gi tr random ban u ta
coi nh c. bc quyt nh ta mi a s kin nhu nhin vo. V d: Cho
x1,x2,x3,x4,x5,x6 l 6 s t nhin random. Tnh xc sut x1 + x2 + x4 + x5
+ x6 l s chia ht cho 6. p dng deferred decision ta c xc sut ny l
1/6.
4. A Randomized Min-Cut Algorithm (Karger Algprithm)
Cho th G = (V,E). Ta nh ngha cut-set l tp cc cnh ca th m nu b cc
cnh i s thnh phn lin thng ca th s tng ln. Min-Cut ca th G l cut-set
nh nht ca th y Bi ton t ra l tm min-cut ca th. Tnh s cnh trong
min-cut . Gii thut c in c phc tp n^3. ALGORITHM: mi iteration ta
thc hin mt edge contraction (ch gii phn takenote). Sau khi thc hin
n-2 iteration: ta cn li 2 nh.
Edge Contraction ca 2 nh A v B l vic xc nhp 2 nh A v B li lm 1
nh trong khi gi nguyn mi lin h ca chng vi cc hnh khc trong th.(Hay
gi nguyn cc cnh vo ra).
- Return Min-Cut = S cnh ni 2 nh ny PROBABILISTIC ANALYSIS: Gi S
v V - S l hai tp nh b chia r bi Min-Cut. Nu ta ch contract cc nh
trong S hoc V-S gii thut s cho kt qu chnh xc. Bt c contract no lm
mt cnh trong Min-Cut, kt qu khng chc s chnh xc. Ta gi Ei l s kin ti
iteration th i ta khng contract bt c cnh no trong Min-Cut. t Fi l s
kin khng c bt c ln no trong s i iteration u tin contract mt mt cnh
trong Min-Cut. Ta c: Fi = ij ! 1 Ej Ta cn tnh Pr(Fn!2) Ban u: Gi
n,m l s nh v s cnh ca th G. Gi MC l Min-Cut trong G v c l s cnh
trong MC. Gi s nh A c bc nh nht trong th deg(A) = k. Suy ra c
-
Iteration th 2: Sau ln chy u tin th cn n-1 cnh. Do vy Pr(E2 |F1
) = 1 2n!1 Tng t nh vy ti iteration th i: th cn n - i + 1 cnh Pr(Ei
|Fi!1) = 1 2n!i!1 Tng kt li ta c: Pr(Fn!2) = Pr(En!2 Fn!3) =
Pr(En!2|Fn!3)Pr(Fn!3) = Pr(En!2|Fn!3)Pr(En!3 Fn!4) = ... =
Pr(En!2|Fn!3)Pr(En!3|Fn!4) ....Pr(E2 |F1 )Pr(F1 ) ni=1 ( 1 2
ni+1) = ni=1 (ni1ni+1) = 2n(n1) Ta ly kt qu nh nht trong ln ln
chyc chng trnh s dng ALGORITHM 1.4 Pr(fail) = ( 2n(n!1))n(n!1)ln n
e!2 ln n = 1n2
3. Exercises:
http://docs.google.com/View?id=dgmqjfk5_188cq53p6ft
Chapter 2: Discrete Random Variables and Expectation Thursday,
June 9, 2010,12:00 (GMT+7) 1. Background 1.1. The
inclusive-exclusive principle: Pr(E1E2) = Pr(E1) + Pr(E2) -
Pr(E1E2)
-
ng dng: - 2 independent events: Pr(AB) = Pr(A)Pr(B) - 2 disjoint
events: Pr(E1E2) = Pr(E1) + Pr(E2) 1.2. Bayes' Law: Pr(E1 | B) =
Pr(BE1) / Pr(B) =Pr(BE1)Pr(BE1)+ Pr(BE2) 2. Tm tt l thuyt Cc mc ch
o
Ni dung Take note
1. Random Variables and Expectation
Random Variable: mt bin ngu nhin X l mt nh x t tp khng gian mu
vo tp cc s thc R. Discrete Random Variable: mt bin ngu nhin ri rc X
l mt bin ngu nhin m tp gi tr ca n khng phi l R na m l mt tp c th m
c. The Expectation of a Random Variable: E[X] = x x Pr(X = x); x
Linearity of Expectation X v Y l cc bin ngu nhin ri rc. E[X +Y] =
E[X] + E[Y]
Mt tp S c coi l c th m c nu tn ti mt song nh gia S v tp cc s t
nhin. Ta cn ghi nh inh ngha ny c th hiu c phn tip theo. inh l ny c
p dng lin tc bi yu cu cc bin X v Y ch cn ri rc. Gi cho chng minh: s
kin ((X = x) (Y = y1 )) v s kin ((X = x) (Y = y2 )) l 2 s kin xung
khc (disjoin). Suy ra: Pr ((X = x) (Y =y1 )) + Pr ((X = x) (Y =y2
)) = Pr ((X = x) ((Y = y1 ) Y = y2 ) )) Do : y Pr ((X = x) (Y = y
)) = Pr(X = x)
2. The Bernoulli and Binomial Random Variables
Bernoulli Random Variable [ or indicator random variable] Xt kt
qu ca mt th nghim: Y = 1 nu kt qu thnh cng Y = 0 nu ngc li. vi Pr(Y
= 1) = p; E[Y] = 1 . p + 0 . (1-p) = p
Binomial Random Variable Ta gi X l mt Binomial random variable
with parameters n and p nu:
C 2 cch nh ngha mt bin ngu nhin: 1. nh ngha da trn logic 2. nh
ngha da trn xc sut tc ra ch r tp v xc sut ca tng s kin trong tp .
Cch th 2 cho nh ngha cht ch hn v c s dng nhiu hn.
-
Pr (X = k) = nipk (1 p)n!k
Din gii r hn X l s ln thnh cng ca n trials, T1,T2, ... , Tn
trong m Pr(T1 = 1 ) = Pr(T2 = 1 ) = . . . . . .= p E[X] = np.
Vi nh ngha theo cch th 2 ta c mt Distribution ca bin theo xc
sut. Binomial random viable with parameters n and p Chng minh: E[X]
= np Gi T1,T2, ... , Tn l n trials. Mi Ti l mt Bernoulli random
variable with parameter p => E[Ti] = p; i = 1, 2, 3, ..., n p
dng Linearity of Expectation ta c: E[X] = E[ ni!1 Ti ] =ni!1 E[Ti ]
= np
3 Conditional Expectation
Conditional Expectation Xt mt khng gian mu con ca khng gian mu ,
tha mn Z = z;E[Y | Z = z] = y yPr(Y = y | Z = z) c gi l expectation
ca bin ngu nhin Y vi iu kin Z = z. Decomposition Law E[X ] = y Pr(Y
= y)E[X | Y = y] Chng minh cng thc ny tng t nh chng minh linearity
of expectation. nh l v k vng ca k vng: E[Y] = E[E[Y | Z]
V d: Xt 2 con xc sc chun (chun tc c 6 mt, mi mt c xc sut 1/6 v
ghi mt s khc nhau t 1 n 6). Gieo 1 ln c 2 s l X1 v X2. t X = X1 +
X2; E[X |X1 = 2] =
x
xPr(X = x|X1 = 2) Nhn thy 6 >= X1, X2 >= 1; y X1 = 2 nn
8>= X >= 3 E[X |X1 = 2] = 8
x ! 3 xPr(X = x|X1= 2) = 8x ! 3 x 16 = 112 Compare: E1, E2 l c s
kin xung khc (E1 E2 =) m E1 v E2 lp y khng gian mu. Khi vi mt s kin
bt k
-
B ta c: Pr(B) = Pr(B E1) + Pr(B E2) Chng minh: t: f(Z) = E [Y |
Z]. Ta c: E[E[Y | Z] = E[f(Z)] E[f(Z) ] =z Pr(Z = z)f(z) =z Pr(Z =
z)E[Y | Z z] = E[X] (ng thc cui suy t
decomposition law) 4. The Geometric Expectation
Geometric Distribution X l mt geometric random variable with
parameter p nu: Pr(X = n) = (1 p)n!1p T y ta tnh c: Pr(X n) = i!n
Pr(X = n) =i!n (1 p)n!1p = p(1 p)n!1
i!0 (1 p)i= p(1 p)n!1 11 (1 p) = (1 p)n!1
Cng thc tnh expectation cho bin nguyn dng:: Cho X l mt bin ngu
nhin ri rc ch nhn cc gi tr nguyn dng: E[X] = i!1 Pr(X 1) p dng cng
thc trn ta tnh c expectation ca geometric random variable: E[X] =
i!1 Pr(X 1) = i!1 (1 p)n!1 = 1
1!(1!p) = 1p
nh ngha v geometric random variable c a ra di dng phn phi xc sut
(Xem Chapter 1) Din gii v ngha, X geometric random variable with
parameter p tc X l s ln cn th t c thnh cng u tin bit rng xc sut
thnh cng ca mi ln l p. Compare: 1. Binomial Random Variable: The
number of Trials : fixed = n The number os Successes: X 2.
Geometric Random Variable: The number of Trials: X The number of
success: fixed = 1 Chng minh: (cng thc tnh expectation cho bin
nguyn dng) S dng nh ngha k vng
-
E[X] = j!1 jPr(X = j) =
j!1i!ji!1 Pr(X = j) Hon i biu thc sig-ma trn ta c:
j!1 i!ji!1 Pr(X = j) =i!1 j!i Pr(X = j) = i!1 Pr(X j)
Extra: Coupon Collector's Problem
Problem: C n loi coupons trong hp, s lng mi loi rt rt ln. Mi ln
ta ly ra 1 coupon. Hi ta phi ly bao nhiu ln c th thu thp c n loi
coupons ny. Problem Analysis: Bi ton yu cu tm s ln ly c th thu c n
loi. Nu vy s khc nhau gia n-1 loi v n loi l g? Lc ta ly c 1 loi
coupon ri. Kh nng c thm loi na l rt d, xc sut ln ly tip theo c thm
1 loi coupon l (n-1)/n. Cn nu xt khi c n-1 loi ri, ly c loi th n
kia xc sut ch l 1/n. Nh vy vic ly thm c mt loi coupon mi khng ph
thuc vo cng vic ta lm trc m ch ph thuc vo s coupon tnh n thi im hin
ti. Tc s coupon cn ly thm i t i-1 loi n i loi ch ph thuc vo gi tr
ca i. Proof: Gi X_i l s coupon cn ly thm tnh t lc ta c i-1 loi n lc
ta c i loi. Mi X_i (i=1,2,...,n) l mt geometric random variable
with parameter pi = 1 i!1n = n!i!1n . Suy ra: E[Xi ] =1pi= nn!i!1
Suy ra: E[X] = E[ ni ! 1 Xi ] =
ni ! 1 E[Xi ] = ni ! 1 nn!i!1
Bi ton ny c nhiu ng dng trong thc t v vy bn cn c k c phg php phn
tch v li gii. S dng k thut braching process with 0 generation in
memory or memoryless. H(n) c gi l Harmonic number. H(n) = ln(n) +
(1). chng minh ta ch cn dng bt ng thc tch phn t 1 n n cho hm f(x) =
1/x f( x ) f(x) f( x )
-
t k = n - i + 1 ta dc: = n nk ! 1
1k= nH(n)
5. Application: The Expected Run-Time of Quicksort
Quick sort l mt gii thut tng i n gin v hiu qu. Vn cht cng trong
quik sort chnh l chn pivot sao cho hp l, trong ri vo worst case n^2
ca gii thut. Nu nh y ta chn pivot mt cch ngu nhin liu gii thut trn
c tr nn tt hn khng? tr li cu hi trn ta s phn tch thi gian tnh ca
Quick Sort vi pivot chn ngu nhin. Probabilistic Analysis: Trong gii
thut Quick Sort, sau khi chn xong pivot cng vic ca ta l so snh
pivot vi tng s trong dy con. y nu phn tch k hn th cu lnh so snh ny
chnh l cu lnh c trng ca vng lp trong Quick Sort. Do , ch cn tnh s
ln so snh ny ta s thu c thi gian tnh ca thut ton. Gi s ln so snh ny
l X. Gi s y_1,y_2,...,y_n l dy c xp xp. Gi X_ij l bin ngu nhin
Beunoulli tha mn: X_ij = 1 nu trong qu trnh sp xp ta c so snh y_i v
y_j i,j = 1,2,3...,n ; ij; X_ij = 0 nu ngc li. Ta c: X = n!1i!1
nj!i!1 Xij Suy ta: E[X] = E[ n!1i!1 nj!i!1 Xij ] =n!1i!1 nj!i!1
E[Xij ] (5.1) Cng vic tip theo ca ta l i tnh E[X_ij} Xt cc s trong
khong t v tr i n v tr j : y_i,y_i+1,....y_j ; (i
-
th chng minh bng phn chng). Lc ny pivot s phi l mt trong j-i+ s
trn. Trong c 2 trng hp dn n vic ta phi so snh y_i v y_j. Do :
Pr(X_ij = 1) = 2/(j-i+1) Suy ra: E[X_ij] = 2/(j-i+1) ( v X l bin
ngu nhin Bernoulli) Thay ng thc trn vo (5.1) ta c: E[X] = n!1i!1
nj!i!1 2j!i!1 t k = j-i+1 ta c: E[X] = n!1i!1 n!i!1k!2 2k S dng lut
hon i sig-ma ta thu c: E[X] = nk!2 n!k!1i!1 2k =2 nk!2 n!k!1k = (2n
+ 2) n
k!21k 2(n 2 + 1) Rt gn ta c: E[X] = (2n+2)H(n) - 4n hay E[X] =
(nln n)
Exercise Exercise 2.3: 1. Cho f(x) l mt vertex function
(f''(x)>=0). Chng minh rng: E[f(x)] >= f(E[x]) 2.Chng minh
rng: E[Xk ] (E[X])k Exercise 2.6: C 2 con xc sc chun (chun tc c 6
mt, mi mt c xc sut 1/6 v ghi mt s khc nhau t 1 n 6). Gieo xc sc ta
c 2 s l X1 v X2. (a) Tnh: E[X | X1 chn] (b) Tnh E[X | X1 = X2] (c)
Tnh E[X1 | X= 9] (d) Tnh E[X1-X2 | X = k Exercise 2.7: Cho X v Y l
2 bin geometric vi tham s (with parameter) ln lt l p v q. (a) Tnh
Pr(X = Y) (b) Tnh E[max(X,Y)] (c) Tnh Pr(min(X,Y) = k) (d) Tnh E[X
| X Y]
-
Hint: Bn s thy tnh memoryless ca geometric random variable rt c
ch. Exercise 2.12: Ta ly cc tm card t trong hp c n loi card. (a)
Tnh expectation ca s card phi ly cho n khi c n loi card (b) Nu ta
ly ng 2n tm card, k vng ca s tm card khng c chn l bao nhiu? (c) Nu
ta ly ng 2n tm card, k vng ca s tm card c chn ng mt ln l bao nhiu?
Exercise 2.22: Cho u vo l mt dy ngu nhin n s: a1,a2,....,an. Mi s
a_i c i ch vi s lin k cho n khi n n c v tr cn sp xp. Tnh Expected
Number of Swap of Buble Sort. Hint: Ta ni a_i v a_j l inverted (b o
ln) nu (i < j) AND (a_i > a_j). Mi bc swap trong Buble Sort
lm mt i mt inverted pair. Exercise 2.23: Cho u vo l mt dy ngu nhin
n s: a1,a2,....,an Tnh Expected Runtime of Linear Insertion Sort.
Hint: Ta ni a_i l out of order nu tn ti a_j tha mn (i < j) AND
(a_i > a_j). Sau iteration th k trong Linear Insertion Sort, phn
t th 1,2,...k u in order.
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Answers: ____________________________________________________
Exercise 2.3: 1. Cho f(x) l mt vertex function (f''(x)>=0). Chng
minh rng: E[f(x)] f(E[x]) p dng Taylor Expansion ln cn im : f(x) =
f() + f'()(x-)
1! + cf''()(x-)22! ; trong c l mt hng s trong khong (0,1) Do
f''(} 0 nn: E [f(x)] E[f() + f'()(x-)
1! ] = E[f()] + E[f'()(x-)1! ]. V f() v f'() l hng s nn nn: E
[f(x)] E[f() + f'()(x-)
1! ] = E[f()] + E[f'()(x-)] (1) Ly expectation ca c 2 v ta c:
E[f() + f'()(x-) ] = f() + f'()(E[x]-)(2) Chn = E[X] ri kt hp (1)
(2) li ta c: E[f(x)] f(E[x]) 2.Chng minh rng: E[Xk ] (E[X])k
-
t f(x) = x^k. Ta c f''(x) = k(k-1)x^(k-2) >=0; p dng bt ng
thc trong phn 1 ta thu c iu phi chng minh.
____________________________________________________ Exercise 2.6:
C 2 con xc sc chun (chun tc c 6 mt, mi mt c xc sut 1/6 v ghi mt s
khc nhau t 1 n 6). Gieo xc sc ta c 2 s l X1 v X2. (a) Tnh E[X | X1
chn] Trc ht ta tnh E[X|X1] = E[(X1+X2)|X1] (Linearity of
conditional expectation) = E[X1 | X1] +E[X2|X1] = X1 + E[X2] ( v X1
v X2 c lp). = 7/2 + X1 - - trn ta s dng kt qu: E[X 2] = 6x ! 1
xPr(X2 = x) = 6x ! 1 x16= 16 6x ! 1 x = 72
E[X | X1 chn] = Pr(X1 = 2)E[(X = x |X1 = 2)] + Pr(X1 = 4)E[(X =
x |X1 =4)] + Pr(X1 = 6)E[(X = x |X1 = 6)] (decomposition law mc
2.1) = 1
6(7
2+ 2) + 1
6(7
2+ 4) + 1
6(7
2+ 6) = 18
5
(b) Tnh E[X | X1 = X2] E[X | X1 = X2] = 6x ! 1 Pr(X2 = x)E[X
|(X1 = X2 (X2 = x) ] = 6x ! 1 Pr(X2 = x)2x = = 6x ! 1 163x = 13 6x
! 1 x =1321 = 7
(c) Tnh E[X1 | X= 9] Do 1 X1 6 m X = 9 nn X1 = 3,4,5,6. E[X1 |X
= 9] = 6x ! 3 xPr(X1 = x| X = 9) Dng Bayes' Law: Pr(X1 = x| X = 9)
= Pr(X1 ! xX ! 9)
Pr(X ! 9) = 136436
= 14
Thay vo trn ta c: E[X1 |X = 9] = 6x ! 3 x 14 = 214 (d) Tnh
E[X1-X2 | X = k] E[X1-X2 | X = k] = E[X1 | X = k] - E[X2 | X = k] (
linearity of expectation) = 0. Bi X1 v X2 l 2 bin hon ton c lp, gi
vai tr nh nhau trong biu thc trn. Do vy kt qu ca 2 biu th phi nh
nhau. (Chng minh bng phn chng cng l mt cch hay bi E[X] c nh ngha l
mt nh x t R vo R.
____________________________________________________ Exercise 2.7:
Cho X v Y l 2 bin geometric vi tham s (with parameter) ln lt l p v
q. (a) Tnh Pr(X = Y) Pr(X = Y) = n!1 Pr(X = x Y = y) = n!1 Pr(X = x
)Pr(Y = y)
-
= n!1 (1-p)n-1p(1-q)n-1q = pq n!1 ((1-p)(1-q)n-1 = pq 1
1- (1-p) (1-q) = pqp ! q - pq (b) Tnh E[max(X,Y)] Do X v Y l 2
bin geometric vi tham s (with parameter) ln lt l p v q nn E[X] =
1/p v E[Y] = 1/q Gi X1 l mt Bernoulli random variable tha mn X1 =
TRUE khi v ch khi X = 1 tc ln th u tin thnh cng. Pr(X1 = TRUE) = p;
X1 = FALSE nu ngc li. E[max(X,Y)] = Pr(X1 = TRUE) E[max(X | X1 =
TRUE , Y) + Pr(X1 = FALSE) E[max(X | X1 = FALSE , Y) = p * E[Y] +
(1-p)*E[max(X | X1 = FALSE , Y) ] ( v X1 = TRUE khi v ch khi X = 1
nn max(X|X1= 1,Y) = Y) (b-1) Khi X > 1 , gi X* l s ln cn phi th
cho n ln thnh cng u tin. Khi E[X| X1 = FALSE] = E[X* +1].
E[max(X,Y)] = p * E[Y] + (1-p)*E[max(X* + 1 , Y) ] Gi Y1, v Y* l
bin tng t nh X1 v X*, ch cn thay X bi Y. Lm tng t nh trn ta thu c:
E[max(X,Y)] = p * E[Y] + (1-p)*( q*E[max( X* + 1 , Y|Y1 = TRUE) +
(1 - q)*E[max(X* + 1 , Y*+1)] ) = p * E[Y] + (1-p)*( q*E[X* +1] +
(1-q) *E[max(X*,Y*) + 1]). Do tnh memoryless ca phn phi geometry nn
E[X*] = E[X], E[Y*] = E[Y], E[max(X*,Y*) + 1] = E[max(X,Y)], E[X] =
1/p v E[Y] = 1/q . Thay vo ta c: E[max(X,Y)] = p/q +
(1-p)*(q*(1/p+1)+(1-q)*(E[max(X,Y)]+1) )
Suy ra : E[max(X,Y)] = 1 !pq!qp- p -qp ! q -pq
(c) Tnh Pr(min(X,Y) = k) Pr(min(X,Y) = k) = Pr(X = k Y k + 1) +
Pr(X = k Y = k) + Pr(X k + 1 Y = k) = Pr(X = k)Pr(Y k + 1) + Pr(X =
k)Pr( Y = k) + Pr(X k + 1 )Pr( Y = k) = (1-p)k-1p(1-q)k +
(1-p)k-1p(1-q)k-1q + (1-p)k p(1-q)k-1q (xem muc 4 chng 2: Pr(Y n )
= (1-q)n-1) = (1-p)k-1(1-q)k-1(p + q - pq)
-
(d) Tnh E[X | X Y] Li gii tng t (b) E[X | X Y] = 1
p ! q - pq ____________________________________________________
Exercise 2.12: Ta ly cc tm card t trong hp c n loi card. (a) Tnh
expectation ca s card phi ly cho n khi c n loi card Bi ny tng t nh
Coupon Collector Problem . Xem mc 2.4.Kt qu: E[X] = H(n) (b) Nu ta
ly ng 2n tm card, k vng ca s tm card khng c chn l bao nhiu? Gi X_i
l s loi card ly c ngay sau khi rt card th i. (i = 1,2,...,2n). D
thy: X_1 = 1; Vi i>=1, c 2 trng hp sau: 1. Card tip theo thuc mt
loi no c ri. Nh vy X_i = X_(i-1). Xc sut xy
ra s kin ny l Pr(Xi = Xi-1) = Xi-1n 2. Card tip theo thuc mt loi
hon ton mi. Nh vy X_i = X_(i-1) + 1. Xc sut
xy ra s kin ny l Pr(Xi = Xi-1 + 1) = 1 - Xi-1n Suy ra: E[Xi
|Xi-1] = Xi-1 Xi-1n + (Xi-1 + 1)(1 - Xi-1n ) = 1 + Xi-1(1- 1n) Ly
Expectation ca 2 v ta c: E[X_i] = E[E[X_i | X_i-1]] = 1 +
a*E[X_i-1]; vi a = 1-1/n. Bin i cng thc truy hi trn ta thu c:
E[X2n] = a2n-1X[1] + a2n-2+ . . .+ a + 1 ; vi a = 1-1/n.
Thay X_1 = 1 vo ra rt gn ta c: E[X2n] = a2n-1 + a2n-2+ . . .+ a
+ 1 = 1-a2n1-a ; Khi n ln ta c th thay: (1- 1
n)n e-1. Kt qu cui cng: E[X2n] = n(1-e-2)
(c) Nu ta ly ng 2n tm card, k vng ca s tm card c chn ng mt ln l
bao nhiu? L lun tng t nh trn. Ch cn thy trng hp X_i = X_(i-1) bi
X_i = X_(i-1) - 1 Lc ny hng s a tr thnh 1 - 2/n
E[X2n] = n2(1-e-4)
____________________________________________________ Exercise 2.22:
Cho u vo l mt dy ngu nhin n s: a1,a2,....,an. Mi s a_i c i ch vi s
lin k cho n khi n n c v tr cn sp xp. Tnh Expected Number of Swap of
Buble Sort. Proof: Ta ni a_i v a_j l inverted (b o ln) nu (i <
j) AND (a_i > a_j). Gi X_ij l mt Bernoulli random variable tha
mn: X_ij = 1 nu a_i v a_j l mt inverted pair. Pr(Xij = 1) = nk!1
Pr(ai = k
aj > k) = nk!1 1n n-kn = 1 - 1n nk!1 k = 12 - 12n X_ij = 0 nu
ngc li.
-
t X = s ln Swap trong Buble Sort Trong Buble Sort, s ln swap
chnh l s inverted pair. Do vy: X = n-1i!1 nj!i!1 Xij . Ly
expectation 2 v: E[X] = E[ n-1i!1 nj!i!1 Xij ] = n-1i!1 nj!i!1
E[Xij ] (Linearity of Expectation) E[X] = n-1i!1 nj!i!1 (12 - 12n)
= (n-1)24 ____________________________________________________
Exercise 2.23: Cho u vo l mt dy ngu nhin n s: a1,a2,....,an Tnh
Expected Runtime of Linear Insertion Sort. Proof: Sau khi sp xp cc
s c th t t l: 1,2,...,n Gi s trc khi sp xp cc s c th t 1,2,...,n
ang v tr ln lt l x_1,x_2, ... ,x_n. y (x_1,x_2, ... ,x_n ) l mt hon
v ca (1,2,...,n) x_i v n v tr th nht cn thc hin |x_i - i| ln swap.
Tng s ln swap l X = ni!1 |xi - i| Trong (x_1,x_2, ... ,x_n ) l mt
hon v ca (1,2,...,n). E[X] = E[ ni!1 |xi - i|] = ni!1 E[|xi - i|]
(Linearity of Expectation) E[|ai - i|] = i-1k!1 Pr(ai = k)(i - k) +
nk!i!1 Pr(ai = k)(k - i) =1n( i-1k!1 (i-k) + nk!i!1 (k-i))
= 1n( i-1j!1 j + n-ij!1 j )
Suy ra: E[X] = ni!1 E[|xi - i|] = 1n( ni!1 i-1j!1 j + ni!1
n-ij!1 j ) . p dng lut hon i sig-ma ta c: E[X] = 1
n( n-1j!1 n-1i!j j + n-1j!1 n-ji!1 j ) =
1n( n-1j!1 ((n-j)j ) + n-1j!1 ( (n-j)j ))
= 2n( n-1j!1 ((n-j)j ) = 2 n-1j!1 j - 2n( n-1j!1 j2 ) = n(n-1) -
2
n-1 (n-1)n(2n - 1)6 E[X] = 2
3n2 + 1
3
S
Chapter 5: Balls and Bins Thursday, June 10, 2010,12:30
(GMT+7)
-
1. Background 1.1. The inclusive-exclusive principle: Pr(E1E2) =
Pr(E1) + Pr(E2) - Pr(E1E2) ng dng: - 2 independent events: Pr(AB) =
Pr(A)Pr(B) - 2 disjoint events: Pr(E1E2) = Pr(E1) + Pr(E2) 1.2.
Bayes' Law: Pr(E1 | B) = Pr(BE1) / Pr(B) =Pr(BE1)Pr(BE1)+ Pr(BE2)
1.3. Expectation
1.4. Binomial Distribution
: n trials + p success 1.5. Geometric Distribution
: n trials + 1 success 2. Tm tt l thuyt Cc mc ch o
Ni dung Take note
1.The Birthday Paradox
Problem: C 30 ngi trong phng, Hi xc sut tn ti 2 ngi c ngy sinh
trng nhau l bao nhiu? Problem Analysis: Ngy sinh c 365 kh nng. S
ngi l 30, 29 hay ch c 1 ,2 c khc g nhau khng? Khi c 1 ngi chc chn l
khng trng vi ai. Khi c 2 ngi th kh nng khng trng chnh l kh nng ngi
2 sinh khac ngy ngi 1. Khi c 364 tng hp trong s 365 trng hp c th.
Xc sut l 364/365. Nh vy xc sut ngi th i khng trng vi ngy sinh ca
nhng ngi trc hon ton khng ph thuc vo ngi trc sinh ngy no m ch ph
thuc vo gi tr ca i. V xc sut ny l: 1 - (i-1)/365 Proof: i theo lp
lun trn ta tnh dc xc sut 30 ngi khng sinh trng ngy l: 30i!1 ( 1
i!1365) 0.2937
- Khng k nm nhun (leap years) v sinh i (twin) Thm vo : tnh s ngi
cn trong phng xc sut tn ti 2 ngi c ngy snh trng nhau bng 1/2 T cng
thc (1) c th rt ra gi tr ca m xc sut ny = 1/ 2 l: m2/2n =
-
Tng qut bi ton cho n ngy sinh v m ngi c ngy sinh khng trng nhau:
Pr = m!1i!0 ( 1 in) mi!0 e! in = e mi!0 ! in =e!m(m!1)2n (1) trn ta
s dng cng thc 1 x e!x vi x tng i gn 0.
ln2 hay m = 2 n ln2 Gi tr ny ch c tim cn l cn n. Nh vy l rt nh
so vi nhn nh ban u. V th n c gi l Paradox. 2. Balls into Bins
Tng qut vn Birthday Paradox trn ta xy dng c mt m hnh ton hc gi l
balls into bins. y c s tng ng s ngi l s bng v s ngy sinh l s hp. Nu
by gi ta nm m balls vo n bins (gi s khng nm trt qu no) , lc ny mi
bins s c mt s bng nht nh. Ta gi maximum load l s bng cha trong hp c
nhiu bng nht. nh l: Xc sut maximum load ln hn 3 ln n/ln ln n l khng
qu 1/n. Proof: Xt bin th nht. Xc sut c t nht M balls trong bin 1 s
l: n
M(1
n)M (chn ra M qu trong s n qu. Xc sut mi qu vo hp 1 l 1/n).
n
M(1
n)M 1
M! (eM)M Tron bt ng thc th 2 ta s dng cng thc: kkk! < i!0
kii! = ek ] Do xc sut tn ti mt bin cha nhiu hn M balls l: n n
M(1
n)M n(e
M)M Thay M = 3 ln n/ln ln n vo ri chuyn ton b sang dng exp ta
chng minh c xc sut ny khng qu 1/n
Dng Taylor Expansion cho e^k
3. The Poisson Distribution
4. Application Hashing: Problem Set Membership 4.1. Chain
Hashing 4.2. Fingerprint
Problem: Cho tp S = {s_1,s_2, ... ,s_m} l tp con ca mt tp rt ln
universe U. Vi mt phn t x bt k chn t U, ta phi tr li cu hi:" x c l
phn t ca S hay khng?". Cu hi ny c gi l Set Membership Problem . 1.
Chain Hashing Phng php c in nht l to mt bng bm tm
hiu th no l rt ln bn c th coi S l tp cc bi ht trong my tnh ca
bn. Cn U l tp ton b bn nhc trn th gii.
-
Method 4.3. Bloom Filter Method
kim, Bn c th dng hm bm ngu nhin.Phng php ny lun cho kt qu chnh
xc v thi gian kh nh. Theo phn tnh mc 2, maximum load bng ln n/ln ln
n l khng qu 1/n. Vy th thi gian tm kim ln hn (ln n/ln ln n ) vi xc
sut khng qu n. Nhc im ca phng ph ny l truy cp b nh qu ln : m phn t
ca tp S khng th lu trong RAM c. 4.2. Fingerprint Method Ta nh ngha
mt hm to fingerprint nh sau: f: S -> B Trong B l tp cc s nh phn
b bt, D thy B c 2^b phn t. Ta ch cn lu m phn t, mi phn t b bt trong
RAM. Tc cn m*2^b bit. Vic tnh f(x) cng chnh l vic tm ra fingerprint
ca x. ALGORITHM Tnh f(x). So snh f(x) vi tt c cc f(s_i); s_i thuc S
C 2 trng hp xy ra: Case 1: Nu f(x) f(s_i); mi i = 1, 2, ...,m =>
x khng thuc S. Case 2: Nu tn ti 1 x khng thuc S. Gi s ngc li x thuc
S th phi tn ti i m f(x) = f(s_i) vi 1
-
= 1 Pr(i: f(x)f(si ) ) Pr (x S ) (4.2) Pr (x S ) = 0 do S l tp
con c m phn t ca tp U rt rt ln. Pr(i: f(x)f(si ) ) = mi!1
Pr(f(x)f(si ) ) =
mi!1 (1 Pr(f(x) = f(si ) ) Mi f(s_i) l mt fingerprint di b bits.
Do vy xc sut f(x) = f(s_i) ch l 1/2^b. Vi mi i = 1,2, ... , m. Suy
ra: Pr(i: f(x)f(si ) ) = (1 12b )m Thay vo ng th (4.2) ta c:
Pr(false positive) = 1 (1 12b
)m 1 e!m2b m2b
xp x trn ta dng 2 ln cng thc: 1 x e!x khi x nh. Chn b = 32 tc
fingerprint 32-bit, gi s t in c 2^16 password xu tc password ngi s
dng khng c php dng. Khi trong RAM ta phi lu: 2^16 * 4 bytes =
256KB;
Pr(false positive) 216232
165536
4.3. Bloom Filter Method Ging nh fingerprint method ta s dng mt
nh x f t tp S vo tp cc gi tr n-bit By gi thay cho vic mi mt phn t
cho ra mt fingerprint ta ch cn mt dy n bt m ta gi l Bloom lu tt c
cc f(s_i). Nu f(s_i) tr li gi tr no m ti bit = 1 th ta thit lp bit
ny. V d: Bloom n = 4 bit 0000; sau khi tinhs f(s_1) = 2 = 0010. Ta
thu c Bloom 0010. Sau khi tnh f(s_2) = 10 = 1010 ta thu c Bloom
1010. Phng php Bloom Filter s dng mt Bloom, v cc hm h_i; i
=1,2,...,k Ta ni y nm trong Bloom nu tt c cc v tr bit 1 ca y u c
trong Bloom. V d: y= 8 = 1000 nm trong Bloom 1010.; y = 1001 khng
nm trong Bloom 1010. ALGORITHM Tnh h_i(x); i =1,2,...,k x thuc S
h_i(x) nm trong Bloom no vi mi i =1,2,...,k
thc hin. y ta coi S l tp rt quan trng, "th ly nhm cn hn b
st".
- PROBABILISTIC ANALYSIS: Case 1: Khng tn ti i h_i(x) nm trong
Bloom => x khng thuc S. Gi s ngc li x thuc S th phi tn ti i:
1
-
3. Exercises:
http://docs.google.com/View?id=dgmqjfk5_184c7sdskcv 4. Cng thc ny
ch no , em c nu thy th chuyn vo nh: = 1 - Pr(A[j], 1jn: A[f(x)] =
A[j])) Exercise Exercise 5.21: Trong open addressing, hash table c
ci t bng mng, hon ton khng s dng linked-lists. Mi entry trong bng
ch c th rng hoc cha 1 phn t. Bn c th nhp vo link sau tm hiu r hn.
http://en.wikipedia.org/wiki/Open_addressing
http://courseweb.xu.edu.ph/courses/ics20/supplements/holte/open-addr.htm
http://courseweb.xu.edu.ph/courses/ics20/supplements/holte/open-addr.htm
Vi mi key k trong table ta nh ngha mt probe sequence h(k,0),h(k,1)
.... ,h(k,n); n l s entry trong table. chn kha k ta tnh ln lt
h(k,0),h(k,1) .... cho n khi tm c trng chn k, sau n ln tht bi ta
hiu l bng full v khng th chn thm Khi tm kim cng lm tng t nh vy, tnh
ln lt h(k,0),h(k,1) ....n khi tm c kha k, hoc tm thy mt trng th
chng t trong bng khng c kha k. Gi s h(k,j) c th nhn bt c gi tr ngu
nhin no trong n entries ca bng v tt c cc h(k,j) c lp. Sau khi s dng
bng ny lu gi m = n/2 phn t, ta nhn c yu cu tm kha k trong bng . Gi
X_i l s probe (thm d) cn thc hin chn kha th i. t X = max {X_i}; 1 i
m l s thm d ln nht cn thc hin chn phn t c kha m (a) Chng minh
Pr(X>2log n) 1/n (b) Chng minh expectation ca di ln nht ca chui
thm d cn thc hin l E[X] = O(log m). Ch : n = 2m. Phng php trn cn c
gi l Double Hashing. V d: h(k,i) = a*h(k) + b*h(i) (mod n) tc l dng
2 hash function. (c) Open addressing/Linear Probing l mt trng hp
ring ca phng php ny. : h(k,i) = h(k) + i (mode n); tron h(i) = i.
Nhng khng hn nh vy bi h(k,i) v h(k,i+1) khng cn c lp nhau na. Hy a
ra nh hng ca s khac bit ny v tm cch p dng Double Hashing cho vic tm
xpectation ca di ln nht ca chui thm d cn thc hin cho Open
addressing/Linear Probing. ( thi K52 - CNTT - HBK HN)
-
Exercise 5.22: Gi s list cc bi ht bn u thch l X, v list cc bi ht
ti u thch l Y. Bit rng |X| = |Y| = n. Ta to ra Bloom filter ca cc
tp X v Y s dng cc s m bits v k hash functions. (a) Tnh expectation
ca s cp bit khc nhau trong Bloom filter ca X v Y (b) Tnh E[ |X Y|]
(c) Gii thch ti sao ta c th s dng phng php ny tm nhng ngi c s thch
cng th loi nhc thay cho vic so snh tt c list mt cch trc tip p s:
(a) n(2p-p2 ) trong p = (1 - 1
n)mk;
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Answer Exercise 5.21: Trong open addressing, hash table c ci t bng
mng, hon ton khng s dng linked-lists. Mi entry trong bng ch c th
rng hoc cha 1 phn t. Bn c th nhp vo link sau tm hiu r hn.
http://en.wikipedia.org/wiki/Open_addressing
http://courseweb.xu.edu.ph/courses/ics20/supplements/holte/open-addr.htm
http://courseweb.xu.edu.ph/courses/ics20/supplements/holte/open-addr.htm
Vi mi key k trong table ta nh ngha mt probe sequence h(k,0),h(k,1)
.... ,h(k,n); n l s entry trong table. chn kha k ta tnh ln lt
h(k,0),h(k,1) .... cho n khi tm c trng chn k, sau n ln tht bi ta
hiu l bng full v khng th chn thm Khi tm kim cng lm tng t nh vy, tnh
ln lt h(k,0),h(k,1) ....n khi tm c kha k, hoc tm thy mt trng th
chng t trong bng khng c kha k. Gi s h(k,j) c th nhn bt c gi tr ngu
nhin no trong n entries ca bng v tt c cc h(k,j) c lp. Sau khi s dng
bng ny lu gi m = n/2 phn t, ta nhn c yu cu tm kha k trong bng . Gi
X_i l s probe (thm d) cn thc hin chn kha th i. t X = max {X_i}; 1 i
m l s thm d ln nht cn thc hin chn phn t c kha m Proof: (a) Chng
minh Pr(X>2log n) 1/n ln chn th i, trong bng c i - 1 entries. Ta
phi tnh h(i.j) cho n khi tm c entry trng. Nh vy X_i l mt geometric
random variable with parameter pi = 1 - i-1n . Theo lut phn phi ny:
Pr(Xi = j) = (1-p)j-1p Suy ra: Pr(Xi j) = l!j Pr(X = j) = l!j
(1-p)j-1p = p(1-p)j-1 l!0 (1-p)l = p(1-p)j-1 11-(1-p) = (1-p)j-1
Suy ra: Pr(Xi > 2log m) = Pr(Xi 2log m + 1 ) = (1-pi )2logm;
thay pi = 1 - i-1n v n
-
= 2m vo ta c: Pr(Xi > 2log m) = ( i-12m)2logm ( m2m)2logm =
1m2 (b) Chng minh expectation ca di ln nht ca chui thm d h(i,j) cn
thc hin l E[X] = O(log m). Ch : n =2m.
E[X] =x!2 log m ( xPr(X = x)) +
m
x ! 2 log m (xPr(X = x)) < 2 log mx!2 log m Pr(X = x) + n
m
x ! 2 log m Pr(X = x) E[X] < 2 log mPr(X 2 log m) + nPr(X
> 2 log m) = 2 log m + n 1
n= 2 log n trn ta s dng: Pr(X 2 log m) 1 v kt qu Pr(X > 2 log
m) 1
n t cu a.
Ch trn ta a ra nhn xt X_i l mt geometric random variable with
parameter pi = 1 - i-1n . Do vy: E[Xi!1] = 1pi!1 = nn-i = 11 -;
trong = in Cng thc ny cho ta thy: "Nu nh trong bng c i kha th
expectation ca s ln thm d cn lm l 1/(1-a) trong a l t s gia s phn t
a vo bng v s entries bng c th cha c" (c) Open addressing/Linear
Probing l mt trng hp ring ca phng php ny. : h(k,i) = h(k) + i (mode
n); tron h(i) = i. Nhng khng hn nh vy bi h(k,i) v h(k,i+1) khng cn
c lp nhau na. Hy a ra nh hng ca s khac bit ny v tm cch p dng Double
Hashing cho vic tm xpectation ca di ln nht ca chui thm d cn thc hin
cho Open addressing/Linear Probing. ( thi K52 - CNTT - HBK HN)
Proof: Gi s by gi thy cho vic tnh h(k,0),h(k,1) .... ,h(k,n) mt cch
ln lt ta s tnh h(k,i_0),h(k,i_1) .... ,h(k,i_n) vi (i_1,i_2
,....,i_n) l mt hon v ca (1,2,...,n). Mi hon v c gi l mt case. Nh
vy ta c n! case. Trong n! th t c th, c 1 v ch 1 hon v l (i_1,i_2
,....,i_n) = (1,2,...,n) dn ta n Linear Probing. Ta gi Linear
Probing l case_1 Mt nhn xt na l cc hon v ny u c vai tr tng ng nhau
tc nu lm Linear Probing th cng c th lm (3,1,2,...,n-1,n) hay bt k
hon v no cng cho ta mt kt qu ging nhau khi tnh expectation. Tr li
vi Double Hashing ta nh ngha bin ngu nhin X = max {X_i}; 1 i m l s
thm d ln nht cn thc hin tm phn t c kha m. E[X] = n!i!1 Pr(CASE =
casei ) E[X|CASE = casei ]; trong CASE l mt hon v ca b (1,2,...,n).
V cc case_i l tng ng nhau nn: Pr(case_i) = 1/n! vi mi i =
1,2,....,n!, v E[X|CASE = case_i] = E[X|CASE = case_1] = E[X|CASE =
Linear Probing] vi mi i = 1,2,....,n!. Do vy: E[X|CASE = Linear
Probing] = E[X] = O(log n);
-
Ch : Gi X_i l s probe (thm d) cn thc hin chn kha th i. trn ta a
ra nhn xt X_i l mt geometric random variable with parameter pi = 1
- i-1n . Do vy: E[Xi!1] = 1pi!1 = nn-i = 11 -; trong = in Nhn xt ny
ch ng cho trng hp Double Hashing khng c g m bo n s ng cho Linear
Probing.