Top Banner
EECS 4101/5101 Prof. Andy Mirzaian
28

CSE 4101/5101

Feb 19, 2016

Download

Documents

riona

CSE 4101/5101. Prof. Andy Mirzaian. Move to Front Self Adjusting Linear Lists. Search Trees. Lists. Multi-Lists. Linear Lists. Binary Search Trees. Multi-Way Search Trees. B-trees. Hash Tables. Move-to-Front. Splay Trees. Red-Black Trees. 2-3-4 Trees. competitive. competitive ?. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: CSE 4101/5101

EECS 4101/5101

Prof. Andy Mirzaian

Page 2: CSE 4101/5101

2

Lists

Move-to-Front

Search Trees

Binary Search Trees Multi-Way Search Trees

B-trees

Splay Trees 2-3-4 TreesRed-Black Trees

SELF ADJUSTING WORST-CASE EFFICIENT

competitive competitive?

Linear ListsMulti-Lists

Hash Tables

DICTIONARIES

Page 3: CSE 4101/5101

3

TOPICS in this slide

Linear List with Move-to-Front

Competitiveness of Move-to-Front: Static Dictionary Dynamic Dictionary Expected Case

Page 4: CSE 4101/5101

4

References: • Lecture Note 2

Page 5: CSE 4101/5101

5

Move-to-Front

Page 6: CSE 4101/5101

6

Introduction

s = a sequence of m dictionary operations. We usually assume D is initially empty (except for static dictionaries, i.e., search only). We often assume there is a linear ordering defined on keys (e.g., integers) so that we can make key comparisons.

D as a SEQUENTIAL LINEAR LIST: We need i probes to sequentially access the item at position i on list D.

SELF-ADJUSTING FEATURE:Exchange: swap the positions of a pair of adjacent items on D.Free exchange: exchanging the accessed/inserted item with its preceding item on D. Paid exchange: all other types of exchanges.Costs: 1 for each item probe

1 for each paid exchange0 for each free exchange

DICTIONARY: Maintain a set D of items (each with a unique identifier called key)that supports the following three operations:

Search (x, D): Locate item x in D (also called access)

Insert (x, D): Insert item x in D (no duplicate keys)

Delete (x, D): Delete item x from D

Page 7: CSE 4101/5101

7

Some on-line heuristics

Move-to-Front (MF): After each dictionary operation perform maximum number of free

exchanges (no paid exchanges). This moves the accessed/inserted item to the front of the list without affecting the relative order of the other items on the list.

Transpose (T): After each dictionary operation perform one free exchange if possible

(no paid exchanges). This moves the accessed/inserted item one position closer to the front (if not at the front already), without affecting the relative order of the other items on the list.

Frequency Count (FC): Maintain a frequency count for each item, initially zero. (This requires extra

space.) Increase the count of an item each time it is accessed/inserted; reduce its count to zero when deleted. Perform exchanges as necessary to maintain the list in non-increasing order of frequency counts.

Page 8: CSE 4101/5101

8

C A E DA E D C

Insert C 4A E D

Delete C 2

A C E DSearch A 3

C E A D

Search B 4C E A D

Insert C 3

E A C DA C D E

Insert E 4

Move-to-Front Example

A C D

List Operation Cost

Page 9: CSE 4101/5101

9

Move-to-Front Applications

Cache based Memory Management:

- Small fast Cache versus large slower main memory.

- Cache I/O page-in-page-out rules.

- Least Recently Used (LRU) paging rule Move-to-Front.

[Sleator, Tarjan, “Amortized efficiency of list update and paging rules,” Communications of ACM, 28(2), pp: 202-208, 1985.]

Adaptive Data Compression:

- Adaptive coding based on MF compares favorably with Hoffman coding.

[Bentley, Sleator, Tarjan, Wei, “A locally adaptive data compression scheme,” Communications of ACM, 29(4), pp: 320-330, 1986.]

Page 10: CSE 4101/5101

10

Static Dictionary

Page 11: CSE 4101/5101

11

Static Dictionary: Search only

Decreasing Frequency (DF): This off-line strategy initially arranges the list items in non-increasing order of access frequencies, and does not perform any exchanges. We do not charge DF the cost of the initial rearrangement! Without loss of generality assume:

k1 k2 … kn

FACT: Among off-line strategies that perform no exchanges, DF is Optimum.

.iki)s(Cn

1i

k

1j

n

1iiDF

i

tij = position of xi during its jth access.,t)s(Cn

1i

k

1jijMF

i

Initial list D0 = [ x1, x2, x3, …, xn ]. s = a sequence of m successful searches.ki = the frequency count (number of search requests in s) for item xi, i = 1..n.m = k1 + k2 + … + kn total number of search requests in s.

Challenging Question: What is the optimum off-line strategy with exchanges?

Page 12: CSE 4101/5101

12

Static Dictionary: Example sequence

s1 = A A A A B B B C CCMF(s1) = 1+1+1+1 + 2+1+1 + 3+1 = 12 < CDF(s1)

s2 = C C B B B A A A A CMF(s2) = 3+1 + 3+1+1 + 3+1+1+1 = 15 < CDF(s2)

s3 = C B A C B A B A ACMF(s3) = 3+3+3 + 3+3+3 + 2+2+1 = 23 > CDF(s3)

Initial list D0 = [ A, B, C ]. s = a sequence of m=9 searches to A, B, C.kA = 4 > kB = 3 > kC = 2.

CDF(s) = 1kA + 2kB + 3kC = 14 + 23 + 32 = 16.

CMF(s) = ? Depends on s.

Page 13: CSE 4101/5101

13

Static Dictionary: MF EfficiencyTHEOREM 1: CMF(s) 2CDF(s) – m.

Proof:

s = . . . xi . . . xi . . . xi . . . xi . . . . . . . . . xi . . . xi . . . xi . . .

jth access to xi

subsequence sij (j-1)st access to xi

Aij = |{ xh | h > i, xh accessed during sij }|Bij = |{ xl | l < i, xl accessed during sij }| i -1tij = Aij + Bij + 1 Aij + i

nii

k

j

ij kkkAi

21

1

………………………….. xi ……………..MF’s List:

tij = cost of jth access to xi

continued

Page 14: CSE 4101/5101

14

Static Dictionary: MF EfficiencyTHEOREM 1: CMF(s) 2CDF(s) – m.

Proof Cont’d: nii

k

j

ijijij kkkAiAti

21

1

,

n

1i

k

1jij

i

)iA(

n

i

nii

n

i

k

j

ij kkkAi

1

21

1 1

n

1iik)1i(

m)s(C2)s(C DFMF

n

1i

k

1jijMF

i

t)s(C

n

1i

k

1jij

n

1i

k

1j

ii

Ai

n

1i

k

1jijDF

i

A)s(C

QED

n

n

n

n

k

kkkkkkkkk

4

43

432

n

1ii

n

1ii kki m)s(CDF

Page 15: CSE 4101/5101

15

Accounting Interpretation

Excess access cost due to xh, h > i, having jumped in front of xi.These jumps also cause inversions on MF’s list compared to DF’s fixed list.

Transfer excess charge from xi to xh (back in time to) when xh was lastaccessed and jumped ahead of xi.So, xh gets charged its normal access cost h, and a transferred charge ofat most h-1 (for lower indexed items xi, i=1..h-1, that were jumped over).

Amortized access cost to xh: ĉMF h + (h-1) = 2h-1 = 2cDF –1.

We will generalize this idea to dynamic dictionaries.

An amortized interpretation of CMF(s) 2CDF(s) – m is ĉMF 2cDF – 1, where ĉMF = the amortized cost of a search operation by MF, cDF = the actual cost of the same operation by DF.

…………xh …………….. xi ……………..MF’s List:

access cost to xi

h > i

Page 16: CSE 4101/5101

16

Dynamic Dictionary

Page 17: CSE 4101/5101

17

The Setup s = a sequence of m search, insert, delete operations on an initially empty list. A = an arbitrary algorithm that executes s.

FA(s) = total # of free exchanges made by A on s. XA(s) = total # of paid exchanges made by A on s.

C-A(s) = total cost by A on s, excluding paid exchanges.

CA(s) = total cost by A on s, including paid exchanges.

FACT: The following hold:

1. CA(s) = C-A(s) + XA(s).

2. FA(s) C-A(s) – m.

3. XMF(s) = XT(s) = XFC(s) = 0.

4. CMF(s) = C-MF(s).

Page 18: CSE 4101/5101

18

MF is 2-CompetitiveTHEOREM 2: CMF(s) 2CA(s) – m, for all A and s.

Proof: We will prove CMF(s) 2C-A(s) + XA(s) – FA(s) – m.

ĉMF = amortized cost charged to MF by a single operation, excluding A’s exchanges.cA = actual cost of A on the same operation, excluding A’s exchanges.

CLAIM:

Aby exchange free afor 1- Aby exchange paid afor 1

deletefor 1c2cinsertor search for1c2

c AA

A

MF

Proof by the potential function method. Potential:

F(MF, A) = Number of inversions in MF’s list with respect to A’s list.

Inversion = any (not necessarily adjacent) pair of items (x,y), where x appears before y in MF’s list, but x appears after y in A’s list.

Example: F( [3,6,2,5,4,1] , [1,2,3,4,5,6] ) = 10.

continued

Page 19: CSE 4101/5101

19

MF is 2-CompetitiveTHEOREM 2: CMF(s) 2CA(s) – m, for all A and s.

Proof of CLAIM Cont’d:

continued

Not yet accounting for exchanges made by A:

Search: ĉMF = cMF + DF = k + [ p – (k-1-p) ] = 2p +1 2(i – 1)+1 = 2i – 1 = 2cA –1.

Insert: The same as search with i = k = L+1 (L = length of the list before insertion.)

Delete: ĉMF = cMF + DF = k + [ - (i-1-p) – (k-1-p) ] = 2(p+1) - i i = cA 2cA –1.

i

k

x1 x2 ……………… xi xi+1 ………………………………..A’s List:

………………….……………… xi …………………...…..MF’s List:

p k-1-p

Page 20: CSE 4101/5101

20

MF is 2-CompetitiveTHEOREM 2: CMF(s) 2CA(s) – m, for all A and s.

Proof of CLAIM Cont’d:

QED

i

k

x1 x2 ……………… xi xi+1 ………………………………..A’s List:

………………….……………… xi …………………...…..MF’s List:

p k-1-p

Accounting for exchanges made by A:

Paid exchage by A: ĉMF = cMF + DF 0 + 1 = 1.

Free exchage by A: ĉMF = cMF + DF = 0 - 1 = -1.

Page 21: CSE 4101/5101

21

Expected Case

Page 22: CSE 4101/5101

22

Static Dictionary: Expected Case

Initial list D0 = [ x1, x2, x3, …, xn ]. Search only.

pi = search probability for item xi, i = 1..n.

pi > 0, p1 + p2 + … + pn = 1.

As in Decreasing Frequency (DF), assume initial list is arranged in non-increasing order of access probability:

p1 p2 … pn > 0.

EA = expected cost of a single search operation by algorithm A.

Page 23: CSE 4101/5101

23

Static Dictionary: Expected CaseTHEOREM 3: The following hold (when p1 p2 … pn > 0):

.1E2E)c(

pp

pp21E)b(

piE)a(

DFMF

nji1 ji

jiMF

n

1iiDF

Proof:

(a) This follows directly from the definition of expectation.

(c) This follows from (a) and (b) and using the fact that p i /(pi + pj )1.

(b) See Next page.

continued

Page 24: CSE 4101/5101

24

Static Dictionary: Expected Case

QED

Proof of (b):

n

1iiiiiMF list) on the xofposition expected L(pLE

otherwise0 xbefore ppearsa x if1

Y variablerandom /10 ijij

n

ij1j iji YE1L

]1Y[Probp ijij

ijijijij p)p1(0p1]Y[E

n

ij1j ijp1

.21pp1Enji1

pp

ppn

1i

n

1i

n

ij1j pp

ppi

n

1ii

n

ij1j pp

pMF ji

ji

ji

ji

ji

j

jpipjp

n

ij1j

jpipjp

1

]accessed is or x x |accessed is x[Prob jijconditional probability

n

ij1j ijYE1

by the linearity of expectation

Page 25: CSE 4101/5101

25

Exercises

Page 26: CSE 4101/5101

26

1. Consider the Move-to-Front (MF) and the Transpose (T) heuristics on linear-lists. Show a sequence s of n dictionary operations search/insert/delete that starts with the empty set, such that the ratio CT(s)/CMF(s) is asymptotically as high as possible. What is that asymptotic ratio as a function of n?

2. Show that the Transpose (T) and the Frequency Count (FC) on-line heuristics are not competitive for:

(a) Static dictionaries. (b) Dynamic dictionaries.

3. Theorem 2 shows that the total cost of the Move-to-Front (MF) heuristic is at most twice the cost of the best off-line strategy (over any sequence s of m dictionary operations on an initially empty list). Show this ratio bound is tight in the worst case. That is, for any sufficiently large m, give an adversarial sequence s of m dictionary operations on an initially empty list such that if algorithm A is the optimal off-line strategy for sequence s, then the asymptotic cost ratio is CMF(s) /CA(s) = 2 - o(1) (i.e., the ratio approaches 2).

4. Theorem 2 shows that the amortized running time of the Move-to-Front (MF) heuristic on linear lists is within a constant (i.e., 2) multiplicative factor of the best off-line strategy. Does the same result hold (possibly with a somewhat larger multiplicative constant factor) for the Move-Half-Way-to-Front (MHWF) strategy? Prove your claim. (The latter strategy moves the accessed/inserted item half-way towards the front. That is, if the item was at position i, then it is moved to position i/2 , without affecting the relative order of the other elements.) [Hint: First compare MHWF with MF, then apply transitivity.]

Page 27: CSE 4101/5101

27

5. In the no-free-exchange model we assume that both “free” and “paid” exchanges have unit cost each. Prove that in this model the move-to-front algorithm has a competitive ratio of 4. [Hint: You'll need to adjust the potential function.]

6. Randomized Move-to-Front: This algorithm flips a coin and moves the accessed or inserted item to front with probability ½. Now to be competitive, the expected cost (taken over all its random choices) of the algorithm running on the sequence should be within a constant factor of the cost of the optimum off-line algorithm on the same sequence. Prove that in this sense, the randomized move-to-front algorithm is competitive.

7. This question concerns Theorem 3 on the Move-to-Front heuristic. Consider the following (non-increasing) access probability distributions (for a suitable normalization parameter a that depends on n but not on i).

(a) Uniform distribution: pi = a, i=1..n. (b) Exponential distribution: pi = a2–i, i=1..n. (c) Harmonic distribution: pi = a/i, i=1..n.

For each of these access probability distributions answer the following two questions:(i) What should a be so that the given formula becomes a valid probability

distribution.(ii) Evaluate the ratio EMF / EDF and derive its exact limit as n goes to infinity. How

does it compare with the upper bound 2 given in Theorem 3(c)?

Page 28: CSE 4101/5101

28

END