Top Banner
Liron Schiff * (TAU) Joint work with Yehuda Afek, Anat Bremler-Barr (TAU) (IDC) Recursive Design of Hardware Priority Queues ed by European Research Council (ERC) Starting Grant no. 259085
41

Recursive Design of Hardware Priority Queues

Feb 16, 2016

Download

Documents

chana

Recursive Design of Hardware Priority Queues. Liron Schiff * ( TAU ) Joint work with Yehuda Afek , Anat Bremler -Barr (TAU) (IDC). ∗Supported by European Research Council (ERC) Starting Grant no. 259085. Priority Queue (PQ). Interface: PQ.Insert ( ) - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Recursive Design of Hardware Priority Queues

Liron Schiff * (TAU)Joint work with

Yehuda Afek, Anat Bremler-Barr(TAU) (IDC)

Recursive Design of Hardware Priority Queues

∗Supported by European Research Council (ERC) Starting Grant no. 259085

Page 2: Recursive Design of Hardware Priority Queues

• Interface:– PQ.Insert()

• The higher the priority of , the smaller is– PQ.GetMin(): remove and return

– PQ.Delete(): just remove– PQ.Peek(): just return minimum

Priority Queue (PQ)

Priority

QueueInser

tGetMi

n

Page 3: Recursive Design of Hardware Priority Queues

• Networking: Scheduling Packets– Many flows (1M)– High rate (100Mpps)

More Application: Scientific Simulators, Databases

Priority Queue Applications

Priority

Queue ( s c h e d u l

e r )

14

33

913

24 1

927

42

55 1

638

7 25

Page 4: Recursive Design of Hardware Priority Queues

Two Existing ApproachesDedicated HardwareSolutions

Common SoftwareSolutions

: Fast : Slow

Non-Scalable Scalable

Page 5: Recursive Design of Hardware Priority Queues

Merge-Sort concept:

Our Approach: The Powering Technique

Base Priority Queue (BPQ)

size HW PQ3 x + size

RAM =

Sort

Merge

√𝑵

√𝑵

Size PQ

Page 6: Recursive Design of Hardware Priority Queues

The Powering Technique• Insert(x) uses Input

Input

BPQ

Exit BPQ

3

Page 7: Recursive Design of Hardware Priority Queues

The Powering Technique• Insert(x) uses Input

Input

BPQ

Exit BPQ

0

3

Page 8: Recursive Design of Hardware Priority Queues

The Powering Technique• Insert(x) uses Input

Input

BPQ

Exit BPQ0

35

Page 9: Recursive Design of Hardware Priority Queues

The Powering Technique• When Input gets full move to Exit.

Input

BPQ

Exit BPQ0

3

5

√𝑵

Page 10: Recursive Design of Hardware Priority Queues

The Powering Technique• When Input gets full move to Exit.

Input

BPQ

Exit BPQ0

3

5

4

7

8

Page 11: Recursive Design of Hardware Priority Queues

The Powering Technique• When Input gets full move to Exit.

Input

BPQ

Exit BPQ0

3

5

4

7

8

1

2

6

√𝑵

Page 12: Recursive Design of Hardware Priority Queues

The Powering Technique• Get_min() extracts the min of Exit or Input

Input

BPQ

Exit BPQ0

3

5

4

7

8

1

2

6

9

min

Page 13: Recursive Design of Hardware Priority Queues

The Powering Technique• Get_min() extracts the min of Exit or Input

Input

BPQ

Exit BPQ

0

3

5

4

7

8

1

2

6

9

and we update the Exit (if needed).min

Page 14: Recursive Design of Hardware Priority Queues

• Difficulties with the Simple idea

• Applying the construction recursively

• Exemplifying on TCAM base units

• Evaluation

Outline

Page 15: Recursive Design of Hardware Priority Queues

1. More than lists in exit module (As lists are emptied, and capacity N is maintained)

2. Move a list in O(1) op’s from Input to Exit

Two difficulties with the simple idea

Input

Exit

√𝑵

√𝑵

¿𝑵

Page 16: Recursive Design of Hardware Priority Queues

Difficulty 1• Maintaining capacity N, while lists are

shrinking

Input

BPQ

Exit BPQ3

5

4

7

8

1

2

6

9

Page 17: Recursive Design of Hardware Priority Queues

Difficulty 1• Maintaining capacity N, while lists are

shrinking

Input

BPQ

Exit BPQ3

5

4

7

8

1

2

6

9

• We continually merge inactive lists during Insert

Page 18: Recursive Design of Hardware Priority Queues

Difficulty 1• Maintaining capacity N, while lists are

shrinking

Input

BPQ

Exit BPQ3

54

7

8

1

2

6

9

• We continually merge inactive lists during Insert

10

Page 19: Recursive Design of Hardware Priority Queues

Difficulty 1• Maintaining capacity N, while lists are

shrinking

Input

BPQ

Exit BPQ3

54

7

8

1

2

6

9

• We continually merge inactive lists during Insert

10

11

Page 20: Recursive Design of Hardware Priority Queues

Difficulty 1• Maintaining capacity N, while lists are

shrinking

Input

BPQ

Exit BPQ3

5

4

7

8

1

2

6

• We continually merge inactive lists during Insert

9

10

11

Page 21: Recursive Design of Hardware Priority Queues

Difficulty 2• Moving all items from input to RAM in O(1)

time

Exit BPQ

Input

BPQ

Page 22: Recursive Design of Hardware Priority Queues

Difficulty 2• Moving all items from input to RAM in O(1)

time– Use two Input BPQs and switch between them

Exit BPQ

Input

BPQ

Input

BPQs

Buffers

Page 23: Recursive Design of Hardware Priority Queues

Difficulty 2• Moving all items from input to RAM in O(1)

time– Use two Input BPQs and switch between them

Exit BPQ

Input

BPQ

Input

BPQ

Buffers

Page 24: Recursive Design of Hardware Priority Queues

Difficulty 2• Moving all items from input to RAM in O(1)

time– Use two Input BPQs and switch between them

Exit BPQ

Input

BPQ

Input

BPQ

Buffers

Page 25: Recursive Design of Hardware Priority Queues

Difficulty 2• Moving all items from input to RAM in O(1)

time– Use two Input BPQs and switch between them

Exit BPQ

Input

BPQ

Input

BPQ

Buffers

Page 26: Recursive Design of Hardware Priority Queues

Block Size – Time Tradeoff• Apply the construction recursively

– We used Exit and Input

Exit BPQ

Input

BPQInpu

t BPQ

√𝑵

√𝑵

Page 27: Recursive Design of Hardware Priority Queues

Block Size – Time Tradeoff• Apply the construction recursively

– We used Exit and Input– We can use Exit and Input

Exit BPQ

Input

BPQ

Input

BPQ

3√𝑁

3√𝑁 2

Page 28: Recursive Design of Hardware Priority Queues

Block Size – Time Tradeoff• Apply the construction recursively

– We used Exit and Input– We can use Exit and Input– We can build each Input recursively

Exit BPQ

Input

BPQ

Input

BPQ

3√𝑁

3√𝑁 2

Exit BPQ

3√𝑁

Input

BPQ

Input

BPQ3√𝑁

3√𝑁

Page 29: Recursive Design of Hardware Priority Queues

Block Size – Time Tradeoff

Exit BPQ

Input

BPQ

Input

BPQ

3√𝑁

3√𝑁 2

Exit BPQ

Input

BPQ

Input

BPQ3√𝑁

3√𝑁

Exit BPQ

Input

BPQ

Input

BPQ3√𝑁

3√𝑁

Page 30: Recursive Design of Hardware Priority Queues

Block Size – Time Tradeoff

Exit BPQ

Input

BPQ

Input

BPQ

3√𝑁

3√𝑁 2

Exit BPQ

Input

BPQ

Input

BPQ

Exit BPQ

Input

BPQ

Input

BPQInser

t

Insert

Page 31: Recursive Design of Hardware Priority Queues

Block Size – Time Tradeoff• A Systolic Array like design:

Exit BPQ

𝑥

RAM

Buf

Buf

Exit BPQ

RAM

𝑁𝑥2

𝑁𝑥2

𝑥

Exit BPQ

RAM

Exit BPQ

𝑵𝒙𝟐

𝑥

Exit BPQ

𝑵𝒙𝟐

…Input

BPQ

Input

BPQ𝑥𝑥

𝑁𝑥3

𝑁𝑥3

Exit BPQ

𝒙𝟐

𝑥

Exit BPQ

𝒙𝟐

𝑥

in

Page 32: Recursive Design of Hardware Priority Queues

Resulting Tradeoffs

Parallel op. Time (Latency)

#BPQ Ops. (per op.)

#Queues * Size

Recursion Levels

.

.

.

.

.

.

.

.

.

.

.

.

Page 33: Recursive Design of Hardware Priority Queues

TCAM example

Page 34: Recursive Design of Hardware Priority Queues

• Associative Memory chips:

• Properties:– Ternary values (‘0’,’1’ and ‘*’)– Already used in routers (IP lookup, classification)– High throughput (300M ops per sec for 1Mb TCAM)– Latency and costs increase dramatically with size

Ternary CAMs (TCAMs)

0*10**1*001001

1111***011

01010110

in

012

m0001001

11out

entry data entry index

Page 35: Recursive Design of Hardware Priority Queues

• Implied by Panigrahy & Sharma (2003)• Three versions:

A. O(1) time but O(w) entries per item(where w is the width of a priority value in bits)

B. O(log w) timeC. “Empirical O(1)” time but O(w) on w.c.

TCAM based Priority Queue

BPQ

Page 36: Recursive Design of Hardware Priority Queues

Space (TCAM bits)

Time (TCAM ops.)

Latency(TCAM ops.)

original

• Implied by Panigrahy & Sharma (2003)• Our results:

TCAM based Priority Queue

PoweringPowering

Page 37: Recursive Design of Hardware Priority Queues

• Using small TCAM-based PQs– Faster TCAM access– Feasible even when N is large

• Suits well backbone routers– TCAMs are already used for IP-lookup

Powering the TCAM BPQ

Page 38: Recursive Design of Hardware Priority Queues

Results for TCAM-based PQ

Size limit

50 400

3200

1000

1300

1600

1900

100,0001,000,000

10,000,000100,000,000

1,000,000,000TCAM Space

N (thousands of items)

TCA

M S

pace

(K

b)

50 100

200

400

800

1600

3200

050

100150200

Throughput

N (thousands of items)

Mpp

s

k=2

k=1

ABC

Page 39: Recursive Design of Hardware Priority Queues

Applying to Shift-Registers

1,00

02,

000

4,00

08,

000

16,0

0032

,000

64,0

0012

8,00

025

6,00

051

2,00

01,

024,

0000

50

100

150

200Throughput

SR-BPQSR_PPQ(2)SR-PPQ(3)

N (thousands of items)

Mpp

s

Size limit

• Considering a HW PQ implementation of R. Chandra and O. Sinnen.

OriginalK=1K=2

Page 40: Recursive Design of Hardware Priority Queues

Summary

• The Powering Technique– Combine Small HW queues and RAM– Allows space – time tradeoffs

• Powering TCAMs– Smaller TCAMs shorter operation time– Matches lower bound for sorting with TCAM– Also works for Shift Registers

Page 41: Recursive Design of Hardware Priority Queues