Minimizing Expected Lookup Times on Binary Search Trees April 29, 2002 Pankaj Gupta Principal Architect, Cypress Semiconductor pankaj@cs.stanford.edu pcg@cypress.com.
Post on 31-Dec-2015
215 Views
Preview:
Transcript
Minimizing Expected Lookup Times on Binary
Search TreesApril 29, 2002
High PerformanceSwitching and RoutingTelecom Center Workshop: Sept 4, 1997.
Pankaj GuptaPrincipal Architect,
Cypress Semiconductorpankaj@cs.stanford.edu
pcg@cypress.comhttp://klamath.stanford.edu/~pankaj
2
Binary Search on Prefix Intervals1
0000 11110010 0100 0110 1000 11101010 1100
P1
P4P3
P5P2
Prefix IntervalP1 /0 0000…
1111
P2 00/2 0000…0011
P3 1/1 1000…1111
P4 1101/4 1101…1101
P5 001/3 0010…0011
1001
1. [Lampson et al., Proc. Infocom, 1998]
I1 I3 I4 I5 I6I2
3
I1
I3
I2 I4 I5
I6
0111
0011 1101
11000001
>Alphabetic Tree
1/2 1/4
1/8
1/16 1/32
1/32
>
>
>
>
0000 11110010 0100 0110 1000 11101010 1100
P1
P4P3
P5P2
1001
I1 I3 I4 I5 I6I2
4
0001
Another Alphabetic Tree
I1
I2
I5
I3
I4
I6
0111
0011
1100
1101
1/2
1/4
1/8
1/16
1/32 1/32
5
0001
Yet Another Alphabetic Tree
I1
I2
I5I3 I4 I6
0111
0011
11001101
1/2
1/4
1/8 1/321/16 1/32
6
I1
I3
I2 I4 I5
I6
0111
0011 1101
11000001
>Original
Alphabetic Tree
1/2 1/4
1/8
1/16 1/32
1/32
0000 11110010 0100 0110 1000 11101010 1100
P1
P4P3
P5P2
I1 I3 I4 I5 I6I21001
>
>
>
>
Avgtime = 2.85
Maxtime = 3
7
0001
Optimal Alphabetic Tree
I1
I2
I5
I3
I4
I6
0111
0011
1100
1101
1/2
1/4
1/8
1/16
1/32 1/32
Avgtime = 1.94
Maxtime = 5
Optimal = Minimum average lookup time
8
0001
I1
I2
I5I3 I4 I6
0111
0011
11001101
1/2
1/4
1/8 1/321/16 1/32
Optimal Depth-constrained Alphabetic
TreeAvgtime = 2
Maxtime = 4
Depth Constraint = 4
9
Desired Behavior of Algorithm
Maximum Lookup Time
Avera
ge L
ooku
p
Tim
e
logN
logN
10
Problem Statement
• Depth-constrained Huffman trees• Optimal solutions
Minimize Average Lookup Time = i lipi s.t. li D i
access time to reach leaf i
probability of accessing leaf i
Depth constraint
Related Work:
[Larmore and Przytycka94] O(nDlogn) with large constant factors.
11
Goal: Near-optimal Depth-constrained
Alphabetic Tree
• Simpler to find than an optimal solution.
• Probabilities are approximate.
Why near-optimal ?
12
Algorithm MinDPQ
Fact [Yeung91]: Given {pk}, can choose {lk} such that: H(p) C < H(p) + 2
Dlp kD
k 2
But:Depth constraint (D) violated
nkp
nkpl
k
kk 11log
,1log
2
2
iii pppH
imeavgLookupTC
log)(
13
Algorithm MinDPQ (contd.)
Original distribution {pk}, possibly pmin< 2-D
Transformed distribution {qk}, qmin 2-D
Transform Probabilities
1* s.t. is where2,max*
kkqDkp
kq
can be found in O(nlogn) time and O(n) space
Explicit Solution
22)()(** opt
kkk CpHqpDlpC
Within 2 memory accesses of optimal
14
Algorithm MinDPQ: Experimental Results
Maximum mem-accesses
Avera
ge m
em
-acc
ess
es
15
Summary of Algorithm MinDPQ
• A practical algorithm to minimize average lookup time while simultaneously keeping maximum lookup time bounded.
• Provably within two memory accesses of the optimal algorithm.
16
Lookup: What’s Used Out There?
• Overwhelming majority:– Modifications of multi-bit tries (h/w
optimized trie algorithms)– DRAM (sometimes SRAM) based,
large number of routes (>0.25M)
• Others mostly Ternary-CAM based– Smaller number of routes
Packet Classification
April 29 and May 1, 2002
High PerformanceSwitching and RoutingTelecom Center Workshop: Sept 4, 1997.
Pankaj GuptaPrincipal Architect,
Cypress Semiconductorpankaj@cs.stanford.edu
pcg@cypress.comhttp://klamath.stanford.edu/~pankaj
18
Packet Classification: Outline
• Motivation• Background and Problem
Definition • Classification Schemes
19
RFC 1812: Requirements for IPv4 Routers
• Must perform an IP datagram forwarding decision (called forwarding)
• Must send the datagram out the appropriate interface (called switching)
Optionally: a router MAY choose to perform special processing on incoming packets
20
Background
The Internet Core
IP Core router
IP Edge Router
A
C
R
Traditional Internet provides a “best-effort” service, and treats all packets going to the same destination identically
21
Motivation: Desire for Additional Services
ISP1
NAP
E1
ISP2
ISP3X
Service ExampleDifferentiated Service
Ensure that traffic from ISP2 is given higher priority over traffic from ISP3.
Packet Filtering
Deny all web traffic from ISP3 at interface X.
Policy-based routing
Ensure that all web traffic from ISP2 is sent via interface Z.
Y
Z
22
More Value added Services
• Accounting and Billing– Treat all video traffic as highest
priority and perform accounting for this type of traffic
• Committed Access Rate (rate limiting)– Rate limit WWW traffic from
interface#7 to 10Mbps
23
Special Processing Requires Identification of
Flows• All packets of a flow obey a pre-defined
rule and are processed similarly by the router
• E.g. a flow = (src-IP-address, dst-IP-address), or a flow = (dst-IP-prefix, protocol) etc.
• Router needs to identify the flow of every incoming packet and then perform appropriate special processing
24
Flow-aware vs Flow-unaware Routers
• Flow-aware router: keeps track of flows and performs similar processing on packets in a flow
• Flow-unaware router (packet-by-packet router): treats each incoming packet individually
25
Flow-aware Routers Need:
Additional mechanisms to negotiate, set-up, manage and execute on service agreements
capability to distinguish and isolate traffic belonging to different flows based on negotiated service agreements
classification
Rules or policies
26
Special processing
Control
Datapath:per-packet processing
Routing lookup
Flow-aware Router: Basic Architectural Components
Routing, resource reservation, admission control, SLAs
Packet classification
Switching
Scheduling
27
Packet Classification: Outline
• Motivation• Background and Problem
Definition • Classification Schemes
28
Packet Classification Engine
Action
--------
---- ----
--------
Predicate Action
Classifier (policy database)
Packet Classification
Incoming Packet
HEADER
29
Header Fields used for Classification
L3-SA L2-DAL2-SAL3-DA L3-PROTL4-PROTL4-DPL4-SP
Transport layer header Network layer header MAC header
DA = Destination addressSA = Source addressPROT = ProtocolSP = Source portDP = Destination port
L2 = layer 2 (e.g., Ethernet)L3 = layer 3 (e.g., IP)L4 = layer 4 (e.g., TCP)
30
Multi-field Packet Classification
Packet Classification: Find the action associated with the highest priority rule matching an incoming packet header.
Field 1 Field 2 … Field k
Action
Rule 1 5.3.40.0/21 2.13.8.11/32
… UDP A1
Rule 2 5.168.3.0/24 152.133.0.0/16
… TCP A2
… … … … … …
Rule N 5.168.0.0/16 152.0.0.0/8 … ANY AN
Example: packet (5.168.3.32, 152.133.171.71, …, TCP)
31
Formal Problem Definition
Given a classifier C with N rules, Rj, 1 j N, where Rj consists of three entities:
1) A regular expression Rj[i], 1 i d, on each of the d header fields,
2) A number, pri(Rj), indicating the priority of the rule in the classifier, and
3) An action, referred to as action(Rj).
For an incoming packet P with the header considered as a d-tuple of points (P1, P2, …, Pd), the d-dimensional packet classification problem is to find the rule Rm with the highest priority among all the rules Rj matching the d-tuple; i.e., pri(Rm) > pri(Rj), j m, 1 j N, such that Pi matches Rj[i], 1 i d. We call rule Rm the best matching rule for packet P.
32
Routing Lookup: Instance of 1D Classification
• One-dimension (destination address)
• Forwarding table classifier• Routing table entry rule• Outgoing interface action• Prefix-length priority
33
Example 4D Classifier
Rule
L3-DA L3-SA L4-DP L4-PROT
Action
R1 152.163.190.69/255.255.255.255
152.163.80.11/255.255.255.255
* * Deny
R2 152.168.3/255.255.255
152.163.200.157/255.255.255.255
eq www udp Deny
R3 152.168.3/255.255.255
152.163.200.157/255.255.255.255
range 20-21
udp Permit
R4 152.168.3/255.255.255
152.163.200.157/255.255.255.255
eq www tcp Deny
R5 * * * * Deny
34
Example Classification Results
Pkt Hdr
L3-DA L3-SA L4-DP L4-PROT
Rule, Action
P1 152.163.190.69 152.163.80.11 www tcp R1, Deny
P2 152.168.3.21 152.163.200.157
www udp R2, Deny
35
R5
Geometric Interpretation
R4
R3
R2R1
R7
Dimension 1
Dim
ensi
on 2
R6
e.g. (128.16.46.23, *)e.g. (144.24/24, 64/16)
P2 P1
Packet classification problem: Find the highest priority rectangle containing an incoming point
36
Packet Classification: Outline
• Motivation• Background and Problem
Definition • Classification Schemes
37
Metrics for Classification Algorithms
• Speed• Storage requirements• Ability to handle large classifiers• Flexibility in implementation• Low preprocessing time• Update time • Scalability in the number of header fields• Flexibility in rule specification
38
Size of Classifier?
• Microflow recognition: 128K-1M flows in a metro/edge router
• Firewall applications, <2K• Wildcarded filters, 16-128K • Depends heavily on the type of
router
39
Linear Search
• Keep rules in a linked list• O(N) storage, O(N) lookup time,
O(1) update complexity
40
Lookups/Classification with Ternary CAM
Memory array Priority
encoder
Action Memory
P32
P31
P8
PacketHeader
Action
TCAM RAM
01
2
3
M
0
1
0
0
1
1.23.11.3, tcp
1.23.x.x, x
41
Ternary CAMs
Advantages
Suitable for multiple fieldsFast: 10-16 ns (66-100 Mpps)Simple to understand
Disadvantages
Inflexible: range-to-prefix blowupDensity: largest available in 2001 was 8Mb, i.e., 64K x 128 (can be cascaded)Power: 8-12W @ 100MHzTough soft-error problemCost: $100-$150 for 8Mb
42
Rule Range Maximal Prefixes
R5 [3,11] 0011, 01**, 10**
R4 [2,7] 001*, 01**
R3 [4,11] 01**, 10**
R2 [4,7] 01**
R1 [1,15] 0001, 001*, 01**, 10**, 110*, 1110
Range-to-prefix Blowup
Rule Range
R1 [3,11]
R2 [2,7]
R3 [4,11]
R4 [4,7]
R5 [1,14]
Maximum memory blowup = factor of (2W-2)d
43
Radix Trie (Recap)
P1 111* H1
P2 10* H2
P3 1010*
H3
P4 10101
H4
P2
P3
P4
P1
A
B
C
G
D
F
H
E
1
0
0
1 1
1
1
Lookup 10111
Add P5=1110*
I
0
P5
next-hop-ptr (if prefix)
left-ptr right-ptr
Trie node
44
Example Classifier
Rule Destination Address
Source Address
R1 0* 10*
R2 0* 01*
R3 0* 1*
R4 00* 1*
R5 00* 11*
R6 10* 1*
R7 * 00*
45
Hierarchical Tries
Dimension DA
Dimension SAR5 R2 R1
R3R6
R7
R4
O(NW) memoryO(W2) lookup
Rule
DA SA
R1 0* 10*
R2 0* 01*
R3 0* 1*
R4 00* 1*
R5 00* 11*
R6 10* 1*
R7 * 00*
Search (000,010)
46
Set-pruning Tries [Tsuchiya, Sri98]
Dimension DA
Rule
DA SA
R1 0* 10*
R2 0* 01*
R3 0* 1*
R4 00* 1*
R5 00* 11*
R6 10* 1*
R7 * 00*
R7 Dimension SAR2 R1 R5 R7 R2 R1
R3
R7
R6
R7
R4
O(N2) memoryO(2W) lookup
Search (000,010)
47
Grid-of-Tries [Sri98]
Dimension DA
Dimension SAR5 R2 R1
R3R6
R7
R4
O(NW) memoryO(2W) lookup
Rule
DA SA
R1 0* 10*
R2 0* 01*
R3 0* 1*
R4 00* 1*
R5 00* 11*
R6 10* 1*
R7 * 00*
Search (000,010)
48
Grid-of-Tries
Advantages
Good solution for two dimensions
Disadvantages
Static solutionNot easily extensible to more than two dimensions
20K 2D rules: 2MB, 9 memory accesses (with expansion)
49
Bitmap-intersection [Lak98]
R4 R3 R2R11
1
0
0
1
0
1
1
R3
R4
R1
R2
P1
50
Bitmap-intersection
Advantages
Good solution for multiple dimensions, for small classifiers
Disadvantages
Static solutionLarge memory bandwidth (scales linearly in N)Large amount of memory (scales quadratically in N)Hardware-optimized
512 rules: 1Mpps with single FPGA (33MHz) and five 1Mb SRAM chips
51
Crossproducting [Sri98]
R4 R3R2
R1
54
3
2
1
6
21 7 8 94 5 63
P1
(1,3)
(8,4)
52
Crossproducting
Advantages
Fast accessesSuitable for multiple fields
Disadvantages
Large amount of memoryNeed caching for bigger classifiers (> 50 rules)
50 rules: 1.5MB, need caching (on-demand crossproducting) for bigger classifiers
Need: d 1-D lookups + 1 memory access, O(Nd) space
53
Classification Algorithms: Speed vs
Storage Tradeoff
O(log N) time with O(Nd) storage, orO(logd-1N) time with O(N) storage
Point Location: Lower bounds for N regions in d dimensions.
N = 100, d = 4, Nd = 100 MBytes and logd-1N = 350 memory accesses
54
Classification Tradeoff in Hardware
Switches/Routers• Power consumption of
classification subsystem• Cost• Speed• Density (Storage)
55
Algorithms so far: Summary
• Good for two fields, but do not scale to more than two fields, OR
• Good for very small classifiers (< 50 rules) only, OR
• Have non-deterministic classification time, OR
• Either too slow or consume too much storage
56
One Solution: Heuristics that “seem to work well
in real-life”• Recursive Flow Classification [Gupta,
McKeown 1999]• Hierarchical Intelligent Cuttings [Gupta,
McKeown 1999]• Aggregated Bit-vector [Baboescu,
Varghese 2001]• Good heuristics do better than worst-
case bounds for real-life datasets.
57
Properties of real-life classifier datasets
• Hierarchy (to at least some level)• Structure
58
Classification: What’s Used Out There?
• Majority: Ternary CAMs– High performance, cost, power,
determinstic worst-case
• Some others: Modifications of RFC– Low speed, low cost DRAM-based,
heuristic
• Some others: nothing/linear search etc.
59
Packet Classification: References
• [Lak98] T.V. Lakshman. D. Stiliadis. “High speed policy based packet forwarding using efficient multi-dimensional range matching”, Sigcomm 1998, pp 191-202
• [Sri98] V. Srinivasan, S. Suri, G. Varghese and M. Waldvogel. “Fast and scalable layer 4 switching”, Sigcomm 1998, pp 203-214
• V. Srinivasan, G. Varghese, S. Suri. “Fast packet classification using tuple space search”, Sigcomm 1999, pp 135-146
• P. Gupta, N. McKeown, “Packet classification using hierarchical intelligent cuttings,” Hot Interconnects VII, 1999
• [Gupta99] P. Gupta, N. McKeown, “Packet classification on multiple fields,” Sigcomm 1999, pp 147-160
60
Packet Classification: References (contd.)
• M. M. Buddhikot, S. Suri, and M. Waldvogel, “Space decomposition techniques for fast layer-4 switching,” Protocols for High Speed Networks, vol. 66, no. 6, pp 277-83, 1999
• A. Feldmann and S. Muthukrishnan, “Tradeoffs for packet classification,” Proc. Infocom 2000
• T. Woo, “A modular approach to packet classification: algorithms and results, “ Proc. Infocom 2000
• S. Iyer, R.R. Kompella, and A. Shelat, “ClassiPI: An architecture for fast and flexible packet classification,” IEEE Network, March/April 2001, vol. 15, no. 2, pp 33-41
• P. Gupta and N. McKeown, “Algorithms for packet classification,” IEEE Network, March/April 2001, vol. 15, no. 2, pp 24-32
• F. Baboescu and G. Varghese, “Scalable packet classification,” Proc. Sigcomm 2001
top related