1 Address Lookup and Classification EE384Y May 25, 2006 Pankaj Gupta Principal Architect and Member of Technical Staff, Netlogic Microsystems [email protected] http://klamath.stanford.edu/~pankaj
Feb 05, 2016
1
Address Lookup and Classification
EE384Y
May 25, 2006High PerformanceSwitching and RoutingTe le c o m C e n te r W o rksh o p : S ep t 4 , 1 9 9 7 .
Pankaj GuptaPrincipal Architect and Member of Technical
Staff, Netlogic Microsystems
[email protected]://klamath.stanford.edu/~pankaj
2
Outline
I. Routing LookupsII. Packet Classification
• Motivation and problem definition• Classification algorithms
– Linear search– Associative search (TCAM)– Trie-based techniques– Crossproducting– Tradeoffs in classification– Heuristic algorithms
• References
3
Motivation: Desire for Additional Services
ISP1NAP
E1
ISP2
ISP3X
Service ExampleDifferentiated Service
Ensure that traffic from ISP2 is given higher priority over traffic from ISP3.
Packet Filtering Deny all web traffic from ISP3 at interface X.
Policy-based routing
Ensure that all web traffic from ISP2 is sent via interface Z.
Y
Z
Other examples: Accounting & billing, rate-limiting, etc.
4
Special Processing Requires Identification of Flows
• All packets of a flow obey a pre-defined rule and are processed similarly by the router
• E.g. a flow = (src-IP-address, dst-IP-address), or a flow = (dst-IP-prefix, protocol) etc.
• Router needs to identify the flow of every incoming packet and then perform appropriate special processing based on negotiated service agreements
Classification
Rules or policies (aka ACL entries,
filters)
5
Special processing
Control
Datapath:(per-packet processing)
Routing lookup
Flow-aware Router: Basic Architectural Components
Routing, resource reservation, admission control, SLAs
Packet classification
Switching
Scheduling
6
Multi-field Packet Classification
Packet Classification: Find the action associated with the highest priority rule matching an incoming packet header.
Field 1 Field 2 … Field k
Action
Rule 1 5.3.40.0/21 2.13.8.11/32
… UDP A1
Rule 2 5.168.3.0/24 152.133.0.0/16
… TCP A2
… … … … … …
Rule N 5.168.0.0/16 152.0.0.0/8 … ANY AN
Example: packet (5.168.3.32, 152.133.171.71, …, TCP)
L3-DA L3-SA L4-PROT
7
Formal Problem Definition
Given a classifier C with N rules, Rj, 1 j N, where Rj consists of three entities:
1) A regular expression Rj[i], 1 i d, on each of the d header fields,
2) A number, pri(Rj), indicating the priority of the rule in the classifier, and
3) An action, referred to as action(Rj).
For an incoming packet P with the header considered as a d-tuple of points (P1, P2, …, Pd), the d-dimensional packet classification problem is to find the rule Rm with the highest priority among all the rules Rj matching the d-tuple; i.e., pri(Rm) > pri(Rj), j m, 1 j N, such that Pi matches Rj[i], 1 i d. We call rule Rm the best matching rule for packet P.
8
Routing Lookup: Instance of 1D Classification
• One-dimension (destination address)
• Forwarding table classifier• Routing table entry rule• Outgoing interface action• Prefix-length priority
9
Example 4D Classifier
Rule
L3-DA L3-SA L4-DP L4-PROT
Action
R1 152.163.190.69/255.255.255.255
152.163.80.11/255.255.255.255
* * Deny
R2 152.168.3/255.255.255
152.163.200.157/255.255.255.255
eq www udp Deny
R3 152.168.3/255.255.255
152.163.200.157/255.255.255.255
range 20-21
udp Permit
R4 152.168.3/255.255.255
152.163.200.157/255.255.255.255
eq www tcp Deny
R5 * * * * Deny
10
Example Classification Results
Pkt Hdr
L3-DA L3-SA L4-DP L4-PROT
Rule, Action
P1 152.163.190.69 152.163.80.11 www tcp R1, Deny
P2 152.168.3.21 152.163.200.157
www udp R2, Deny
11
R5
Geometric Interpretation
R4
R3
R1R2
R7
Dimension 1
Dim
ensi
on 2
R6
e.g. (128.16.46.23, *)
e.g. (144.24/24, 64/16)
P2 P1
Packet classification problem: Find the highest priority rectangle containing an incoming point
12
Outline
I. Routing LookupsII. Packet Classification
• Motivation and problem definition• Classification algorithms
– Linear search– Associative search (TCAM)– Trie-based techniques– Crossproducting– Tradeoffs in classification– Heuristic algorithms
• References
13
Metrics for Classification Algorithms
• Speed• Storage requirements• Ability to handle large classifiers• Low preprocessing time• Update time• Scalability in the number of header
fields• Flexibility in rule specification
14
Size/Update-rate of Classifier?
• Micro-flow recognition– 128K-1M flows in a metro/edge router– Also requires high update rate (but have
few wildcards)
• Firewall applications – <2K rules per interface– Requires low update rate (usually
configured at start-up/boot-up time)
• Depends heavily on the type of router
15
Linear Search
• Keep rules in a linked list• O(N) storage, O(N) lookup time,
O(1) update complexity
16
Ternary Match Operation
• Each TCAM entry stores a value, V, and mask, M• Hence, two bits (Vi and Mi) for each bit position i (i=1..W)• For an incoming packet header, H = {Hi}, the TCAM entry outputsa match if Hi matches Vi in each bit position for which Mi equals ‘1’.
Vi Mi Match in bit position I ?
X 0 Yes
0 1 Iff (Hi==0)
1 1 Iff (Hi==1)
Optional Exercise: What is the logic equation for Z (boolean variable denoting whether a TCAM entry matched)?
Optional Exercise: What is the logic equation for Z (boolean variable denoting whether a TCAM entry matched), if instead of (Vi, Mi) we store (Ai,Bi) where (0,0) = always match, (1,1) = always mismatch, (0,1) = match0, and (1,0) = match1
17
Lookups/Classification with Ternary CAM
Memory array Priority
encoder
Action MemoryPacket
HeaderAction
TCAM RAM
01
2
3
M
0
1
0
0
1
1.23.11.3, tcp
1.23.x.x, x
P32
P31
P8
For LPM
18
Maximal Prefixes
0011, 01**, 10**
001*, 01**
01**, 10**
01**
0001, 001*, 01**, 10**, 110*, 1110
Range-to-prefix Blowup
Rule Range
R1 [3,11]
R2 [2,7]
R3 [4,11]
R4 [4,7]
R5 [1,14]
Maximum memory blowup = factor of (2W-2)d
Luckily, real-life does not see too many arbitrary ranges.
19
TCAMs
Advantages
Extensible to multiple fieldsFast: 6-8 ns today (133-150 searches per second) going to 250 MspsSimple to understand and use
Disadvantages
Inflexible: range-to-prefix blowupPower: ~15-20W @ 100MspsCost: $200-$250 for ~2MByteDensity: largest available in 2006 is ~2MB, i.e., 128K x 128 (can be cascaded)Tough memory soft-error problem
20
Example Classifier
Rule Destination Address
Source Address
R1 0* 10*
R2 0* 01*
R3 0* 1*
R4 00* 1*
R5 00* 11*
R6 10* 1*
R7 * 00*
21
Hierarchical Tries
Dimension DA
O(NW) memoryO(W2) lookup
Rule
DA SA
R1 0* 10*
R2 0* 01*
R3 0* 1*
R4 00* 1*
R5 00* 11*
R6 10* 1*
R7 * 00*
Search (000,010)
Dimension SAR5 R2 R1
R3R6
R7
R4
22
Set-pruning Tries [Tsuchiya, Sri98]
Dimension DA
Rule
DA SA
R1 0* 10*
R2 0* 01*
R3 0* 1*
R4 00* 1*
R5 00* 11*
R6 10* 1*
R7 * 00*
R7 Dimension SAR2 R1 R5 R7 R2 R1
R3
R7
R6
R7
R4
O(N2) memoryO(2W) lookup
Search (000,010)
23
Grid-of-Tries [Sri98]
Dimension DA
Dimension SAR5 R2 R1
R3R6
R7
R4
O(NW) memoryO(2W) lookup
Rule
DA SA
R1 0* 10*
R2 0* 01*
R3 0* 1*
R4 00* 1*
R5 00* 11*
R6 10* 1*
R7 * 00*
Search (000,010)
24
Grid-of-Tries
Advantages
Good solution for two dimensions
Disadvantages
Difficult to carry out updatesNot easily extensible to more than two dimensions
20K 2D rules: 2MB, 9 memory accesses (with prefix-expansion)
25
Crossproducting [Sri98]
R4 R3R2
R1
54
3
2
1
6
21 7 8 94 5 63
P1
(1,3)
(8,4)
26
Crossproducting
Advantages
Fast accessesSuitable for multiple fields
Disadvantages
Large amount of memoryNeed caching for bigger classifiers (> 50 rules)
50 rules: 1.5MB, need caching (on-demand crossproducting) for bigger classifiers
Need: d 1-D lookups + 1 memory access, O(Nd) space
27
Outline
I. Routing LookupsII. Packet Classification
• Motivation and problem definition• Classification algorithms
– Linear search– Associative search (TCAM)– Trie-based techniques– Crossproducting– Tradeoffs in classification– Heuristic algorithms
• References
28
Classification Algorithms: Speed vs. Storage Tradeoff
O(log N) time with O(Nd) storage, orO(logd-1N) time with O(N) storage
Lower bounds for Point Location in N regions with d dimensions from Computational Geometry
N = 100, d = 4, Nd = 100 MBytes and logd-1N = 350 memory accesses
29
One Solution: Heuristics that “seem to work well in real-life”
• Recursive Flow Classification [Gupta, McKeown 1999]– Generalization of crossproducting to conserve
storage
• Hierarchical Intelligent Cuttings [Gupta, McKeown 1999]
• Aggregated Bit-vector [Baboescu, Varghese 2001]
• HyperCuts [Singh, Baboescu, Varghese2003]• Good heuristics do better than worst-case
bounds for real-life datasets.
• Hierarchy (to at least some level)• Structure
Properties of real-life classifiers:
30
How Well Do Heuristics Do?
• Very well at low speeds– E.g., Hypercuts can process ~20K rules in
five dimensions using about 9Mb of memory in ~20 memory accesses (i.e., ~15 Million searches per second)
• At high speeds, occupy too much (and classifier-dependent) storage– E.g., RFC can process ~1K rules in five
dimensions using ~16Mb memory in ~6 memory accesses (i.e., ~50 million searches per second)
31
Classification: What’s Used Out There?
• Majority of hardware platforms: TCAMs– High performance, cost, power, determinstic
worst-case
• Some others: Modifications of RFC– Low speed, low cost DRAM-based, heuristic– Works well in software platforms
• Some others: HyperCuts/HiCuts• Others: nothing/linear search/simulated-
parallel-search etc.
32
Lookup: What’s Used Out There?
• Overwhelming majority of routers:– Modifications of multi-bit tries (h/w
optimized trie algorithms)– DRAM (sometimes SRAM) based,
large number of routes (>0.25M)– Parallelism required for speed/storage
becomes an issue• Others mostly TCAM based
– Allows sharing the same TCAM for both lookup and classification
33
Packet Classification: References
• F. Baboescu and G. Varghese, “Scalable packet classification,” Proc. Sigcomm 2001
• [Lak98] T.V. Lakshman. D. Stiliadis. “High speed policy based packet forwarding using efficient multi-dimensional range matching”, Sigcomm 1998, pp 191-202
• K. Lakshminarayanan, A. Rangarajan and S. Venkatachary. “Algorithms for advanced packet classification with Ternary CAMs”, Sigcomm 2005.
• [Sri98] V. Srinivasan, S. Suri, G. Varghese and M. Waldvogel. “Fast and scalable layer 4 switching”, Sigcomm 1998, pp 203-214 [Grid-of-tries, crossproducting]
• V. Srinivasan, G. Varghese, S. Suri. “Fast packet classification using tuple space search”, Sigcomm 1999, pp 135-146
• P. Gupta, N. McKeown, “Packet classification using hierarchical intelligent cuttings,” Hot Interconnects VII, 1999
• [Gupta99] P. Gupta, N. McKeown, “Packet classification on multiple fields,” Sigcomm 1999, pp 147-160 [RFC]
34
Packet Classification: References (contd.)
• P. Gupta, “Algorithms for routing lookups and packet classification”, PhD Thesis, Ch 1 and 4, Dec 2000, available at http://yuba.stanford.edu/ ~pankaj/phd.html [Background and introduction to Classification]
• P. Gupta and N. McKeown, “Algorithms for packet classification,” IEEE Network, March/April 2001, vol. 15, no. 2, pp 24-32
• S. Singh, F. Baboescu, G. Varghese and J. Wang, “Packet classification using multidimensional cutting,” Proc. ACM Sigcomm 2003. [HyperCuts]
• S. Iyer, R.R. Kompella, and A. Shelat, “ClassiPI: An architecture for fast and flexible packet classification,” IEEE Network, March/April 2001, vol. 15, no. 2, pp 33-41
• TCAM vendors: netlogicmicro.com, idt.com