Top Banner
1 EE384Y: Packet Switch Architectures Part II Address Lookup and Classification Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford University [email protected] http://www.stanford.edu/~nickm
49

1 EE384Y: Packet Switch Architectures Part II Address Lookup and Classification Nick McKeown Professor of Electrical Engineering and Computer Science,

Mar 26, 2015

Download

Documents

Adrian Clayton
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 1 EE384Y: Packet Switch Architectures Part II Address Lookup and Classification Nick McKeown Professor of Electrical Engineering and Computer Science,

1

High PerformanceSwitching and RoutingTelecom Center Workshop: Sept 4, 1997.

EE384Y: Packet Switch ArchitecturesPart II

Address Lookup and Classification

Nick McKeownProfessor of Electrical Engineering and Computer Science, Stanford University

[email protected]://www.stanford.edu/~nickm

Page 2: 1 EE384Y: Packet Switch Architectures Part II Address Lookup and Classification Nick McKeown Professor of Electrical Engineering and Computer Science,

2

Generic Router Architecture (Review from EE384x)

LookupIP Address

UpdateHeader

Header ProcessingData Hdr Data Hdr

~1M prefixesOff-chip DRAM

AddressTable

AddressTable

IP Address Next Hop

QueuePacket

BufferMemoryBuffer

Memory~1M packetsOff-chip DRAM

Page 3: 1 EE384Y: Packet Switch Architectures Part II Address Lookup and Classification Nick McKeown Professor of Electrical Engineering and Computer Science,

3

Lookups Must be Fast

12540Gb/s2003

31.2510Gb/s2001

7.812.5Gb/s1999

1.94622Mb/s1997

40B packets (Mpkt/s)

LineYear

1. Lookup mechanism must be simple and easy to implement2. (Surprise?) Memory access time is the long-term bottleneck

Page 4: 1 EE384Y: Packet Switch Architectures Part II Address Lookup and Classification Nick McKeown Professor of Electrical Engineering and Computer Science,

4

Memory Technology (2003-04)

Technology

Single chip density

$/chip ($/MByte)

Access speed

Watts/chip

Networking DRAM

64 MB $30-$50($0.50-$0.75)

40-80ns 0.5-2W

SRAM 4 MB $20-$30($5-$8)

4-8ns 1-3W

TCAM 1 MB $200-$250($200-$250)

4-8ns 15-30W

Note: Price, speed and power are manufacturer and market dependent.

Page 5: 1 EE384Y: Packet Switch Architectures Part II Address Lookup and Classification Nick McKeown Professor of Electrical Engineering and Computer Science,

5

Lookup Mechanism is Protocol Dependent

Networking Protocol

Lookup Mechanism

Techniques

MPLS, ATM, Ethernet

Exact match search

–Direct lookup–Associative lookup–Hashing–Binary/Multi-way Search Trie/Tree

IPv4, IPv6 Longest-prefix match search

-Radix trie and variants-Compressed trie-Binary search on prefix intervals

Page 6: 1 EE384Y: Packet Switch Architectures Part II Address Lookup and Classification Nick McKeown Professor of Electrical Engineering and Computer Science,

6

Outline

I. Routing Lookups• Overview• Exact matching

– Direct lookup– Associative lookup– Hashing– Trees and tries

• Longest prefix matching– Why LPM?– Tries and compressed tries– Binary search on prefix intervals

• References

II. Packet Classification

Page 7: 1 EE384Y: Packet Switch Architectures Part II Address Lookup and Classification Nick McKeown Professor of Electrical Engineering and Computer Science,

7

Exact Matches in ATM/MPLS

VCI/MPLS-label

Addre

ss

Memory

Data

(Outgoing Port, new VCI/label)

• VCI/Label space is 24 bits- Maximum 16M addresses. With 64b data, this is 1Gb of memory.

• VCI/Label space is private to one link • Therefore, table size can be “negotiated”• Alternately, use a level of indirection

Direct Memory Lookup

Page 8: 1 EE384Y: Packet Switch Architectures Part II Address Lookup and Classification Nick McKeown Professor of Electrical Engineering and Computer Science,

8

Exact Matches in Ethernet Switches

• Layer-2 addresses are usually 48-bits long,

• The address is global, not just local to the link,

• The range/size of the address is not “negotiable” (like it is with ATM/MPLS)

• 248 > 1012, therefore cannot hold all addresses in table and use direct lookup.

Page 9: 1 EE384Y: Packet Switch Architectures Part II Address Lookup and Classification Nick McKeown Professor of Electrical Engineering and Computer Science,

9

Exact Matches in Ethernet Switches (Associative Lookup)• Associative memory (aka Content Addressable

Memory, CAM) compares all entries in parallel against incoming data.

Network address Data

AssociativeMemory(“CAM”)

Addre

ss48bitsMatch

Location

Addre

ss“Normal”Memory

Data

Port

Page 10: 1 EE384Y: Packet Switch Architectures Part II Address Lookup and Classification Nick McKeown Professor of Electrical Engineering and Computer Science,

10

Exact Matches in Ethernet SwitchesHashing

• Use a pseudo-random hash function (relatively insensitive to actual function)

• Bucket linearly searched (or could be binary search, etc.)• Leads to unpredictable number of memory references

HashingFunction

Memory

Addre

ss

Data

NetworkAddress

48

16, say Pointer

Memory

Addre

ss

DataList/Bucket

List of network addresses in this bucket

Page 11: 1 EE384Y: Packet Switch Architectures Part II Address Lookup and Classification Nick McKeown Professor of Electrical Engineering and Computer Science,

11

Exact Matches Using HashingNumber of memory references

Where:

ER Expected number of memory references=

M Number of memory addresses in table=

N Number of linked lists= M N=

M

N)

11(1

12

1

empty)not islist |list oflength Expected(2

1ER

:referencesmemory ofnumber Expected

Page 12: 1 EE384Y: Packet Switch Architectures Part II Address Lookup and Classification Nick McKeown Professor of Electrical Engineering and Computer Science,

12

Exact Matches in Ethernet SwitchesPerfect Hashing

HashingFunction

Memory

Addre

ss

Data

NetworkAddress

48

16, say Port

There always exists a perfect hash function.

Goal: With a perfect hash function, memory lookup always takes O(1) memory references.

Problem: - Finding perfect hash functions (particularly

minimal perfect hashings) is very complex. - Updates?

Page 13: 1 EE384Y: Packet Switch Architectures Part II Address Lookup and Classification Nick McKeown Professor of Electrical Engineering and Computer Science,

13

Exact Matches in Ethernet Switches

Hashing• Advantages:

– Simple– Expected lookup time is small

• Disadvantages– Inefficient use of memory– Non-deterministic lookup time

Attractive for software-based switches, but decreasing use in hardware platforms

Page 14: 1 EE384Y: Packet Switch Architectures Part II Address Lookup and Classification Nick McKeown Professor of Electrical Engineering and Computer Science,

14

Exact Matches in Ethernet Switches Trees and Tries

Binary Search Tree

< >

< > < >

log

2 NN entries

Binary Search Trie

0 1

0 1 0 1

111010

Lookup time bounded and independent of table size, storage

is O(NW)

Lookup time dependent on table size, but independent of address length, storage is O(N)

Page 15: 1 EE384Y: Packet Switch Architectures Part II Address Lookup and Classification Nick McKeown Professor of Electrical Engineering and Computer Science,

15

Exact Matches in Ethernet Switches Multiway tries

16-ary Search Trie

0000, ptr 1111, ptr

0000, 0 1111, ptr

000011110000

0000, 0 1111, ptr

111111111111

Ptr=0 means no children

Q: Why can’t we just make it a 248-ary trie?

Page 16: 1 EE384Y: Packet Switch Architectures Part II Address Lookup and Classification Nick McKeown Professor of Electrical Engineering and Computer Science,

16

Exact Matches in Ethernet Switches

Multiway tries

Degree ofTree

# MemReferences

# Nodes(x106)

Total Memory(Mbytes)

FractionWasted (%)

2 48 1.09 4.3 494 24 0.53 4.3 738 16 0.35 5.6 8616 12 0.25 8.3 9364 8 0.17 21 98256 6 0.12 64 99.5

Table produced from 215 randomly generated 48-bit addresses

As degree increases, more and more pointers are “0”

Page 17: 1 EE384Y: Packet Switch Architectures Part II Address Lookup and Classification Nick McKeown Professor of Electrical Engineering and Computer Science,

17

Exact Matches in Ethernet Switches Trees and Tries

• Advantages:– Fixed lookup time– Simple to implement and update

• Disadvantages– Inefficient use of memory and/or

requires large number of memory references

Page 18: 1 EE384Y: Packet Switch Architectures Part II Address Lookup and Classification Nick McKeown Professor of Electrical Engineering and Computer Science,

18

Outline

I. Routing Lookups• Overview• Exact matching

– Direct lookup– Associative lookup– Hashing– Trees and tries

• Longest prefix matching– Why LPM?– Tries and compressed tries– Binary search on prefix intervals

• References

II. Packet Classification

Page 19: 1 EE384Y: Packet Switch Architectures Part II Address Lookup and Classification Nick McKeown Professor of Electrical Engineering and Computer Science,

19

Longest Prefix Matching: IPv4 Addresses

• 32-bit addresses• Dotted quad notation: e.g.

12.33.32.1• Can be represented as integers on

the IP number line [0, 232-1]: a.b.c.d denotes the integer: (a*224+b*216+c*28+d)

0.0.0.0 255.255.255.255IP Number Line

Page 20: 1 EE384Y: Packet Switch Architectures Part II Address Lookup and Classification Nick McKeown Professor of Electrical Engineering and Computer Science,

20

Class-based Addressing

A B C D

0.0.0.0

E

128.0.0.0 192.0.0.0

Class Range MS bits netid hostidA 0.0.0.0 –

128.0.0.00 bits 1-7 bits 8-31

B 128.0.0.0 -191.255.255.255

10 bits 2-15 bits 16-31

C 192.0.0.0 -223.255.255.255

110 bits 3-23 bits 24-31

D (multicast)

224.0.0.0 - 239.255.255.255

1110 - -

E (reserved)

240.0.0.0 -255.255.255.255

11110 - -

Page 21: 1 EE384Y: Packet Switch Architectures Part II Address Lookup and Classification Nick McKeown Professor of Electrical Engineering and Computer Science,

21

Lookups with Class-based Addresses

23

186.21

Port 1

Port 2192.33.32.1

Class A

Class B

Class C

192.33.32 Port 3Exact match

netid port#

Page 22: 1 EE384Y: Packet Switch Architectures Part II Address Lookup and Classification Nick McKeown Professor of Electrical Engineering and Computer Science,

22

Problems with Class-based Addressing

• Fixed netid-hostid boundaries too inflexible– Caused rapid depletion of address

space

• Exponential growth in size of routing tables

Page 23: 1 EE384Y: Packet Switch Architectures Part II Address Lookup and Classification Nick McKeown Professor of Electrical Engineering and Computer Science,

23

Early Exponential Growth in Routing Table Sizes

Num

ber

of

BG

P r

oute

s advert

ised

Page 24: 1 EE384Y: Packet Switch Architectures Part II Address Lookup and Classification Nick McKeown Professor of Electrical Engineering and Computer Science,

24

Classless Addressing (and CIDR)

• Eliminated class boundaries• Introduced the notion of a variable

length prefix between 0 and 32 bits long

• Prefixes represented by P/l: e.g., 122/8, 212.128/13, 34.43.32/22, 10.32.32.2/32 etc.

• An l-bit prefix represents an aggregation of 232-l IP addresses

Page 25: 1 EE384Y: Packet Switch Architectures Part II Address Lookup and Classification Nick McKeown Professor of Electrical Engineering and Computer Science,

25

CIDR:Hierarchical Route Aggregation

Backbone

Router

R1R2

R3R4

ISP, P ISP, Q192.2.0/22 200.11.0/22

Site, S

192.2.1/24

Site, T

192.2.2/24 192.2.0/22 200.11.0/22

192.2.1/24 192.2.2/24

192.2.0/22, R2

Backbone routing table

IP Number Line

R2

Page 26: 1 EE384Y: Packet Switch Architectures Part II Address Lookup and Classification Nick McKeown Professor of Electrical Engineering and Computer Science,

26

Post-CIDR Routing Table sizes

Source: http://www.cidr-report.org/

Page 27: 1 EE384Y: Packet Switch Architectures Part II Address Lookup and Classification Nick McKeown Professor of Electrical Engineering and Computer Science,

27

Routing Lookups with CIDR

192.2.0/22, R2

192.2.2/24, R3 192.2.0/22 200.11.0/22

192.2.2/24

200.11.0/22, R4

200.11.0.33192.2.0.1 192.2.2.100

LPM: Find the most specific route, or the longest matching prefix among all the prefixes matching the destination address of an incoming packet

Page 28: 1 EE384Y: Packet Switch Architectures Part II Address Lookup and Classification Nick McKeown Professor of Electrical Engineering and Computer Science,

28

Longest Prefix Match is Harder than Exact Match

• The destination address of an arriving packet does not carry with it the information to determine the length of the longest matching prefix

• Hence, one needs to search among the space of all prefix lengths; as well as the space of all prefixes of a given length

Page 29: 1 EE384Y: Packet Switch Architectures Part II Address Lookup and Classification Nick McKeown Professor of Electrical Engineering and Computer Science,

29

LPM in IPv4Use 32 exact match algorithms for LPM!

Exact matchagainst prefixes

of length 1

Exact matchagainst prefixes

of length 2

Exact matchagainst prefixes

of length 32

Network Address PortPriorityEncodeand pick

Page 30: 1 EE384Y: Packet Switch Architectures Part II Address Lookup and Classification Nick McKeown Professor of Electrical Engineering and Computer Science,

30

Metrics for Lookup Algorithms• Speed (= number of memory accesses)• Storage requirements (= amount of

memory)• Low update time (support ~5K updates/s)• Scalability

– With length of prefix: IPv4 unicast (32b), Ethernet (48b), IPv4 multicast (64b), IPv6 unicast (128b)

– With size of routing table: (sweetspot for today’s designs = 1 million)

• Flexibility in implementation• Low preprocessing time

Page 31: 1 EE384Y: Packet Switch Architectures Part II Address Lookup and Classification Nick McKeown Professor of Electrical Engineering and Computer Science,

31

Radix Trie

P1 111* H1

P2 10* H2

P3 1010*

H3

P4 10101

H4

P2

P3

P4

P1

A

B

C

G

D

F

H

E

1

0

0

1 1

1

1

Lookup 10111

Add P5=1110*

I

0

P5

next-hop-ptr (if prefix)

left-ptr right-ptr

Trie node

Page 32: 1 EE384Y: Packet Switch Architectures Part II Address Lookup and Classification Nick McKeown Professor of Electrical Engineering and Computer Science,

32

Radix Trie

• W-bit prefixes: O(W) lookup, O(NW) storage and O(W) update complexity

Advantages

SimplicityExtensible to wider fields

Disadvantages

Worst case lookup slowWastage of storage space in chains

Page 33: 1 EE384Y: Packet Switch Architectures Part II Address Lookup and Classification Nick McKeown Professor of Electrical Engineering and Computer Science,

33

Leaf-pushed Binary Trie

A

B

C

G

D

E

1

0

0

1

1

left-ptr or next-hop

Trie node

right-ptr or next-hop

P2

P4P3

P2

P1P1 111* H1

P2 10* H2

P3 1010*

H3

P4 10101

H4

Page 34: 1 EE384Y: Packet Switch Architectures Part II Address Lookup and Classification Nick McKeown Professor of Electrical Engineering and Computer Science,

34

PATRICIA

2A

B C

E

10

1

Patricia tree internal node

3

P3

P2

P4

P110

0F G

D5

bit-position

left-ptr right-ptr

Lookup 10111

P1 111* H1

P2 10* H2

P3 1010*

H3

P4 10101

H4

Bitpos 12345

Page 35: 1 EE384Y: Packet Switch Architectures Part II Address Lookup and Classification Nick McKeown Professor of Electrical Engineering and Computer Science,

35

• W-bit prefixes: O(W2) lookup, O(N) storage and O(W) update complexity

Advantages

Decreased storage Extensible to wider fields

Disadvantages

Worst case lookup slowBacktracking makes implementation complex

PATRICIA

Page 36: 1 EE384Y: Packet Switch Architectures Part II Address Lookup and Classification Nick McKeown Professor of Electrical Engineering and Computer Science,

36

Path-compressed Tree

1, , 2A

B C10

10,P2,4

P4

P1

1

0

E

D1010,P3,5

bit-position

left-ptr right-ptr

variable-length bitstring

next-hop (if prefix present)

Path-compressed tree node structure

Lookup 10111

P1 111* H1

P2 10* H2

P3 1010*

H3

P4 10101

H4

Page 37: 1 EE384Y: Packet Switch Architectures Part II Address Lookup and Classification Nick McKeown Professor of Electrical Engineering and Computer Science,

37

• W-bit prefixes: O(W) lookup, O(N) storage and O(W) update complexity

Advantages

Decreased storage

Disadvantages

Worst case lookup slow

Path-compressed Tree

Page 38: 1 EE384Y: Packet Switch Architectures Part II Address Lookup and Classification Nick McKeown Professor of Electrical Engineering and Computer Science,

38

Multi-bit Tries

Depth = WDegree = 2Stride = 1 bit

Binary trieW

Depth = W/kDegree = 2k

Stride = k bits

Multi-ary trie

W/k

Page 39: 1 EE384Y: Packet Switch Architectures Part II Address Lookup and Classification Nick McKeown Professor of Electrical Engineering and Computer Science,

39

Prefix Expansion with Multi-bit Tries

If stride = k bits, prefix lengths that are not a multiple of k need to be expanded

Prefix Expanded prefixes

0* 00*, 01*

11* 11*

E.g., k = 2:

Maximum number of expanded prefixes corresponding to one non-expanded prefix = 2k-

1

Page 40: 1 EE384Y: Packet Switch Architectures Part II Address Lookup and Classification Nick McKeown Professor of Electrical Engineering and Computer Science,

40

Four-ary Trie (k=2)

P2

P3 P12

A

B

F11

next-hop-ptr (if prefix)

ptr00 ptr01

A four-ary trie node

P11

10

P42

H11

P41

10

10

1110

D

C

E

G

ptr10 ptr11

Lookup 10111

P1 111* H1

P2 10* H2

P3 1010*

H3

P4 10101

H4

Page 41: 1 EE384Y: Packet Switch Architectures Part II Address Lookup and Classification Nick McKeown Professor of Electrical Engineering and Computer Science,

41

Prefix Expansion Increases Storage Consumption

• Replication of next-hop ptr• Greater number of unused (null)

pointers in a node

Time ~ W/kStorage ~ NW/k * 2k-1

Page 42: 1 EE384Y: Packet Switch Architectures Part II Address Lookup and Classification Nick McKeown Professor of Electrical Engineering and Computer Science,

42

Generalization: Different Strides at Each Trie Level

• 16-8-8 split• 4-10-10-8 split• 24-8 split• 21-3-8 split

Optional Exercise: Why does this not work well for IPv6?

Page 43: 1 EE384Y: Packet Switch Architectures Part II Address Lookup and Classification Nick McKeown Professor of Electrical Engineering and Computer Science,

43

Choice of Strides: Controlled Prefix Expansion [Sri98]

Given a forwarding table and a desired number of memory accesses in the worst case (i.e., maximum tree depth, D)

A dynamic programming algorithm to compute the optimal sequence of strides that minimizes the storage requirements: runs in O(W2D) timeAdvantages

Optimal storage under these constraints

Disadvantages

Updates lead to sub-optimality anywayHardware implementation difficult

Page 44: 1 EE384Y: Packet Switch Architectures Part II Address Lookup and Classification Nick McKeown Professor of Electrical Engineering and Computer Science,

44

Binary Search on Prefix Intervals [Lampson98]

0000 11110010 0100 0110 1000 11101010 1100

P1

P4P3

P5P2

Prefix IntervalP1 /0 0000…

1111

P2 00/2 0000…0011

P3 1/1 1000…1111

P4 1101/4 1101…1101

P5 001/3 0010…0011

1001

I1 I3 I4 I5 I6I2

Page 45: 1 EE384Y: Packet Switch Architectures Part II Address Lookup and Classification Nick McKeown Professor of Electrical Engineering and Computer Science,

45

I1

I3

I2 I4 I5

I6

0111

0011 1101

11000001

>Alphabetic Tree

1/2 1/4

1/8

1/16 1/32

1/32

>

>

>

>

0000 11110010 0100 0110 1000 11101010 1100

P1

P4P3

P5P2

1001

I1 I3 I4 I5 I6I2

Page 46: 1 EE384Y: Packet Switch Architectures Part II Address Lookup and Classification Nick McKeown Professor of Electrical Engineering and Computer Science,

46

0001

Another Alphabetic Tree

I1

I2

I5

I3

I4

I6

0111

0011

1100

1101

1/2

1/4

1/8

1/16

1/32 1/32

Page 47: 1 EE384Y: Packet Switch Architectures Part II Address Lookup and Classification Nick McKeown Professor of Electrical Engineering and Computer Science,

47

Advantages

Storage is linearCan be ‘balanced’Lookup time independent of W

Disadvantages

But, lookup time is dependent on NIncremental updates complexEach node is big in size: requires higher memory bandwidth

•W-bit N prefixes: O(logN) lookup, O(N) storage

Multiway Search on Intervals

Page 48: 1 EE384Y: Packet Switch Architectures Part II Address Lookup and Classification Nick McKeown Professor of Electrical Engineering and Computer Science,

48

Routing Lookups: References

• [lulea98] A. Brodnik, S. Carlsson, M. Degermark, S. Pink. “Small Forwarding Tables for Fast Routing Lookups”, Sigcomm 1997, pp 3-14. [Example of techniques for decreasing storage consumption]

• [gupta98] P. Gupta, S. Lin, N.McKeown. “Routing lookups in hardware at memory access speeds”, Infocom 1998, pp 1241-1248, vol. 3. [Example of hardware-optimized trie with increased storage consumption]

• P. Gupta, B. Prabhakar, S. Boyd. “Near-optimal routing lookups with bounded worst case performance,” Proc. Infocom, March 2000 [Example of deliberately skewing alphabetic trees]

• P. Gupta, “Algorithms for routing lookups and packet classification”, PhD Thesis, Ch 1 and 2, Dec 2000, available at http://yuba.stanford.edu/ ~pankaj/phd.html [Background and introduction to LPM]

Page 49: 1 EE384Y: Packet Switch Architectures Part II Address Lookup and Classification Nick McKeown Professor of Electrical Engineering and Computer Science,

49

Routing lookups : References (contd)

• [lampson98] B. Lampson, V. Srinivasan, G. Varghese. “ IP lookups using multiway and multicolumn search”, Infocom 1998, pp 1248-56, vol. 3.

• [LC-trie] S. Nilsson, G. Karlsson. “Fast address lookup for Internet routers”, IFIP Intl Conf on Broadband Communications, Stuttgart, Germany, April 1-3, 1998.

• [sri98] V. Srinivasan, G.Varghese. “Fast IP lookups using controlled prefix expansion”, Sigmetrics, June 1998.

• [wald98] M. Waldvogel, G. Varghese, J. Turner, B. Plattner. “Scalable high speed IP routing lookups”, Sigcomm 1997, pp 25-36.