1 Scalable high-throughput SRAM-based architecture for IP-lookup using FPGA Author: Hoang Le; Weirong Jiang; Prasanna, V.K.; Publisher: FPL 2008. Field Programmable Logic and Applications, 2008. Presenter: Yu-Ping Chiang Date: 2008/12/03
Dec 20, 2015
1
Scalable high-throughput SRAM-based architecture for
IP-lookup using FPGA
Author: Hoang Le; Weirong Jiang; Prasanna, V.K.;Publisher: FPL 2008. Field Programmable Logic and Applications, 2008. Presenter: Yu-Ping ChiangDate: 2008/12/03
2
Outline
Binary-tree-based IP LookupMappingSearching
ArchitectureCache based
PerformanceThroughputComparison
3
Binary-tree-based IP Lookup Base on Binary Search Tree Property
Each node has a value. Left sub-trie nodes contain only smaller values. Right sub-trie nodes contain only greater values. Element can found in (1+logN) operations.
Pre-compute Pad prefixes to 32 bits with 1s. Padded bits. Sort with
concatenation of prefix and padded bits.
4
Binary-tree-based IP Lookup Build Binary Search Tree
Full binary tree without last level. Left-aligned .
=>complete tree
12-N
levellast in nodes of #
1log
(height) levels of # n
Nodes of # N
1-n
2
N
2
2 1
n
2
2 1
n
5
Binary-tree-based IP Lookup
.
12 2 n
△12 2 n
12 2n
12-N
levellast in nodes of #
1log
(height) levels of # n
Nodes of # N
1-n
2
N
2
2 1
n
△
6
Binary-tree-based IP Lookup .
12 2 n
12 2 n
12 2n
12 1 n
12 1 n
12-N
levellast in nodes of #
1log
(height) levels of # n
Nodes of # N
1-n
2
N
2
2 1
n
△
7
Binary-tree-based IP Lookup Recursive find root
12 2 n
12 2 n
12 2n
12 1 n
12 1 n
12-N
levellast in nodes of #
1log
(height) levels of # n
Nodes of # N
1-n
2
N
8
Binary-tree-based IP Lookup
422
1)12(8
418log
8
14
14
2
n
NStep 1:
x = 4-1+1 =4
4
01111111/011 (Prefix length)
9
Binary-tree-based IP Lookup
222
1)12(4
314log
4
13
13
2
n
NStep 2:
x = 2-1+1 =2
01011111/1014
6
10
Binary-tree-based IP Lookup
4
6
7
8
2
5 3 1
11
Binary-tree-based IP Lookup Search
4
6
7
8
2
5 3 1
01111111/011Step 1: IP=01001010
≦ >Not MATCH!!
12
Binary-tree-based IP Lookup Search
4
6
7
8
2
5 3 1
01011111/101
Step 2: IP=01001010
>≦Not MATCH!!
13
Binary-tree-based IP Lookup Search
4
6
7
8
2
5 3 1
01001111/101
Step 3: IP=01001010
≦Match!!Continue search for longer matching.
14
Binary-tree-based IP Lookup Search
4
6
7
8
2
5 3 100011111/011
Step 4: IP=01001010
≦Match!!
Not MATCH!!
15
Binary-tree-based IP Lookup Search:
4
6
7
8
2
5 3 1
Match!!
Property:
ex: 1011* and 101*→10111111 and 10111111 (=)
1010* and 101*→10101111 and 10111111 (<)Left brench
16
Outline
Binary-tree-based IP LookupMappingSearching
ArchitectureCache based
PerformanceThroughputComparison
17
Architecture Pipelining
Memory of each stage contains one Binary Search Tree level nodes.
Dual read/write port Content of each entry:
Padded prefix Prefix length
Data forward to next stage: IP address Memory address Previously longest matched prefix information.
18
Architecture Pipelining
Memory of each stage contains one Binary Search Tree level nodes.
Dual read/write port Content of each entry:
Padded prefix Prefix length
Data forward to next stage: IP address Memory address Previously longest matched prefix information.
19
Architecture Cache based
Most recently searched packets. Update when:
Route update related to cached entry. Cache miss.
20
Outline
Binary-tree-based IP LookupMappingSearching
ArchitectureCache based
PerformanceThroughputComparison
21
Performance Throughput
Without caching 324 MLPS, 100 Gbps 162 MHz Minimum packet size of 40 bytes.
With 1% routing entries cached 4 packets processed per clock
=> 4*324=1.3GLPS, 416 Gbps
22
Performance Comparison
Architecture # slices BRAM # prefix Throughput
Ring architecture 1405(2.3%) 530 80K 125 MLPS
State of art on FPGA 14274(22.7%) 254 80K 263 MLPS
Non-cache-based 2009(3.2%) 539 228K 324 MLPS
Cache-based 7982(12.7%) 539 228K 1.3 GLPS
Non-cache-based with SRAM 1813(2.9%) 311 2M 324 MLPS
Cache-based with SRAM 7713(12.3%) 311 2M 1.3 GLPS