Top Banner
Netconf 2005 Robert Olsson Experiments & Experiences with FIB lookup and route cache
24

Robert Olsson Experiments & Experiences with FIB lookup ...vger.kernel.org/papers/netconf-RO-2005.pdf · sizeof(struct sk_buff) size 64 bit 32 bit 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35

May 25, 2019

Download

Documents

lemien
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Robert Olsson Experiments & Experiences with FIB lookup ...vger.kernel.org/papers/netconf-RO-2005.pdf · sizeof(struct sk_buff) size 64 bit 32 bit 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35

Netconf 2005

Robert Olsson

Experiments & Experiences with FIB lookup and route cache

Page 2: Robert Olsson Experiments & Experiences with FIB lookup ...vger.kernel.org/papers/netconf-RO-2005.pdf · sizeof(struct sk_buff) size 64 bit 32 bit 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35

What we hear/got

dst cache overflow reports RCU related

mistuned, misunderstod etc.

fib_lookup complaints what to expect

BSD comparisons. Radix-tree ToS/semantic questionable

fib_hash considered bad

Page 3: Robert Olsson Experiments & Experiences with FIB lookup ...vger.kernel.org/papers/netconf-RO-2005.pdf · sizeof(struct sk_buff) size 64 bit 32 bit 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35

Getting forward :)

“Infrastructure” for test & development

stats to understand what happenstools and setups to study

Preroute patches w.. Jamal 2004 pktgen DoS, scripts w. routing table

steady Linux API work to prepare to plugin new algos. Most from DaveM.

So much research Still so little usable for Linux

Page 4: Robert Olsson Experiments & Experiences with FIB lookup ...vger.kernel.org/papers/netconf-RO-2005.pdf · sizeof(struct sk_buff) size 64 bit 32 bit 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35

FIB overview

FIB vs. dst hash performance

fib_hashfib_hlistfib_hash2fib_trieclassifier lookup?unified lookup?

Page 5: Robert Olsson Experiments & Experiences with FIB lookup ...vger.kernel.org/papers/netconf-RO-2005.pdf · sizeof(struct sk_buff) size 64 bit 32 bit 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35

fib_hash (current)

Fast - YesGeneral purposeVery integrated

Page 6: Robert Olsson Experiments & Experiences with FIB lookup ...vger.kernel.org/papers/netconf-RO-2005.pdf · sizeof(struct sk_buff) size 64 bit 32 bit 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35

fib_hlist

TutorialKISShlist with semantic_match

Very fast with small tablesFor embedded system etc?

Page 7: Robert Olsson Experiments & Experiences with FIB lookup ...vger.kernel.org/papers/netconf-RO-2005.pdf · sizeof(struct sk_buff) size 64 bit 32 bit 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35

fib_hlist performance

fib_hlist fib_hash0

50

100

150

200

250

300

350

400

450

500

550

600

650

700

750

Main title

dst cache

/24

rDoS 6 r

rDos 123kr

Note! Zero for fib_hlist :) Still decent many apps.

Page 8: Robert Olsson Experiments & Experiences with FIB lookup ...vger.kernel.org/papers/netconf-RO-2005.pdf · sizeof(struct sk_buff) size 64 bit 32 bit 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35

fib_hash2

Vargese inspired, use what got

2^24 hash lookup w. sorted hlist Makes /24 entries of plens 1-23

/0 special case. Huge...TABLE_LOCAL with a few entries

Idea was to test performance with the fastest algo we could think of.

Not for embedded system etc? :-)

Reduced it became fib_hlist

Page 9: Robert Olsson Experiments & Experiences with FIB lookup ...vger.kernel.org/papers/netconf-RO-2005.pdf · sizeof(struct sk_buff) size 64 bit 32 bit 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35

fib_hash2/route cache compare

route cache FIB lookup FIB lookup DoS0

50

100

150

200

250

300

350

400

450

500

550

600

650

700

750

Row 1

Page 10: Robert Olsson Experiments & Experiences with FIB lookup ...vger.kernel.org/papers/netconf-RO-2005.pdf · sizeof(struct sk_buff) size 64 bit 32 bit 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35

fib_trie

First trie. In theory variable key length, 32, 128 bits etc

Algo for dynamic trie written in Java. Memoryleak and stack handling were problems.

Also prefix matching based on fib_sematic match

Cisco CEF has fixed 256 childs 8-8-8-8 or 16-8-8 (GSR)LC-trie is child size is dynamic 2-12 bits seen

Need to be verified. New netlink call to do fib_lookup

Can be improved...

Page 11: Robert Olsson Experiments & Experiences with FIB lookup ...vger.kernel.org/papers/netconf-RO-2005.pdf · sizeof(struct sk_buff) size 64 bit 32 bit 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35

fib_trie performance comparison

fib_hash fib_trie0

50100150200250300350400450500550600650700

forwarding kpps

Linux 2.6.16 1 CPU used(SMP) Opteron 1.6 GHz e1000

dsh hash

5 r single flow

5 r rDoS

123kr rDoS

Preroute pathes to disable route hash

Page 12: Robert Olsson Experiments & Experiences with FIB lookup ...vger.kernel.org/papers/netconf-RO-2005.pdf · sizeof(struct sk_buff) size 64 bit 32 bit 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35

LOCAL/MAIN tables

fib_lookup() in ip_fib.h

Always looks up LOCAL table before MAIN

Extra lookup costs performance when notto localhost.

We discussed this with Alexey...

Page 13: Robert Olsson Experiments & Experiences with FIB lookup ...vger.kernel.org/papers/netconf-RO-2005.pdf · sizeof(struct sk_buff) size 64 bit 32 bit 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35

LOCAL/MAIN tables

Aver depth: 4.48Max depth: 6Leaves: 25Internal nodes: 18

Aver depth: 3.22Max depth: 7Leaves: 158936Internal nodes: 39440

Page 14: Robert Olsson Experiments & Experiences with FIB lookup ...vger.kernel.org/papers/netconf-RO-2005.pdf · sizeof(struct sk_buff) size 64 bit 32 bit 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35

Route hash/GCStrategies for GC run. Better work!!

Timer based vs on demand /proc/sys/net/ipv4/route/gc_interval /proc/sys/net/ipv4/route/gc_min_interval_ms GC without GC run. Very robust...

rt_intern_hash() cand rt_free() chain lengthto long.

ip_rt_gc_elasticity can be dynamic.... ????

total flush for fib insert/delete....

Page 15: Robert Olsson Experiments & Experiences with FIB lookup ...vger.kernel.org/papers/netconf-RO-2005.pdf · sizeof(struct sk_buff) size 64 bit 32 bit 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35

32/64 bit || sizeof(sk_buff)

32 64

0

25

50

75

100

125

150

175

200

225

250

275sizeof(struct sk_buff)

size

64 bit 32 bit

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

0.55

relative forwarding

T-put

Gcc 3.4 x86_64 vs i686 on same HW

Page 16: Robert Olsson Experiments & Experiences with FIB lookup ...vger.kernel.org/papers/netconf-RO-2005.pdf · sizeof(struct sk_buff) size 64 bit 32 bit 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35

Per device hash

Per device input route hash

isolate dev'sless lockingsame performance

output used shared hash

given up for the moment

Page 17: Robert Olsson Experiments & Experiences with FIB lookup ...vger.kernel.org/papers/netconf-RO-2005.pdf · sizeof(struct sk_buff) size 64 bit 32 bit 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35

Preroute patches

Started hacking with Jamal a year ago

Do full fib_lookup() for every packet

Lot's of interest from Paul's and peopledoing “hi-risk” hosting.

Very useful for FIB testing.

Works only with gatewayed hosts.

Page 18: Robert Olsson Experiments & Experiences with FIB lookup ...vger.kernel.org/papers/netconf-RO-2005.pdf · sizeof(struct sk_buff) size 64 bit 32 bit 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35

Skb recycling/reuse

Page 19: Robert Olsson Experiments & Experiences with FIB lookup ...vger.kernel.org/papers/netconf-RO-2005.pdf · sizeof(struct sk_buff) size 64 bit 32 bit 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35

TCP performance

4 512 1024 2048 4096 8192 16384 327680

100

200

300

400

500

600

700

800

900

1000

NAPI

Non_NAPI

2.6.11.7 SMP kernel using one CPU driver e1000 NAPI - no-NAPI. Opteron 1.6 GHz e1000 w 82546GB.

Page 20: Robert Olsson Experiments & Experiences with FIB lookup ...vger.kernel.org/papers/netconf-RO-2005.pdf · sizeof(struct sk_buff) size 64 bit 32 bit 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35

TCP performancewhen receiving DoS on other NIC

4 512 1024 2048 4096 8192 16384 327680

50

100

150

200

250

300

350

400

450

500

550

600

650

700

750

800

850

NAPI

Non_NAPI

2.6.11.7 SMP kernel using one CPU driver e1000 NAPI - no-NAPI. Opteron 1.6 GHz e1000 w 82546GB.

Page 21: Robert Olsson Experiments & Experiences with FIB lookup ...vger.kernel.org/papers/netconf-RO-2005.pdf · sizeof(struct sk_buff) size 64 bit 32 bit 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35

ipv6 performance

T-put0

50

100

150

200

250

300

350

400

450

500

550

600

650

Forwarding kpps 76 byte pkt.

Linux 2.5.12 1 CPU(SMP) Opteron 1.6 GHz e1000

Single flow small

Singe flow 543 r

rDoS 543 r

How rDoS work on sparse routing table?

Page 22: Robert Olsson Experiments & Experiences with FIB lookup ...vger.kernel.org/papers/netconf-RO-2005.pdf · sizeof(struct sk_buff) size 64 bit 32 bit 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35

Goodbye to old friends?

FASTROUTEHW-FLOWCONTROL

Page 23: Robert Olsson Experiments & Experiences with FIB lookup ...vger.kernel.org/papers/netconf-RO-2005.pdf · sizeof(struct sk_buff) size 64 bit 32 bit 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35

10 GbE early days

64 128 256 512 1024 15000

100000

200000

300000

400000

500000

600000

700000

800000

900000

1000000

TX performance IXGB

in pps

Op 1,6 NAPI

OP 1.6 noNAPI

XEON

Page 24: Robert Olsson Experiments & Experiences with FIB lookup ...vger.kernel.org/papers/netconf-RO-2005.pdf · sizeof(struct sk_buff) size 64 bit 32 bit 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35

Hi-perf filtering

Need for hi-pref stateless filteringnetfilter API

hi-pac?

tc-stuff?

netfilter API

share fib_semantic_match()