Top Banner
Embracing the BSD Routing Table Martin Pieuchot [email protected] EuroBSDcon, Belgrade September 2016
54

Embracing the BSD Routing Table

Feb 11, 2017

Download

Documents

vanduong
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Embracing the BSD Routing Table

Embracing the BSD Routing Table

Martin [email protected]

EuroBSDcon, Belgrade

September 2016

Page 2: Embracing the BSD Routing Table

Embracing the BSD Routing Table

How many global data structures do you need?

2 of 20

Page 3: Embracing the BSD Routing Table

Agenda

BSD Routing Table

Refined Interface

New data structures

Conclusion

3 of 20

Page 4: Embracing the BSD Routing Table

Agenda

BSD Routing Table

Refined Interface

New data structures

Conclusion

4 of 20

Page 5: Embracing the BSD Routing Table

Forwarding tablesys/net/radix.c

Input For me?

Forward?

no

Deliveryes

Select interface

yes

OutputSend

Since 4.3 Reno

� replace hash-based lookup

� PATRICIA trie

� radix tree with r = 2

5 of 20

Page 6: Embracing the BSD Routing Table

Forwarding tablesys/net/radix.c

Input For me?

Forward?

no

Deliveryes

Select interface

yes

OutputSend

Since 4.3 Reno

� replace hash-based lookup

� PATRICIA trie

� radix tree with r = 2

5 of 20

Page 7: Embracing the BSD Routing Table

Link layer address translationsys/net/if ethersubr.c

RTF CLONING: For each connected route

RTF CLONED: For every host in the subnet

iwm0

192.168.0/24

6 of 20

Page 8: Embracing the BSD Routing Table

Link layer address translationsys/net/if ethersubr.c

RTF CLONING: For each connected route

RTF CLONED: For every host in the subnet

iwm0

192.168.0/24

192.168.0.1

00:05:43:11:3e:26

6 of 20

Page 9: Embracing the BSD Routing Table

Link layer address translationsys/net/if ethersubr.c

RTF CLONING: For each connected route

RTF CLONED: For every host in the subnet

iwm0

192.168.0/24

192.168.0.1 192.168.0.6

00:05:43:11:3e:26 00:bc:24:bd:af:7c

6 of 20

Page 10: Embracing the BSD Routing Table

Link layer address translationsys/net/if ethersubr.c

RTF CLONING: For each connected route

RTF CLONED: For every host in the subnet

iwm0

192.168.0/24

192.168.0.1 192.168.0.6 192.168.0.42

00:05:43:11:3e:26 00:bc:24:bd:af:7c link#1

6 of 20

Page 11: Embracing the BSD Routing Table

Message oriented IPCsys/net/rtsock.c

Routing messages

� RTM ADD

� RTM DELETE

� RTM CHANGE

� RTM GET

� None

� RTM NEWADDR

� RTM DELADDR

� RTM IFINFO

� ...

Native speakers

route(8), dhclient(8), bgpd(8), dvmrpd(8), eigrpd(8), ldpd(8), ospfd(8), ospf6d(8),ripd(8), snmpd(8), ...

7 of 20

Page 12: Embracing the BSD Routing Table

Agenda

BSD Routing Table

Refined Interface

New data structures

Conclusion

8 of 20

Page 13: Embracing the BSD Routing Table

Single lookupsys/netinet/ip input.c

Input For me?

Output

no

Deliveryes

Forwarding?

� RTF LOCAL

� RTF BROADCAST

Where?

� rt ifidx

Which Source?

� rt ifa

Link layer address?

� rt gateway

9 of 20

Page 14: Embracing the BSD Routing Table

Single lookupsys/netinet/ip input.c

Input For me?

Output

no

Deliveryes

Forwarding?

� RTF LOCAL

� RTF BROADCAST

Where?

� rt ifidx

Which Source?

� rt ifa

Link layer address?

� rt gateway

9 of 20

Page 15: Embracing the BSD Routing Table

Single lookupsys/netinet/ip input.c

Input For me?

Output

no

Deliveryes

Forwarding?

� RTF LOCAL

� RTF BROADCAST

Where?

� rt ifidx

Which Source?

� rt ifa

Link layer address?

� rt gateway

9 of 20

Page 16: Embracing the BSD Routing Table

Single lookupsys/netinet/ip input.c

Input For me?

Output

no

Deliveryes

Forwarding?

� RTF LOCAL

� RTF BROADCAST

Where?� rt ifidx

Which Source?

� rt ifa

Link layer address?

� rt gateway

9 of 20

Page 17: Embracing the BSD Routing Table

Single lookupsys/netinet/ip input.c

Input For me?

Output

no

Deliveryes

Forwarding?

� RTF LOCAL

� RTF BROADCAST

Where?� rt ifidx

Which Source?� rt ifa

Link layer address?

� rt gateway

9 of 20

Page 18: Embracing the BSD Routing Table

Single lookupsys/netinet/ip input.c

Input For me?

Output

no

Deliveryes

Forwarding?

� RTF LOCAL

� RTF BROADCAST

Where?� rt ifidx

Which Source?� rt ifa

Link layer address?

� rt gateway

9 of 20

Page 19: Embracing the BSD Routing Table

Gateway routesys/net/route.c

localhost 192.168.0.1 wifi

eurobsdcon.org internet

$ netstat -rnf inet

Routing tables

Internet:

Destination Gateway Flags Refs Use Mtu Prio Iface

default 192.168.0.1 UGS 20 420 - 8 iwm0

192.168.0/24 192.168.0.6 UC 2 10 - 4 iwm0

192.168.0.1 00:05:43:11:3e:26 UHLch 1 241 - 4 iwm0

192.168.0.6 00:bc:24:bd:af:7c UHLl 1 4 - 4 iwm0

10 of 20

Page 20: Embracing the BSD Routing Table

Gateway routesys/net/route.c

localhost 192.168.0.1 wifi

eurobsdcon.org internet

$ netstat -rnf inet

Routing tables

Internet:

Destination Gateway Flags Refs Use Mtu Prio Iface

default 192.168.0.1 UGS 20 420 - 8 iwm0

192.168.0/24 192.168.0.6 UC 2 10 - 4 iwm0

192.168.0.1 00:05:43:11:3e:26 UHLch 1 241 - 4 iwm0

192.168.0.6 00:bc:24:bd:af:7c UHLl 1 4 - 4 iwm0

10 of 20

Page 21: Embracing the BSD Routing Table

Link layer address of the gatewaysys/net/if ethersubr.c

Single shared cache

� Proxy reference count

� Immutable pointer

� Flag it RTF CACHED

� Checks during insertion

� No second route lookup

� No atomic operations

if_get(9)

iwm0

default

rt_ifidx 192.168.0.1

rt_gwroute

rt_ifidx

00:05:43:11:3e:26

11 of 20

Page 22: Embracing the BSD Routing Table

Link layer address of the gatewaysys/net/if ethersubr.c

Single shared cache

� Proxy reference count

� Immutable pointer

� Flag it RTF CACHED

� Checks during insertion

� No second route lookup

� No atomic operations

if_get(9)

iwm0

default

rt_ifidx 192.168.0.1

rt_gwroute

rt_ifidx

00:05:43:11:3e:26

11 of 20

Page 23: Embracing the BSD Routing Table

Link layer address of the gatewaysys/net/if ethersubr.c

Single shared cache

� Proxy reference count

� Immutable pointer

� Flag it RTF CACHED

� Checks during insertion

� No second route lookup

� No atomic operations

if_get(9)

iwm0

default

rt_ifidx 192.168.0.1

rt_gwroute

rt_ifidx

00:05:43:11:3e:26

11 of 20

Page 24: Embracing the BSD Routing Table

Link layer address of the gatewaysys/net/if ethersubr.c

Single shared cache

� Proxy reference count

� Immutable pointer

� Flag it RTF CACHED

� Checks during insertion

� No second route lookup

� No atomic operations

if_get(9)

iwm0

default

rt_ifidx 192.168.0.1

rt_gwroute

rt_ifidx

00:05:43:11:3e:26

11 of 20

Page 25: Embracing the BSD Routing Table

Link layer address of the gatewaysys/net/if ethersubr.c

Single shared cache

� Proxy reference count

� Immutable pointer

� Flag it RTF CACHED

� Checks during insertion

� No second route lookup

� No atomic operations

if_get(9)

iwm0

default

rt_ifidx 192.168.0.1

rt_gwroute

rt_ifidx

00:05:43:11:3e:26

11 of 20

Page 26: Embracing the BSD Routing Table

Link layer address of the gatewaysys/net/if ethersubr.c

Single shared cache

� Proxy reference count

� Immutable pointer

� Flag it RTF CACHED

� Checks during insertion

� No second route lookup

� No atomic operations if_get(9)

iwm0

default

rt_ifidx 192.168.0.1

rt_gwroute

rt_ifidx

00:05:43:11:3e:26

11 of 20

Page 27: Embracing the BSD Routing Table

Multipathsys/net/radix mpath.c

default

192.168.0/24

192.168.0.101 192.168.0.102

00:05:43:11:3e:26 0e:ff:4e:17:3f:06

� Introduced by KAME

� for sending/forwarding

� Identical keys in the tree

� different priority, or� different gateway

� Extended to

� Connected routes� ARP proxy entries� (Multicast groups)

12 of 20

Page 28: Embracing the BSD Routing Table

Multipathsys/net/radix mpath.c

default default

192.168.0/24

192.168.0.101 192.168.0.102

00:05:43:11:3e:26 0e:ff:4e:17:3f:06

� Introduced by KAME

� for sending/forwarding

� Identical keys in the tree

� different priority, or� different gateway

� Extended to

� Connected routes� ARP proxy entries� (Multicast groups)

12 of 20

Page 29: Embracing the BSD Routing Table

Multipathsys/net/radix mpath.c

default default

192.168.0/24

192.168.0.101 192.168.0.102

00:05:43:11:3e:26 0e:ff:4e:17:3f:06

� Introduced by KAME

� for sending/forwarding

� Identical keys in the tree

� different priority, or� different gateway

� Extended to

� Connected routes� ARP proxy entries� (Multicast groups)

12 of 20

Page 30: Embracing the BSD Routing Table

Multipathsys/net/radix mpath.c

default default

192.168.0/24

192.168.0.101 192.168.0.102

00:05:43:11:3e:26 0e:ff:4e:17:3f:06

� Introduced by KAME

� for sending/forwarding

� Identical keys in the tree

� different priority, or� different gateway

� Extended to

� Connected routes� ARP proxy entries� (Multicast groups)

12 of 20

Page 31: Embracing the BSD Routing Table

Multipathsys/net/radix mpath.c

default default

192.168.0/24

192.168.0.101 192.168.0.102

00:05:43:11:3e:26 0e:ff:4e:17:3f:06

� Introduced by KAME

� for sending/forwarding

� Identical keys in the tree

� different priority, or� different gateway

� Extended to

� Connected routes� ARP proxy entries� (Multicast groups)

12 of 20

Page 32: Embracing the BSD Routing Table

Agenda

BSD Routing Table

Refined Interface

New data structures

Conclusion

13 of 20

Page 33: Embracing the BSD Routing Table

Why?sys/net/radix mpath.c

/*

* Stolen from radix.c rn addroute().

* This is nasty code with a certain amount of magic and dragons.[...]

*/

14 of 20

Page 34: Embracing the BSD Routing Table

Everything is multipathsys/net/rtable.c

0x0

0xc0a80000

0xc0a80065 0xc0a80066

� Data structure separation

� network agnostic� value is a pointer

� List of entries

� value points to a list� ordered by priority� generic multipath

� MP ready

� different lifetimes� separated refcount� no backpointer

15 of 20

Page 35: Embracing the BSD Routing Table

Everything is multipathsys/net/rtable.c

0x0 default

0xc0a80000 192.168.0/24

0xc0a80065 0xc0a80066 192.168.0.102

� Data structure separation

� network agnostic� value is a pointer

� List of entries

� value points to a list� ordered by priority� generic multipath

� MP ready

� different lifetimes� separated refcount� no backpointer

15 of 20

Page 36: Embracing the BSD Routing Table

Everything is multipathsys/net/rtable.c

0x0 default

0xc0a80000

default

192.168.0/24

0xc0a80065 0xc0a80066

192.168.0/24

192.168.0.102

� Data structure separation

� network agnostic� value is a pointer

� List of entries

� value points to a list� ordered by priority� generic multipath

� MP ready

� different lifetimes� separated refcount� no backpointer

15 of 20

Page 37: Embracing the BSD Routing Table

Everything is multipathsys/net/rtable.c

0x0 default

0xc0a80000

default

192.168.0/24

0xc0a80065 0xc0a80066

192.168.0/24

192.168.0.102

� Data structure separation

� network agnostic� value is a pointer

� List of entries

� value points to a list� ordered by priority� generic multipath

� MP ready

� different lifetimes� separated refcount� no backpointer

15 of 20

Page 38: Embracing the BSD Routing Table

Allotment Routing Tablesys/net/art.c

Number of packets receivedwhile sending 800Kpps

Shared code & knowledgeBeautiful free software story

� Algorithm from Donald Knuth

� patent free

� C version by Yoichi Hariguchi

� documented in a paper� variable stride length� BSD licensed

� Integrated by Martin Pieuchot

� Lock free lookup by JonathanMatthew & David Gwynne

16 of 20

Page 39: Embracing the BSD Routing Table

Allotment Routing Tablesys/net/art.c

Number of packets receivedwhile sending 800Kpps

Shared code & knowledgeBeautiful free software story

� Algorithm from Donald Knuth

� patent free

� C version by Yoichi Hariguchi

� documented in a paper� variable stride length� BSD licensed

� Integrated by Martin Pieuchot

� Lock free lookup by JonathanMatthew & David Gwynne

16 of 20

Page 40: Embracing the BSD Routing Table

Allotment Routing Tablesys/net/art.c

Number of packets receivedwhile sending 800Kpps

Shared code & knowledgeBeautiful free software story

� Algorithm from Donald Knuth

� patent free

� C version by Yoichi Hariguchi

� documented in a paper� variable stride length� BSD licensed

� Integrated by Martin Pieuchot

� Lock free lookup by JonathanMatthew & David Gwynne

16 of 20

Page 41: Embracing the BSD Routing Table

Allotment Routing Tablesys/net/art.c

Number of packets receivedwhile sending 800Kpps

Shared code & knowledgeBeautiful free software story

� Algorithm from Donald Knuth

� patent free

� C version by Yoichi Hariguchi

� documented in a paper� variable stride length� BSD licensed

� Integrated by Martin Pieuchot

� Lock free lookup by JonathanMatthew & David Gwynne

16 of 20

Page 42: Embracing the BSD Routing Table

Allotment Routing Tablesys/net/art.c

Number of packets receivedwhile sending 800Kpps

Shared code & knowledgeBeautiful free software story

� Algorithm from Donald Knuth

� patent free

� C version by Yoichi Hariguchi

� documented in a paper� variable stride length� BSD licensed

� Integrated by Martin Pieuchot

� Lock free lookup by JonathanMatthew & David Gwynne

16 of 20

Page 43: Embracing the BSD Routing Table

Agenda

BSD Routing Table

Refined Interface

New data structures

Conclusion

17 of 20

Page 44: Embracing the BSD Routing Table

Conclusionsys/net/rtable.c

� Routing table as single gobal data structure

� Used for forwarding, sending and receiving� Consulted once per packet� Lock free lookup

� No secondary lookup for link layer address translation

� No atomic primitive to get the gateway link layer address

� Generic, multi-use multipath implementation

� Faster route lookup via ART

� Interface didn’t change

18 of 20

Page 45: Embracing the BSD Routing Table

Conclusionsys/net/rtable.c

� Routing table as single gobal data structure� Used for forwarding, sending and receiving

� Consulted once per packet� Lock free lookup

� No secondary lookup for link layer address translation

� No atomic primitive to get the gateway link layer address

� Generic, multi-use multipath implementation

� Faster route lookup via ART

� Interface didn’t change

18 of 20

Page 46: Embracing the BSD Routing Table

Conclusionsys/net/rtable.c

� Routing table as single gobal data structure� Used for forwarding, sending and receiving� Consulted once per packet

� Lock free lookup

� No secondary lookup for link layer address translation

� No atomic primitive to get the gateway link layer address

� Generic, multi-use multipath implementation

� Faster route lookup via ART

� Interface didn’t change

18 of 20

Page 47: Embracing the BSD Routing Table

Conclusionsys/net/rtable.c

� Routing table as single gobal data structure� Used for forwarding, sending and receiving� Consulted once per packet� Lock free lookup

� No secondary lookup for link layer address translation

� No atomic primitive to get the gateway link layer address

� Generic, multi-use multipath implementation

� Faster route lookup via ART

� Interface didn’t change

18 of 20

Page 48: Embracing the BSD Routing Table

Conclusionsys/net/rtable.c

� Routing table as single gobal data structure� Used for forwarding, sending and receiving� Consulted once per packet� Lock free lookup

� No secondary lookup for link layer address translation

� No atomic primitive to get the gateway link layer address

� Generic, multi-use multipath implementation

� Faster route lookup via ART

� Interface didn’t change

18 of 20

Page 49: Embracing the BSD Routing Table

Conclusionsys/net/rtable.c

� Routing table as single gobal data structure� Used for forwarding, sending and receiving� Consulted once per packet� Lock free lookup

� No secondary lookup for link layer address translation

� No atomic primitive to get the gateway link layer address

� Generic, multi-use multipath implementation

� Faster route lookup via ART

� Interface didn’t change

18 of 20

Page 50: Embracing the BSD Routing Table

Conclusionsys/net/rtable.c

� Routing table as single gobal data structure� Used for forwarding, sending and receiving� Consulted once per packet� Lock free lookup

� No secondary lookup for link layer address translation

� No atomic primitive to get the gateway link layer address

� Generic, multi-use multipath implementation

� Faster route lookup via ART

� Interface didn’t change

18 of 20

Page 51: Embracing the BSD Routing Table

Conclusionsys/net/rtable.c

� Routing table as single gobal data structure� Used for forwarding, sending and receiving� Consulted once per packet� Lock free lookup

� No secondary lookup for link layer address translation

� No atomic primitive to get the gateway link layer address

� Generic, multi-use multipath implementation

� Faster route lookup via ART

� Interface didn’t change

18 of 20

Page 52: Embracing the BSD Routing Table

Conclusionsys/net/rtable.c

� Routing table as single gobal data structure� Used for forwarding, sending and receiving� Consulted once per packet� Lock free lookup

� No secondary lookup for link layer address translation

� No atomic primitive to get the gateway link layer address

� Generic, multi-use multipath implementation

� Faster route lookup via ART

� Interface didn’t change

18 of 20

Page 53: Embracing the BSD Routing Table

Questions?

Slides on http://www.openbsd.org/papers/

More stories on http://www.grenadille.net

19 of 20

Page 54: Embracing the BSD Routing Table

Coming soon!sys/net/pf.c

Input pf(4)

Output

Deliver

20 of 20