Top Banner
Takuya ASADA<[email protected]> @syuu1228
23

Implementing a layer 2 framework on linux network

Dec 04, 2014

Download

Technology

Takuya Asada

 
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Implementing a layer 2 framework on linux network

Takuya ASADA<[email protected]> @syuu1228

Page 2: Implementing a layer 2 framework on linux network

I was in embedded software company, worked on SMP support for router firmware

Ph. D. Student of Tokyo University of Technology, researching improvement network I/O architecture on modern x86 servers

Interested in: SMP, Network, Virtualization

GSoC ’11(FreeBSD) Multithread support for BPF

GSoC ’12(FreeBSD) BIOS support for BHyVe

Research assistant at IIJ research laboratory, implementing BCube for Linux

Today’s topic!

Page 3: Implementing a layer 2 framework on linux network

BCube is a new network architecture

Designed for shipping-container based modular data centers

Server-centric network structure ◦ Server act as

End hosts

Relay nodes for each other

The paper published in ACM SIGCOMM ’09 by Microsoft Research Asia

Page 4: Implementing a layer 2 framework on linux network

Each server has one connection to each layers

Switches never connect to other switches

Servers relay traffic for each other

switch

server

000 001

0,0

010 011

0,1

1,0 1,1

100 101

0,0

110 111

0,1

1,0 1,1

2,0 2,1 2,0 2,1

Bcube0

Bcube1

Bcube2

Page 5: Implementing a layer 2 framework on linux network

𝐵𝐶𝑢𝑏𝑒𝑘 has k + 1 layers

𝐵𝐶𝑢𝑏𝑒𝑥 contains n 𝐵𝐶𝑢𝑏𝑒𝑥−1

𝐵𝐶𝑢𝑏𝑒0 contains n servers

Total servers = 𝑛𝑘+1

000 001

0,0

010 011

0,1

1,0 1,1

100 101

0,0

110 111

0,1

1,0 1,1

2,0 2,1 2,0 2,1

Bcube0

Bcube1

Bcube2

switch

server

Page 6: Implementing a layer 2 framework on linux network

High network capacity for various traffic patterns ◦ one-to-one

◦ one-to-all

◦ one-to-several

◦ all-to-all

Performance degrades gracefully as servers/switches failure increases

Doesn’t need special hardware, only use commodity switch

Page 7: Implementing a layer 2 framework on linux network

Each server has unique BCube address

Each digit pointed port number of switch in the layer

000 001

0,0

010 011

0,1

1,0 1,1

100 101

0,0

110 111

0,1

1,0 1,1

2,0 2,1 2,0 2,1

Bcube0

Bcube1

Bcube2

switch

server

Page 8: Implementing a layer 2 framework on linux network

Default routing rule ◦ Top layer→Bottom layer

◦ Ex: Route from 000 to 111 000 →100 →110 →111

000 001

0,0

010 011

0,1

1,0 1,1

100 101

0,0

110 111

0,1

1,0 1,1

2,0 2,1 2,0 2,1

Bcube0

Bcube1

Bcube2

Page 9: Implementing a layer 2 framework on linux network

There are alternate routes between any nodes

Can bypass failure servers and switches

Also can use acceralate throughput to parallelize traffic

000 001

0,0

010 011

0,1

1,0 1,1

100 101

0,0

110 111

0,1

1,0 1,1

2,0 2,1 2,0 2,1

Bcube0

Bcube1

Bcube2

Page 10: Implementing a layer 2 framework on linux network

Source server decides the best path for a flow

Bypass failure paths

To propagate routing path, source server writes routing path information on packet header

Page 11: Implementing a layer 2 framework on linux network

Add BCube header between Ethernet header and IP header

Has src/dst address and also routing path information on “Next Hop Index Array”

IP Header

BCube Header

Ethernet HeaderBCube dest address

BCube source address

Protocol type

Next Hop Index Array

Page 12: Implementing a layer 2 framework on linux network

Evaluating various "Data Center Network" technologies, especially for container-moduler datacenter architecture. BCube is one of the candidate.

Page 13: Implementing a layer 2 framework on linux network

Try to use existing code as much as possible

Minimum implementation at first

BCube binds multiple interface, assigns a BCube address and an IP address

What is the most similar function which already existing on Linux? →Bridge! ◦ Forked bridge.ko and brctl command,

named bcube.ko and bcctl command

Page 14: Implementing a layer 2 framework on linux network

brctl addbr <bridge> brctl delbr <bridge> ↓ bcctl addbc <bcube> <bcaddr> <N> <K> bcctl delbc <bcube>

Modified addbr/delbr, add 3 args ◦ BCube address ◦ n and k parameter

Use MAC address format/size for BCube address

Use BCube address for HW address of BCube device ◦ It works like fake MAC address on Linux network stack

101 → 00:00:01:00:01

Page 15: Implementing a layer 2 framework on linux network

brctl addif <bridge> <device> brctl delif <bridge> <device>

↓ bcctl assignif <bcube> <layer> <device> bcctl unassignif <bcube> <layer> <device>

Modified assignif / unassignif command, add layer number on args

Page 16: Implementing a layer 2 framework on linux network

Need to reconsider address resolution

Normal Ethernet ◦ IP Address → MAC Address (ARP)

BCube network ◦ IP Address → BCube Address

→ ARP?

◦ (Neighbor) BCube address → MAC Address → Need additional neighbor discovery protocol

Page 17: Implementing a layer 2 framework on linux network

Once broadcast works on BCube implementation, ARP should work on it

But I haven’t implemented it yet, decided to configure manually by following command: arp –i bc0 –s 10.0.0.6 00:00:00:01:00:10

Page 18: Implementing a layer 2 framework on linux network

Need an ARP like protocol

Decided to configure manually too, implemented following command: bcctl addneighbour <bcube> <layer> <bcaddr> <macaddr> bcctl delneighbour <bcube> <layer> <bcaddr>

bcube.ko maintenance neighbor table, use it in packet transmitting/forwarding

Page 19: Implementing a layer 2 framework on linux network

In bridge.ko, it maintenance FDB(forwarding database) to lookup destination MAC address→output port using hash table

Deleted FDB, implemented function to decide next hop BCube address, output port, and MAC address of next hop

Haven’t implemented source routing – just default routing for now

Page 20: Implementing a layer 2 framework on linux network

Top layer→Bottom layer

Ex: Route from 000 to 111 000 →100 →110 →111

000 001

0,0

010 011

0,1

1,0 1,1

100 101

0,0

110 111

0,1

1,0 1,1

2,0 2,1 2,0 2,1

Bcube0

Bcube1

Bcube2

Page 21: Implementing a layer 2 framework on linux network

To add BCube Header between Ethernet Header and IP header, I forked net/ethernet/eth.c

ETH_HLEN (14byte) → BCUBE_HLEN (24byte)

struct ethhdr (MAC header) → struct bcubehdr (MAC & BCube header)

eth_header_ops → bc_header_ops To handle Bcube Header

Unfortunately GRO accesses ethernet header directly, and it works before BCube handles a packet – need to disable it

Page 22: Implementing a layer 2 framework on linux network

Found a way to implement new L2 framework using existing bridge implementation ◦ Lot more easy than implement it from scrach

Development Status ◦ Implemented basic features, debugging now ◦ Will consider to add more features

broadcast / multicast Intermediate node/switch failure detection, change the

routing source routing address resolution protocol

Planing more detail evaluation in our data center testbed

Any comments and suggestions are welcome

Page 23: Implementing a layer 2 framework on linux network

This work was done as part of research assistance work at IIJ research laboratory.