Top Banner
Network Virtualization in Multi-tenant Datacenters Author: VMware, UC Berkeley and ICSI Publisher: 11th USENIX Symposium on Networked Systems Design and Implementation (NSDI 14) Presenter: Yi-Tsung Huang Date: 2015/09/30 Department of Computer Science and Information Engineering National Cheng Kung University, Taiwan R.O.C.
30

Network Virtualization in Multi-tenant Datacenters Author: VMware, UC Berkeley and ICSI Publisher: 11th USENIX Symposium on Networked Systems Design and.

Jan 29, 2016

Download

Documents

Junior Griffin
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Network Virtualization in Multi-tenant Datacenters Author: VMware, UC Berkeley and ICSI Publisher: 11th USENIX Symposium on Networked Systems Design and.

Network Virtualization in Multi-tenant Datacenters

Author: VMware, UC Berkeley and ICSI

Publisher: 11th USENIX Symposium on Networked Systems Design and Implementation (NSDI 14)

Presenter: Yi-Tsung Huang

Date: 2015/09/30

Department of Computer Science and Information Engineering National Cheng Kung University, Taiwan R.O.C.

Page 2: Network Virtualization in Multi-tenant Datacenters Author: VMware, UC Berkeley and ICSI Publisher: 11th USENIX Symposium on Networked Systems Design and.

Introduction

multi-tenant datacenter (MTD)• the networking layer would support similar

properties as the compute layer, in which arbitrary network topologies and addressing architectures could be overlayed onto the same physical network.

Network virtualization• allows for the creation of virtual networks, each

with independent service models, topologies, and addressing architectures, over the same physical network.

National Cheng Kung University CSIE Computer & Internet Architecture Lab

2

Page 3: Network Virtualization in Multi-tenant Datacenters Author: VMware, UC Berkeley and ICSI Publisher: 11th USENIX Symposium on Networked Systems Design and.

Introduction

In this paper we present NVP, a network virtualization platform that has been deployed in dozens of production environments over the last few years and has hosted tens of thousands of virtual networks and virtual machines.

National Cheng Kung University CSIE Computer & Internet Architecture Lab

3

Page 4: Network Virtualization in Multi-tenant Datacenters Author: VMware, UC Berkeley and ICSI Publisher: 11th USENIX Symposium on Networked Systems Design and.

System Design-Abstraction

The network hypervisor is a software layer interposed between the provider’s physical forwarding infrastructure and the tenant control planes.

National Cheng Kung University CSIE Computer & Internet Architecture Lab

4

Page 5: Network Virtualization in Multi-tenant Datacenters Author: VMware, UC Berkeley and ICSI Publisher: 11th USENIX Symposium on Networked Systems Design and.

System Design-Abstraction

Control abstraction must allow tenants to define a set of logical network elements(logical datapaths) that they can configure as they would physical network elements.

National Cheng Kung University CSIE Computer & Internet Architecture Lab

5

Page 6: Network Virtualization in Multi-tenant Datacenters Author: VMware, UC Berkeley and ICSI Publisher: 11th USENIX Symposium on Networked Systems Design and.

System Design-Abstraction

Packet abstraction must enable packets sent by endpoints in the MTD to be given the same switching, routing and filtering service they would have in the tenant’s home network.

National Cheng Kung University CSIE Computer & Internet Architecture Lab

6

Page 7: Network Virtualization in Multi-tenant Datacenters Author: VMware, UC Berkeley and ICSI Publisher: 11th USENIX Symposium on Networked Systems Design and.

System Design-Virtualization Architecture

In our NVP design, we implement the logical datapaths in the software virtual switches on each host, leveraging a set of tunnels between every pair of host-hypervisors.

National Cheng Kung University CSIE Computer & Internet Architecture Lab

7

Page 8: Network Virtualization in Multi-tenant Datacenters Author: VMware, UC Berkeley and ICSI Publisher: 11th USENIX Symposium on Networked Systems Design and.

System Design-Virtualization Architecture

For packet replication, NVP constructs a simple multicast overlay using additional physical forwarding elements called service nodes.

Some tenants want to interconnect their logical network with their existing physical one. This is done via gateway appliances.

National Cheng Kung University CSIE Computer & Internet Architecture Lab

8

Page 9: Network Virtualization in Multi-tenant Datacenters Author: VMware, UC Berkeley and ICSI Publisher: 11th USENIX Symposium on Networked Systems Design and.

System Design-Virtualization Architecture

National Cheng Kung University CSIE Computer & Internet Architecture Lab

9

Page 10: Network Virtualization in Multi-tenant Datacenters Author: VMware, UC Berkeley and ICSI Publisher: 11th USENIX Symposium on Networked Systems Design and.

System Design-Design Challenges

Datapath design and acceleration Declarative programming Scaling the computation

National Cheng Kung University CSIE Computer & Internet Architecture Lab

10

Page 11: Network Virtualization in Multi-tenant Datacenters Author: VMware, UC Berkeley and ICSI Publisher: 11th USENIX Symposium on Networked Systems Design and.

Virtualization Support at the Edge-Implementing the Logical Datapath

NVP uses Open vSwitch (OVS) in all transport nodes to forward packets.

OVS is remotely configurable by the NVP controller cluster via two protocols• OpenFlow• OVSDB management protocol

Each logical datapath consists of a series (pipeline) of logical flow tables, each with its own globally-unique identifier.

National Cheng Kung University CSIE Computer & Internet Architecture Lab

11

Page 12: Network Virtualization in Multi-tenant Datacenters Author: VMware, UC Berkeley and ICSI Publisher: 11th USENIX Symposium on Networked Systems Design and.

Virtualization Support at the Edge-Implementing the Logical Datapath

National Cheng Kung University CSIE Computer & Internet Architecture Lab

12

Page 13: Network Virtualization in Multi-tenant Datacenters Author: VMware, UC Berkeley and ICSI Publisher: 11th USENIX Symposium on Networked Systems Design and.

Virtualization Support at the Edge-Forwarding Performance

To achieve efficient flow lookups on x86, OVS exploits traffic locality.

To re-enable hardware offloading for encapsulated traffic with existing NICs, NVP uses an encapsulation method called STT.• STT places a standard, but fake, TCP header after the

physical IP header.

National Cheng Kung University CSIE Computer & Internet Architecture Lab

13

Page 14: Network Virtualization in Multi-tenant Datacenters Author: VMware, UC Berkeley and ICSI Publisher: 11th USENIX Symposium on Networked Systems Design and.

Virtualization Support at the Edge-Fast Failovers

NVP deployments have multiple service nodes to ensure that any one service node failure does not disrupt logical broadcast and multicast traffic.

NVP deployments typically involve multiple gateway nodes for each bridged physical network.

NVP must ensure that no loops between the logical and physical networks are possible.

National Cheng Kung University CSIE Computer & Internet Architecture Lab

14

Page 15: Network Virtualization in Multi-tenant Datacenters Author: VMware, UC Berkeley and ICSI Publisher: 11th USENIX Symposium on Networked Systems Design and.

Forwarding State Computation

National Cheng Kung University CSIE Computer & Internet Architecture Lab

15

Page 16: Network Virtualization in Multi-tenant Datacenters Author: VMware, UC Berkeley and ICSI Publisher: 11th USENIX Symposium on Networked Systems Design and.

Forwarding State Computation

National Cheng Kung University CSIE Computer & Internet Architecture Lab

16

we implemented a domain-specific, declarative language called nlog for computing the network forwarding state.

The logic is written in a declarative manner that specifies a function mapping the controller input to output.

Page 17: Network Virtualization in Multi-tenant Datacenters Author: VMware, UC Berkeley and ICSI Publisher: 11th USENIX Symposium on Networked Systems Design and.

Forwarding State Computation

nlog declarations are Datalog queries: a single declaration is a join over a number of tables that produces immutable tuples for a head table.

National Cheng Kung University CSIE Computer & Internet Architecture Lab

17

Page 18: Network Virtualization in Multi-tenant Datacenters Author: VMware, UC Berkeley and ICSI Publisher: 11th USENIX Symposium on Networked Systems Design and.

Controller Cluster-Scaling and Availability of Computation

Two-layer distributed controller• Logical controllers: Compute flows and tunnels

for logical datapaths• Physical controllers: Communicate with

hypervisors, gateways, and service nodes

National Cheng Kung University CSIE Computer & Internet Architecture Lab

18

Page 19: Network Virtualization in Multi-tenant Datacenters Author: VMware, UC Berkeley and ICSI Publisher: 11th USENIX Symposium on Networked Systems Design and.

Controller Cluster-Scaling and Availability of Computation

To provide failover within the cluster, NVP provisions hot standbys at both the logical and physical controller layers by exploiting the sharding mechanism.

One controller, acting as a sharding coordinator, ensures that every shard is assigned one master controller and one or more other controllers acting as hot standbys.

National Cheng Kung University CSIE Computer & Internet Architecture Lab

19

Page 20: Network Virtualization in Multi-tenant Datacenters Author: VMware, UC Berkeley and ICSI Publisher: 11th USENIX Symposium on Networked Systems Design and.

Controller Cluster-Distributed Services

NVP is built on the Onix controller platform and thus has access to the elementary distributed services Onix provides.

National Cheng Kung University CSIE Computer & Internet Architecture Lab

20

Page 21: Network Virtualization in Multi-tenant Datacenters Author: VMware, UC Berkeley and ICSI Publisher: 11th USENIX Symposium on Networked Systems Design and.

Controller Cluster-Distributed Services

Leader election• Each controller must know which shard it

manages, and must also know when to take over responsibility of slices managed by a controller that has disconnected.

Label allocation• A network packet encapsulated in a tunnel must

carry a label that denotes the logical egress port to which the packet is destined, so the receiving hypervisor can properly process it.

National Cheng Kung University CSIE Computer & Internet Architecture Lab

21

Page 22: Network Virtualization in Multi-tenant Datacenters Author: VMware, UC Berkeley and ICSI Publisher: 11th USENIX Symposium on Networked Systems Design and.

Controller Cluster-API for Service Providers

To support integrating with a service provider’s existing cloud management system, NVP exposes an HTTP-based REST API in which network elements, physical or logical, are presented as objects.

National Cheng Kung University CSIE Computer & Internet Architecture Lab

22

Page 23: Network Virtualization in Multi-tenant Datacenters Author: VMware, UC Berkeley and ICSI Publisher: 11th USENIX Symposium on Networked Systems Design and.

Evaluation-Controller Cluster

The configuration in the following tests has 3,000 simulated hypervisors, each with 21 vNICs. There are 7000 logical datapaths, each coupled with a logical control plane modeling a logical switch.

The test control cluster has three nodes. Each controller is a bare-metal Intel Xeon 2.4GHz server with 12 cores, 96GB of memory, and 400GB hard disk.

National Cheng Kung University CSIE Computer & Internet Architecture Lab

23

Page 24: Network Virtualization in Multi-tenant Datacenters Author: VMware, UC Berkeley and ICSI Publisher: 11th USENIX Symposium on Networked Systems Design and.

Evaluation-Controller Cluster

Cold start• test simulates bringing the entire system back

online after a major datacenter disaster in which all servers crash and all volatile memory is lost.

Restore• test simulates a milder scenario where the whole

control cluster crashes and loses all volatile state but the dataplane remains intact.

National Cheng Kung University CSIE Computer & Internet Architecture Lab

24

Page 25: Network Virtualization in Multi-tenant Datacenters Author: VMware, UC Berkeley and ICSI Publisher: 11th USENIX Symposium on Networked Systems Design and.

Evaluation-Controller Cluster

Failover• test simulates a failure of a single controller within

a cluster. Steady state

• we start with a converged idle system. We then add 10 logical ports to existing switches through API calls, wait for connectivity correctness on these new ports, and then delete them.

National Cheng Kung University CSIE Computer & Internet Architecture Lab

25

Page 26: Network Virtualization in Multi-tenant Datacenters Author: VMware, UC Berkeley and ICSI Publisher: 11th USENIX Symposium on Networked Systems Design and.

Evaluation-Controller Cluster

National Cheng Kung University CSIE Computer & Internet Architecture Lab

26

Page 27: Network Virtualization in Multi-tenant Datacenters Author: VMware, UC Berkeley and ICSI Publisher: 11th USENIX Symposium on Networked Systems Design and.

Evaluation-Controller Cluster

National Cheng Kung University CSIE Computer & Internet Architecture Lab

27

Page 28: Network Virtualization in Multi-tenant Datacenters Author: VMware, UC Berkeley and ICSI Publisher: 11th USENIX Symposium on Networked Systems Design and.

Evaluation-Controller Cluster

National Cheng Kung University CSIE Computer & Internet Architecture Lab

28

Page 29: Network Virtualization in Multi-tenant Datacenters Author: VMware, UC Berkeley and ICSI Publisher: 11th USENIX Symposium on Networked Systems Design and.

Evaluation-Controller Cluster

National Cheng Kung University CSIE Computer & Internet Architecture Lab

29

Page 30: Network Virtualization in Multi-tenant Datacenters Author: VMware, UC Berkeley and ICSI Publisher: 11th USENIX Symposium on Networked Systems Design and.

Evaluation-Transport Nodes

National Cheng Kung University CSIE Computer & Internet Architecture Lab

30