Tricircle in a nutshell(use cases and technologies)
[email protected] edited: May, 2016
Content will be constantly updated at https://docs.google.com/presentation/d/1UQWeAMIJgJsWw-cyz9R7NvcAuSWUnKvaZFXLfRAQ6fI/
● Use cases○ Massive distributed edge cloud○ Large scale cloud○ Hybrid cloud
● Tricircle in a nutshell
The uplink challenge in centralized cloud
3
CentralDataCenter
CentralDataCenter
Long path, limited uplink bandwidth, bad experience for uplink in centralized clouddownlink is a little better for using CDN etc re-distribution services
Massive distributed small edge clouds:VNF* / App / Storage placement close to end user for better user experience
4
Edge DataCenter
Edge DataCenter
Edge DataCenter
CentralDataCenter
CentralDataCenter
Thousands of edge data center cloud to place VNF/APP/Storage close to end user for better user experience, for example, video processing once it’s uploaded/streamed to the nearby cloud by the user, better personalized networking capabilities
VNF: virtualized network function, telecom application running in the cloud
Massive distributed small edge clouds:Bandwidth sensitive, heavy load App like CAD modeling, video editing/uplink streaming ask for the cloud close to the end user
Edge DataCenter
Edge DataCenter
Edge DataCenter
CentralDataCenter
CentralDataCenter
Enterprise asked for cloud close to the end user in production cloud, for the heavy load application like CAD modeling, video editing are very bandwidth sensitive. And there are often multiple branches of the Enterprise in different location, need to collaborate together, for example, video editing collaboration across different branches. So the edge clouds serve for the Enterprise are also needed to be distributed
VNF / App / Storage movement/distribution on demand among edge cloudsfor better user experience
6
Edge DataCenter
Edge DataCenter
Edge DataCenter
CentralDataCenter
CentralDataCenter
The VNF / APP / Storage could be moved/distributed on demand among edge clouds for personalized best user experience in computation/storage/networking.
Distributed VNF/APP for better reliability, availability and user experience
7
Edge DataCenter
Edge DataCenter
Edge DataCenter
The VNF / APP / Storage distributed into multiple edge data centers for better reliability, availability and user experience.
Internet
Internet
vEPC
vEPC
vEPC
Internet
vEPC distributed into multiple DCs
Service function chaining across sites for flexible service logic
8
Edge DataCenter
Edge DataCenter
Edge DataCenter
CentralDataCenter
CentralDataCenter
Flexible service logic by dynamically chaining app across data centers
Edge DataCenter
Internet
OpenStack@Site1
OpenStack@Site2
OpenStack@Site3
Why not just put one OpenStack in each site for distributed edge cloud?
OpenStack@Site1
OpenStack@Site2
OpenStack@Site3
Tenant level L2/L3 networking and its automation for tenant E-W traffic isolation.If a tenant has resources distributed in multi-site, for example one company’s branches, inter-connection and isolation is needed
VM1 Router
VM2 Router
OpenStack@Site1
OpenStack@Site2
OpenStack
VM1
VM2
OpenStack
Can we make one OpenStack distributed into multiple sites?The question is why we want to do this, and how many sites
VM1 Router
VM2 Router
@Site1
@Site2
The benefit using one OpenStack to manage multiple sites:•Cross site L2/L3 networking automation, include tenant space IP/mac management and security group management•Global resources view and multisite quota control
Site1 Site2 Site3
one OpenStack
Challenge in using one OpenStack instances to manage massive multi-sites:● N*N inter-site access (API, DB, Message Bus, Scheduler)● If the link to one site is broken, the site is not manageble, no CLI/API is available● Key is the how to do the localization access.
Site1 Site2 Site3
one OpenStackAPI
Scheduler
DB
MessageBus
API
Scheduler
DB
MessageBus
API
Scheduler
DB
MessageBus
Massive distributed edge clouds bring new requirements
● L2/L3 networking across OpenStack instances in different data centers● VNF cross site service chaining● Volume/VM/object storage migration/distribution● Distributed image management● Distributed quota management● ...
● Use cases○ Massive distributed edge cloud○ Large scale cloud○ Hybrid cloud
● Tricircle in a nutshell
Amazon Region, AZ (1)
16http://www.slideshare.net/AmazonWebServices/spot301-aws-innovation-at-scale-aws-reinvent-2014
Amazon AZ capacity
17http://www.slideshare.net/AmazonWebServices/spot301-aws-innovation-at-scale-aws-reinvent-2014
Challenge in capacity expansion for one OpenStack in public cloud
OpenStack
API Server
Compute Node
MessageBus DB
Network Node
Sizing is headache in capacity expansion in production public cloud:You have to estimate, calculate, monitoring, simulate, test, online grey expansion for controller nodes and network nodes…whenever you add new machines to the cloud. Too much work to do to expand to 50000 compute nodes.
Number of Nova-API Server...Number of Cinder-API Server..Number of Neutron-API Server…Number of Scheduler..Number of Conductor…specification of physical server…specification of physical switch…Size of storage for Image..Size of management plane bandwidth…size of data plane bandwidth…reservation of rack space …reservation of networking slots…….
1000 (compute nodes) -> 2000 -> 50000
Capacity expansion should be controllable, modularized in public cloud
You can’t test all size, sometimes you even don’t have so much resources to testbut you can add already tested and verified building block for capacity expansion(experience from production public cloud)
API Server
Compute Node
MessageBus
Network Node
DB
OpenStack
API Server
Compute Node
MessageBus
Network Node
DB
OpenStack
….
1000 (compute nodes) -> 2000 -> 50000
Capacity expansion should be controllable, modularized in public cloud
Don’t ruin the user expectation when adding new blocking for capacity expansion. Most of them are networking...
OpenStack
….
1000 -> 2000 -> 50000
OpenStack
VM2 VM3 VM4VM1 VM6 VM7VM5
SEG, Net1@AZ1
R
VM8 VM9
SEG, Net2@AZ1
After capacity expansion, new building block will be added to the same AZ. End user should not be aware of your capacity expansion.1. Create VM6~VM7 in Net1@AZ1 with SEG should work2. Create VM8~VM9 in Net2@AZ1 with SEG,and add router interface (Net2, R) should also work3. ...
App placement in multi-AZ for higher reliability and availability
DNS and/or load balancing for APPs in different AZs…
(AZ as a fault domain concept, It’s very strange that OpenStack’s AZs will share controller nodes/message bus/DB, Each AZ should have its own these components)
OpenStack (AZ1)
VM2 VM3 VM4VM1 VM5
SEG, Net1@AZ1R
OpenStack(AZ2)
VM7 VM8 VM9VM6 VM10
SEG, Net2@AZ2R
App placement in multi-AZ for higher reliability and availability
vEPC is designed as distributed application, the DB/session process unit and front end load balance can be distributed into multiple AZs for higher reliability and availability
vEPC distributed into multiple AZs
OpenStack(AZ3)OpenStack(AZ2)OpenStack(AZ1)
But one single endpoint and OpenStack api is expected for these building blocks by end user, PaaS, CLI, SDK, …
OpenStack
AZ1 AZ2
OpenStackOpenStack
Large scale cloud brings new requirements
○ L2/L3 networking across OpenStack instances○ Distributed quota management○ Global resource view of the tenant○ Volume/VM migration/backup○ Multi-DC image import/clone/export○ Single API endpoint with open interface, tools, SDKs, ...○ ...
● Use cases○ Massive distributed edge cloud○ Large scale cloud○ Hybrid cloud
● Tricircle in a nutshell
OpenStack (AZ1)
VM2 VM3 VM4VM1 VM5
SEG, Net1@AZ1R
AWS
RVM2VM1
AZure
RVM2VM1
Already built the private cloud and also use the power of public cloud, how to manage the resources in hybrid cloud? Networking, migration, backup/restore
Tricircle is OpenStack API gateway with added value like cross OpenStack L2/L3 networking, volume/VM movement, image distribution, global resource view, distributed quota management …This makes massive distributed edge clouds work like one inter-connected cloud, one OpenStack 28
Edge DataCenter
Edge DataCenter
Edge DataCenter
Tricircle
OpenStack API
OpenStack API
OpenStack API
Tricircle is OpenStack API gateway with added value like cross OpenStack L2/L3 networking, volume/VM movement, image distribution, global resource view, distributed quota management …This makes Tricircle being able to address the capacity expansion and multi-AZ challenges in a large scale cloud, and work like one OpenStack 29
Tricircle
OpenStack API
AZ1 AZ2
API Server
Compute Node
MessageBus
Network Node
DB
OpenStack
API Server
Compute Node
MessageBus
Network Node
DB
OpenStack
….
1000 (compute nodes) -> 2000 -> 50000
OpenStack
Tricircle is OpenStack API gateway with added value like cross OpenStack L2/L3 networking, volume/VM movement, image distribution, global resource view, distributed quota management …This makes Tricircle being able to manage hybrid cloud leverage the help of “Jacket” (https://wiki.openstack.org/wiki/Jacket) project
Region One
Region, AZ, Pod, DC, Top, Bottom
32
Tricircle
DC1
bottom OpenStack ( Pod )bottom OpenStack ( Pod )
bottom OpenStack ( Pod )
Nova Cinder Neutron
DC2
bottom OpenStack ( Pod )
Nova Cinder Neutron
DC3
bottom OpenStack ( Pod )bottom OpenStack ( Pod )
bottom OpenStack ( Pod )
Nova Cinder Neutron
bottom OpenStack ( Pod )
Nova Cinder Neutron
AZ1 AZ3 AZ4
OpenStack API
OpenStack API OpenStack API
Tricircle provides an OpenStack API gateway and networking automation to allow multiple OpenStack instances, spanning in one site or multiple sites or in hybrid cloud, to be managed as a single OpenStack cloud*Tricircle itself can be deployed into multiple distributed data centers
V1 Tricircle is modified from OpenStack … coupling
OpenStack@Site1
OpenStack@Site2
OpenStack@Site3
Tricircle
OpenStack API
OpenStack API OpenStack API OpenStack API
NovaNova Driver
CinderNova Driver
NeutronL2/L3 Agent
V2 Tricircle is OpenStack API gateway and networking automation, decoupled from OpenStack
Tricircle
OpenStack API
OpenStack API OpenStack API OpenStack API
NovaAPI Gateway
NeutronAPI Gateway
Neutron APICinder
API GatewayTricircle Plugin
OpenStack@Site1
OpenStack@Site2
OpenStack@Site3
Tricircle, stateless design is like cells but better...
35
API Cell
Message BusDB
Compute Nodes
OpenStack
Message BusDB
Compute Nodes
Cell
Nova-APIGW
routing DB
Cinder-APIGW
Neutron-APIGW
Nova,Cinder,Neutron,GlanceControllers
Nova-API
routing DB
API Cell
RPC
REST Cells Tricircle
Sharding the cloud
Yes Yes
Interface between API entrance and edge data center
RPCSQL
RESTful.OpenStack APIEasy for upgrade, troubleshooting, multi-vendor integration
Services involved
Nova Nova,Cinder,Neutron, Glance(*), Ceilometer(*)
North API Nova API OpenStack API
Tricircle
Tricircle is API gateway (API entrance ) just like Nova API in API Cell, even simpler, for just forwarding the request, and leave the request parameter validation to be done in bottom OpenStacks. No VM/volume/backup/snapshot data will be stored in Tricircle.
* Glance, Ceilometer will be involved later.
Message BusDB
Compute Nodes
CellRPC
Message BusDB
Compute Nodes
OpenStack
Nova,Cinder,Neutron,GlanceControllers
REST
Tricircle, OpenStack API-gateway to routing request - VM placement
36
API CellNova-APIGW
routing DB
Cinder-APIGW
Neutron-APIGW
Message BusDB
Compute Nodes
OpenStack
Nova,Cinder,Neutron,GlanceControllers
1
2
3
Message route:1. boot VM ( AZ = az1 ) through
Nova-APIGW
2. forwarding request to regarding bottom OpenStack according to AZ. If more than one bottom openstack in one AZ, then schedule one. The bottom OpenStack process the boot request.
3. Nova-APIGW cache the routing for the new VM (VM in bottom OpenStack)
Tricircle
Message BusDB
Compute Nodes
OpenStackNova,Cinder,Neutron,Glance
Controllers
Tricircle, OpenStack API-gateway to routing request- operation forwarding
37
API CellNova-APIGW
routing DB
Cinder-APIGW
Neutron-APIGW
1
2
3
Message BusDB
Compute Nodes
OpenStack
Nova,Cinder,Neutron,GlanceControllers
5
4
Message route:
1. Nova reboot VM
2. Query routing DB where the VM is
3. reboot VM Rest API to bottom OpenStack
4,5 just a locality OpenStack reboot VM operation.
Tricircle
Message BusDB
Compute Nodes
OpenStackNova,Cinder,Neutron,Glance
Controllers
Tricircle, OpenStack API-gateway to routing request - VM/Volume co-location
38
API CellNova-APIGW
routing DB
Cinder-APIGW
Neutron-APIGW
Message BusDB
Compute Nodes
OpenStackNova,Cinder,Neutron,Glance
Controllers
1
2
3
Tricircle
Message BusDB
Compute Nodes
OpenStackNova,Cinder,Neutron,Glance
Controllers
1. Level1 Co-location: Availability Zone
2. Level2 Co-location: Project-ID and OpenStack binding
if one AZ includes more than one OpenStack. In each AZ, there is one current binding OpenStack for the same project-id
AZ1
Neutron can not just forward API request, need to do networking for tenant VMs in different edge data centers
So Neutron API-GW is not simple Neutron API forwarding gateway, need networking functionality,
Neutron API-GW includes Neutron API, and Tricircle plugin for networking purpose, and reserve Neutron DB for tenant level IP/mac address (which is spanning across multiple OpenStack instances) management, otherwise, conflict will happen.
Tricircle, networking of my VMs in different bottom OpenStack?
39
API CellCinder-APIGW
Neutron-APIGW
Message BusDB
Compute Nodes
OpenStack
Nova,Cinder,Neutron,GlanceControllers
Tricircle
Message BusDB
Compute Nodes
OpenStack
Nova,Cinder,Neutron,GlanceControllers
VM1 VM2
R
Nova-APIGW
routing DB
Cinder-APIGW
Neutron-APIGW
Tricircle
Neutron-APIGW
Neutron APITricircle Plugin
Neutron DB
Tricircle
40
Networking - L2 networking (mixed VLAN/VxLAN)
Neutron APITricircle Plugin
L2GW Driver
bottom OpenStack bottom OpenStack
Network1-1
3 Create Network1-1
Nova API-GW
1 Create Network1 2 Create VM1(Network1, AZ1)
VLAN1L2GW1 L2GW2
5. Create Port1 for VM1
VM1
6 Create VM1(Port1, Network1-1)
4. update Network1( segment1 = Network1-1@ AZ1)
*support from Networking L2GW project
Tricircle
41
Networking - L2 networking (mixed VLAN/VxLAN)
Neutron APITricircle Plugin
L2GW Driver
bottom OpenStack bottom OpenStack
Network1-1
8 Create Network1-2
Nova API-GW
7 Create VM2(Network1, AZ2)
VLAN1L2GW1
10. Create Port2 for VM2
VM1
11 Create VM2(Port2, Network1-2)
9. update Network1( segment2 = Network1-2 @ AZ2)
VM2
Network1-2VxLAN2
L2GW2
Tricircle
42
Networking - L2 networking (mixed VLAN/VxLAN)
Neutron APITricircle Plugin
L2GW Driver
bottom OpenStack bottom OpenStack
Network1-1
Nova API-GW
VLAN1L2GW1
VM1 VM2
Network1-2VxLAN2
L2GW2
XJob
11. Start async job for L2 Networking for (Network1-1, Network1-2)
12. Create L2GW local connection13. Create L2GW remote connection14. Populate remote mac/IP info
12. Create L2GW local connection13. Create L2GW remote connection14. Populate remote mac/IP info
L2 Networking(EVPN)
Edge DataCenter
When a new VM is booted, the networking is a relative time consuming task: need to create security group, network, subnet, connecting to router, etc.. For better user experience, not to let the end user wait for 30 seconds or longer for the VM booting request, it could be done by async. job. So we can introduce XJob for such async. background job.
Tricircle, balance between user experience and simplicity
43
Edge DataCenter
OpenStack
API CellCinder-APIGW
Neutron-APIGW
Nova-APIGW
DB
Cinder-APIGW Neutron-API
Tricircle PluginTricircle
OpenStack
XJob
44Networking - L3 networking ( E-W/N-S, VxLAN or Mixed VLAN/VxLAN )
Will be implemented after cross OpenStack L2 networking (VxLAN, Mixed VLAN/VxLAN) is ready, using VxLAN or Mixed VLAN/VxLAN as the N-S bridging network, for E-W just stretch the L2 network to where needed.
https://review.openstack.org/#/c/304540/
45
Networking - L3 networking ( Shared VLAN E-W )
Neutron APITricircle Plugin
bottom OpenStack bottom OpenStack
Provider L2 Networking
Router1 Router2VM1 VM2
Network2Network1provider VLAN1
1,2
7,89,10 11,12
Nova API-GW
XJob
3,4
5,6
ab c d e gf
13
provider VLAN1
a Create Network1b Create VM1(Network1)1 Create Network12 Create VM1(Network1)c Create Routerd Add router-interface ( Router1, Network1)3 Create Router14 Add router-interface ( Router1, Network1)
e Create Network2f Create VM2(Network2)5 Create Network26 Create VM2(Network2)g Add router-interface ( Router, Network2)7 Create Router28 Add router-interface ( Router2, Network2)
9 create provider network VLAN110 add router interface ( Router1, VLAN1)11 create provider network VLAN112 add router interface ( Router2, VLAN1)
Shared VLAN is mainly used intra data center. But doesn't mean it can't used inter~datacenter. for example, put physical vxlan gateway(vlan to vxlan 1:1 mapping) in each dc. it will look like this: vlan100 in dc 1-> vxlan gw in dc1,convert vlan100 to vxlan 5095- > ....~> vxlan gw in dc2,convert vxlan 5095 to vlan 100 -> vlan 100 in dc2 you have to configure the vlan vxlan mapping manually.
46
bottom OpenStack bottom OpenStack
Router1 Router2VM1 VM2
Network2Network1
bottom OpenStack
provider VLAN for E-W(bridge network E-W,CIDR: cidr E-W)
Router3
External Networking
provider VLAN for N-S(bridge network N-S,CIDR: cirdr N-S)
IP1IP5 IP6
IP2
IP3
FIP1(External)
ip1
fip1(internal)
ip2
Networking - L3 networking ( Shared VLAN E-W/N-S )
Tenant data movement across OpenStack
Tricircle
(OpenStack Controller)
(OpenStack Controller)
(OpenStack Controller)
VM1(Trans Tool)
VM2(Trans Tool)
OpenStack API Gateway:● Move tenant’s data (VM,Volume,
Image,etc) across site leverage the cross site tenant L2/L3 networking
OpenStack API
OpenStack API OpenStack API OpenStack API
volume volume
Create VM with transportation tool, and attach the volume( data to be moved) to the VM, move the data across OpenStack through tenant level L2/L3 networking. *Conveyor, a project built above Tricircle will help to do this:https://launchpad.net/conveyor
Summary: Stateless designed Tricircle
48
Tricircle
Admin API
DB
Nova API-GW Cinder API-GW Neutron APITricircle Plugin
XJob
Restful OpenStack API
Message bus, Async. XJob RPC API for cross OpenStack functionalities like networking, migrate volume
DB Access for site management, resource routing
Msg. Bus
bottom OpenStack
Nova Cinder Neutron
bottom OpenStack ( Pod )
Nova Cinder Neutron
New components of Tricircle
What does stateless mean for TricircleTricricle, which is only working as API gateway, will not store object data like VM/Volume/Backup/Snapshor/etc.., especially status of these object. For networking logic object like Network/Subnet/Router, it’s logical concept can be across multi-sites, it’s necessary to store these logical abstract object for global IP/mac address management and networking purpose. But even for port, it will be queried from the bottom OpenStack.
Stateless designed TricircleThe new components of Tricircle is fully decoupled with OpenStack services like Nova, Cinder, and the Tricircle plugin will work just like OVN, ODL plugin in Neutron project. The stateless architecture proposal is to remove the uuid mapping, status synchronization and allow resource provisioning in each bottom OpenStack instances, make tricircle a very slim API gateway.
● Nova API-GW ○ a standalone web service to receive all nova api request, and routing the request to appropriate bottom OpenStack instance
according to Availability Zone ( during creation ) or resource id ( during operation and query ). If more than one pod in one Availability Zone, then routing request to proper pod according to current tenant ID and pod binding.
○ Nova APIGW is the functionality to trigger automatic networking when new VMs are being provisioned.○ work as stateless service, and could run with processes distributed in multi-hosts.
● Cinder API-GW ○ a standalone web service to receive all cinder api request, and route the request to appropriate bottom OpenStack
according to Availability Zone ( during creation ) or resource id ( during operation and query ). If more than one pod in one Availability Zone, then routing request to proper pod according to current tenant ID and pod binding.
○ Cinder APIGW and Nova APIGW will make sure the volumes for the same VM will co-locate in same OpenStack instance.○ work as stateless service, and could run with processes distributed in multi-hosts.
● Neutron API Server○ Neutron API Server is reused from Neutron to receive and handle Neutron API request.○ Neutron Tricircle Plugin. It runs under Neutron API server in the same process like OVN Neutron plugin.The Tricircle plugin
serve for tenant level L2/L3 networking automation across multi-OpenStack instances. It will use driver interface to call L2GW api, especially for cross OpenStack mixed VLAN / VxLAN L2 networking.
50
Stateless designed TricircleThe new components of Tricircle is fully decoupled with OpenStack services like Nova, Cinder, and the Tricircle plugin will work just like OVN, ODL plugin in Neutron project. The stateless architecture proposal is to remove the uuid mapping, status synchronization and allow resource provisioning in each bottom OpenStack instances, make tricircle a very slim API gateway.
● Admin API○ manage sites(bottom OpenStack instances) and availability zone mappings.○ Retrieve object uuid routing.○ Expose api for maintenance.
● Xjob○ receive and process cross OpenStack functionalities and other async jobs from Nova API-GW, or Cinder API-GW, Admin API
or Neutron Tricircle Plugin.○ for example, when booting a VM for the first time for the project, router, security group rule, FIP and other resources may have
not already been created in the bottom OpenStack instance. But it’s required. Not like network,security group, ssh keypair, other resources they must be created before a VM booting. These resources could be created in async way to accelerate response for the first VM booting request.
○ cross OpenStack networking also will be done in async jobs.○ Any of Admin API, Nova API-GW, Cinder API-GW, Neutron Tricircle plugin could send an async job to XJob through message
bus with RPC API provided by XJob.● Database
○ The Tricircle has its own database to store pods, pod-bindings, jobs, resource routing tables.○ And Neutron Tricircle plugin reuse DB of Neutron, for one tenant’s network, router will be spread into multiple pods, and
managing tenant level IP/mac address.51
More information
wiki of Tricircle: https://wiki.openstack.org/wiki/Tricircle
play and contribute: https://github.com/openstack/tricircle
Design doc: https://docs.google.com/document/d/18kZZ1snMOCD9IQvUKI5NVDzSASpw-QKj7l2zNqMEd3g
The good aspect for cell is to divide the DB and message bus into separate small DB and message bus in each cell. Cell could be deployed into data center, then resource locality could be achieve.
Cells V2: a good but not good enough enhancement(1)
54
Edge DataCenter
Edge DataCenter
Edge DataCenter
Nova-API
routing DB
API Cell
Message BusDB
Compute Nodes
Cell
Message BusDB
Compute Nodes
Cell
The challenge for cells is that, ● only nova supports cells. ● using RPC for inter-datacenter
communication will bring the difficulty in inter-dc troubleshooting. If the link is broken from one site, the site is not manageable, for no CLI or rest API to manage a child cell
● upgrade has to deal with DB and RPC change.
● difficult for multi-vendor integration for different cell via RPC/DB interface
If it’s RESTful interface between API-cell and child cell, it’ll be better, the site(child cell) is still manageable even if the link to the site is broken.But RESTful interface don’t allow to access DB and message bus directly. That’s why Tricircle uses OpenStack api as the interface between “API-cell” and “Child-cell” in site
Cells V2: a good but not good enough enhancement(2)
55
Edge DataCenter
Edge DataCenter
Edge DataCenter
Nova-API
routing DB
API Cell
1
2Message Bus
DB
Compute Nodes
Cell
Message BusDB
Compute Nodes
Cell
3
4
5