This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
What's In This Guide .................................................................................................................................................................... 4
Types of Cloud Workloads ....................................................................................................................................................... 5
CloudPlatform Supports Both Workload Types ....................................................................................................................... 8
Traditional Workload ........................................................................................................................................................... 8
Management Server Cluster Deployment ................................................................................................................................. 10
What Type of Workload is the Management Server? ........................................................................................................... 10
Management Server Cluster Backup and Replication ........................................................................................................... 11
Management Server Cluster Hardware ................................................................................................................................. 12
Primary Management Server Cluster................................................................................................................................. 12
Standby Management Server Cluster ................................................................................................................................ 12
Management Server Cluster Configuration ........................................................................................................................... 13
Primary Management Server Cluster Configuration.......................................................................................................... 13
Cloud-Era Availability Zone Deployment ................................................................................................................................... 16
Cloud-Era Availability Zone Configuration ............................................................................................................................. 20
Traditional Availability Zone Deployment ................................................................................................................................. 22
Traditional Availability Zone Hardware ................................................................................................................................. 23
Choice of Hypervisor in Traditional Availability Zone ............................................................................................................ 25
Traditional Availability Zone Configuration (for vSphere) ..................................................................................................... 25
Traditional Availability Zone Configuration (for Xenserver) .................................................................................................. 28
Disclaimer: Vendors and products mentioned in this document are provided as examples and should not be taken as
endorsements or indication of vendor certification.
Citrix CloudPlatform™ is an open source software platform that pools datacenter resources to build public, private, and
hybrid Infrastructure as a Service (IaaS) clouds. CloudPlatform abstracts the network, storage, and compute nodes that
make up a datacenter and enables them to be delivered as a simple-to-manage, scalable cloud infrastructure. These nodes
or components of a cloud can vary greatly from datacenter to datacenter and cloud to cloud because they are defined by
the unique workloads or applications that they support. With so many options for servers, hypervisors, storage and
networking it is imperative that cloud operators design with a specific application in mind to ensure the infrastructure
meets the scalability and reliability requirements of the application.
The following figure illustrates the steps a cloud operator typically follows to determine the appropriate deployment
architecture for CloudPlatform.
Types of Cloud Workloads
Two distinct types of application workloads have emerged in cloud operator’s datacenters.
The first type is a traditional enterprise workload. The majority of existing enterprise applications fall into this category. They include, for example, applications developed by leading enterprise vendors such as Microsoft, Oracle, and SAP. These applications are typically built to run on a single server or on a cluster of front-end and application server nodes backed by a database. Traditional workloads typically rely on technologies such as enterprise middleware clusters and vertically-scaled databases.
Citrix commonly refers to the second type as a Cloud-Era workload. Internet companies such as Amazon, Google, Zynga, and Facebook have long realized that traditional enterprise infrastructure was insufficient to serve the load generated by millions of users. These Internet companies pioneered a new style of application architecture that
Define target workloads
Determine how that application workload will be delivered reliably
does not rely on enterprise-grade server clusters, but on a large number of loosely-coupled computing and storage nodes. Applications developed this way often utilize technologies such as MySQL sharding, no-SQL, and geographic load balancing.
There are two fundamental differences between traditional workloads and cloud-era workloads.
SCALE: The first difference is scale. Traditional enterprise applications serve tens of thousands of users and
hundreds of sessions. Driven by the growth of Internet and mobile devices, Internet applications serve tens of millions of users. The orders of magnitude difference in scale translates to significant difference in demand for computing infrastructure. As a result the need to reduce cost and improve efficiency becomes paramount.
RELIABILITY: The difference in scale has an important side effect. Enterprise applications can be designed to run on
reliable hardware. Application developers do not expect the underlying enterprise-grade server or storage cluster to fail during normal course of operation. Sophisticated backup and disaster recovery procedures can be setup to handle the unlikely scenario of hardware failure. The Internet scale changed the paradigm. As the amount of hardware resources grow, it is no longer possible to deliver the same level of enterprise-grade reliability, backup, and disaster recovery at the scale needed to support Internet workloads in a cost effective and efficient manner.
Traditional vs. Cloud-Era Workload Requirements
Traditional Workload Cloud-Era Workload
Scale 10s of thousands of users Millions of Users
Reliability 99.999 uptime Assumes failure
Infrastructure Proprietary Commodity
Applications SAP, Microsoft, Oracle Web Content, Web Apps, Social Media
Cloud-era workloads assume that the underlying infrastructure can fail and will fail. Instead of implementing disaster recovery as an after-thought, multi-site geographic failover must be designed into the application. Once the application can expect infrastructure failure, it no longer needs to rely on technologies such as network link aggregation, storage multipathing, VM HA or fault tolerance, or VM live migration. Instead the application is expected to treat servers and storage as “Ephemeral Resources,” a term that means resources can be used while they are available, but they may become unavailable after a short period of use. Some cloud-era applications, such as the Netflix streaming video service, have notably employed a mechanism called “Chaos Monkey” that randomly destroys infrastructure nodes to ensure that the application can continue to function despite infrastructure failure.
Common Cloud Workloads
Traditional Workload Candidates
Communications / Productivity
Outlook, Exchange or SharePoint
CRM / ERP / Database
Oracle, SAP
Desktop
Desktop-based computing, desktop service and support applications, and desktop management applications
The database replication between primary and standby clusters can be done using MySQL replication methodology with hot
backup option. You can find more information about this at http://www.innodb.com/wp/products/hot-backup/.
CloudPlatform Internal DNS CPMS-URL (Example URL pointing at CloudPlatform Management server)
Management nodes:10.52.2.148; 10.52.2.149
CloudPlatform Version CloudPlatform 3.0.x
MySQL Version MySQL 5.1.61
MySQL Database (Master)
IP Address
10.52.2.142
MySQL Database (Slave)
IP Address
10.52.2.143
Management Server Node Configuration
Management Servers
Number of Servers (VM) for Management
2 This is a redundant design for high availability.
Name(s) CPMGSRV01, CPMGSRV02 Naming convention does not provide any standards or suggestions. These are sample names.
IP Address(es) 10.52.2.148; 10.52.2.149 The IP addresses specified only for reference and needs to be changed to fit network configuration of datacenter.
Deployment Hypervisor
XenServer 6.0.2 6.0.2 is the latest version of XenServer and is tested and entitled with CloudPlatform 3.0.x
Management Server VM Properties
CPU: 4 x vCPU RAM: 16 GB RAM NIC: 1 NIC HDD: 250GB
Management server is memory intensive and having enough RAM ensures performance requirements.
Operating System RHEL 6.2 (64-bit) RHEL is the recommended OS for its available commercial support.
Load Balancing used Yes Load balancing management servers is a recommended practice to meet performance requirements.
Load Balancer NetScaler VPX Considering the load and number of users and SSL connections this Cloud architecture needs to manage Netscaler VPX would suffice. NetScaler MPX is an option if the load requirement goes beyond what is mentioned in this document.
Load Balancer (NetScaler) Configuration
The CloudPlatform UI is to be load balanced using Load Balancers. CloudPlatform requires that ports 8080 and 8250 are
configured on the LB VIP and that it requires persistence/stickiness across multiple sessions.
Source Port Destination Port Protocol Persistence
8080 8080 HTTP Yes
8250 8250 TCP Yes
Master/Slave MySQL Configuration
CloudPlatform requires a MySQL database, to store configuration information, VM Staging, and events related to every VM
(i.e. every guest VM started as part of the cloud environment creates an associated event which is stored in the database).
The script provided with the CloudPlatform installation creates two different databases referred to as cloud and
cloud_usage and populates the initial data within each database. The CloudPlatform Installation Guide details the scripts
used for installing and preparing the databases for CloudPlatform.
Currently CloudPlatform has a dependency on the InnoDB Engine used in MySQL for foreign key support in both the cloud
and cloud usage databases; therefore a MySQL Cluster cannot be used. The following section refers to a Master / Slave
configuration of MySQL.
MySQL replication works on a master/slave topology therefore there is no requirement for shared storage. Internally by
means of the asynchronous transfer mode data is kept consistent between both of the servers. The replication
methodology used for CloudPlatform is ROW Based.
The MySQL community edition (GPL) is to be deployed on two separate Virtual Servers running Red Hat Enterprise Linux 6.2
with Replication (master and slave) configured between them for high availability.
Here is the summary of networking configuration in the Cloud-era availability zone:
1. A pair of NetScaler MPX appliances in HA configuration is connected directly to the public Internet on one side, and on the other side to the datacenter core switch on a RFC 1918 private network.
2. Datacenter core switch and aggregation switches create 200 pairs of RFC 1918 private IP networks. Each pod consumes 1 pair of RFC 1918 private IP networks: a storage/management network and a guest network.
3. Each host in the pod is connected to 2 RFC 1918 private IP networks. One is a 10Gbps network used for storage and management traffic. The other is a 1Gbps network used to carry guest VM traffic.
4. There is one NFS server in each pod. The NFS server is connected to the storage/management network via a 10Gbps Ethernet link.
5. Link aggregation may be used in the datacenter core and aggregation switches. Link aggregation is not used in TOR switches, hosts, or primary storage NFS servers.
6. A high performance NFS server is directly connected to the datacenter aggregation switch layer and is used as the secondary storage server for this datacenter.
The datacenter core and aggregation switches set up the appropriate network ACL to ensure that various networks are
properly isolated. The following table details best practices on whether access should be allowed or denied based on
3. Create 2 NFS exports in the primary storage NFS server for each pool
Availability Zone(s) – 1 (it is always recommended to go with minimum of two availability zones)
ZONE-01
Network Mode Basic (L3 Network Model) This Zone has two hundred PODs and two clusters in each POD. The configuration is specified for one cluster that can be replicated for all other clusters in all the PODs.
\ZONE-01 \PODS\ <Z-01-POD01-Xen-CL01-04>
Name of Cluster(s) Z-01-POD01-Xen-CL01 Z-01-POD01-Xen-CL02 Z-01-POD02-Xen-CL03 Z-01-POD02-Xen-CL04
These names for POD and Clusters are specific to implementation.
Number of Hypervisors (compute nodes) per Cluster
8 x XenServer 6.0.X
Storage Infrastructure
Type / Make NetApp FAS3270
Number of Controllers
2 Two controllers for availability.
Primary Protocol NFS
Available Capacity 20TB/pod This is an example. Calculate the capacity requirement for primary and secondary storage using the formulas mentioned in the section above.
Primary Storage (two per cluster)
Z-01-POD01-CL (Replicate this for every cluster)
Availability Zone ZONE-01 Name of the Zone/POD/Clusters should be treated as examples.
Pod Z-01-POD01
Cluster Z-01-POD010-CL01
Protocol NFS
Size 4TB Please refer to the computation in the section above to determine the exact size.
In this section we will describe how to design and configure a 64-node traditional server virtualization availability zone. The
availability zone consists of 4 pods, each comprised of 16 nodes. Unlike the Cloud-era setup, where each pod has its own
NFS servers, the entire zone shares a centralized storage server over a SAN. The availability zone is connected to 4 shared
VLANs: public, DMZ, test-dev, and production. In addition, tenants can be allocated isolated VLANs from a pool of zone
VLANs. A VM can be connected to one or more of these networks:
An isolated VLAN NAT’ed to public internet via the virtual router
The DMZ VLAN
The test-dev VLAN
The production VLAN
The following figure illustrates the physical network setup for a traditional availability zone:
Every host is connected to 3 networks:
1. A storage network that connects the host to primary storage. Storage multipath technology should be used to ensure reliability.
2. An untagged Ethernet network used for management and vMotion traffic. NIC bonding should be used to ensure reliability.
3. An Ethernet network used for shared and public VLAN traffic. NIC bonding should be used to ensure reliability. This network is used to carry 4 shared VLANs: public, DMZ, test-dev, and production. It is also used to carry the isolated zone VLANs.