CloudPlatform Deployment Reference Architecture - Citrix · PDF fileCloudPlatform Deployment Reference Architecture ... While Cloud-era workloads represent an application architecture

CloudPlatform Deployment

Reference Architecture

For Citrix CloudPlatform Version 3.0.x

© 2012 Citrix Systems, Inc. All rights reserved. Specifications are subject to change without notice. Citrix Systems, Inc., the

Citrix logo, Citrix XenServer, Citrix XenCenter, and Citrix CloudPlatform are trademarks or registered trademarks of Citrix

Systems, Inc. All other brands or products are trademarks or registered trademarks of their respective holders.

CloudPlatform Deployment Reference Architecture

2 © 2012 Citrix Systems, Inc. All rights reserved.

Contents

What's In This Guide .................................................................................................................................................................... 4

Workload-Driven Deployment Process........................................................................................................................................ 5

Types of Cloud Workloads ....................................................................................................................................................... 5

CloudPlatform Supports Both Workload Types ....................................................................................................................... 8

Traditional Workload ........................................................................................................................................................... 8

Cloud-Era Workload ............................................................................................................................................................. 9

Management Server Cluster Deployment ................................................................................................................................. 10

What Type of Workload is the Management Server? ........................................................................................................... 10

Management Server Cluster Backup and Replication ........................................................................................................... 11

Management Server Cluster Hardware ................................................................................................................................. 12

Primary Management Server Cluster................................................................................................................................. 12

Standby Management Server Cluster ................................................................................................................................ 12

Management Server Cluster Configuration ........................................................................................................................... 13

Primary Management Server Cluster Configuration.......................................................................................................... 13

Cloud-Era Availability Zone Deployment ................................................................................................................................... 16

Overview ................................................................................................................................................................................ 16

Network Configuration ...................................................................................................................................................... 17

Cloud-Era Availability Zone Hardware ................................................................................................................................... 18

Primary Storage Sizing ....................................................................................................................................................... 19

Secondary Storage Sizing ................................................................................................................................................... 19

Cloud-Era Availability Zone Configuration ............................................................................................................................. 20

Traditional Availability Zone Deployment ................................................................................................................................. 22

Overview ................................................................................................................................................................................ 22

Traditional Availability Zone Hardware ................................................................................................................................. 23

Primary Storage Sizing ....................................................................................................................................................... 24


© 2012 Citrix Systems, Inc. All rights reserved. 3

Secondary Storage Sizing ................................................................................................................................................... 24

Choice of Hypervisor in Traditional Availability Zone ............................................................................................................ 25

Traditional Availability Zone Configuration (for vSphere) ..................................................................................................... 25

Traditional Availability Zone Configuration (for Xenserver) .................................................................................................. 28

Disclaimer: Vendors and products mentioned in this document are provided as examples and should not be taken as

endorsements or indication of vendor certification.



What's In This Guide

This Guide is for cloud operators who are planning medium to large-scale production deployments of Citrix CloudPlatform.

It is designed to work in conjunction with the CloudPlatform Installation Guide . This document is intended to offer high-

level planning and architectural guidance, as opposed to detailed installation procedures, for production deployments. The

reader should refer to the CloudPlatform Installation Guide for the detailed steps needed to install and configure Citrix

CloudPlatform. The reader should also refer to the CloudPlatform Administration Guide for the instructions on how to

operate, maintain, and upgrade a CloudPlatform installation.

Citrix CloudPlatform supports a large number of hypervisor, network and storage configurations. To simplify the planning of

large-scale production deployments, this document is designed to provide guidance on selecting the proper architecture

and configuration according to the target workload the cloud is designed to support.

Before we cover the details of different options of deployment architecture we’ll first establish the foundation of our

methodology: workload-driven deployment.

http://support.citrix.com/product/cs/v3.0/#tab-doc




Workload-Driven Deployment Process

Citrix CloudPlatform™ is an open source software platform that pools datacenter resources to build public, private, and

hybrid Infrastructure as a Service (IaaS) clouds. CloudPlatform abstracts the network, storage, and compute nodes that

make up a datacenter and enables them to be delivered as a simple-to-manage, scalable cloud infrastructure. These nodes

or components of a cloud can vary greatly from datacenter to datacenter and cloud to cloud because they are defined by

the unique workloads or applications that they support. With so many options for servers, hypervisors, storage and

networking it is imperative that cloud operators design with a specific application in mind to ensure the infrastructure

meets the scalability and reliability requirements of the application.

The following figure illustrates the steps a cloud operator typically follows to determine the appropriate deployment

architecture for CloudPlatform.

Types of Cloud Workloads

Two distinct types of application workloads have emerged in cloud operator’s datacenters.

The first type is a traditional enterprise workload. The majority of existing enterprise applications fall into this category. They include, for example, applications developed by leading enterprise vendors such as Microsoft, Oracle, and SAP. These applications are typically built to run on a single server or on a cluster of front-end and application server nodes backed by a database. Traditional workloads typically rely on technologies such as enterprise middleware clusters and vertically-scaled databases.

Citrix commonly refers to the second type as a Cloud-Era workload. Internet companies such as Amazon, Google, Zynga, and Facebook have long realized that traditional enterprise infrastructure was insufficient to serve the load generated by millions of users. These Internet companies pioneered a new style of application architecture that

Define target workloads

Determine how that application workload will be delivered reliably

Develop the deployment architecture

Implement cloud deployment

Operate cloud environment (e.g., monitor, upgrade, patch)

Iaas Cloud



does not rely on enterprise-grade server clusters, but on a large number of loosely-coupled computing and storage nodes. Applications developed this way often utilize technologies such as MySQL sharding, no-SQL, and geographic load balancing.

There are two fundamental differences between traditional workloads and cloud-era workloads.

SCALE: The first difference is scale. Traditional enterprise applications serve tens of thousands of users and

hundreds of sessions. Driven by the growth of Internet and mobile devices, Internet applications serve tens of millions of users. The orders of magnitude difference in scale translates to significant difference in demand for computing infrastructure. As a result the need to reduce cost and improve efficiency becomes paramount.

RELIABILITY: The difference in scale has an important side effect. Enterprise applications can be designed to run on

reliable hardware. Application developers do not expect the underlying enterprise-grade server or storage cluster to fail during normal course of operation. Sophisticated backup and disaster recovery procedures can be setup to handle the unlikely scenario of hardware failure. The Internet scale changed the paradigm. As the amount of hardware resources grow, it is no longer possible to deliver the same level of enterprise-grade reliability, backup, and disaster recovery at the scale needed to support Internet workloads in a cost effective and efficient manner.

Traditional vs. Cloud-Era Workload Requirements

Traditional Workload Cloud-Era Workload

Scale 10s of thousands of users Millions of Users

Reliability 99.999 uptime Assumes failure

Infrastructure Proprietary Commodity

Applications SAP, Microsoft, Oracle Web Content, Web Apps, Social Media

Cloud-era workloads assume that the underlying infrastructure can fail and will fail. Instead of implementing disaster recovery as an after-thought, multi-site geographic failover must be designed into the application. Once the application can expect infrastructure failure, it no longer needs to rely on technologies such as network link aggregation, storage multipathing, VM HA or fault tolerance, or VM live migration. Instead the application is expected to treat servers and storage as “Ephemeral Resources,” a term that means resources can be used while they are available, but they may become unavailable after a short period of use. Some cloud-era applications, such as the Netflix streaming video service, have notably employed a mechanism called “Chaos Monkey” that randomly destroys infrastructure nodes to ensure that the application can continue to function despite infrastructure failure.

Common Cloud Workloads

Traditional Workload Candidates

Communications / Productivity

Outlook, Exchange or SharePoint

CRM / ERP / Database

Oracle, SAP

Desktop

Desktop-based computing, desktop service and support applications, and desktop management applications



Cloud-Era Workload Candidates

Web Service Static and dynamic web content, streaming media, RSS, mash-ups and SMS

Web Applications

Web service-enabled applications, eCommerce, eBusiness, Java application servers

Rich Internet Applications

Videos, online gaming and mobile apps (Adobe Flex, Flash, Air, Silverlight, iPhone)

Disaster Recovery

Onsite/Offsite backup and recovery, live failover, cloud bursting for scale

HPC

Engineering design and analysis, scientific applications, high performance computing

Collaboration / Social media

Web 2.0 applications for online sharing and collaboration (Blog, CMS, File Share, Wiki, IM)

Batch Processing

Predictive usage for processing large workloads - Data mining, warehousing, analytics, business intelligence

Development and Test

Software development and test processes and image management



CloudPlatform Supports Both Workload Types

Citrix CloudPlatform is the only product in the industry today that supports both traditional enterprise and Cloud-era

workloads. While Cloud-era workloads represent an application architecture that will likely become more dominant in the

future, the majority of applications that exist today are written as enterprise-style workloads. With CloudPlatform, a cloud

operator may design for one style of workload and add support for the other style later. Or a cloud operator may design for

supporting both styles of workload from the beginning.

The ability to support both styles of workload lies in CloudPlatform’s architectural flexibility. Cloud operators can, for

example, configure multiple availability zones using different hypervisor, storage, and networking capabilities required to

support different types of workloads to meet security, compliance and scalability needs of multiple cloud initiatives.

Traditional Workload

The following figure illustrates how a CloudPlatform Traditional Availability Zone can be constructed to support a traditional

enterprise style workload

Traditional workloads in the cloud are typically designed with a requirement for high availability and fault tolerance and use

common components of an enterprise datacenter to meet those needs. This starts with an enterprise-grade hypervisor,

such as VMware vSphere or Citrix XenServer that supports live migration of virtual machines and storage and has built-in

high availability. Storage of virtual machine images leverages high-performance SAN devices. Traditional physical network

infrastructure like firewalls and layer 2 switching are used and VLANs are designed to isolate traffic between servers and

tenants. VPN tunneling provides secure remote access and site-to-site access through existing network edge devices.

Applications are packaged using industry-standard OVF files.



Cloud-Era Workload

The following figure illustrates how a CloudPlatform Cloud-Era Availability Zone can be constructed to support cloud-era

workloads:

The desire for cost-savings can easily offset the need for features in designing for a cloud-era workload making open source

and commodity components such as XenServer and KVM a more attractive option. In this workload type, virtual machine

images are stored in EBS volumes and object store can used to store data that must persist through availability zone

failures. Because of VLAN scalability limitations, software defined networks are becoming necessary in cloud-era availability

zones. CloudPlatform meets this need by supporting Security Groups in L3 networking. Elastic Load Balancing (ELB) or

Global Server Load Balancing (GSLB) is used to redirect user traffic to servers in multiple availability zones. Third party tools

developed for Amazon Web Services to manage applications in this type of environment are readily available and have

tested proven integrations with CloudPlatform.



Management Server Cluster Deployment

The management server deployment is not dependent on the underlying style of cloud workload. A single management

server cluster can manage multiple availability zones across multiple datacenters enabling cloud operators to create

different availability zones to handle different workload types as needed. The following figure illustrates how a single cloud

can contain both cloud-era availability zones and traditional availability zones that are local or geographically dispersed.

What Type of Workload is the Management Server?

CloudPlatform Management Server is designed to run as a traditional enterprise-grade application or traditional workload.

It is designed as a simple, lightweight, and highly efficient application with the majority of work running inside system VMs

(see CloudPlatform Administration Guide – Working with System Virtual Machines) and executed on computing nodes. This

design choice is for two reasons:

First, managing a cloud is not a cloud-scale problem. In CloudPlatform version 3.0.x, each management server node is

certified to manage 10,000 computing nodes. This level of scalability is sufficient for today’s production cloud deployments.

When CloudPlatform deployments continue to grow, we expect to be able to tune management server code so that each

individual management server node can scale to many times more computing nodes.

The second reason for designing the management server as an enterprise application is a pragmatic one. Few people who

deploy CloudPlatform will have a Cloud-era infrastructure already in place. Without an existing IaaS cloud and 3rd

party

management tools like RightScale or EnStratus in place, deploying cloud workload is not an easy task. Building

CloudPlatform Management Server as a cloud-era workload would therefore lead to a bootstrap problem.




Management Server Cluster Backup and Replication

As a traditional-style enterprise application, the management server cluster is front ended by a load balancer and connects

to a shared MySQL database. While the cluster nodes themselves are stateless and can be easily recreated, the MySQL

database node should be backed up and replicated to a remote site to ensure continuing operation of the cloud. The

following figure illustrates how a standby management server cluster is setup in a remote datacenter.

During the normal course of operation, the primary management server cluster serves all UI and API requests. Individual

server failures in the management server cluster are protected as other servers in the cluster will take over the load.

To ensure the management server cluster can recover from a MySQL database failure, an identical database machine is

setup to serve as the backup MySQL server. All database transactions are replayed in real time on the Backup MySQL server

in an active-passive setup. If the primary MySQL server fails, the admin can reconfigure the management server cluster to

point to the backup MySQL server.

To ensure that the system can recover from the failure of the entire availability zone 1 that contains the primary

management server cluster, a standby management server cluster can be setup in another availability zone. Asynchronous

replication is setup between the backup MySQL server in the primary management server cluster and the MySQL server in

the standby management server cluster. If availability zone 1 fails, a cloud administrator can bring up the standby

management server cluster and then update the DNS server to redirect cloud API and UI to the standby management server

cluster.



Management Server Cluster Hardware

Primary Management Server Cluster Citrix recommends a two-node management server cluster that is capable of managing a cloud

deployment totaling 10,000 computing nodes.

Load Balancer NetScaler VPX or MPX based on the number of concurrent active sessions.

Management Server Node 1 Intel or AMD CPU server with at least 2GHZ, 1 socket, 4 cores, 16GB of memory, and

250GB of RAID 1 local disk storage.

Management Server Node 2 Intel or AMD CPU server with at least 2GHZ, 1 socket, 4 cores, 16GB of memory, and


Primary MySQL Server Intel or AMD CPU server with at least 2GHZ, 1 socket, 4cores, 16GB of memory, and


Backup MySQL Server Intel or AMD CPU server with at least 2GHZ, 1 socket, 4 cores, 16GB of memory, and


As long as adequate performance is available

It is permissible to run management server and MySQL server as virtual machines.

It is permissible to run NetScaler VPX virtual appliance.

Standby Management Server Cluster Standby Management Server cluster is identical to the primary management server cluster with one difference:

backup MySQL server is not required.

Load Balancer NetScaler VPX or MPX.

Management Server Node 1 Intel or AMD CPU server with at least 2GHZ, 1 socket, 6 cores, 32GB of memory,

and 250GB of RAID 1 local disk storage.

Management Server Node 2 Intel or AMD CPU server with at least 2GHZ, 1 socket, 6 cores, 32GB of memory,


Primary MySQL Server Intel or AMD CPU server with at least 2GHZ, 1 socket, 6 cores, 32GB of memory,




Management Server Cluster Configuration

Primary Management Server Cluster Configuration

The database replication between primary and standby clusters can be done using MySQL replication methodology with hot

backup option. You can find more information about this at http://www.innodb.com/wp/products/hot-backup/.

CloudPlatform Internal DNS CPMS-URL (Example URL pointing at CloudPlatform Management server)

Management nodes:10.52.2.148; 10.52.2.149

CloudPlatform Version CloudPlatform 3.0.x

MySQL Version MySQL 5.1.61

MySQL Database (Master)

IP Address

10.52.2.142

MySQL Database (Slave)

IP Address

10.52.2.143

Management Server Node Configuration

Management Servers

Number of Servers (VM) for Management

2 This is a redundant design for high availability.

Name(s) CPMGSRV01, CPMGSRV02 Naming convention does not provide any standards or suggestions. These are sample names.

IP Address(es) 10.52.2.148; 10.52.2.149 The IP addresses specified only for reference and needs to be changed to fit network configuration of datacenter.

Deployment Hypervisor

XenServer 6.0.2 6.0.2 is the latest version of XenServer and is tested and entitled with CloudPlatform 3.0.x

Management Server VM Properties

CPU: 4 x vCPU RAM: 16 GB RAM NIC: 1 NIC HDD: 250GB

Management server is memory intensive and having enough RAM ensures performance requirements.

Operating System RHEL 6.2 (64-bit) RHEL is the recommended OS for its available commercial support.

Management Servers – Load Balancing

http://www.innodb.com/wp/products/hot-backup/



Load Balancing used Yes Load balancing management servers is a recommended practice to meet performance requirements.

Load Balancer NetScaler VPX Considering the load and number of users and SSL connections this Cloud architecture needs to manage Netscaler VPX would suffice. NetScaler MPX is an option if the load requirement goes beyond what is mentioned in this document.

Load Balancer (NetScaler) Configuration

The CloudPlatform UI is to be load balanced using Load Balancers. CloudPlatform requires that ports 8080 and 8250 are

configured on the LB VIP and that it requires persistence/stickiness across multiple sessions.

Source Port Destination Port Protocol Persistence

8080 8080 HTTP Yes

8250 8250 TCP Yes

Master/Slave MySQL Configuration

CloudPlatform requires a MySQL database, to store configuration information, VM Staging, and events related to every VM

(i.e. every guest VM started as part of the cloud environment creates an associated event which is stored in the database).

The script provided with the CloudPlatform installation creates two different databases referred to as cloud and

cloud_usage and populates the initial data within each database. The CloudPlatform Installation Guide details the scripts

used for installing and preparing the databases for CloudPlatform.

Currently CloudPlatform has a dependency on the InnoDB Engine used in MySQL for foreign key support in both the cloud

and cloud usage databases; therefore a MySQL Cluster cannot be used. The following section refers to a Master / Slave

configuration of MySQL.

MySQL replication works on a master/slave topology therefore there is no requirement for shared storage. Internally by

means of the asynchronous transfer mode data is kept consistent between both of the servers. The replication

methodology used for CloudPlatform is ROW Based.

The MySQL community edition (GPL) is to be deployed on two separate Virtual Servers running Red Hat Enterprise Linux 6.2

with Replication (master and slave) configured between them for high availability.



MySQL Database

Number of MySQL

Databases

Instances

2 In Master/Slave configuration

Virtual Machine

Configuration

2 vCPU, 16GB RAM, 250 GB Local Disk Use shared storage for DB Storage.

High Availability MySQL Master/Slave Replication

INNODB_ROLLBACK_ON_TIMEOUT=1

INNODB_LOCK_WAIT_TIMEOUT=600

MAX_CONNECTIONS=350

LOG-BIN=MYSQL-BIN

BINLOG-FORMAT = 'ROW'

CloudPlatform does not support mysql

clustering -- it is a manual failover.

InnoDB Rollback

on Timeout

1 (second) The InnoDB Rollback on Timeout

(innodb_rollback_on_timeout)

configuration. Located in /etc/my.cnf

InnoDB Lock Wait

Timeout

600 (seconds) The InnoDB Lock Wait Timeout

(innodb_lock_wait_timeout)

configuration. Located in /etc/my.cnf

MySQL Max

Connections

700 The configuration for the max number of

MySQL connections (max_connections).

This should be set to 350 * (Number of

CloudPlatform Management nodes).

Located in /etc/my.cnf

Binary Log

Location

mysql-bin This setting (log-bin) enables and sets the

location for the binary log. Located in

/etc/my.cnf

Binary Log Format Row Based This setting (binlog-format) defines the

binary Log format. Located in

/etc/my.conf



Cloud-Era Availability Zone Deployment

Overview

In this section we will describe how to design and configure a 3200-node cloud-era availability zone where all 3200 nodes

reside in the same datacenter. These nodes are divided into 200 racks or pods, with 16 hosts in each. The number of hosts

in each pod is typically a function of the available power. In the event that blade servers are used, 16 hosts constitute a

typical blade chassis and embedded networking switches would eliminate the need for TOR switches. Each pod also

contains an NFS server for primary storage.

The following figure illustrates how the compute hosts and storage servers are interconnected



Network Configuration

Here is the summary of networking configuration in the Cloud-era availability zone:

1. A pair of NetScaler MPX appliances in HA configuration is connected directly to the public Internet on one side, and on the other side to the datacenter core switch on a RFC 1918 private network.

2. Datacenter core switch and aggregation switches create 200 pairs of RFC 1918 private IP networks. Each pod consumes 1 pair of RFC 1918 private IP networks: a storage/management network and a guest network.

3. Each host in the pod is connected to 2 RFC 1918 private IP networks. One is a 10Gbps network used for storage and management traffic. The other is a 1Gbps network used to carry guest VM traffic.

4. There is one NFS server in each pod. The NFS server is connected to the storage/management network via a 10Gbps Ethernet link.

5. Link aggregation may be used in the datacenter core and aggregation switches. Link aggregation is not used in TOR switches, hosts, or primary storage NFS servers.

6. A high performance NFS server is directly connected to the datacenter aggregation switch layer and is used as the secondary storage server for this datacenter.

The datacenter core and aggregation switches set up the appropriate network ACL to ensure that various networks are

properly isolated. The following table details best practices on whether access should be allowed or denied based on

source and destination.

Destination

S

o

u

r

c

e

Storage/Mgmt

Network

Guest Network Secondary Storage

NFS Server

Public

Internet

Storage/Mgmt Network Allowed Denied Allowed NAT’ed

Guest Network Denied Allowed Denied NAT’ed

Secondary Storage NFS Server Allowed Denied Allowed Denied

Public Internet Denied Denied Denied Allowed

The detailed network and IP address configuration is listed in the following table:

Storage/Management

Network

Each host in the pod must have an IP address in the storage/management network.

CloudPlatform will also use a small number of private IP addresses for system VMs.

So a minimum of /27 RFC 1918 private IP address must be allocated for each pod.

These IP addresses will be exclusively used by CloudPlatform. Each pod must have

a different address range for storage/management.



Guest Network The number of guest IP addresses for each pod is determined by the profile of the

VM supported. For example, if VMs on average have 2GB memory, allocate 64 VM

in each host, and 1024 VMs on each pod. To be safe, allocate /21 RFC 1918 private

IP range for the guest network in each pod, allowing a maximum of 2048 VMs to be

created.

Guest network IP ranges in different pods must not overlap.

Cloud operators may choose to create site-to-site VPN tunnels that enable VMs in

different availability zones to communicate with each other via their private IP

addresses. If that is a requirement guest network IP ranges in different availability

zones must not overlap.

Secondary Storage

Server IP

One or more RFC 1918 IP addresses for the NFS server.

Cloud-Era Availability Zone Hardware

Load Balancer NetScaler MPX

Core Switch and

Aggregation Switch

Follow established networking practices

TOR Switch 2 per pod. 1 10G and 1 1G. 24 ports each. More ports would be required if

using IPMI or ILO for managing the individual hosts.

Computing node Intel or AMD CPU server with at least 2GHZ, 2 sockets, 6 cores per socket,

128GB of memory, and 250GB of RAID 1 local disk storage.

Primary Storage NFS

Server

NFS server sized based on the profiles of VMs the cloud is designed to support.

CloudPlatform supports thin provisioning and primary storage sizing can take

advantage of this to reduce the initial storage requirements. (See sizing

calculation below)

Secondary Storage NFS

Server

Sized according to number of hosts and VM profiles.

(See sizing calculation below)



Primary Storage Sizing

Primary storage sizing is based on the VM Profile. The formula for calculating the primary storage for each pod-specific NFS

storage would be as follows:

R = Average size of the system/root disk.

D = Average size of the Data volume.

N = Average number of Data volumes attached per VM.

V = Total number of VMs per pod.

The size of the primary storage required per pod would be

V * (R + (N*D))

Overprovisioning is supported on NFS storage devices in CloudPlatform and can be used to reduce the initial size

requirement of the primary storage per pod.

Secondary Storage Sizing

For Secondary Storage Sizing, here is a formula to follow:

N = Number of VMs in the Zone.

S = Average Number of Snapshots per VM.

G = Average size of snapshot per VM.

T = Number of Templates in the zone.

I = Number of ISOs in the zone.

Secondary Storage sizing would be

((N * S * G) + (I * Avg Size of ISOs) + (T * Avg size of Templates)) * 1.2

There is a 20% spare capacity built into the formula. The actual size could be further reduced based on the following

factors

Deduplication in the Storage Array.

Thin Provisioning.

Compression.



Cloud-Era Availability Zone Configuration

We will configure CloudPlatform as follows:

1. Each pod consists of 2 XenServer pools.

2. There are 8 hosts in each pool.

3. Create 2 NFS exports in the primary storage NFS server for each pool

Availability Zone(s) – 1 (it is always recommended to go with minimum of two availability zones)

ZONE-01

Network Mode Basic (L3 Network Model) This Zone has two hundred PODs and two clusters in each POD. The configuration is specified for one cluster that can be replicated for all other clusters in all the PODs.

\ZONE-01 \PODS\ <Z-01-POD01-Xen-CL01-04>

Name of Cluster(s) Z-01-POD01-Xen-CL01 Z-01-POD01-Xen-CL02 Z-01-POD02-Xen-CL03 Z-01-POD02-Xen-CL04

These names for POD and Clusters are specific to implementation.

Number of Hypervisors (compute nodes) per Cluster

8 x XenServer 6.0.X

Storage Infrastructure

Type / Make NetApp FAS3270

Number of Controllers

2 Two controllers for availability.

Primary Protocol NFS

Available Capacity 20TB/pod This is an example. Calculate the capacity requirement for primary and secondary storage using the formulas mentioned in the section above.

Primary Storage (two per cluster)

Z-01-POD01-CL (Replicate this for every cluster)

Availability Zone ZONE-01 Name of the Zone/POD/Clusters should be treated as examples.

Pod Z-01-POD01

Cluster Z-01-POD010-CL01

Protocol NFS

Size 4TB Please refer to the computation in the section above to determine the exact size.



Path / LUN NFS:/PS/Z-01-CL01-PS01/ NFS:/PS/Z-01-CL01-PS02/

This is a sample path.

Secondary Storage

Z-01-SS01

Type / Make NetApp FAS3270

Number of Controllers

2

Primary Protocol NFS

Available Capacity 50TB This is per Zone. Calculate the capacity requirement from the formula mentioned in the section above.

Host Configuration

XenServer Version 6.0.2 Latest version of XenServer

XenServer Edition Advanced

Server Hardware Specifications

HP DL360p Gen8

Networking Configuration

1 G NIC (1) 10 G NIC (1) + 1 NIC for IPMI

1 G NIC will be dedicated for public network and 10G NIC for private/storage.

Number of XenServer Hosts (computing nodes)

16 (hosts per pod)

Network Configuration

Distribution (core) Switch

Juniper EX4500

Access Switch Juniper EX4200 (4) 2 per POD

48 10G ports



Traditional Availability Zone Deployment

Overview

In this section we will describe how to design and configure a 64-node traditional server virtualization availability zone. The

availability zone consists of 4 pods, each comprised of 16 nodes. Unlike the Cloud-era setup, where each pod has its own

NFS servers, the entire zone shares a centralized storage server over a SAN. The availability zone is connected to 4 shared

VLANs: public, DMZ, test-dev, and production. In addition, tenants can be allocated isolated VLANs from a pool of zone

VLANs. A VM can be connected to one or more of these networks:

An isolated VLAN NAT’ed to public internet via the virtual router

The DMZ VLAN

The test-dev VLAN

The production VLAN

The following figure illustrates the physical network setup for a traditional availability zone:

Every host is connected to 3 networks:

1. A storage network that connects the host to primary storage. Storage multipath technology should be used to ensure reliability.

2. An untagged Ethernet network used for management and vMotion traffic. NIC bonding should be used to ensure reliability.

3. An Ethernet network used for shared and public VLAN traffic. NIC bonding should be used to ensure reliability. This network is used to carry 4 shared VLANs: public, DMZ, test-dev, and production. It is also used to carry the isolated zone VLANs.



Either 1Gbps or 10Gbps network can be used depending on the workload and VM density requirements.

The detailed network and IP address configuration is listed in the following table:

Storage Area Network Apply vendor’s best practices for SAN setup

Management/vMotion

Network

Each host needs 1 RFC 1918 private IP address. CloudPlatform consumes additional

private IPs for system VMs like CloudPlatform virtual routers. Reserve at least a /22

private IP address range to ensure plenty of private IPs (1024) are available for

system VMs.

Management/vMotion network IP ranges in different pods must not overlap.

VLAN network Carries tagged VLAN traffic for shared and isolated VLANs.

Secondary Storage

Server IP

One or more RFC 1918 IP addresses for the NFS server.

Traditional Availability Zone Hardware

Core Switching Fabric Follow established networking practices

TOR Switch 2 per pod. 48 ports each to allow NIC bonding.

Computing node Intel or AMD CPU server with at least 2GHZ, 2 sockets, 6 cores per socket,

128GB of memory, and 250GB of RAID 1 local disk storage.

Primary Storage Server Sized based on the profiles of VMs the cloud is designed to support

Secondary Storage NFS Server Sized according to VM profiles



Primary Storage Sizing

Primary Storage Sizing is based on the VM Profile. The storage sizing is based on the formula for calculating the primary

storage:

S = Average size of the system/root disk.

D = Average size of the Data volume.

N = Average number of Data volumes attached per VM.

V = Total number of VMs per pod.

R = Number of pods in the zone.

The size of the primary storage required per pod would be

R* V * (S + (N*D))

If using tiered storage, which is quite common in traditional Enterprise style workloads, repeat the calculation for each tier.

Secondary Storage Sizing

For Secondary Storage Sizing, here is a formula to follow:

N = Number of VMs in the Zone.

S = Average Number of Snapshots per VM.

G = Average size of snapshot per VM.

T = Number of Templates in the zone.

I = Number of ISOs in the zone.

Secondary Storage sizing would be

((N * S * G) + (I * Avg Size of ISOs) + (T * Avg size of Templates)) * 1.2

There is a 20% spare capacity built into the formula. The actual size could be further reduced based on the following

factors

Deduplication in the Storage Array.

Thin Provisioning.

Compression.



Choice of Hypervisor in Traditional Availability Zone

There are a variety of choices of hypervisors in a traditional availability zone. The following table lists the recommended

configuration for each hypervisor type.

XenServer vSphere

Primary Storage NFS iSCSI or FC

Storage Network Link Aggregation (LACP) Multipathing

Cluster Size 8 8

Traditional Availability Zone Configuration (for vSphere)

Availability Zone 1- ZONE-VMW-01

Name of Zone [ZONE-VMW-01] Sample name for the Zone.

Network Mode Advanced Network with VLANs Advanced Networking is a stipulation

for using the VMware ESX hypervisor

with vCenter

VLAN Type Tagged VLANs

Guest Networks CIDR 10.2.1.0/24 This CIDR is a sample and

CloudPlatform administrator can

choose based on their networking

best practices.

Guest VLAN Range 300-1000 These VLANs are allocated for each

account and any isolated/shared

network created apart from a guest

network. You can compute a range

roughly at an average of 3 VLANs per



customer.

Guest networks (VM Traffic) VMware Switch: vSwitch0 Virtual switch specified are sample

names/values and should be changed

to fit actual configurations.

Storage network VMware Switch: vSwitch3 Virtual switch specified are sample



Management network

(Control Plane Traffic)

VMware Switch: vSwitch1 Virtual switch specified are sample



Public network VMware Switch: vSwitch2 Virtual switch specified are sample



POD (Z01-POD01) Replicate this for each POD

Pod Name Z01-POD01 Sample name for POD.

Start Reserved System IPs 10.144.53.201 These are examples. Need to change

to suit your network configuration.

IPs for the CloudPlatform hosts,

storage and network devices within

the Pod. For 64 hosts enough IPs

should be allocated to virtual routers,

storage VM, Console Proxy VMs.

End Reserved System IPs 10.144.53.235 These are examples. Need to change

to suit your network configuration.



the Pod

Number of Clusters 2 Clusters Citrix recommends 8 servers per

cluster -- this provides the optimum

management to performance ratio



Cluster Name Z01-POD1-VMW-CL01

Z01-POD1-VMW-CL02

Sample names for clusters.

Hypervisor VMware ESXi 5.0 vCenter must use port 443 (default).

Compute Nodes in cluster (replicate for each cluster)

Number of Servers 8 hosts per Cluster. Cisco UCS B230 M1 blades (Guest

VMs)

Make & Model Cisco UCS B230 M1 Blade Servers

CPUs 2 x 6-core Intel CPUs

Memory 128GB RAM 128 GB should be sufficient for most

workloads but can be increased based

on target workload and hypervisor

capacity.

Target Number of VMs 60 per server

Network Hardware

Access switches 2 x Cisco Nexus 5548

Storage Hardware

Shared Hypervisor Storage

(Primary Storage)

Storage System: EMC VNX 7500

Protocol: VMFS

VMFS Datastore: Z1-P1-CL01-

PS01

Citrix recommends a minimum of two

primary storage volumes per cluster.

Use VMFS file system when storage is

connected by iSCSI or FC.

Name for VMFS Datastore mentioned

here is a sample.



Traditional Availability Zone Configuration (for XenServer)

Availability Zone 1- ZONE-XEN-01

Name of Zone [ZONE-XEN-01] Sample name for the zone.

Network Mode Advanced Network with VLANs Advanced Networking is a stipulation

for using the Citrix XenServer

hypervisor

VLAN Type Tagged VLANs

Guest Networks CIDR 10.2.1.0/24 This CIDR is a sample and

CloudPlatform administrator can

choose based on their networking

best practices.

Guest VLAN Range 300-1000 These VLANs are allocated for each

account and any isolated/shared

network created apart from a guest

network. You can compute a range

roughly at an average of 3 VLANs per

customer.

Guest networks (VM Traffic) Network Label (XenServer Bridge)

: cloud-guest

These names are examples and

should be changed to meet XenServer

Configuration.

Storage network Network Label (XenServer

Bridge): cloud-storage



Configuration.

VM Management network

(Control Plane Traffic)

Network Label (XenServer

Bridge):cloud-mgmt



Configuration.

Public networks Network Label (XenServer

Bridge):cloud-pub





Configuration.

POD (Z01-POD01) Replicate this for each POD

Pod Name Z01-POD01 Sample name for the pod.

Start Reserved System IPs 10.144.53.201 These are examples and should be

changed to suit your network

configuration.

XenServer uses link local addresses

for virtual routers and system VMs.

End Reserved System IPs 10.144.53.210 These are examples and should be

changed to suit your network

configuration.



the Pod

Number of Clusters 2 clusters Citrix recommends 8 servers per

cluster -- this provides the optimum

management to performance ratio.

Cluster Name Z01-POD1-XEN-CL01

Z01-POD1-XEN-CL02

Sample name for clusters.

Hypervisor Citrix XenServer 6.0.x

Compute Nodes in Cluster (replicate for each cluster)

Number of Servers 8 hosts per cluster.

Make & Model HP DL360 G8

CPUs 2 x 6-core Intel CPUs

Memory 128 GB RAM 128 GB should be sufficient for most

workloads but can be increased based



on target workload and hypervisor

capacity.

Target Number of VMs 60 per server

Network Hardware

Access switches 2 x Cisco Nexus 5548

Storage Hardware

Shared Hypervisor Storage

(Primary Storage)

Storage System:NetApp FAS3240AE

Protocol: NFS

NFS Mount 1:/Z1-P1-CL01-PS01/

NFS Mount 2: /Z1-P1-CL01-PS02/

Citrix recommends a minimum of two

primary storage volumes per cluster.

Sample NFS Mount names.

CloudPlatform Deployment Reference Architecture - Citrix · PDF fileCloudPlatform Deployment Reference Architecture ... While Cloud-era workloads represent an application architecture

Documents