1
NETE4631Managing the Cloud
and Capacity Planning
Lecture Notes #8
Lecture Outline Managing the cloud
Administrating the cloud Managing responsibilities Lifecycle management Emerging cloud management standards
Capacity Planning Steps for capacity planner Scenario Load testing Resource ceiling Scaling
2
Administrating the Cloud
Network management systems are often described as FCAPS (ISO) Fault/ Configuration/ Accounting/
Performance/ Security Fundamental features
Administrating/ Configuring / Provisioning of resources, Enforcing security policy, monitoring operations, Optimizing performance, Policy management, Performance maintenance, etc.
3
Administrating the Cloud (2)
Network management framework tools BMC ProactiveNet Performance
Management HP OpenView/ HP manager products IBM Tivoli Service Automation
Manager CA (Computer Associates) Unicenter Microsoft System Center
4
Administrating the Cloud (3)
5
Management Responsibilities
What is different from traditional network management? Cloudy characteristics
Billing is on a pay-as-you-go basis. The management service is extremely scalable. The management service is ubiquitous. Communication between the cloud and other
systems uses cloud networking standards. The type of Cloud affects which tools for
monitoring Level of controlling aspects of operations –
IaaS>PaaS>SaaS
6
Management Responsibilities by service model types
7
What to be Monitored for Cloud?
End-users services such as HTTP, TCP, POP3/ SMTP, etc.
Browser performance on the client Application monitoring in the cloud such as
Apache, MySQL, and so on Cloud infrastructure monitoring of services
such as Amazon Web Services Machine instance monitoring where the
service measures processor utilization, memory usage, disk consumption, queue lengths, etc.
8
Lifecycle Management
Six different stages in the lifecycle The definition of the services as a template for
creating instances Client interactions with the service, usually through
an SLA (Service Level Agreement) The deployment of an instance to the cloud and
the runtime management of instances The definition of the attributes of the service while
in operation and performance of modification of properties
Management of the operation of instance and routine maintenance
Retirement of service
9
Cloud Management Products
Very young industry List of products -> Chapter 11 of Course Book
Core management features Support of different cloud types Creation and provisioning of different types of
cloud resources such as machine instances, storage, or staged applications
Performance reporting including availability and uptime, response time, resource quota usage
The creation of dashboards that can be customized for a particular client’s needs
10
Example - CloudKick
www.cloudclick.com 11
12
Emerging Cloud Management Standards
Distributes Management Task Force (DMTF) An industry organization that develops
industry system management standards for platform interoperability
Create a working group to help develop interoperability standards for managing transactions between and in public, private, and hybrid cloud systems
Describing resource management and security protocols, packaging methods and network management technologies.
Distributes Management Task Force (DMTF)
13
14
Emerging Cloud Management Standards (2)
Cloud Commons Initiated by CA and donates to Software
Engineering Institute (SEI), CMU, USA Establishes cloud-based metrics for
file creation and deletion/ Email availability/ console response time/ storage and database benchmark
Using dashboard called CloudSensor to monitor cloud-based services in real time
Cloud Commons
15
Capacity Planning
Capacity Planning Match demand to available resources Identify critical resources that has
resource ceiling and add more resources to remove the bottleneck of higher demands
Not focus on performance tuning or optimization
16
Steps for Capacity Planner
Iterative process with the following steps Examine what systems are in place (characteristics) Measuring their workload for the different resources in
the system: CPU, RAM, disk, network and so forth Load the system until it is overloaded, determine when
it breaks, and specify what is required to maintain acceptable performance/ what factors are responsible for the failure (resource ceiling)
Determining usage pattern & predict future demand Add or tear down resources to meet demand
17
Scenario
Example (LAMP) Capacity planner
works with a system that has a website on Apache
Also, a site has been processing database transactions (MySQL)
Application-level metrics
Page views (hits/s) Transactions (trans/s)
18
Scenario (2) System-level metrics
What each system is capable of How resources of such a system affect
system-level performance Example
A machine instance (physical or virtual) CPU Memory (RAM) Disk Network Connectivity
Measured by tools such as sar command/ Microsoft task manager/ RRDTool for Linux
19
RRDTool
20
Load Testing Load testing seeks to answer the following question.
What is the maximum load that my current system can support?
Which resources represent the bottleneck in the current system that limits the system’s performance? (resource ceiling)
Can I alter the configuration of my server in order to increase capacity?
How does this server’s performance relate to your other servers that might have different characteristics.
Tools HTTPerf, Siege, Autobench, IBM Rational Performance Tester,
HP LodeRunner, Jmeter, OpenSTA
21
Resource Ceiling (1)
22
Resources Ceiling (2)
23
Network Capacity Three aspects to assessing network
capacity Network traffic to and from the network
interface at the server (physical or virtual) system utilities (I/O), Network monitor (traffic)
Network traffic from the cloud to the network interface
Tools such as those from Apparel Networks Network traffic from the cloud through your
ISP to your local network interface The connection from the backbone to your
computer (through ISP)
24
Scaling
Scale vertically (scale up) Add resources to a system to make it powerful A virtual system can run more virtual
machines (operating system instance), more RAM, faster compute times
Example – rendering or memory-limited apps Scale horizontally (scale out)
Add more nodes to remove I/O bottleneck Easy to pull resources and partition Example – web server apps
25
26
Scaling Comparison Cost
Scale up pays more than scale out. Maintenance
Scale out increases the number of systems you must manage.
Communication Scale out increases the number of
communication between systems. Scale out introduces additional latency
to your system.
References
Chapter 6, 11 of Course Book: Cloud Computing Bible, 2011, Wiley Publishing Inc.
27