Tips and Tricks for Capacity Risk Assessment, Rightsizing and Planning Kevin Denton, Gilead Sciences Jim Medeiros, VMware Monica Sharma, VMware VCM4992 #VCM4992
Jan 28, 2018
Tips and Tricks for Capacity Risk Assessment,
Rightsizing and Planning
Kevin Denton, Gilead Sciences
Jim Medeiros, VMware
Monica Sharma, VMware
VCM4992
#VCM4992
2
Agenda: Tips and Tricks for vSphere Capacity Planning
Monitor & Analyze
Right-Size VMs Conclusion Improve Utilization
vC Ops – Overview Gilead’s Advantage
3
Gilead - Overview
Gilead Sciences • Growing, innovative leader in Research
based Biopharmaceutical
• Focus areas - HIV/AIDS, Hepatitis, Cancer,
Respiratory & Cardiovascular conditions
Goals • Robust capacity planning based on
tangible data
• Forecast growth to know what capacity
is needed
4
Gilead’s Challenges & Needs
Criteria for an Operations
Management Solution
No adequate capacity planning (Yearly fire drill)
No understanding of current utilization
No way to do adequate forecasting
Challenges
Drop & play – easy setup & management
Provides capabilities of showing Utilization,
Capacity management, Change management
and Forecasting
5
Agenda: Tips and Tricks for vSphere Capacity Planning
Monitor & Analyze
Right-Size VMs Conclusion Improve Utilization
vC Ops – Overview Gilead’s Advantage
6
Agenda: Tips and Tricks for vSphere Capacity Planning
vC Ops – Overview
Today & Roadmap Get Right Metrics Tune Policies Pick your Visuals
7
Capacity Planning in vCenter Operations – Today
Ensure performance SLAs
Increase Utilization & Realize Savings
Plan better by what-if modeling
Policy driven Capacity views/dashboards
Optimization & Rightsizing
recommendations
Modeling of how many VMs can
fit & do I have enough
Do I have any capacity risk? Benefit Benefit
Description
Can I improve utilization?
Do I have enough?
8
Capacity Planning in vCenter Operations – Today
Ensure performance SLAs
Increase Utilization & Realize Savings
Plan better by what-if modeling
Policy driven Capacity views/dashboards
Optimization & Rightsizing
recommendations
Modeling of how many VMs can
fit & do I have enough
Do I have any capacity risk? Benefit Benefit
Description
Can I improve utilization?
Do I have enough?
9
Capacity Planning in vCenter Operations – Roadmap
Manage capacity across
SDDC & hybrid
Forecast accurately
Optimize utilization
Extensible capacity models
beyond virtual
Save, Reserve future projects,
plan deficit
Policy-driven, automated
recommendations
Custom Report Builder
Capacity beyond vSphere Future-proof Forecast
Automate Recommendations
Benefit
Description
10
Get the Right Metrics
16 GB- Total Allocated Capacity
2GB -What VM did not get (Contention)
8GB - What the VM got(Usage)
SQL VM
10GB- What the VM wants(Demand)
Demand is What the VM wants: Physical
resources an object might consume
w/o constraints
Demand = Usage (what VM gets)
+
Contention (What VM does not get)
② Check Time Resolution - Don’t use one time
peak for planning, use rolled up avg over time
③ Use BOTH: Allocation & Demand Models • Use Allocation model to create a safe top line
E.g. fill VMs till cluster is at 200% ,then add
new host
• Use Demand model in conjunction to catch
unexpected bursts/peaks and prevent waste
④ Compare actual demand vs. allocation
• To assess performance risk
• To show optimization potential & savings
Allocation - Amount of a resource that the
user configures
① Use Demand for capacity & performance if Demand > Entitlement
• May have performance issues
• May be undersized (‘Stressed’)
• Use Demand vs Consumed for Memory
Buffer The most a VM can get (Entitlement)
11
Translate Your Operational ‘Knobs’ to vC Ops Policies
How would you like to
Manage Capacity Risk?
What are your goals to
Optimize your environment
Performance Higher utilization
Ignore Waste Higher density
safe
PRODUCTION TEST-DEV
Configure Out-of-Box Policies
Production/Test Dev/UAT/IT-Apps etc
12
Pick Your Visuals
Out of box
Custom
vSphere Dashboard Planning Views Canned Reports
Custom Templates Custom Heatmaps Custom Dashboards
13
Resources available for you
1. VMworld slides
from
VMworld site
2. Custom Dashboards
from
VMware Management
Blog-Tech Tips
14
Agenda: Tips and Tricks for vSphere Capacity Planning
Monitor & Analyze
Right-Size VMs Conclusion Improve Utilization
vC Ops – Overview Gilead’s Advantage
15
Agenda: Tips & Tricks to Analyze Demand, Utilization & Risk
VM Growth Infra Burn Rate Capacity Risk
Monitor & Analyze
16
How many of you have been tasked to
Monitor Infrastructure Utilization & Risk?
Audience Poll Question
17
What Has Been My VM Growth Trend?
vC Ops vSphere UI Planning Vm Capacity View vC Ops Custom UI->VM Count & Trend –by Cluster
① Metrics:
Use Total/Powered on
VM count
② Visuals:
Forecast trend to view
Risk
③ View Growth
by Cluster, LOB, Geo etc.
18
What Has Been My Infrastructure Utilization Trend?
② Visuals:
Breakdown
by cluster to view
Actual Demand
by Clusters
① Metrics:
Use Usable Capacity
vs. Total Capacity for
Planning decisions
(includes Buffers)
19
How Well Is My Infrastructure Utilized Today?
③ Under-utilized
Clusters –
fill or consolidate
② Stressed Clusters
with high Count
of VMs
① Used,
Remaining?
Metrics: VM Count,
Usable Capacity
20
Which Clusters are at Capacity Risk & Why?
① Which clusters are
at Capacity Risk?
③ Compare
Actual Demand
to Allocation
② Why?
- Out of Capacity?
- Will run out soon?
- Under-Sized?
- VM: Host Ratio
21
Assess Risk Based on Your Policy
① Identify & Apply out of box Policies
• By Environment to manage Risk
• Production Policy
• Test-Dev Policy
• By Workload type for Right-sizing
• Ignore objects
• Batch Workloads
• Interactive/Server Workloads
• Optimized for 15/30 min SLA
② Translate your Knobs to Policies
• Allocation and Demand model
• Over-commit ratios(CPU, Mem)
• Thresholds for capacity risk
• Buffers
• Business hours
22
What Do These Settings Impact & When?
① Dashboard - Time Remaining
& Capacity Remaining
calculated daily
② Planning Views –
Capacity Risk Details
view updates in real-time
23
Which Datastores Are at Capacity Risk & Why?
Datastores at capacity
risk –color coded
Which VMs
Causing most waste?
24
Which Top N VMs Are at Capacity Risk & Why?
VMs out of Capacity? Undersized VMs?
VMs out of Guest FS? VMs running out of
capacity soon?
25
Agenda: Tips and Tricks for Right-Sizing
Monitor & Analyze
Right-Size VMs Conclusion Improve Utilization
vC Ops – Overview Gilead’s Advantage
26
Agenda: Tips and Tricks for Right-Sizing
Right-Size VMs
27
Tips for Right Sizing VMs
① More vCPUs actually
slows down a VM
② (CPU Usage | Co-stop)
Trend this metric when
Usage is low but
Demand is high
Table for 2 – Just a minute please
Table for 10 – 20 minutes
28
How Do Right Sizing Analytics Work?
Time
% D
em
and
Stress % Threshold
Current Capacity
Moments of Stress Summed Up as %
of Stress Zone Area
If Stress > 1%, show in under-sized VM list
Area based Stress Analysis
• VM is considered
undersized/stressed when:
• Amount of CPU demand
peaks above 70% is more
than 1% of any 1 hour
70%
Time
% D
em
and
Current Capacity
Waste % Threshold
Moments of Wasted Summed
Up as % of Waste Zone Area
If Waste > 99%, show in list
• VM is considered oversized when:
• Amount of CPU demand below
above 30% is more than 1% of the
entire range(30 days)
29
Step 1: Identify Over/Under Sized VMs/Hosts
① Under Planning Views
• Over/Under sized VMs,
• Under utilized/Stressed Clusters
30
Step 2: Profile Workload & Apply Policy
Server Workload Profile:
• E.g. Exchange, AD, Citrix
• 9-5 Usage pattern
• Account for many micro-
bursts in an hour
5 Minute CPU
Demand Average
Interactive Workload Profile:
• E.g. Web Servers
• Constantly busy
① Apply “Interactive Policy”
② (Optional)Tune Settings
• To catch peaks
• Enable “Stress”
• Use buffers for erratic peaks
• Set sliding window = 1 hour
vSphere UI Operations All Metrics
31
Step 2: Profile Batch Workload Type & Apply Policy
5 Minute CPU
Demand Average
Batch Workload Profile:
• E.g. Month end, Backup,
• Busy only for small bursts, idle most of the time.
Peak higher than avg
• Ensure sized for when it needs resources (4 hr SLA)
① Apply “Batch Workload Policy”
② (Optional) Tune Settings:
• Narrow down business period
• Set “sliding window” for
expected duration
③ If VM is idle for 28 days, it will
NOT be considered over-sized
32
Step 3: Report Wasteful VMs with Usage Trends
Top N Over-sized VMs
Top N by Memory
Top N by CPU Usage Trend Memory Demand
Trend CPU Demand
33
Agenda: Tips and Tricks to Improve Utilization
Monitor & Analyze
Right-Size VMs Conclusion Improve Utilization
vC Ops – Overview Gilead’s Advantage
34
Agenda: Tips and Tricks to Improve Utilization
Reclaim Waste Consolidate, Right-Size
Over-Commit
Improve Utilization
35
Audience Poll Question
“How many of you over-commit memory
in test dev but not in production”
36
Decide on Your Optimization Phases
1
20-50%
① Phase 1: Reclaim Waste
• Idle VMs
• Powered Off VMs
2
20%
② Phase 2: Increase Utilization
• Consolidate Under utilized
clusters
• Right-size Over-sized VMs
3
15%
③ Phase 3: Increase Over-Commit
or Density ‘safely’
• Assess potential density w/o
performance risk
37
Phase 1: Reclaim Unused Resources (Waste)
① View Wasteful VMs
breakdown (Dashboard)
② Identify list of Idle, Powered
Off VMs in Planning
Views/Reports
38
Phase 2: Consolidate Clusters
① Identify Under Utilized Clusters to Consolidate
② Run what-if scenario
Select VMs from Under utilized Cluster
Model if they will fit in target cluster
③ How many Small Medium Large VMs
can fit in target cluster?
39
Phase 3: Increase Over-commit Safely
① (Dashboard) Identify
optimal consolidation ratios
(Based on ‘Demand’)
② Increase Over-commit
• Use allocation model for Memory
Risk management
• Increase Memory over-commit
by 5-15% and observe
• Set this in the Policy Settings 3c
40
Conclusion & Takeaways
vCenter Operations Manager
enables you to improve your existing process to
Analyze, Optimize & Model future capacity needs
Gilead’s Advantage with vCenter Operations Manager
Realized value within 3 months in production with vCenter Operations
Identified reclamation opportunities to realize savings
Got improved insights to plan purchases for future growth
Gained more visibility into workloads to maintain performance & availability
41
Other VMware Activities Related to This Session
HOL:
HOL-SDC-1301
Applied Cloud Operations
HOL-SDC-1304
vSphere Performance Optimization
THANK YOU
Tips and Tricks for Capacity Risk Assessment,
Rightsizing and Planning
Kevin Denton, Gilead Sciences
Jim Medeiros, VMware
Monica Sharma, VMware
VCM4992
#VCM4992