Hotspot mitigation for the masses Fabien Hermenier, Aditya Ramesh, Abhinay Nagpal, Himanshu Nagpal, Ramesh Chandra ACM Symposium On Cloud Computing 2019
Hotspot mitigation for the masses
Fabien Hermenier, Aditya Ramesh, Abhinay Nagpal, Himanshu Nagpal, Ramesh Chandra
ACM Symposium On Cloud Computing 2019
Entreprise cloud company
~ 15,000 customers worldwide
~ 40,000 private clouds deployments
(and we are recruiting)
Private clouds
to hyper-converged infrastructures (HCI)
SAN based, remote I/Os
Distributed file-system favouring local I/Os, one controller VM per node
From converged
602 private clouds
~ 4 node clusters, 13 VMs per node long tail distribution
~ 1.31:1 vCPU/thread, up to 9:1
~25% CPU, ~2% I/Os (dynamic allocation) ~44% memory (static allocation)
small clusters and beefy nodes fit SMB needs
oversubscribed cores
moderate load
no relationship between dimensions
see the distributions in the paper
Fix hotspots induced by dynamic resources allocation
Cron based Threshold based
NP-hard No holy grail
Scheduler specialisation may alter its applicability
cpu
mem
cpu
mem
cpu
mem
Acropolis Dynamic Scheduler (ADS)
Doing great for the 1%
Established in 2009
Tech unicorn in 2013
~3,700 employees
Doing ok for the 99%
Exact approach on top of
Inside ADSBtrPlace
Constraint programming backend to avoid over-filtering
ObjectiveMinimise data movement Tend to balance
ActuationVM migrations (up to 2 in parallel) Admin notification upon no solutions
Lessons learntLooking at 2,668 clusters that called ADS at least once
Service latency is good enough
0.5% undecidable problems
Working with an exact approach
Continuous search helps yield better mitigation plans
de-facto sizing limit
Scale beyond sizing limits
0
5
10
15
90 93 94 95 99percentiles
dura
tion
(sec
.) first solutionlast solutionlatency
0.00
0.25
0.50
0.75
1.00
0 25 50 75 100saved migrations (%)
CC
DF
0.0
0.2
0.4
4 32 64 128 256cluster size (nodes)
dura
tion
(sec
.)
In the paper: engineering particularities
..
..
....
. . .
..
.
..
.
. ...
..
..
.. .. .
new feature
Optimise to reduce undecidable rate, migrations
Beware of false quick wins
The dataset bias dilemma
Looking for workload agnostic optimisations
..
..
....
. . .
..
..
.
.
. ...
..
.. .
. .. .
Chasing outliers requires trade-offs
Low overall load, local hotspots.
Manage only supposed mis-placed VMs
Pin “well placed VM”
Local search to reduce the problem size
Available in BtrPlace
Enabled in ADS 1.0 during the prototyping phase
Local search considered useful and harmful
Over-filtering issues reported Moved to a 2-phases resolution
62.991.38
0.3
34.411.61
0.22
32.261.62−2.34
retry without local searchon timeout
retry without local searchif unsolvable
pure local search
0 20 40 60improvement wrt. full resolution (%)
latencymigrationssolved problems
Local search enabled, then disabled if needed Trigger reconsidered over time
Practical effectiveness
73.28% if ADS issues a plan
12.24% If unsolvable
Complex to analyse without a/b testing The success rate is a consequence of subjective modelling choices How many clusters in a clean state after a call to ADS ?
ConclusionIt is about supporting diverse workload
Not all enhancements are safe
Tools and knowledge bases are crucial
Incremental improvements from observation small wins matter
Trading quality for capability
It is not about developing a new feature, it is about checking its side effects
Exhibit and characterise outliers It is about preventing regressions
We are hiring: https://www.nutanix.com/careers