Energy-Efficient System Virtualization for Mobile and Embedded Systems Final Review 2014/01/21
Mar 23, 2016
Energy-Efficient System Virtualization for Mobile and Embedded SystemsFinal Review2014/01/21
OutlineProject overviewbig-LITTLE core architectureModelsResource-Guided scheduling
◦Experimental resultsConclusion
OutlineProject overviewbig-LITTLE core architectureModelsResource-Guided scheduling
◦Experimental resultsConclusion
What Have Been DoneThe first-half year
◦Energy-efficient task scheduling for per-core DVFS architecture Offline energy-efficient task scheduling Online energy-efficient task scheduling
The last-half year◦Energy-efficient task scheduling for
big-LITTLE core architecture
Goal of Our big-LITTLE Aware SchedulingDerive an energy-efficient
scheduler for big-LITTLE core architecture◦Satisfies the resource requirement of
each task.◦Minimizes the average power
consumption.
OutlineProject overviewbig-LITTLE core architectureModelsResource-Guided scheduling
◦Experimental resultsConclusion
Big-LITTLE Core ArchitectureDeveloped by ARM in 2011.Combines two kinds of
architecturally compatible processors with different power and performance characteristics.
Three different types◦1st Cluster migration◦2nd CPU migration/In-Kernel Switcher◦3rd Heterogeneous Multi-
Processing(HMP)
Type 1: Cluster Migration
Either big or LITTLE cores are used simultaneously.
Type 2: CPU Migration
Logical CPU: a pair of big and LITTLE core.
Only one of the two cores in a pair is powered up and processing tasks at a time.
Type 3: HMP
All the big and LITTLE cores can be used at the same time.
OutlineProject overviewbig-LITTLE core architectureModelsResource-Guided scheduling
◦Experimental resultsConclusion
Building the Power ModelMeasure the average power
consumption of big and LITTLE core using different core frequency under different CPU load.
Platform: ODROID-XU◦1st type, Cluster migration.◦Cortex™-A15 and Cortex™-A7.◦Per-cluster DVFS.
Average Power Consumption
Power ModelThe power consumption Pt of an
interval t is:
nb and nL: the number of big and little cores Pb
i,t and PLj,t : the power consumption of big core i
and little core j during time t. Eb
f and ELf : the power consumption of big and
little core with frequency f under load 100%. loadingn,t : the load of the n-th core in interval t.
tnLf
Ltn
tnbf
btn
n
j
Ltj
n
i
btit
loadingEP
loadingEP
PPPLb
,,
,,
1,
1,
Task ModelFor every Taski in a scheduling
interval, we define:
◦loadingi :the percentage of time that task Ti is running on a core in a period of time.
◦CoreFreqi :the current frequency of the core cluster this task is running on.
iii loadingCoreFreqresource
Task Model(Cont.)We also define the minimum
resource required for Taski as:
◦QoSFreqi :the minimum core frequency that can satisfies the QoS requirement of Taski.
◦QoSLoadingi :the CPU load of Taski while the core frequency is QoSFreqi.
iii QoSLoadingQoSFreqreqres _
ObjectiveFind a scheduling plan for an
interval t according to task loadings, such that Pt is minimum while satisfying task resource requirements.◦Recall that
Lb n
j
Ltj
n
i
btit PPP
1,
1,
OutlineProject overviewbig-LITTLE core architectureModelsResource-Guided scheduling
◦Experimental resultsConclusion
Resource-Guided schedulingConsist of three phases:
◦TaskInfo phase◦LittleCore phase◦bigCore phase
Activates every scheduling interval, and makes decisions for the next interval.
TaskInfo PhaseGathers the loading information
of each task.Collects the load and current
frequency of each core.
LittleCore PhaseIf LITTLE core cluster is powered-
off, skip this phase.Compares resourcei of each Taski
in little core cluster with their res_reqi.◦If any task gets less resources than
its minimum requirement, adjusts core settings.
Core AdjustmentFirst, try to provide more
resources to tasks by increasing LITTLE core frequency.
If increasing frequency cannot provide enough resources, add one LITTLE core.
Still, if add core cannot provide enough resources, migrate the task to big core cluster.
bigCore PhaseIf big core cluster is powered-off,
skip this phase.Compares resourcei of each Taski
in big core cluster with their res_reqi.◦Similar to LittleCore phase, without
“migrate to bigger/powerful cores”.If every task requirements are
satisfied, try to migrate task back to LITTLE.◦By estimating if task requirements
can be satisfied in LITTLE cores.
Flowchart
OutlineProject overviewbig-LITTLE core architectureModelsResource-Guided scheduling
◦Experimental resultsConclusion
PlatformPlatform: ODROID-XU
◦1st type, Cluster migration◦Cortex™-A15 and Cortex™-A7◦Per-cluster DVFS
BenchmarkThree applications
◦TTpod: MP3 palyer◦Candy Crush: Game◦Chrome: Web browser
Benchmark QoS requirementTTpod Play music without interruptsCandy Crush At least 24 FPS during gameplayChrome Jump to next page within one second after clicking a link
SimulationSimulate the execution of the
three applications separately, and measure their average power consumption on the three types of big-LITTLE core architectures.
Compare the average power consumption with Linaro’s strategies.
Simulation Results
Linaro’s strategies increase core frequency while encountering high CPU load, and eventually uses the highest frequency of big core to run Candy Crush and Chrome.
Our method use LITTLE core for Candy Crush, and big core with lower frequency for Chrome, thus reduce power consumption while keeping the QoS.
Resource-Guided Scheduler LinaroModel I II III I II III
TTpod 0.019 0.019 0.019 0.019 0.019 0.019Candy Crush 0.371 0.371 0.371 1.49 1.49 1.49Chrome 0.916 0.916 0.916 1.88 1.73 1.73
ExperimentExecute the three applications
together, and measure the average power consumption during execution.
Scenario◦ A user first starts TTpod to play some music. A
minute later, this user starts to play the game, Candy Crush, while keeping the music playing. After playing the game for three minutes, this user finishes the game and opens Chrome to search for a solution on how to conquer a certain stage of Candy Crush.
Results – Linaro’s Strategy
Results – Resource-Guided
Experimental SummaryAverage power consumptions
◦Linaro’s strategy: 0.843 Watt◦Resource-guided: 0.089 Watt
The main reason is that some applications over-use resources.◦For example: Candy Crush ◦Linaro’s strategy is unaware of such
condition, thus keep increasing the core frequency.
OutlineProject overviewbig-LITTLE core architectureModelsResource-Guided scheduling
◦Experimental resultsConclusion
ConclusionBuild an energy-efficient scheduler for
big-LITTLE core architecture that satisfies the resource requirement of each task and minimizes the energy consumption.
Propose a scheduling policy which decides the resource use for the tasks in a dynamic fashion.
The experimental results demonstrate that compared to Linaro’s scheduling strategies, our resource-guided scheduling method is more power-efficient.
Realizing Power-aware big-LITTLE scheduler on ICL Hypervisor2014/01/21
Project GoalEnabling power-aware big-LITTLE
scheduler design on ICL Hypervisor
Target hardware platform◦ODROID-XU+"E" with power meter
Further DetailsSchedule vCPU between host CPU.Hypervisor scheduler may not interfere
Guest OS process scheduler. Two conditions:
◦Guest OS has big-LITTLE-aware scheduler Hypervisor assign host CPU to guest according
to Guest OS requirements.◦Guest OS does not have big-LITTLE-aware
scheduler Hypervisor scheduler assign vCPU according to
guest condition.
Preliminary ThoughtSchedules vCPU to host CPU.Guest OS should provide
information to hypervisor◦Information such as “x big and y
little” or “vCore 2 require at least z processing speed”.
◦What information can ICL hypervisor get from guest OS?
VM IntrospectionSince our scheduler needs QoS
information from tasks, we need to know what applications a Guest OS is running.◦To maintain QoS and save energy.
Thus we need hypervisor to support VM introspection.