Energy-Efficient System Virtualization for Mobile and Embedded Systems

Energy-Efficient System Virtualization for Mobile and Embedded SystemsFinal Review2014/01/21

OutlineProject overviewbig-LITTLE core architectureModelsResource-Guided scheduling

◦Experimental resultsConclusion



What Have Been DoneThe first-half year

◦Energy-efficient task scheduling for per-core DVFS architecture Offline energy-efficient task scheduling Online energy-efficient task scheduling

The last-half year◦Energy-efficient task scheduling for

big-LITTLE core architecture

Goal of Our big-LITTLE Aware SchedulingDerive an energy-efficient

scheduler for big-LITTLE core architecture◦Satisfies the resource requirement of

each task.◦Minimizes the average power

consumption.



Big-LITTLE Core ArchitectureDeveloped by ARM in 2011.Combines two kinds of

architecturally compatible processors with different power and performance characteristics.

Three different types◦1st Cluster migration◦2nd CPU migration/In-Kernel Switcher◦3rd Heterogeneous Multi-

Processing(HMP)

Type 1: Cluster Migration

Either big or LITTLE cores are used simultaneously.

Type 2: CPU Migration

Logical CPU: a pair of big and LITTLE core.

Only one of the two cores in a pair is powered up and processing tasks at a time.

Type 3: HMP

All the big and LITTLE cores can be used at the same time.



Building the Power ModelMeasure the average power

consumption of big and LITTLE core using different core frequency under different CPU load.

Platform: ODROID-XU◦1st type, Cluster migration.◦Cortex™-A15 and Cortex™-A7.◦Per-cluster DVFS.

Average Power Consumption

Power ModelThe power consumption Pt of an

interval t is:

nb and nL: the number of big and little cores Pb

i,t and PLj,t : the power consumption of big core i

and little core j during time t. Eb

f and ELf : the power consumption of big and

little core with frequency f under load 100%. loadingn,t : the load of the n-th core in interval t.

tnLf

Ltn

tnbf

btn

n

j

Ltj

n

i

btit

loadingEP

loadingEP

PPPLb

,,

,,

1,

1,

Task ModelFor every Taski in a scheduling

interval, we define:

◦loadingi :the percentage of time that task Ti is running on a core in a period of time.

◦CoreFreqi :the current frequency of the core cluster this task is running on.

iii loadingCoreFreqresource

Task Model(Cont.)We also define the minimum

resource required for Taski as:

◦QoSFreqi :the minimum core frequency that can satisfies the QoS requirement of Taski.

◦QoSLoadingi :the CPU load of Taski while the core frequency is QoSFreqi.

iii QoSLoadingQoSFreqreqres _

ObjectiveFind a scheduling plan for an

interval t according to task loadings, such that Pt is minimum while satisfying task resource requirements.◦Recall that

Lb n

j

Ltj

n

i

btit PPP

1,

1,



Resource-Guided schedulingConsist of three phases:

◦TaskInfo phase◦LittleCore phase◦bigCore phase

Activates every scheduling interval, and makes decisions for the next interval.

TaskInfo PhaseGathers the loading information

of each task.Collects the load and current

frequency of each core.

LittleCore PhaseIf LITTLE core cluster is powered-

off, skip this phase.Compares resourcei of each Taski

in little core cluster with their res_reqi.◦If any task gets less resources than

its minimum requirement, adjusts core settings.

Core AdjustmentFirst, try to provide more

resources to tasks by increasing LITTLE core frequency.

If increasing frequency cannot provide enough resources, add one LITTLE core.

Still, if add core cannot provide enough resources, migrate the task to big core cluster.

bigCore PhaseIf big core cluster is powered-off,

skip this phase.Compares resourcei of each Taski

in big core cluster with their res_reqi.◦Similar to LittleCore phase, without

“migrate to bigger/powerful cores”.If every task requirements are

satisfied, try to migrate task back to LITTLE.◦By estimating if task requirements

can be satisfied in LITTLE cores.

Flowchart



PlatformPlatform: ODROID-XU

◦1st type, Cluster migration◦Cortex™-A15 and Cortex™-A7◦Per-cluster DVFS

BenchmarkThree applications

◦TTpod: MP3 palyer◦Candy Crush: Game◦Chrome: Web browser

Benchmark QoS requirementTTpod Play music without interruptsCandy Crush At least 24 FPS during gameplayChrome Jump to next page within one second after clicking a link

SimulationSimulate the execution of the

three applications separately, and measure their average power consumption on the three types of big-LITTLE core architectures.

Compare the average power consumption with Linaro’s strategies.

Simulation Results

Linaro’s strategies increase core frequency while encountering high CPU load, and eventually uses the highest frequency of big core to run Candy Crush and Chrome.

Our method use LITTLE core for Candy Crush, and big core with lower frequency for Chrome, thus reduce power consumption while keeping the QoS.

Resource-Guided Scheduler LinaroModel I II III I II III

TTpod 0.019 0.019 0.019 0.019 0.019 0.019Candy Crush 0.371 0.371 0.371 1.49 1.49 1.49Chrome 0.916 0.916 0.916 1.88 1.73 1.73

ExperimentExecute the three applications

together, and measure the average power consumption during execution.

Scenario◦ A user first starts TTpod to play some music. A

minute later, this user starts to play the game, Candy Crush, while keeping the music playing. After playing the game for three minutes, this user finishes the game and opens Chrome to search for a solution on how to conquer a certain stage of Candy Crush.

Results – Linaro’s Strategy

Results – Resource-Guided

Experimental SummaryAverage power consumptions

◦Linaro’s strategy: 0.843 Watt◦Resource-guided: 0.089 Watt

The main reason is that some applications over-use resources.◦For example: Candy Crush ◦Linaro’s strategy is unaware of such

condition, thus keep increasing the core frequency.



ConclusionBuild an energy-efficient scheduler for

big-LITTLE core architecture that satisfies the resource requirement of each task and minimizes the energy consumption.

Propose a scheduling policy which decides the resource use for the tasks in a dynamic fashion.

The experimental results demonstrate that compared to Linaro’s scheduling strategies, our resource-guided scheduling method is more power-efficient.

Realizing Power-aware big-LITTLE scheduler on ICL Hypervisor2014/01/21

Project GoalEnabling power-aware big-LITTLE

scheduler design on ICL Hypervisor

Target hardware platform◦ODROID-XU+"E" with power meter

Further DetailsSchedule vCPU between host CPU.Hypervisor scheduler may not interfere

Guest OS process scheduler. Two conditions:

◦Guest OS has big-LITTLE-aware scheduler Hypervisor assign host CPU to guest according

to Guest OS requirements.◦Guest OS does not have big-LITTLE-aware

scheduler Hypervisor scheduler assign vCPU according to

guest condition.

Preliminary ThoughtSchedules vCPU to host CPU.Guest OS should provide

information to hypervisor◦Information such as “x big and y

little” or “vCore 2 require at least z processing speed”.

◦What information can ICL hypervisor get from guest OS?

VM IntrospectionSince our scheduler needs QoS

information from tasks, we need to know what applications a Guest OS is running.◦To maintain QoS and save energy.

Thus we need hypervisor to support VM introspection.

Energy-Efficient System Virtualization for Mobile and Embedded Systems

Documents