Xiao Ling 1, Shadi Ibrahim 2, Hai Jin 1, Song Wu 1, Songqiao Tao 1 1 Cluster and Grid Computing Lab Services Computing Technology and System Lab School.

Xiao Ling1, Shadi Ibrahim2, Hai Jin1, Song Wu1, Songqiao Tao 1

1Cluster and Grid Computing LabServices Computing Technology and System Lab

School of Computer Science and TechnologyHuazhong University of Science and Technology

2INRIA Rennes - Bretagne AtlantiqueRennes, France

Exploiting Spatial Locality to Improve Disk Efficiency in Virtualized

Environments

Disk efficiency in virtualized environments• VMs with multiple OSs and applications running on a

physical server• Disk I/O utilization impacts I/O performance of applications

running on VMs• Disk efficiency depending on exploitation of spatial locality

– Disk scheduling exploits spatial locality– Reducing disk seek and rotational overheads

But achieving high spatial locality is a challenging task in a virtualized environment

Why difficult?

• Complicated I/O behavior of VMs– More than one process running on VMs (e.g. Virtual

desktop, data intensive application)--mixed applications

• Transparency of Virtualization

Block layer Lacks :a goral view of I/O access patterns of processes in the virtualized environment

HypervisorSoftware

Shared disk

Guest OS

Stream

ing App

File editing

Guest OS

Process A

Process B

Guest OS

Process C

Process D

Shoulders of Giants

• Invasive mode scheduling– Selecting the disk scheduler pair within both the hypervisor and VMs

according to access pattern of applications[ICPP’11, SIGOPS Oper. Syst. Rev. ’10]

– An additional Hypervisor-to-VM interference

• Non-invasive mode scheduling– Streaming scheduling [Fast’11], Antfarm[USENIX ATC’06]– All VM with similar read applications– Grabbing bandwidth among VMs

• Analysis of data accesses of VMs – Only a specific(one) application is running within a VM

Studies on improving I/O performance of applications proceed us

What do we solve?

• Considering mixed applications and the transparency feature of virtualization

• Exploring the benefit of the spatial locality and regularity of data accesses

• Disk scheduling how to exploit spatial locality to maximize disk efficiency while preserving the transparency of virtualization?

Outline

• Problem Description• Related Work• Observe Disk Access patterns of VMs• Prediction Model• Design of Pregather• Performance Evalution• Conclusions and Future Work

Difference of Data Access

Traditional Environment Virtualized Environment

simultaneously accessing different parts of data blocks in the range of VM image space

Experiment settings

• Physical server– four quad-core 2.40GHz Xenon processor, – 22GB of memory and one dedicated SATA disk of 1TB – Xen 4.0.1 with kernel 2.6.18 , Ext3 file system

• Configuration of VMs– RHEL5 with kernel 2.6.18, Ext3 file system, 1GB memory and

2 VCPU, 12GB virtual disk– Defaut Noop scheduler

• workloads– Sysbench-file I/O: sequential read/write, random read/write

Access Patterns of VMs

• Regions across VMs– requests from the same VM

• Sub-regions within VM– different ranges and frequencies of access

Our observations:

Access Patterns of VMs

Region Sub-region Region

Regional Spatial LocalitySub-regional Spatial LocalitySub-regions without spatial locality

Observations

• Special spatial locality– Regional spatial locality->bounded by VM image– Sub-regional spatial locality->access patterns of applications

• Ignoring of these spatial locality– Seeking among VM– increasing disk head seeks among sub-regions (e.g. CFQ, AS)

• Our goal– taking advantage of special spatial locality to improve

physical disk efficiency in the virtualized environment.

How to exploit these spatial locality

• Batch Processing requests with special spatial locality with adaptive non-working-conserving mode– Easy capturing regularity of regional spatial

locality– Hardly perceiving the regularity of Sub-regional

spatial locality due to transparency of virtualization

The distribution of sub-regions with spatial locality?

Access interval of these sub-regions?

Prediction Model

Prediction Model

Outline

• Problem Description• Related Work• Zoom Disk Access patterns of VMs• Prediction Model• Design of Pregather• Performance Evalution• Conclusions and Future Work

Prediction Model

• Challenges– the distribution of sub-regions with spatial locality is

changing with time and the access patterns of applications

– Interference from background processes running on a VM– different sub-regions may have different access regularity

• Analyzing historical data access within a VM image to predict sub-regional spatial locality

Prediction Model-vNavigator • Quantization of Access Frequency

– contributions of historical requests for prediction

– Temporal access-density of zone

Prediction Model-vNavigator • Explore Sub-regional Spatial Locality

– temporal access-density threshold of a VM where– Clustering zones

Prediction Model-vNavigator • Access Regularity of Sub-regional Spatial Locality

– The range of a sub-region unit

– Future access interval of the sub-region unit

where is the average access interval

Design of Pregather • An adaptive non-work-conserving disk scheduling in

the hypervisor– whether or not to dispatch the pending request without

starving other requests.– How long wait for future request with spatial locality

• A spatial-locality-aware heuristic algorithm – the regional spatial locality across VMs and the prediction

of sub-regional spatial locality from the vNavigator model– Guide Pregather to make the decision– waiting time is less than seek time

The SPLA Algorithm

• Setting timer according to position of disk head– Whether setting Coarse waiting time for regional spatial

locality

– Whether setting Fine waiting time for sub-regional spatial locality

no pending request from the current serving

VMx

AvgD(VMx ) <D|neighor VM-LBA of completed request |

CoarseTimer=AvgT(VMx )

pending request from the the

current serving VMx

Existing SR(Ui ) including LBA of

completed request

FineTimer=ST (Ui )

The SPLA Algorithm

• Dispatching request or continuing to wait– Seektime(closest pending request, completed request)– Within coarse waiting time

– Within fine waiting time

– till over timer or deadline of pending request or a suitable new request

Seektime<AvgT(VMx )

Request from VMx

Dispatch the request and turn off timer OROR

Seektime<ST (Ui )

LBA of Request in SR(Ui )

Dispatch the request and turn off timer

OROR

Implementation of Pregather

Pregather allocates each VM an equal serving time slice and serves VMs in a round robin fashion

In Xen-hosted platform

Outline

• Problem Description• Related Work• Zoom Disk Access patterns of VMs• Prediction Model• Design of Pregather• Performance Evolution• Conclusions and Future Work

Performance Evolution

• Goal of Experiments– Verifying the vNavigator model – the overall performance of Pregather for multiple VMs– Evaluating the overhead of memory

• Setting Parameters– The size of zone: 2000; prediction window:20ms; λ: 2;– Time slice: 200ms

• Benchmark– Sysbench-file I/O, hadoop, tpch

Verification of vNavigator Model

• The ratio of successful waiting– VM with Sequential applications has clear sub-regional

locality (e.g. success ratio 90.3%)– VM with only random applications has weak sub-regional

locality (e.g. success ration 80.4%)

33% 31%

38%22%

10%

• VMs with Different Access Patterns

Pregather for Multiple VMs

1.6x2.6x


• Disk I/O efficiency for Data Intensive Applications↑ 26% CFQ↑ 28%AS↑38%Deadline

↓18%

At Zero:Pregather: 65% CFQ: 53% AS: 36%

↓20%


• Disk I/O efficiency for Data Intensive Applications with other applicationsCompared with CFQ:Q2: ↓10%, Q19: ↓8%, Sort: ↓12%

Pregather: 63%


• Memory Overheads

916KB

Conclusion and Future Work

• Contributions– Observing regional spatial locality and sub-regional spatial

locality– an intelligent prediction model to predict the regularity of

sub-regional spatial locality– Pregather with a spatial-locality-aware heuristic algorithm in

the hypervisor to improve disk I/O efficiency without any prior knowledge of applications

• Future work– extend Pregather to enable an intelligent allocation of

physical blocks– Qos guarantee for VMs

Thanks!

Xiao Ling 1, Shadi Ibrahim 2, Hai Jin 1, Song Wu 1, Songqiao Tao 1 1 Cluster and Grid Computing Lab Services Computing Technology and System Lab School.

Documents

spatial localityseeking

high spatial locality

physical disk efficiency

disk efciency

disk scheduler pair

dedicated sata disk

vmincreasing disk head

virtualized environmentsvms