Top Banner

of 17

FlashFQ: A Fair Queueing I/O Scheduler for Flash-Based SSDs

Dec 30, 2015

ReportDownload

Documents

Kai Shen and Stan Park University of Rochester. FlashFQ: A Fair Queueing I/O Scheduler for Flash-Based SSDs. Flash I/O Fairness. NAND Flash storage devices achieve fast I/O without mechanical seek/rotation delay - PowerPoint PPT Presentation

FlashFQ: A Fair Queueing I/O Scheduler for Flash-Based SSDs

FlashFQ: A Fair Queueing I/O Scheduler for Flash-Based SSDsKai Shen and Stan ParkUniversity of RochesterUSENIX-ATC 201311Flash I/O FairnessUSENIX-ATC 20132NAND Flash storage devices achieve fast I/O without mechanical seek/rotation delayHigh efficiency when no OS scheduling (passing requests to device without delay or reordering, Linux noop)

But fairness is important in multi-task systems and cloudsConcern: heavy I/O operations can unfairly block light operations (e.g., writes block reads, large I/O block small I/O)

Existing fair I/O schedulers are mostly timeslice-basedLinux CFQ, Argon [Wachs et al.07], FIOS [Park and Shen12]Timeslice schedulers may exhibit poor responsiveness, particularly when there are large number of co-running tasksTimeslice schedulers cant easily exploit device parallelismFair Queueing Resource SchedulerUSENIX-ATC 20133Originated in network packet schedulingWeighted Fair Queueing [Demers et al.89], Processor Sharing [Parekh92] and others

Virtual time-based fairnessVirtual time roughly indicates accumulated resource use for a taskBalancing virtual time progression (equal resource usage) by dispatching the request from task with slowest virtual time

Management of under-utilizing tasksThose who do not immediately use allotted resourcePrevent them from building up unused resources for bursty dispatches

Timeslicing vs. Fair QueueingUSENIX-ATC 20134Timeslice scheduling

Fair queuing (more responsive)

Fair queuing (allow parallelism on parallel device)

Concern: Loss of Spatial LocalityUSENIX-ATC 20135Fair queueing with frequent task switches loses spatial localitySignificant problem for mechanical disksLess a problem for Flash drivesLogically random writes become physically sequential writes through block remapping at firmware

Ratio of random I/O latency over sequential I/O latencyFlashFQ Design BasisUSENIX-ATC 20136Build on SFQ(D) [Jin et al.04]A requests start tag is roughly the owner tasks accumulated resource usage before its service (task virtual time)Request dispatches are ordered based on their start tags for fairnessParallel dispatches are allowed up to the depth D

Prevent under-utilizing tasks from building up unused resourcesSystem virtual time: minimum virtual time of all active tasksSystem virtual time is the lower bound of request start tags bring forward the virtual time of under-utilizing tasks after inactivity forfeiture of unused resourcesChallenge: Restricted Parallelism on FlashUSENIX-ATC 20137Parallel I/O sometimes improves efficiency but interference exists among concurrently dispatched I/O operationsChallenge: exploit parallelism but manage interference

Solution: Throttled DispatchUSENIX-ATC 20138Given interference, parallel dispatches without control leads to unfairnessE.g., a writer would utilize much more resource than a reader does

Our Approach:Account for each tasks resource usageThrottled dispatch block a task if its resource usage is excessively ahead of the slowest task, who will then catch up at less interferenceChallenge: Deceptive IdlenessUSENIX-ATC 20139Existing fair queueing schedulers are work-conservingThey never idle the device when there is pending work

Deceptive idlenessAn active task that issues the next I/O request a short time after receiving the result of the previous one temporarily appears idleWork-conserving schedulers fail to recognize deceptive idlenessKnown to cause poor performance on disks [Iyer and Druschel01]

Little performance impact on Flash, but cause poor fairnessWhile a task is deceptively idle, the system virtual time may advance while forfeiting its resourcesSolution: Anticipation for FairnessUSENIX-ATC 201310Anticipation: Let a task stay active continuously when deceptive idleness appears between its consecutive requestsSystem virtual time considers such active tasks next anticipated request as a hypothetical outstanding requestThrottled dispatch also considers such active task when deciding whether another task is an excessive resource over-user

Work-conserving or notAnticipation #2 may idle the device while there is pending work wasted resourcesAnticipation #1 maintains the work-conserving propertyDifferentiated anticipation timeouts

Discussion: Knowledge of Request CostUSENIX-ATC 201311Need to know a requests resource use before it completesFinish-time-based fair queueingStart-time-based fair queueing that allows parallelism

We estimate an I/O operations resource use based on its type (read/write) and sizeFor reads and writes respectively, we assume a linear model (non-zero offset) between the I/O size and its resource use

Implementation IssuesUSENIX-ATC 201312We implement FlashFQ in the OS (Linux) to regulate I/O resource by concurrent applicationsCan also be implemented in a virtual machine monitor to manage I/O resource among VMs

Queue plugging and request mergingCritical performance enhancement techniqueComplication for fair queueing scheduler (re-computing virtual time tags for requests and tasks)

I/O context: the Linux resource principal to receive fairnessHard to use (impossible to group multiple threads together)Bug on process groupingJournaling daemon, inappropriately, has a unique I/O context by itself

Evaluation SetupUSENIX-ATC 201313Demonstrate the fairness and responsiveness of FlashFQ

Compare against several alternatives:Raw device I/O (Linux noop)Linux CFQ timeslice-based, but a timeslice ends if the task appears to be idle (even deceptively idle)Quanta strict enforcement of timeslicesFIOS our previously-developed timeslice Flash scheduler [FAST12] 4-Tag SFQ(D) no support for throttled dispatches or anticipation for fairnessEvaluation on FairnessUSENIX-ATC 201314Only Quanta, FIOS, and FlashFQ achieve fairness

Evaluation on ResponsivenessUSENIX-ATC 201315Only FlashFQ achieves fairness and responsiveness

Evaluation with Apache and KyotoCabinetUSENIX-ATC 201316Apache web server: reading mostly small filesKyotoCabinet key-value store: replacing large (128KB) records

ConclusionsUSENIX-ATC 201317Fair queueing is well suited for Flash I/O schedulingMostly work-conserving (efficient), fair, and highly responsiveSupport I/O parallelism on FlashLoss of spatial locality isnt a big concern on Flash

FlashFQBuild on classic fair queueing with parallelismThrottled dispatch to address restricted parallelism on FlashAnticipation for fairness to address deceptive idleness

Practical lessonsRequire knowledge (estimation) of request costLinux implementation: queue plugging and request merging, proper I/O context maintenance (journaling)

Data

Workstation

Router

Cloud

Workstation

Repeater

Balloon callout. Select shape and start typing. Resize box to desired dimensions. Move control handle to aim pointer at speaker.

Server

Server

Minicomputer

Task 1

Task 2

Unresponsiveness

Task 1

Task 2

An epoch

Task 1

Task 2

Task 1 timeslice

Task 2 timeslice