Top Banner
Huaicheng Li, Mingzhe Hao, Michael Hao Tong, Swaminatahan Sundararaman*, Matias Bjørling , Haryadi S. Gunawi The CASE of FEMU: Cheap, Accurate, Scalable and Extensible Flash Emulator * + +
20

The CASE of FEMU: Cheap, Accurate, Scalable and · QEMU NVMe Emulation Guest OS App NVMe driver Tail DoorBell Head DoorBell Shadow ... FEMU: …

Jun 30, 2018

Download

Documents

dokiet
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: The CASE of FEMU: Cheap, Accurate, Scalable and · QEMU NVMe Emulation Guest OS App NVMe driver Tail DoorBell Head DoorBell Shadow ...  FEMU: …

Huaicheng Li, Mingzhe Hao, Michael Hao Tong, Swaminatahan Sundararaman*, Matias Bjørling , Haryadi S. Gunawi

The CASE of FEMU: Cheap, Accurate, Scalable and

Extensible Flash Emulator

ceres.cs.uchicago.edu

* +

+

Page 2: The CASE of FEMU: Cheap, Accurate, Scalable and · QEMU NVMe Emulation Guest OS App NVMe driver Tail DoorBell Head DoorBell Shadow ...  FEMU: …

What SSD platforms are used?2FEMU @ FAST ’18

- Software-Defined Flash - Split-Level Architecture

Trends57%

Emulator Hardware PlatformSimulator

Simple

Time-saving

Trace drivenInternal-researchonly

SSDSim FlashSimDiskSim+SSD

Page 3: The CASE of FEMU: Cheap, Accurate, Scalable and · QEMU NVMe Emulation Guest OS App NVMe driver Tail DoorBell Head DoorBell Shadow ...  FEMU: …

3FEMU @ FAST ’18

EmulatorSimulator

Simple

Time-saving

Trace drivenInternal-researchonly

SSDSim FlashSimDiskSim+SSD

Full-stack Research

Accurate

ExpensiveComplex to use

Hardware Platform

Wear-out

OpenSSD OpenChannel-

SSD20%

19% Single SSD

1% Distributed SSDs

Page 4: The CASE of FEMU: Cheap, Accurate, Scalable and · QEMU NVMe Emulation Guest OS App NVMe driver Tail DoorBell Head DoorBell Shadow ...  FEMU: …

4

Fullstack Research

Cheap

Poor ScalabilityPoor Accuracy

Guest OS

FEMU @ FAST ’18

Simulator Emulator

Simple

Time-saving

Trace drivenInternal-researchonly

Full-stack Research

Accurate

ExpensiveComplex to use

Hardware Platform

Wear-out

OpenSSD OpenChannel-

SSDSSDSim FlashSim

DiskSim+SSDVSSIM FlashEm

LightNVM’s QEMU

Page 5: The CASE of FEMU: Cheap, Accurate, Scalable and · QEMU NVMe Emulation Guest OS App NVMe driver Tail DoorBell Head DoorBell Shadow ...  FEMU: …

The “CASE” of FEMU

❏ Extensible❏ modifiable interface❏ modifiable FTL

FEMU @ FAST ’18 5

FEMU: QEMU/Software based Flash Emulator

❏ Cheap: $0, https://github.com/ucare-uchicago/femu

❏ Accurate: 0.5-38% error rate in latency❏ 11% average at microsecond level

❏ Scalable: support 32 channels/chips

Page 6: The CASE of FEMU: Cheap, Accurate, Scalable and · QEMU NVMe Emulation Guest OS App NVMe driver Tail DoorBell Head DoorBell Shadow ...  FEMU: …

What is FEMU?

QEMU

FEMU

App

Guest OS

VM

App

Host OS

Typical Fullstack Research FEMU Fullstack Research

FEMU @ FAST ’18 6

Hardware Platform

Supported research:

NVMe

Kernel changes

Interface changes

FTL changes

Page 7: The CASE of FEMU: Cheap, Accurate, Scalable and · QEMU NVMe Emulation Guest OS App NVMe driver Tail DoorBell Head DoorBell Shadow ...  FEMU: …

QEMU Scalability

FEMU @ FAST ’18

Guest OS

QEMU

IO IO IO

...

7

Expected

Page 8: The CASE of FEMU: Cheap, Accurate, Scalable and · QEMU NVMe Emulation Guest OS App NVMe driver Tail DoorBell Head DoorBell Shadow ...  FEMU: …

QEMU IDE Scalability

FEMU @ FAST ’18

IO

Guest OS

QEMU

8

Expected

1 IO thread

Page 9: The CASE of FEMU: Cheap, Accurate, Scalable and · QEMU NVMe Emulation Guest OS App NVMe driver Tail DoorBell Head DoorBell Shadow ...  FEMU: …

FEMU @ FAST ’18

IO

Guest OS

QEMU

9

Expected

2 IO threads

IO

Page 10: The CASE of FEMU: Cheap, Accurate, Scalable and · QEMU NVMe Emulation Guest OS App NVMe driver Tail DoorBell Head DoorBell Shadow ...  FEMU: …

FEMU @ FAST ’18

IO

Guest OS

QEMU

IO IOIO

10

Expected

Represent VSSIM

Expected

Page 11: The CASE of FEMU: Cheap, Accurate, Scalable and · QEMU NVMe Emulation Guest OS App NVMe driver Tail DoorBell Head DoorBell Shadow ...  FEMU: …

QEMU NVMe Scalability

FEMU @ FAST ’18

IO

Guest OS

QEMU

IO IO

...

11

Represent LightNVM’s QEMU

Expected

Page 12: The CASE of FEMU: Cheap, Accurate, Scalable and · QEMU NVMe Emulation Guest OS App NVMe driver Tail DoorBell Head DoorBell Shadow ...  FEMU: …

QEMU Scalability

FEMU @ FAST ’18 12

QEMU and existing emulators are NOT Scalable !

FEMU is Scalable !

Page 13: The CASE of FEMU: Cheap, Accurate, Scalable and · QEMU NVMe Emulation Guest OS App NVMe driver Tail DoorBell Head DoorBell Shadow ...  FEMU: …

QEMU NVMe Emulation

Guest OS

App

NVMe driver

Tail DoorBell Head DoorBell

Shadow DoorBell

Shadow DoorBell

Scalability Root Causes & Solutions (1)FEMU @ FAST ’18

Guest OS

App

NVMe driver

QEMU NVMe Emulation

Tail DoorBell Head Doorbell

thousands of cycles interrupt overhead

polling ZERO VM-exit

13

Submission Queue

Completion Queue

Submission Queue

Completion Queue

VM-exit

Page 14: The CASE of FEMU: Cheap, Accurate, Scalable and · QEMU NVMe Emulation Guest OS App NVMe driver Tail DoorBell Head DoorBell Shadow ...  FEMU: …

DMA Emulation

FEMU @ FAST ’18 14

NVMe Emulation

DMA EmulationBlock Driver

Image Format Driver

Raw Device Driver

AIO Queue

Thread Pool

Host File System

Host Block IO LayerHost Device Driver

NVMe Emulation

FEMU Heap Storage

DMA from/to heap storage

Scalability Root Causes & Solutions (2)

More than20us latency reduction

Page 15: The CASE of FEMU: Cheap, Accurate, Scalable and · QEMU NVMe Emulation Guest OS App NVMe driver Tail DoorBell Head DoorBell Shadow ...  FEMU: …

FEMU AccuracyFEMU @ FAST ’18

FEMU ?App

Lfemu Loc

Error = |Lfemu - Loc| / Loc

15

OpenChannel-SSD

Page 16: The CASE of FEMU: Cheap, Accurate, Scalable and · QEMU NVMe Emulation Guest OS App NVMe driver Tail DoorBell Head DoorBell Shadow ...  FEMU: …

FEMU @ FAST ’18

TR Ttransfer+

16

NAND Data Register RAM

Req1

NAND Data Register RAM

Req2Req1 Req2

time→

queueing delay

+ TR + Ttransfer

NAND Data Register RAM

Req1

NAND Data Register RAM

Req2

Req1

Req2

Cache Register

Cache Register

faster

Single-Register model (S-Reg)

Double-Register model (D-Reg)

Page 17: The CASE of FEMU: Cheap, Accurate, Scalable and · QEMU NVMe Emulation Guest OS App NVMe driver Tail DoorBell Head DoorBell Shadow ...  FEMU: …

FEMU AccuracyFEMU @ FAST ’18

Latency Error: 11-57% ⇒ 0.5-38%

17

Single Register Model (S-Reg)

Double Register Model (D-Reg)

X: # of channelsY: # of planes per channel

Single Register ModelDouble Register Model

Similar!

Page 18: The CASE of FEMU: Cheap, Accurate, Scalable and · QEMU NVMe Emulation Guest OS App NVMe driver Tail DoorBell Head DoorBell Shadow ...  FEMU: …

FEMU Limitations

FEMU @ FAST ’18

● No persistence

18

● Further optimizations to support higher parallelism (more scalable)

● Accuracy can be improved

● Not able to emulate large-capacity SSD

Page 19: The CASE of FEMU: Cheap, Accurate, Scalable and · QEMU NVMe Emulation Guest OS App NVMe driver Tail DoorBell Head DoorBell Shadow ...  FEMU: …

FEMU @ FAST ’18

Downloading, installing and using FEMU can cause side effects including headache, nausea, agitation, and depression. If your research condition does not improve after using FEMU for a week, please talk to your advisor or us right away.

19FEMU @ FAST ’18

● Cheap● Accurate● Scalable● Extensible

FEMU150mg

Installing, using and debugging FEMU can cause side effects including headache, nausea, agitation, and depression. If your research condition does not improve after using FEMU for a week, please talk to your advisor or us right away.

Page 20: The CASE of FEMU: Cheap, Accurate, Scalable and · QEMU NVMe Emulation Guest OS App NVMe driver Tail DoorBell Head DoorBell Shadow ...  FEMU: …

Thank you!Questions?

http://ucare.cs.uchicago.edu

FEMU: https://github.com/ucare-uchicago/femu

20FEMU @ FAST ’18