Multiple Device Driver and Flash FTL Sarah Diesburg COP 5641
Multiple Device Driver and Flash FTL
Sarah Diesburg
COP 5641
Introduction
Kernel uses logical remapping layers over storage to hide complexity and add functionality
Two examples
Multiple device drivers
Flash Translation Layer (FTL)
The md driver
Provides virtual devices
Created from one or more independent underlying devices
The basic mechanism to support
RAIDs
Full-disk encryption (software)
LVM
Secure deletion (TrueErase)
The md driver
File systems mounted on top of device mapper virtual device
Virtual device can
Abstract multiple devices
Perform encryption
Other things
User/Kernel Applications
DM
File System
Simple Device Mappers
Linear Maps a linear range of a device
Delay delays reads and/or writes and maps them to different
devices
Zero provides a block-device that always returns zero'd data on
reads and silently drops writes similar behavior to /dev/zero, but as a block-device instead
of a character-device.
Flakey Used for testing only, simulates intermittent, catastrophic
device failure
http://lxr.linux.no/#linux+v3.2/Documentation/device-mapper
Loading a device mapper
#!/bin/sh
# Create an identity mapping for a device
echo "0 `blockdev --getsize $1` linear $1 0" \
| dmsetup create identity
Loading a device mapper
#!/bin/sh
# Create an identity mapping for a device
echo "0 `blockdev --getsize $1` linear $1 0" \
| dmsetup create identity
Logical start sector
Loading a device mapper
#!/bin/sh
# Create an identity mapping for a device
echo "0 `blockdev --getsize $1` linear $1 0" \
| dmsetup create identity
Command to get number of sectors of a device (like /dev/sda1)
Loading a device mapper
#!/bin/sh
# Create an identity mapping for a device
echo "0 `blockdev --getsize $1` linear $1 0" \
| dmsetup create identity
Type of device mapper device we want. Linear is a one-to-one logical to physical sector mapping.
Loading a device mapper
#!/bin/sh
# Create an identity mapping for a device
echo "0 `blockdev --getsize $1` linear $1 0" \
| dmsetup create identity
Linear parameters: base device (like /dev/sda1)
Loading a device mapper
#!/bin/sh
# Create an identity mapping for a device
echo "0 `blockdev --getsize $1` linear $1 0" \
| dmsetup create identity
Linear parameters: starting offset within the device
Loading a device mapper
#!/bin/sh
# Create an identity mapping for a device
echo "0 `blockdev --getsize $1` linear $1 0" \
| dmsetup create identity
Pipe the command to dmsetup, acts like “table_file” parameter
Loading a device mapper
#!/bin/sh
# Create an identity mapping for a device
echo "0 `blockdev --getsize $1` linear $1 0" \
| dmsetup create identity
dmsetup command manages logical devices that use the device mapper driver. See ‘man dmsetup’ for more information.
Loading a device mapper
#!/bin/sh
# Create an identity mapping for a device
echo "0 `blockdev --getsize $1` linear $1 0" \
| dmsetup create identity
We wish to “create” a new logical device mapper device.
Loading a device mapper
#!/bin/sh
# Create an identity mapping for a device
echo "0 `blockdev --getsize $1` linear $1 0" \
| dmsetup create identity
We name the new device “identity”.
Loading a device mapper
Can then mount file system directly on top of virtual device
#!/bin/bash
mount /dev/mapper/identity /mnt
Unloading a device mapper
#!/bin/bash
umount /mnt
dmsetup remove identity
Unloading a device mapper
#!/bin/bash
umount /mnt
dmsetup remove identity
First unmount the file system
Unloading a device mapper
#!/bin/bash
umount /mnt
dmsetup remove identity
Then use dmsetup to remove the device called identity
dm-linear.c
Documentation
http://lxr.linux.no/#linux+v3.2/Documentation/device-mapper/linear.txt
Code
http://lxr.linux.no/#linux+v3.2/drivers/md/dm-linear.c
dm-linear.c
static struct target_type linear_target = {
.name = "linear",
.version = {1, 1, 0},
.module = THIS_MODULE,
.ctr = linear_ctr,
.dtr = linear_dtr,
.map = linear_map,
.status = linear_status,
.ioctl = linear_ioctl,
.merge = linear_merge,
.iterate_devices = linear_iterate_devices,
};
linear_map
static int linear_map(struct dm_target *ti, struct bio *bio,
union map_info *map_context)
{
struct linear_c *lc = (struct linear_c *) ti->private;
bio->bi_bdev = lc->dev->bdev;
bio->bi_sector = lc->start + (bio->bi_sector - ti->begin);
return DM_MAPIO_REMAPPED;
}
(**Note – this is a simpler function from an earlier kernel version.
Version 3.2 does the same, but with a few more helper functions)
Memory Technology Device
Different than a character or block device
Exports a special character device with extra ioctls and operations to access flash storage
For raw flash devices (not USB sticks)
Embedded chips
http://www.linux-mtd.infradead.org/
NAND Flash Characteristics
Flash has different constraints than hard drives or character devices
Exports read, write, and erase operations
NAND Flash Characteristics
Can only write to a freshly-erased location
If you want to write again to same physical location, you must first erase the area
Reads and writes are to smaller flash pages
Erasures are performed in flash blocks
Holds many flash pages
NAND Flash Characteristics
Each storage location can be erased only 10K-1M times
Writing is slower than reading
Erasures can be 10x slower than writing
Each NAND page has a small, non-addressable out-of-bounds area to hold state and mapping information
Accessed by ioctls
NAND Flash Characteristics
We need a way to not wear out the flash and have good performance with a minimum of writes and erases
Flash Translation Layer
The solution is to stack a flash translation layer (FTL) on top of the raw flash device
Exports a block device
Takes care of the flash operations of reads, writes, and erases
Evenly wears writes to all flash locations
Marks old pages as invalid until they can be erased later
Data Path
Virtual file system (VFS)
File system
Multi-device drivers
Ext3
Disk driver Disk driver MTD driver MTD driver
JFFS2
FTL
Apps
Flash Translation Layer
Rotates the usage of pages
OS
Logical Address Physical Address
0 0
1 1
Write random
bits to 1
data Flash
0 1 2 3 4 5 6
data
Flash Translation Layer
Overwrites go to new page
Logical Address Physical
Address 0 0
1 2
Write random
bits to 1
data Flash
0 1 2 3 4 5 6
random data
OS
FTL Example
INFTL – Inverse Nand Flash Translation Layer
Open-source FTL in linux kernel for DiskOnChip flash
Somewhat out-dated
INFTL
Broken into two files
inftlmount.c – load/unload functions
inftlcore.c – flash and wear-leveling operations
http://lxr.linux.no/linux+*/drivers/mtd/inftlmount.c
http://lxr.linux.no/linux+*/drivers/mtd/inftlcore.c
INFTL
Stack-based algorithm to provide the illusion of updates
Each stack (or chain) corresponds to a virtual address with sequentially-addressed pages
INFTL “Chaining”
INFTL “Chaining”
Chains can grow to any length
Once there are no more freshly-erased erase blocks, some old ones must be garbage-collected
Chain is “folded” so that all valid data is copied into top erase block
Lower erase blocks in chain are erased and put back into the pool
inftlcore.c
static struct mtd_blktrans_ops inftl_tr = {
.name = "inftl",
.major = INFTL_MAJOR,
.part_bits = INFTL_PARTN_BITS,
.blksize = 512,
.getgeo = inftl_getgeo,
.readsect = inftl_readblock,
.writesect = inftl_writeblock,
.add_mtd = inftl_add_mtd,
.remove_dev = inftl_remove_dev,
.owner = THIS_MODULE,
};
inftl_writeblock
static int inftl_writeblock(struct mtd_blktrans_dev *mbd, unsigned long block,
char *buffer)
{
struct INFTLrecord *inftl = (void *)mbd;
unsigned int writeEUN;
unsigned long blockofs = (block * SECTORSIZE) & (inftl->EraseSize - 1);
size_t retlen;
struct inftl_oob oob;
char *p, *pend;
inftl_writeblock
/* Is block all zero? */
pend = buffer + SECTORSIZE;
for (p = buffer; p < pend && !*p; p++);
if (p < pend) {
writeEUN = INFTL_findwriteunit(inftl, block);
if (writeEUN == BLOCK_NIL) {
printk(KERN_WARNING "inftl_writeblock():cannot find"
"block to write to\n");
/*
* If we _still_ haven't got a block to use, we're screwed.
*/
return 1;
}
memset(&oob, 0xff, sizeof(struct inftl_oob));
inftl_writeblock
memset(&oob, 0xff, sizeof(struct inftl_oob));
oob.b.Status = oob.b.Status1 = SECTOR_USED;
inftl_write(inftl->mbd.mtd, (writeEUN * inftl->EraseSize) +
blockofs, SECTORSIZE, &retlen, (char *)buffer,
(char *)&oob);
} else {
INFTL_deleteblock(inftl, block);
}
return 0;
}