Multiple Device Driver and Flash FTL Sarah Diesburg COP 5641 Introduction Kernel uses logical remapping layers over storage to hide complexity and add functionality Two examples Multiple device drivers Flash Translation Layer (FTL) The md driver Provides virtual devices Created from one or more independent underlying devices The basic mechanism to support RAIDs Full-disk encryption (software) LVM Secure deletion (TrueErase) The md driver File systems mounted on top of device mapper virtual device Virtual device can Abstract multiple devices Perform encryption Other things Applications File System DM User/Kernel Simple Device Mappers Linear Delay provides a block-device that always returns zero'd data on reads and silently drops writes similar behavior to /dev/zero, but as a block-device instead of a character-device. Flakey delays reads and/or writes and maps them to different devices Zero Maps a linear range of a device Used for testing only, simulates intermittent, catastrophic device failure http://lxr.linux.no/#linux+v3.2/Documentation/devicemapper Loading a device mapper #!/bin/sh # Create an identity mapping for a device echo "0 `blockdev --getsize $1` linear $1 0" \ | dmsetup create identity Loading a device mapper #!/bin/sh # Create an identity mapping for a device echo "0 `blockdev --getsize $1` linear $1 0" \ | dmsetup create identity Logical start sector Loading a device mapper #!/bin/sh # Create an identity mapping for a device echo "0 `blockdev --getsize $1` linear $1 0" \ | dmsetup create identity Command to get number of sectors of a device (like /dev/sda1) Loading a device mapper #!/bin/sh # Create an identity mapping for a device echo "0 `blockdev --getsize $1` linear $1 0" \ | dmsetup create identity Type of device mapper device we want. Linear is a one-to-one logical to physical sector mapping. Loading a device mapper #!/bin/sh # Create an identity mapping for a device echo "0 `blockdev --getsize $1` linear $1 0" \ | dmsetup create identity Linear parameters: base device (like /dev/sda1) Loading a device mapper #!/bin/sh # Create an identity mapping for a device echo "0 `blockdev --getsize $1` linear $1 0" \ | dmsetup create identity Linear parameters: starting offset within the device Loading a device mapper #!/bin/sh # Create an identity mapping for a device echo "0 `blockdev --getsize $1` linear $1 0" \ | dmsetup create identity Pipe the command to dmsetup, acts like “table_file” parameter Loading a device mapper #!/bin/sh # Create an identity mapping for a device echo "0 `blockdev --getsize $1` linear $1 0" \ | dmsetup create identity dmsetup command manages logical devices that use the device mapper driver. See ‘man dmsetup’ for more information. Loading a device mapper #!/bin/sh # Create an identity mapping for a device echo "0 `blockdev --getsize $1` linear $1 0" \ | dmsetup create identity We wish to “create” a new logical device mapper device. Loading a device mapper #!/bin/sh # Create an identity mapping for a device echo "0 `blockdev --getsize $1` linear $1 0" \ | dmsetup create identity We name the new device “identity”. Loading a device mapper Can then mount file system directly on top of virtual device #!/bin/bash mount /dev/mapper/identity /mnt Unloading a device mapper #!/bin/bash umount /mnt dmsetup remove identity Unloading a device mapper #!/bin/bash umount /mnt dmsetup remove identity First unmount the file system Unloading a device mapper #!/bin/bash umount /mnt dmsetup remove identity Then use dmsetup to remove the device called identity dm-linear.c Documentation http://lxr.linux.no/#linux+v3.2/Documenta tion/device-mapper/linear.txt Code http://lxr.linux.no/#linux+v3.2/drivers/md/ dm-linear.c dm-linear.c static struct target_type linear_target = { .name = "linear", .version = {1, 1, 0}, .module = THIS_MODULE, .ctr = linear_ctr, .dtr = linear_dtr, .map = linear_map, .status = linear_status, .ioctl = linear_ioctl, .merge = linear_merge, .iterate_devices = linear_iterate_devices, }; linear_map static int linear_map(struct dm_target *ti, struct bio *bio, union map_info *map_context) { struct linear_c *lc = (struct linear_c *) ti->private; bio->bi_bdev = lc->dev->bdev; bio->bi_sector = lc->start + (bio->bi_sector - ti->begin); return DM_MAPIO_REMAPPED; } (**Note – this is a simpler function from an earlier kernel version. Version 3.2 does the same, but with a few more helper functions) Memory Technology Device Different than a character or block device For raw flash devices (not USB sticks) Exports a special character device with extra ioctls and operations to access flash storage Embedded chips http://www.linux-mtd.infradead.org/ NAND Flash Characteristics Flash has different constraints than hard drives or character devices Exports read, write, and erase operations NAND Flash Characteristics Can only write to a freshly-erased location If you want to write again to same physical location, you must first erase the area Reads and writes are to smaller flash pages Erasures are performed in flash blocks Holds many flash pages NAND Flash Characteristics Each storage location can be erased only 10K-1M times Writing is slower than reading Erasures can be 10x slower than writing Each NAND page has a small, nonaddressable out-of-bounds area to hold state and mapping information Accessed by ioctls NAND Flash Characteristics We need a way to not wear out the flash and have good performance with a minimum of writes and erases Flash Translation Layer The solution is to stack a flash translation layer (FTL) on top of the raw flash device Exports a block device Takes care of the flash operations of reads, writes, and erases Evenly wears writes to all flash locations Marks old pages as invalid until they can be erased later Data Path Apps Virtual file system (VFS) File system Ext3 Multi-device drivers FTL Disk driver Disk driver MTD driver JFFS2 MTD driver Flash Translation Layer Rotates the usage of pages Write random bits to 1 OS Flash data data 0 1 2 Logical Address 0 Physical Address 0 1 1 3 4 5 6 Flash Translation Layer Overwrites go to new page Write random bits to 1 OS Flash data data random 0 1 2 Logical Address 0 Physical Address 0 1 2 3 4 5 6 FTL Example INFTL – Inverse Nand Flash Translation Layer Open-source FTL in linux kernel for DiskOnChip flash Somewhat out-dated INFTL Broken into two files inftlmount.c – load/unload functions inftlcore.c – flash and wear-leveling operations http://lxr.linux.no/linux+*/drivers/mtd/i nftlmount.c http://lxr.linux.no/linux+*/drivers/mtd/i nftlcore.c INFTL Stack-based algorithm to provide the illusion of updates Each stack (or chain) corresponds to a virtual address with sequentiallyaddressed pages INFTL “Chaining” INFTL “Chaining” Chains can grow to any length Once there are no more freshly-erased erase blocks, some old ones must be garbage-collected Chain is “folded” so that all valid data is copied into top erase block Lower erase blocks in chain are erased and put back into the pool inftlcore.c static struct mtd_blktrans_ops inftl_tr = { .name = "inftl", .major = INFTL_MAJOR, .part_bits = INFTL_PARTN_BITS, .blksize = 512, .getgeo = inftl_getgeo, .readsect = inftl_readblock, .writesect = inftl_writeblock, .add_mtd = inftl_add_mtd, .remove_dev = inftl_remove_dev, .owner = THIS_MODULE, }; inftl_writeblock static int inftl_writeblock(struct mtd_blktrans_dev *mbd, unsigned long block, char *buffer) { struct INFTLrecord *inftl = (void *)mbd; unsigned int writeEUN; unsigned long blockofs = (block * SECTORSIZE) & (inftl->EraseSize - 1); size_t retlen; struct inftl_oob oob; char *p, *pend; inftl_writeblock /* Is block all zero? */ pend = buffer + SECTORSIZE; for (p = buffer; p < pend && !*p; p++); if (p < pend) { writeEUN = INFTL_findwriteunit(inftl, block); if (writeEUN == BLOCK_NIL) { printk(KERN_WARNING "inftl_writeblock():cannot find" "block to write to\n"); /* * If we _still_ haven't got a block to use, we're screwed. */ return 1; } inftl_writeblock memset(&oob, 0xff, sizeof(struct inftl_oob)); oob.b.Status = oob.b.Status1 = SECTOR_USED; inftl_write(inftl->mbd.mtd, (writeEUN * inftl->EraseSize) + blockofs, SECTORSIZE, &retlen, (char *)buffer, (char *)&oob); } else { INFTL_deleteblock(inftl, block); } return 0; }