IBM eServer xSeries Technical Conference © IBM Corporation 2003 1 Session ID: O24 Steve Dobbelstein Lake Buena Vista, FL September 8-12, 2003 Enterprise Volume Management System for Linux © IBM Corporation 2003
Jan 19, 2018
IBM eServer xSeries Technical Conference© IBM Corporation 2003 1
Session ID: O24
Steve Dobbelstein
Lake Buena Vista, FL September 8-12, 2003
Enterprise Volume Management System for Linux
© IBM Corporation 2003
IBM eServer xSeries Technical Conference© IBM Corporation 2003 1
Trademark Notes The following terms are trademarks of International
Business Machines Corporation in the United States and/or other countries:
IBM AIX OS/2 System/390
Linux is a trademark of Linus Torvalds Other company, product, and service names may be
trademarks or service marks of others.
IBM eServer xSeries Technical Conference© IBM Corporation 2003 1
Agenda Volume management basics
Overview of volume management schemes Enterprise Volume Management System (EVMS)
Overview Demonstration
IBM eServer xSeries Technical Conference© IBM Corporation 2003 1
What is Volume Management? Provide a logical
abstraction of the physical storage devices
File systems and applications do not need to know about the organization of the physical devices
Disk Disk Disk
Volume Volume Volume Volume
IBM eServer xSeries Technical Conference© IBM Corporation 2003 1
Where To Use Volume Management
Systems with lots of physical storage Lots of disks (tens, hundreds, thousands) Combine many disks into a single pool of storage
Increased total storage space Redundancy to protect against hardware failures
Systems with little physical storage Single disk (most PCs, laptops) Divide up disk to provide logically separate storage for
different uses
IBM eServer xSeries Technical Conference© IBM Corporation 2003 1
Disk Partitioning Divide a disk into one or
more logical sections Simple, widely used Fixed sizes, difficult to
resize No redundancy hda
hda1 hda5 hda6
IBM eServer xSeries Technical Conference© IBM Corporation 2003 1
RAID Redundant Array of Inexpensive Disks Combine several disks
Increase total storage space Provide redundancy Improve performance
Can be done in hardware or software
IBM eServer xSeries Technical Conference© IBM Corporation 2003 1
RAID Linear Linear concatenation of
several disks Increased total storage
space No redundancy or
performance improvement
hda hdb hdc hdd
md/md3
IBM eServer xSeries Technical Conference© IBM Corporation 2003 1
RAID 0 Striping
Data are interleaved across all disks
Increased total storage space
Improved performance with parallel I/O
No redundancy hda
951
1317
hdb
1062
1418
hdc
1173
1519
hdd
1284
1620
md/md0
IBM eServer xSeries Technical Conference© IBM Corporation 2003 1
RAID 1 Mirroring Redundancy
Multiple copies of all data No extra storage space
Device size is equal to a single disk
Improved read performance Reduced write performance hda hdb
md/md1
IBM eServer xSeries Technical Conference© IBM Corporation 2003 1
RAID 4 Striping with parity Redundancy, but less
than with mirroring One "chunk" of parity bits
per stripe Increased storage space
(minus size of one disk) Improved performance,
but at cost of CPU overhead
hda
741
1013
hdb
852
1114
hdc
963
1215
hdd
PPP
PP
md/md4
IBM eServer xSeries Technical Conference© IBM Corporation 2003 1
RAID 5 RAID 4 creates a bottleneck
on the parity disk Spread parity among all
disks for better performance Same total size as RAID 4
hda
7P1
1013
hdb
P42
1114
hdc
853
P15
hdd
96P
12P
md/md5
IBM eServer xSeries Technical Conference© IBM Corporation 2003 1
Volume Groups A collection of devices (disks, partitions, RAID) The space of all devices is combined in the group, but
not directly available as a device.
hda
hda5hda1
lvm/vg1
hdb
IBM eServer xSeries Technical Conference© IBM Corporation 2003 1
Volume Groups Combined space is divided into fixed-sized chunks
Physical Extents (PEs) Similar to memory page frames
hda
hda5hda1
lvm/vg1
hdb
IBM eServer xSeries Technical Conference© IBM Corporation 2003 1
Volume Groups Create volumes from free-space in the group Volumes consist of Logical Extents (LEs)
Each LE maps to a PE
lvm/vg1
lvm/vg1/home lvm/vg1/web
IBM eServer xSeries Technical Conference© IBM Corporation 2003 1
Volume Groups Simple resizing of groups
Add new devices to the group to expand total available free-space
Remove devices that aren't used by any volumes
hda
hda5hda1
lvm/vg1
hdb hdc
IBM eServer xSeries Technical Conference© IBM Corporation 2003 1
Volume Groups Simple resizing of volumes
Add or remove extents at the end of the volume
lvm/vg1
lvm/vg1/home lvm/vg1/web
IBM eServer xSeries Technical Conference© IBM Corporation 2003 1
Snapshots Frozen image of a live volume
Useful for performing consistent backups without needing to take file system off-line
Copy-On-Write to save old data "Origin" volume is always up-to-date Snapshot capacity can be smaller than origin Multiple simultaneous snapshots of same origin
IBM eServer xSeries Technical Conference© IBM Corporation 2003 1
Snapshots Origin volume and snapshot storage are divided into
equal sized “chunks”
Origin volume Snapshot storageCOW table
IBM eServer xSeries Technical Conference© IBM Corporation 2003 1
Snapshots Write to the origin volume
Origin volume Snapshot storage
IBM eServer xSeries Technical Conference© IBM Corporation 2003 1
Snapshots Write to the origin volume
Write request is queued to be finished later Chunk is read from the origin
Origin volume Snapshot storage
IBM eServer xSeries Technical Conference© IBM Corporation 2003 1
Snapshots Write to the origin volume
Write chunk to snapshot storage
Origin volume Snapshot storage
IBM eServer xSeries Technical Conference© IBM Corporation 2003 1
Snapshots Write to the origin volume
New COW table entry: Map chunk 4 to chunk 1
Origin volume Snapshot storage
IBM eServer xSeries Technical Conference© IBM Corporation 2003 1
Snapshots Write to the origin volume
Release all I/Os waiting on this copy
Origin volume Snapshot storage
IBM eServer xSeries Technical Conference© IBM Corporation 2003 1
Snapshots Read from the snapshot volume
Unmapped chunk: get data from origin volume
Origin volume Snapshot storage
Snapshot volume
IBM eServer xSeries Technical Conference© IBM Corporation 2003 1
Snapshots Read from the snapshot volume
Re-mapped chunk: get data from snapshot
Origin volume Snapshot storage
Snapshot volume
IBM eServer xSeries Technical Conference© IBM Corporation 2003 1
Bad Block Relocation Detect I/O errors Remap bad blocks to a reserved area of the device
BBR device
Normal data area Reservedsectors
IBM eServer xSeries Technical Conference© IBM Corporation 2003 1
Enterprise Volume Management System
Modular, extensible system for managing storage on Linux
Integrates all aspects of volume management into a single package
Disk Partitioning (fdisk, Disk Druid) Volume Groups (LVM) Software RAID (MD) File Systems (mkfs, fsck, resizefs) More
IBM eServer xSeries Technical Conference© IBM Corporation 2003 1
EVMS Architecture
EVMS Engine
DOS
LVM
Plug-ins
RAID
User Interfaces
GUI
Text-Mode
CLI
MD DMRAID-0
RAID-1
Linear
Snapshot
User-Space
Kernel-Space
IBM eServer xSeries Technical Conference© IBM Corporation 2003 1
EVMS Architecture Engine
Core library Provides API for user interfaces
Coordinates all activities with the plug-ins Defines common set of possible tasks
Creation Deletion Resize Configuration changes
IBM eServer xSeries Technical Conference© IBM Corporation 2003 1
EVMS Engine Architecture
EVMSEngine
Device Managers
Segment Managers
Region Managers
EVMS Features
File System Interface Modules
Cluster Manager
IBM eServer xSeries Technical Conference© IBM Corporation 2003 1
EVMS Engine Architecture Volume discovery bubbles up through the layers
Local Disk Manager plug-in discovers all disk devices Each plug-in examines current list of devices
Claims a device by removing from the list Creates new devices and adds to the list
IBM eServer xSeries Technical Conference© IBM Corporation 2003 1
EVMS Engine Architecture Plug-ins
Each plug-in recognizes a specific volume format Partitions Volume Groups Software RAID EVMS Features File Systems
IBM eServer xSeries Technical Conference© IBM Corporation 2003 1
EVMS Plug-ins
EVMSEngine
Linux HA RSCT LocalDisk Manager
DOSSegment Manager
GPTSegment Manager
BSDSegment Manager
MacSegment Manager
Bad BlockSegment Manager
ClusterSegment Manager
SparseSegment Manager
S/390Segment Manager
LVMRegion Manager
AIXRegion Manager
OS/2Region Manager
MD LinearRegion Manager
MD RAID 0Region Manager
MD RAID 1Region Manager
MD RAID 4/5Region Manager
MD MultipathRegion Manager
Drive LinkingFeature
Bad BlockFeature
SnapshotFeature
ext2/3FSIM
JFSFSIM
ReiserFSFSIM
SwapFSIM
XFSFSIM
IBM eServer xSeries Technical Conference© IBM Corporation 2003 1
EVMS User Interfaces Communicate with Engine through well-defined API Graphical User Interface (evmsgui)
Point-and-click Intuitive displays
Text-Mode (evmsn) Graphical-like for terminal windows Same look and feel of the GUI
Command-Line (evms) Create scripts for common tasks
IBM eServer xSeries Technical Conference© IBM Corporation 2003 1
Clustering Support in EVMS EVMS can operate in a cluster of machines
Assign ownership to a group of disks Private to one cluster node Shared by all cluster nodes Reassign ownership during fail-over
Remote Administration Ability to administer other nodes in the cluster from a single
machine
IBM eServer xSeries Technical Conference© IBM Corporation 2003 1
Clustering Support in EVMS Cluster Managers
Provide membership and communication services Used to specify fail-over policies Linux-HA
Open-source, High-Availability cluster manager RSCT
Reliable Scalable Cluster Technology OpenGFS (in development)
Open-source cluster file system Mount a file system on all cluster nodes simultaneously
IBM eServer xSeries Technical Conference© IBM Corporation 2003 1
EVMS Platform Support EVMS has been tested on Linux running on:
xSeries x86 IA64
pSeries PPC PPC64
zSeries s390 s390x
IBM eServer xSeries Technical Conference© IBM Corporation 2003 1
Summary of EVMS Capabilities EVMS integrates all aspects of disk, partition, and
volume management into a single, easy-to-use system. EVMS has no design limits on the number of disks,
partitions, or volumes that it can handle. EVMS minimizes down time due to configuration
changes. EVMS is very extensible, due to its modular design and
support of plug-in modules. (continued)
IBM eServer xSeries Technical Conference© IBM Corporation 2003 1
Summary of EVMS Capabilities EVMS is designed to be scalable, including use in
clustering environments. EVMS can read, write, and manipulate volumes created
by other volume managers (given the proper plug-ins). EVMS can reduce the costs associated with migrating
data to Linux.
IBM eServer xSeries Technical Conference© IBM Corporation 2003 1
EVMS Project Information Hosted on SourceForge
http://evms.sf.net Live CVS tree Mailing lists
[email protected] [email protected]
Installation instructions Documentation
http://evms.sf.net/docs/EVMS-xSeries-2003.ppt IRC: irc.freenode.net, #evms
IBM eServer xSeries Technical Conference© IBM Corporation 2003 1
EVMS Demonstration