All Rights Reserved, Copyright (C) FUJITSU 2007 SCSI support on Xen Matsumoto Hitoshi [email protected] Fujitsu Ltd.
Sep 08, 2018
All Rights Reserved, Copyright (C) FUJITSU 2007
SCSI support on Xen
Matsumoto Hitoshi [email protected]
Fujitsu Ltd.
All Rights Reserved, Copyright (C) FUJITSU 20072
AgendaAgenda
ArchitectureCurrent status Next challenge
All Rights Reserved, Copyright (C) FUJITSU 20074
Requirements for VM in data centerRequirements for VM in data center
Conventional application supportServer consolidation HW fault tolerancePerformance
All Rights Reserved, Copyright (C) FUJITSU 20075
Requirements for VM in data centerRequirements for VM in data center
Conventional application supportSome applications issue SCSI commands.
ex. DB (enterprise), backup.
pvSCSI driver
Server consolidation HW fault tolerancePerformance
All Rights Reserved, Copyright (C) FUJITSU 20076
Data centerData center
Data Center managementEnterprise data are stored in FC/SCSI devices.Reliability and availability are required.Many SCSI devices in data center.
hardware snapshottape operation
DB server
backup server
snapshotdata file data file
Oracle
storage (RAID)
tape drive
SCSI commandSCSI command
load, unload, reset
hardware snapshot
LAN
SAN
All Rights Reserved, Copyright (C) FUJITSU 20077
Minimum backup windowMinimum backup window
example : SCSI command for storage
Disk to TapeDisk to Tape
Disk to diskDisk to disk
Hardware Hardware snapshotsnapshot
D2DD2D
D2TD2T on line on line
D2DD2D
Backup window
minimize
All Rights Reserved, Copyright (C) FUJITSU 20078
example : SCSI command for tapeexample : SCSI command for tape
backup restoreload
unload
Disk(RAID)
Tapedrive
Tape cartridge
out of box
Robot : move cartridgeRobot : move cartridgeTape : load,unload,rewindTape : load,unload,rewind
All Rights Reserved, Copyright (C) FUJITSU 20079
pvSCSI driver(SCSI passthrough)pvSCSI driver(SCSI passthrough)
pvSCSI driver(SCSI passthrough) consists of SCSI frontend driver and SCSI backend driver. Each guest can issue SCSI commands via host.Each guest can occupy each FC HBA card.
conventional application works on guest.
guest 2
Hypervisor
guest 1host
Host OS
guest OS guest OS
SCSI frontenddriver
SCSI command SCSI commandSCSI backend
driver
FC(SCSI) nativedriver
FC HBA card FC HBA card
FC(SCSI) nativedriver
SCSI backenddriver
SANSAN
SCSI frontenddriver
All Rights Reserved, Copyright (C) FUJITSU 200710
Requirements for VM in data centerRequirements for VM in data center
Conventional application supportServer consolidation
All resources are consolidated on VM.
pvSCSI driver + NPIV
HW fault tolerance Performance
All Rights Reserved, Copyright (C) FUJITSU 200711
Data center (Enterprise system)Data center (Enterprise system)
Tape
HBA HBAHBA HBA HBA HBAHBAHBA HBA
LAN
HBA
DB server
SCSI command
SCSI driver
Backup server
SCSI command
SCSI driver
serverserverSCSI driver
SCSI command
Many servers in data centerEach server has several storage.
Disk Disk Disk Disk
All Rights Reserved, Copyright (C) FUJITSU 200712
Data center (Enterprise system) on VMData center (Enterprise system) on VM
Server consolidation on VMMany HBA cards are need for data center.
guest(server)
guest(server)
guest (DB server)
SCSI command
guest (backup server)
SCSI driver
SCSI backend
driver
SCSI driver
hypervisor
host
HBA HBA HBA FC HBA HBA HBA HBA HBA HBA
Disk Tape
VBDfrontend
driver
VBDfrontend
driver
VBDbackend
driver
VBDbackend
driver
SCSI backend
driver
SCSI backend
driver
SCSI driverSCSI driver SCSI driver
SCSI frontend
driver
SCSI commandSCSI command
SCSI frontend
driver
SCSI frontend
driver
All Rights Reserved, Copyright (C) FUJITSU 200713
NPIV supportNPIV supportNPIV : Technology to creates a many vHBA(VP) in a physical HBA.
Each guest can have own vHBA.The number of physical HBA can be reduced.
guest 2guest 1host
Host OS
guest OS guest OS
SCSI frontenddriver
SCSI frontenddriver
SCSI command SCSI commandSCSI backend
driver
FC(SCSI) nativedriver with NPIV
Hypervisor
SCSI backenddriver
FC HBA card with NPIVVP VP
SANSAN
All Rights Reserved, Copyright (C) FUJITSU 200714
NPIV (N-Port Identifier Virtualization)NPIV (N-Port Identifier Virtualization)
The virtual port can connect to SAN independently as the physical port. The virtual port is allocated to owner guest.NPIV is standardized by the SNIA.
HBA driverHBA driver
HBA card HBA card HBA cardHBA card
VP VP VP
no NPIV NPIV
SAN SAN
VP : Virtual Port
PP : Physical PortPP
PP
PP PP PP
PP PP PP
All Rights Reserved, Copyright (C) FUJITSU 200715
Requirements for VM in data centerRequirements for VM in data center
Conventional application supportServer consolidationHW fault tolerance
Containment hardware failure
pvSCSI driver + NPIV + driver domain
Redundancy hardware failure
pvSCSI driver + driver domain + multi path driver
Performance
All Rights Reserved, Copyright (C) FUJITSU 200716
Containment with NPIVContainment with NPIV
Crash on VP to guest 1 does not affect guest 2.
guest 2guest 1host
Host OS
guest OS guest OS
SCSI frontenddriver
SCSI frontenddriver
SCSI command SCSI commandSCSI backend
driver
FC(SCSI) nativedriver with NPIV
Hypervisor
SCSI backenddriver
FC HBA card with NPIVVP VP
SANSAN
All Rights Reserved, Copyright (C) FUJITSU 200717
driver domaindriver domain
If host goes down with I/O operation, whole system does not go down.Guest can not access I/O device directly.
Hypervisor
guest 2guest 1
SANSAN
driver domain
guest OS guest OS
SCSI frontenddriver
SCSI frontenddriver
SCSI backenddriver
SCSI command SCSI command
host
host
HBA card HBA card
FC(SCSI) native driver
SCSI backenddriver
All Rights Reserved, Copyright (C) FUJITSU 200718
Containment with driver domainContainment with driver domain
Crash on driver domain 1 for guest 1 does not affect guest 2.
Hypervisor
SANSAN
host
hostOS
driver domain 1
SCSI backenddriver
FC(SCSI) native driver
crash
FC HBA card
guest 2
guest OS
SCSI frontenddriver
SCSI command
driver domain 2 guest 1
guest OS
SCSI frontenddriver
SCSI command
SCSI backenddriver
FC(SCSI) native driver
FC HBA card
All Rights Reserved, Copyright (C) FUJITSU 200719
multi path drivermulti path driver
fail over: alternate path retryload balance: multi access path
application
sd/st/sgscsi_mod
multi path driver
HBA cardHBA card
disk
application
sd/st/sgscsi_mod
multi path driver
HBA cardHBA card
disk
fail over load balance
• Linux has a multi path driver as “device mapper”.• Many vendors prepare their original multi path driver.
All Rights Reserved, Copyright (C) FUJITSU 200720
redundancy with driver domainredundancy with driver domain
Each guest has alternate path to I/O device via driver domain so that each guest can continue to work when a HBA card or a driver domain is crashed.
Hypervisor
SANSAN
guest 2
guest OS
SCSI frontenddriver
SCSI command
host
hostOS
driver domain 1 driver domain 2 guest 1
guest OS
SCSI frontenddriver
SCSI command
SCSI backenddriver
SCSI backenddriver
FC(SCSI) native driver
SCSI backenddriver
SCSI backenddriver
FC(SCSI) native driver
Multi pathdriver
Multi pathdriver
crash
FC HBA card with NPIVVP VP
FC HBA card with NPIVVP VP
All Rights Reserved, Copyright (C) FUJITSU 200721
Requirements for VM in data centerRequirements for VM in data center
Conventional application supportServer consolidation HW fault tolerance Performance
The performance of pvSCSI driver is almost same as VBD. Guest issues I/O to device not via host.More performance!!direct I/O
PV domain and HVM domain
All Rights Reserved, Copyright (C) FUJITSU 200722
direct I/Odirect I/O
Each guest can access hardware without host.
guest 2guest 1
SANSAN
Host domain
Host OS
guest OS guest OS
FC(SCSI)native driver
SCSI command SCSI command
Hypervisor
FC(SCSI)native driver
direct I/O hardware
FC HBA card FC HBA card
All Rights Reserved, Copyright (C) FUJITSU 200723
direct I/O / NPIV architecturedirect I/O / NPIV architecture
chipset for direct I/O
VP
DMA controller
CPU CPU
for guest B
Memory
for guest A
Address Translation
AccessAllowed
AccessDenied
AccessAllowed
AddressTranslation
Table
Guarantee that the PCI Express device cannot perform unauthorized access to memory portion
The device can be assigned to guest domain
Authorized Access
Unauthorized Access
Managed by Hypervisor/Dom0but guest domain
VP
DMA controller HBA
All Rights Reserved, Copyright (C) FUJITSU 200725
current statuscurrent statusThe basic function of pvSCSI driver code was posted to xen community.NPIV works. pvSCSI driver on driver domain is under evaluation.pvSCSI driver works on HVM domain and PV domain.Oracle RMAN works on guest with pvSCSIdriver.
All Rights Reserved, Copyright (C) FUJITSU 200726
performance (same as VBD)performance (same as VBD)
Memory : 2GBcpu :2 for each domaintool : iogen1.3.6
Dom0 vs VBD vs pvSCSIThe performance ratio of VBD and pvSCSI to Dom0(100%).
Performance of pvSCSI is almost same as VBD.
read
Dom
0
Dom
0
Dom
0
VBD
VBD VB
D
pvS
CSI
pvSC
SI
pvSC
SI
10%20%30%40%50%60%70%80%90%
100%
8k 128k 256kblock size
Dom0= 100% write
Dom
0
Dom
0
Dom
0
VBD VBD
VBD
pvS
CSI
pvSC
SI
pvS
CSI
10%20%30%40%50%60%70%80%90%
100%
8k 128k 256kblock size
Dom0= 100%
All Rights Reserved, Copyright (C) FUJITSU 200727
Oracle RMAN works on guest with pvSCSI driver.DB server on guest Oracle
RMAN
hardwaresnapshot
data file data file
backup set
control data
RMANarchive log
control data
Storage (RAID)SAN
pvSCSI
Oracle RMAN
All Rights Reserved, Copyright (C) FUJITSU 200729
Next challengeNext challenge
Complexity direct I/O and nondirect I/OLUN assignment
All Rights Reserved, Copyright (C) FUJITSU 200730
Complexity direct I/O and nondirect I/OComplexity direct I/O and nondirect I/O
PV domain uses pvSCSI driver.HVM domain uses direct I/O.
Hypervisor
I/O deviceI/O device
host
hostOS
guest 2(PV domain)
guest OS
SCSI frontenddriver
SCSI command
Multi pathdriver
FC HBA with NPIVVP VP
driver domain 1
SCSI backenddriver
driver domain 2
SCSI backenddriver
guest 1(PV domain)
guest OS
SCSI frontenddriver
SCSIcommand
Multi pathdriver
guest 3(HVM domain)
guest OS
SCSI command
Multi pathdriver
guest 4(HVM domain)
guest OS
SCSI command
Multi pathdriver
FC(SCSI)native driver
FC(SCSI)native driver
direct I/O hardware
FC HBA with NPIVVP VP
FC HBA with NPIVVP VP
FC HBA with NPIVVP VP
FC(SCSI)native driver
SCSI backenddriver
SCSI backenddriver
FC(SCSI)native driver
All Rights Reserved, Copyright (C) FUJITSU 200731
LUN assignmentLUN assignmentLUN allocation to guest with pvSCSI driver
Hypervisor
guest 4(PV domain)
guest 1(HVM domain)
guest 2(HVM domain)
guest 3(PV domain)
guest 5(HVM domain)
guest 6(HVM domain)
direct I/O hardware
driver domain
VP VP VP
guest1
guest2
guest2
guest3
guest4
guest1
SCSIbackend driver
VPVP
SAN
LUN allocationfor guest
SANSAN
SCSIfronten driver
SCSIfrontend driver
SCSIfrontend driver
SCSIfrontend driver
LUN0
LUN1
LUN2
LUN3
LUN4
LUN5
All Rights Reserved, Copyright (C) FUJITSU 200732
Special thanks toIntel Corporation QLogic CorporationEmulex Corporation Brocade Communications Systems IncSun Microsystems Inc Xenon community engineers!
All Rights Reserved, Copyright (C) FUJITSU 200733
This work was partly funded by Ministry of Economy, Trade and Industry of Japan as the secure platform project of Association of Super-Advanced Electronics Technologies (ASET).
All Rights Reserved, Copyright (C) FUJITSU 200735
- Direct I/O- But, All LUNs are assigned to“one” guest domain
A HBA must be occupied by an owner guest.
LUN Assignment to Guest Domain(Direct I/O without VT-d)
HBA
LUN
LUN
LUN
LUN
LUN
LUN
GuestDomain . . .Guest
DomainSCSI
frontenddriver
Hostdomain
SCSIfrontend
driver
SCSIbackend
driver
All Rights Reserved, Copyright (C) FUJITSU 200736
HBA
LUN
LUN
LUN
LUN
LUN
LUN
- Portion of LUNs canbe assigned to appropriateguest domain(LUN filtering by pvSCSI)
- But, not direct I/O(via DD)
LUN Assignment(Driver Domain (DD) with pvSCSI)
GuestDomain . . .
GuestDomain
SCSIfrontend
driver
DD
SCSIfrontend
driver
SCSIbackend
driver
SCSIbackend
driver
All Rights Reserved, Copyright (C) FUJITSU 200737
HBA
LUN
LUN
LUN
LUN
LUN
LUN
- Portion of LUNs canbe assigned to appropriateguest domain
- But, not yet direct I/O(via DD)
LUN Assignment to Guest Domain(Driver Domain (DD) with pvSCSI/NPIV)
vpvp
GuestDomain . . .
GuestDomain
SCSIfrontend
driver
DD
SCSIfrontend
driver
SCSIbackend
driver
SCSIbackend
driver
All Rights Reserved, Copyright (C) FUJITSU 200738
HBA
GuestDomain
LUN
LUN
LUN
. . .
LUN
LUN
LUN
GuestDomain
- Portion of LUNs canbe assigned to appropriateguest domain
- And direct I/O
LUN Assignment to Guest Domain(Direct I/O with VT-d/SR-IOV)
vpvp
All Rights Reserved, Copyright (C) FUJITSU 200739
Virtualization layersVirtualization layers
storage
server
networkSANSAN
xen
network virtual storage
storage box
SCSI protocol
SCSI protocol
Management interface depends on SCSI protocol.
All Rights Reserved, Copyright (C) FUJITSU 200740
network virtual storagenetwork virtual storage
storage
server
Virtual storage manager
SAN switch
- Configuration
- Migration
etc
application
Virtual disk
Virtual storage pool
SAN
All Rights Reserved, Copyright (C) FUJITSU 200741
storage data migrationstorage data migration
Virtual storage has the online migration function with SCSI protocol.
application
writeread
copying
application
read/write
Pre copy
application
read/write
Migration complete
All Rights Reserved, Copyright (C) FUJITSU 200742
replicationreplication
Network/storage box virtualization has snapshot copy.
backup restore replication
Virtualstorage
All Rights Reserved, Copyright (C) FUJITSU 200743
Example 1 : Guest MigrationExample 1 : Guest Migration
guest 2guest 1
VP VPV-WWN 1 V-WWN 2
HBA
guest 2data
guest 1data
VP VP
HBA
SAN
guest 2guest 1
VP VPV-WWN 1
HBA
guest 2data
guest 1data
VP VPV-WWN 2
HBA
SAN
migrate guest 2
Keep connection to the same v-WWN2.