Top Banner
© 2007 Open Grid Forum Grid provisioning from cloned golden boot images Alan G. Yoder, Ph.D. Network Appliance Inc.
30

© 2007 Open Grid Forum Grid provisioning from cloned golden boot images Alan G. Yoder, Ph.D. Network Appliance Inc.

Mar 27, 2015

Download

Documents

Jesse Kilgore
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: © 2007 Open Grid Forum Grid provisioning from cloned golden boot images Alan G. Yoder, Ph.D. Network Appliance Inc.

© 2007 Open Grid Forum

Grid provisioning from cloned golden boot images

Alan G. Yoder, Ph.D.

Network Appliance Inc.

Page 2: © 2007 Open Grid Forum Grid provisioning from cloned golden boot images Alan G. Yoder, Ph.D. Network Appliance Inc.

2© 2007 Open Grid Forum

Outline

• Types of grids• Storage provisioning in various grid types• Case study

• performance• stability

Page 3: © 2007 Open Grid Forum Grid provisioning from cloned golden boot images Alan G. Yoder, Ph.D. Network Appliance Inc.

3© 2007 Open Grid Forum

Grid types

• Cycle scavenging• Clusters• Data center grids

Page 4: © 2007 Open Grid Forum Grid provisioning from cloned golden boot images Alan G. Yoder, Ph.D. Network Appliance Inc.

4© 2007 Open Grid Forum

Cycle scavenging grids

• Widely distributed as a rule• campus or department wide• global grids

• Typically for • collaborative science• resource scavenging

• Main focus to date of GGF, OGF, Globus, et al.• Category includes "grid of grids"

Page 5: © 2007 Open Grid Forum Grid provisioning from cloned golden boot images Alan G. Yoder, Ph.D. Network Appliance Inc.

5© 2007 Open Grid Forum

Clusters

• Grid-like systems• good scaleout• cluster-wide namespace

• Especially attractive in HPC settings• Many concepts in common with cycle-

scavenging systems• but proprietary infrastructure

• no management standards yet

Page 6: © 2007 Open Grid Forum Grid provisioning from cloned golden boot images Alan G. Yoder, Ph.D. Network Appliance Inc.

6© 2007 Open Grid Forum

Data Center Grids

• Focus of this talk• Typically fairly homogenous

• standard compute node hardware• two or three OS possibilities

• Two variants• Nodes have disks

• Topologically homomorphic to cycle scavenging grids• May use cycle scavenging grid technology

• Nodes are diskless• Storage becomes much more important storage grids

Page 7: © 2007 Open Grid Forum Grid provisioning from cloned golden boot images Alan G. Yoder, Ph.D. Network Appliance Inc.

7© 2007 Open Grid Forum

Storage technology adoption curves

Market Adoption Cycles

Enterprise Storage Market

TodayGrid Frameworks

Today ?

Direct attachedStorage

NetworkedStorage

StorageGrids

Global StorageNetwork

Focus of this talk

Page 8: © 2007 Open Grid Forum Grid provisioning from cloned golden boot images Alan G. Yoder, Ph.D. Network Appliance Inc.

8© 2007 Open Grid Forum

Diskless compute farms

• Connected to storage grids• Boot over iSCSI or FCP• OS is provisioned in a boot LUN on a storage array• Applications can be provisioned as well

Key benefit – nodes can be repurposed at any time from a different boot image

Key benefit – smart storage and provisioning technology can use LUN cloning to deliver storage efficiencies through block sharing

Key benefit – no rotating rust in compute nodes• reduced power and cooling requirements• no OS/applications to provision on new nodes

Page 9: © 2007 Open Grid Forum Grid provisioning from cloned golden boot images Alan G. Yoder, Ph.D. Network Appliance Inc.

9© 2007 Open Grid Forum

Local fabric technologies

SAN

blah blah blah blah blah blah blah blah blah blah

products = e.g.

shadowimage,flexclone

iSCSIorFC

• Servers boot over iSCSI or FCP SAN• Storage server(s) maintain golden image + clones

Page 10: © 2007 Open Grid Forum Grid provisioning from cloned golden boot images Alan G. Yoder, Ph.D. Network Appliance Inc.

10© 2007 Open Grid Forum

Global deployment technologies

iSAN

iSAN

iSAN

iSAN

WAN

products e.g.snapmirror,

trucopy

Long-haul replication from centraldata center to local centers

Page 11: © 2007 Open Grid Forum Grid provisioning from cloned golden boot images Alan G. Yoder, Ph.D. Network Appliance Inc.

11© 2007 Open Grid Forum

Diskless booting

LU – Logical UnitLUN – Logical Unit Number

Mapping – LUNs :: initiator portsMasking – Initiators :: LUNs (“views”)

• Node shuts down• Storage maps desired image to LUN 0 for the zone

(FCP) or initiator group (iSCSI) the node is in• Node restarts• Node boots from LUN 0

• mounts scratch storage space if also provided• starts up grid-enabled application

• Node proceeds to compute until done or repurposed

Page 12: © 2007 Open Grid Forum Grid provisioning from cloned golden boot images Alan G. Yoder, Ph.D. Network Appliance Inc.

12© 2007 Open Grid Forum

Example

/vol/vol1/geotherm2LUN 0

mapped to gridsrv1

gridsrv1 gridsrv2 gridsrv3

/vol/vol1/mysql_on_linuxLUN 0

mapped to gridsrv2 /vol/vol1/mysql_on_linux

LUN 0mapped to gridsrv3

compute grid

storage grid

Page 13: © 2007 Open Grid Forum Grid provisioning from cloned golden boot images Alan G. Yoder, Ph.D. Network Appliance Inc.

13© 2007 Open Grid Forum

What makes this magic happen?

/vol/vol1/geotherm2LUN 0

mapped to gridsrv1

gridsrv1 gridsrv2 gridsrv3

/vol/vol1/mysql_on_linuxLUN 0

mapped to gridsrv2 /vol/vol1/mysql_on_linux

LUN 0mapped to gridsrv3

compute grid

storage grid

SGME

Page 14: © 2007 Open Grid Forum Grid provisioning from cloned golden boot images Alan G. Yoder, Ph.D. Network Appliance Inc.

14© 2007 Open Grid Forum

SGME

• Storage Grid Management Entity• Component of overall GME in OGF Reference

model• GME is the collection of software that assembles

the components of a grid into a grid

• Provisioning, monitoring etc.

• Many GME products: Condor et al

• Current storage grid incarnations are often home-rolled scripts

• Also Stork, Lustre, qlusters

Page 15: © 2007 Open Grid Forum Grid provisioning from cloned golden boot images Alan G. Yoder, Ph.D. Network Appliance Inc.

15© 2007 Open Grid Forum

Provisioning a diskless node

• Add HBAs to white box if necessary• Fiddle with CMOS to boot from SAN• For iSCSI:

• DHCP supplies address, node name• SGME provisions igroup for node address• SGME creates LU for node• SGME maps LU to igroup

• For FC:• zone, mask, map, etc.

SGME Grid Storage Management software

HBA Host Bus Adapter

CMOS BIOS settings

DHCP IP boot management

Page 16: © 2007 Open Grid Forum Grid provisioning from cloned golden boot images Alan G. Yoder, Ph.D. Network Appliance Inc.

16© 2007 Open Grid Forum

Provisioning a diskless node

• Add HBAs to white box if necessary• We used QLogic 4052 adapters

• Fiddle with CMOS to boot from SAN

• Get your white box vendor to do this• Blade server racks generally easily configurable

for this as well

Page 17: © 2007 Open Grid Forum Grid provisioning from cloned golden boot images Alan G. Yoder, Ph.D. Network Appliance Inc.

17© 2007 Open Grid Forum

Preparing a gold image

• On a client – this is manual one-time work• Install Windows server (e.g.)• Setup HBA

• e.g. QLogic needs iscli.exeand commands in startup.batC:\iscli.exe –n 0 KeepAliveTO 180 IP_ARP_Redirect on

• Software initiators must be prevented from paging out

• HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management:DisablePagingExecutive => 1

• Run Microsoft sysprep setup mgr and seal image

Page 18: © 2007 Open Grid Forum Grid provisioning from cloned golden boot images Alan G. Yoder, Ph.D. Network Appliance Inc.

18© 2007 Open Grid Forum

Preparing a gold lun

• On storage server – manual one time work• Copy the golden image to a new base LUN (over CIFS)

des-3050-2> lun show    /vol/vol1/gold/win2k3_hs  10g ...

• Create a snap shot of the volume with the gold lun….

des-3050-2> snap create vol1 windows_lun• Create an igroup for each initiator

des-3050-2> igroup create -i -t windows kc65b1 \

iqn.2000-04.com.qlogic:qmc4052.zj1ksw5c9072.1

Note: commands in blue type are Netapp-specific, for purposes of illustration only

Page 19: © 2007 Open Grid Forum Grid provisioning from cloned golden boot images Alan G. Yoder, Ph.D. Network Appliance Inc.

19© 2007 Open Grid Forum

Preparing cloned LUNs

• SGME: for each client• create a qtree des-3050-2> qtree create /vol/vol1/iscsi/

• Create a lun clone from the gold lundes-3050-2> lun create –b \ /vol/vol1/.snapshot/windows_lun/gold/win2k3_hs \ /vol/vol1/iscsi/kc65b1

• Map the lun to the igroup.des-3050-2> lun map /vol/vol1/iscsi/kc65b1 kc65b1 0

Page 20: © 2007 Open Grid Forum Grid provisioning from cloned golden boot images Alan G. Yoder, Ph.D. Network Appliance Inc.

20© 2007 Open Grid Forum

Getting clients to switch horses

• SGME: for each client • Notify client to clean up• Bring down client

• remote power strips/blade controllers• Remap client LUN on storage

des-3050-2> lun offline /vol/vol1/iscsi/kc65b1

des-3050-2> lun unmap /vol/vol1/iscsi/kc65b1 kc65b1 0

des-3050-2> lun online /vol/vol1/iscsi2/kc65b1

des-3050-2> lun map /vol/vol1/iscsi/kc65b1 kc65b1 0

• Bring up client• DHCP

Page 21: © 2007 Open Grid Forum Grid provisioning from cloned golden boot images Alan G. Yoder, Ph.D. Network Appliance Inc.

21© 2007 Open Grid Forum

Lab results

• Experiment conducted at Network Appliance• FAS 3050 clustered system• 224 clients (112 per cluster node)

• dual core 3.2GHz/2GB Intel Xeon IBM H20 Blades • Qlogic QMC 4052 adapters • Windows Server 2003 SE SP1

• Objectives• determine robustness and performance characteristics of

configuration, under conditions of storage failover and giveback

• determine viability of keeping paging file on central storage

Page 22: © 2007 Open Grid Forum Grid provisioning from cloned golden boot images Alan G. Yoder, Ph.D. Network Appliance Inc.

22© 2007 Open Grid Forum

Network configuration

Not your daddy’s network

Page 23: © 2007 Open Grid Forum Grid provisioning from cloned golden boot images Alan G. Yoder, Ph.D. Network Appliance Inc.

23© 2007 Open Grid Forum

Client load

• Program to generate heavy CPU and paging activity (2 GB memory area, lots of reads and writes)

• Several instances per client

Page 24: © 2007 Open Grid Forum Grid provisioning from cloned golden boot images Alan G. Yoder, Ph.D. Network Appliance Inc.

24© 2007 Open Grid Forum

Client load, cont.

• ~400 pages/sec

Page 25: © 2007 Open Grid Forum Grid provisioning from cloned golden boot images Alan G. Yoder, Ph.D. Network Appliance Inc.

25© 2007 Open Grid Forum

Load on storage

Near 100% disk utilization on storage systemin takeover mode

des-3050-1(takeover)> sysstat -u 1 CPU Total Net kB/s Disk kB/s Tape kB/s Cache Cache CP CP Disk ops/s in out read write read write age hit time ty util 18% 1318 3129 4413 167328 48 0 0 13 98% 0% - 100% 42% 2708 67637 6165 166210 8 0 0 13 99% 0% - 100% 53% 2035 71519 5258 155134 52419 0 0 13 99% 45% D 100% 54% 1852 62163 4488 124647 99591 0 0 13 99% 100% : 79% 49% 2021 70115 5083 123828 58347 0 0 13 99% 100% D 73% 83% 1005 24380 2414 110473 54491 0 0 13 99% 100% : 83% 42% 2892 65357 7878 211645 56495 0 0 13 99% 100% : 128% 39% 2250 29027 7839 155554 19597 0 0 13 99% 35% D 93% 74% 1671 39249 4393 184457 57014 0 0 15 100% 100% : 112% 38% 2323 57148 6777 161911 69163 0 0 15 99% 100% : 100% 51% 2105 52591 5354 147766 95826 0 0 12 99% 90% D 86% 29% 382 957 988 163609 60946 0 0 12 98% 100% : 100% 19% 1331 2232 4305 163301 6938 0 0 12 98% 49% : 100% 18% 1247 1547 4390 164802 24 0 0 13 98% 0% - 100% 30% 2037 31462 5717 167336 0 0 0 13 99% 0% - 100% 33% 2000 4047 5909 169060 24 0 0 13 98% 0% - 100% 67% 1580 2177 5471 167101 32 0 0 13 99% 0% - 100%

Page 26: © 2007 Open Grid Forum Grid provisioning from cloned golden boot images Alan G. Yoder, Ph.D. Network Appliance Inc.

26© 2007 Open Grid Forum

Observations

• Failover and giveback transparent• No BSOD when times within windows

• recall: KeepAliveTO = 180• some tuning opportunities here

actual failover was < 60 seconds

iscsi stop+start used to increase “failover time” for testing

• Slower client access during takeover• expected behavior

• Heavy paging activity not an issue• Higher number of clients / storage server an option,

depending on application behavior

Page 27: © 2007 Open Grid Forum Grid provisioning from cloned golden boot images Alan G. Yoder, Ph.D. Network Appliance Inc.

27© 2007 Open Grid Forum

Economic analysis

• Assume • 256 clients / storage server• 20w / drive• $80 / client-side drive• 80G client-side drive, 10G used per application• $3000 / server-side drive• 300G server-side drive

• Calculate• server-side actual usage• cost of client-side drives vs. cost for server space• cost of power+cooling for client-side drives and server space

Page 28: © 2007 Open Grid Forum Grid provisioning from cloned golden boot images Alan G. Yoder, Ph.D. Network Appliance Inc.

28© 2007 Open Grid Forum

Results

• Server side usage• 512 clients x 10GB per application = 5 TB• Assume

• 50% usable space on server• 20w typical per drive • 2.3 x multiplier to account for cooling

• 5000GB * 2 / 300GB/drive * 20w/drive * 2.3 1.53 KW• 10TB raw @ $10/GB $100,000

• Workstation side drives• Same assumptions (note: power supply issue)• 512 drives * 20w/drive * 2.3 23.5 KW• 512 drives * $80/drive $40,960

• At $0.10/KWH, cost curves cross over in three years

• in some scenarios, it’s less than two years

Page 29: © 2007 Open Grid Forum Grid provisioning from cloned golden boot images Alan G. Yoder, Ph.D. Network Appliance Inc.

29© 2007 Open Grid Forum

Conclusion

• Dynamic provisioning from golden images is here

• Incredibly useful technology in diskless workstation farms• Fast turnaround• Central control• Simple administration• Nearly effortless client replacement

• Green!

Page 30: © 2007 Open Grid Forum Grid provisioning from cloned golden boot images Alan G. Yoder, Ph.D. Network Appliance Inc.

30© 2007 Open Grid Forum

Questions?