Transcript
If you build it, will they come?
Dirk Petersen, Scientific Computing
Director,
Fred Hutchinson Cancer Research Center
Joe Arnold, Chief Product Officer, President
SwiftStack
Using OpenStack Swift to build a large scale active
archive in a Scientific Computing environment
Challenge
Need an archive to offload expensive storage
❖ Lost cost storage
❖ High throughput: Load large genome files
HPC
❖ Faster and lower cost than S3 & no
proprietary lock-in.
About Fred Hutch
❖ Cancer & HIV research
❖ 3 Nobel Laureates
❖ $430M budget / 85% NIH funding
❖ 2,700 employees
❖ Conservative use of information technology
IT at Fred Hutch
❖ Multiple data centers with >1000kw
capacity
❖ 100 staff in Center IT plus divisional IT
❖ Team of 3 Sysadmins to support storage
❖ IT funded by indirects (F&A)
❖ Storage Chargebacks started Nov 2014
❖ 1.03 PUE, natural air cooled
Inside Fred Hutch data center
About SwiftStack
❖ Object Storage software
❖ Build with OpenStack Swift
❖ SwiftStack is leading contributor
and Project Technical Lead
❖ Software-defined storage platform
for object storage
SecurityAuthentication & Authorization
SwiftStack Storage Clusters
Runtime Agents include:
Load balancing, Monitoring, Utilization, Device Inventory
OS
HW
OS
HW
OS
HW
OS
HW
OS
HW
OS
HW
Swift Object Storage Engine
NFS/CIFSSwift API
Device & Node Management
Datacenter 1
Datacenter 2
Datacenter 3
User Dashboard …
OS
HW
OS
HW
OS
HW
Out of Band,
Software-Defined Controller
SwiftStack Controller
Researchers concerned about ….
❖ Significant storage costs – $40/TiB/month
chargebacks (first 5 TB is free) and
declining grant funding
❖ “If you charge us please give us some
cheap storage for old and big files”
❖ (Mis)perception on storage value (I can buy
a hard drive at BestBuy)
Not what you want: Unsecured and unprotected external USB
storage
Finance concerned about ….
❖ Cost predictability and scale
❖ Data growth causes storage costing up to
$1M per year
❖ Genomics data grows at 40%/Y and
chargebacks don’t cover all costs
❖ Expensive forklift upgrades every few
years
❖ The public cloud (e.g. Amazon S3) set new
transparent cost benchmark.
How much does it cost?
❖ Only small changes vs 2014 ❖ Kryder’s law obsolete at <15%/Y ?
❖ Swift now down to Glacier cost (hardware down to $3 / TB /
month)
❖ No price reductions in the cloud
❖ 4TB (~$120) and 6TB (~$250) drives cost the same ❖ Do you want a fault domain of 144TB or 216TB in your storage
servers
❖ Don’t save on CPU / Erasure Code is coming !
40
2826
11
0
10
20
30
40
50
NAS Amazon S3 Google Swiftstack
$/TB/Mo
AWS EFS is $300/TB/Mo
Economy File in production in 2014
❖ Chargebacks drove the Hutch to embrace
more economical storage
❖ Selected Swift object storage managed by
SwiftStack
❖ Go-live in 2014, strong interest and
expansion in 2015
❖ Researchers do not want to pay the price
for standard enterprise storage
Chargebacks spike Swift utilization!
❖ Started storage chargebacks
on Nov 1st
❖ Triggered strong growth in October
❖ Users sought to avoid high cost of
enterprise NAS and put as much as
possible into lower cost Swift
❖ Underestimated success of Swift
❖ Needed to stop migration to buy more
hardware
❖ Can migrate 30+ TB per day today
Standard Hardware
❖ Supermicro with Silicon Mechanics
❖ 2.1PB raw capacity; ~700TB usable
❖ No RAID controllers; no storage lost to
RAID
❖ Seagate SATA drives (desktop)
❖ 2 x 120GB Intel S3700 SSDs; OS +
metadata
❖ 10Gb Base-T connectivity
❖ (2) Intel Xeon E5 CPUs
❖ 64GB RAM
Management of OpenStack Swift using SwiftStack
❖ Out-of-band management controller
❖ SwiftStack provides control & visibility
❖ Monitoring and stats at cluster, node,
and drive levels
❖ Authentication & Authorization
❖ Capacity & Utilization Management
via Quotas and Rate Limits
❖ Alerting, & Diagnostics
SwiftStack Automation
❖ Deployment automation
❖ Let us roll out Swift nodes in
10 minutes
❖ Upgrading Swift across clusters
with 1 click
❖ 0.25 FTE to manage cluster
HPC Requirements
❖ High Aggregate throughput
❖ Current network architecture is bottleneck
❖ Many parallel streams used to max out
throughput
❖ Ideal for HPC cluster architecture
Not a Filesystem
No traditional file system hierarchy, we just have containers, that can contain
millions of objects (aka files)
Huh, no sub-directories? But how the heck can I upload my uber-complex
bioinformatics file system with 11 folder hierarchies to Swift?
Filesystem Mapping with Swift
We simulate the hierarchical structure by simply putting forward slashes (/) in
the object name (or file name)
❖ So, how do you actually copy a folder?
❖ However, the Swift client is frequently used,
well supported, maintained and really fast !!
$ swift upload --changed --segment-
size=2G --use-slo --object-
name=“pseudo/folder" “container" "
/my/local/folder"
Really? Can’t we get this a little easier?
Introducing Swift Commander
❖ Swift Commander, a simple shell wrapper
for the Swift client, curl and some other
tools makes working with Swift very easy.
❖ Sub commands such as swc ls, swc cd,
swc rm, swc more give you a feel that is
quite similar to a Unix file system
❖ Actively maintained and available at:
❖ https://github.com/FredHutch/Swift-
commander/
$ swc upload /my/posix/folder
/my/Swift/folder
$ swc compare /my/posix/folder /my/Swift/folder
$ swc download /my/Swift/folder /my/scratch/fs
Much easier…
Some additional examples
Swift Commander + Metadata
❖ Didn’t someone say that object storage
systems were great at using metadata?
❖ Yes, and you can just add a few key:value
pairs as upload argument:
❖ Query the meta data via swc, or use an
external search engine such as elastic
search
$ swc meta /my/Swift/folder
Meta Cancer: breast
Meta Collaborators: jill,joe,jim
Meta Project: grant-xyz
$ swc upload /my/posix/folder /my/Swift/folder
project:grant-xyz
collaborators:jill,joe,jim
cancer:breast
Integrating with HPC
❖ Integrating Swift in HPC workflows is
not really hard
❖ Example, running samtools using
persistent scratch space
(files deleted if not accessed for 30
days)
If ! [[ -f /fh/scratch/delete30/pi/raw/genome.bam ]]; then
swc download /Swiftfolder/genome.bam /fh/scratch/delete30/raw/genome.bam
fi
samtools view -F 0xD04 -c /fh/scratch/delete30/pi/raw/genome.bam > otherfile
A complex 50 line HPC submission script
prepping a GATK workflow requires
just 3 more lines !!
Other HPC Integrations
❖ Use HPC system to download lots of bam
files in parallel
❖ 30 cluster jobs run in parallel on 30 1G
nodes (which is my HPC limit)
❖ My scratch file system says it loads data at
1.4 GB/s
❖ This means that each bam file is
downloaded at 47 MB/s on average and
downloading this dataset of 1.2 TB takes 14
min
$ swc ls /Ext/seq_20150112/ > bamfiles.txt
$ while read FILE; do
$ sbatch -N1 -c4 --wrap="swc download /Ext/seq_20150112/$FILE .";
$ done < bamfiles.txt
$ squeue -u petersen
JOBID PARTITION NAME USER ST TIME NODES NODELIST
17249368 campus sbatch petersen R 15:15 1 gizmof120
17249371 campus sbatch petersen R 15:15 1 gizmof123
17249378 campus sbatch petersen R 15:15 1 gizmof130
$ fhgfs-ctl --userstats --names --interval=5 --nodetype=storage
====== 10 s ======
Sum: 13803 [sum] 13803 [ops-wr] 1380.300 [MiB-wr/s]
petersen 13803 [sum] 13803 [ops-wr] 1380.300 [MiB-wr/s]
Swift Commander + Small Files
So, we could tar up this entire directory
structure… but then we have one giant tar
ball
Solution: tar up sub dirs in one file but create
a tar ball for each level
eg. /folder1/folder2/folder3
restoring folder2 and below we just need
folder2.tar.gz + folder3.tar.gz
$ swc arch /my/posix/folder /my/Swift/folder
$ swc unarch /my/Swift/folder /my/scratch/fs
It’s available at https://github.com/FredHutch/Swift-commander/blob/master/bin/swbundler.py
It’s Easy
It’s Fast❖ Archiving uses multiple processes, measured up to 400 MB/s from one
Linux box.
❖ Each process uses pigz multithreaded gzip compression (Example:
compressing 1GB DNA string down to 272MB: 111 sec using gzip, 5
seconds using pigz)
❖ Restore can use standard gzip
Desktop Clients & Collaboration
❖ Reality: Every archive requires access via
GUI tools
❖ Requirements
❖ Easy to use
❖ Do not create any proprietary data
structures in Swift that cannot be read by
other tools
Cyberduck desktop client running in windows
Desktop Clients & Collaboration
❖ Another example: ExpanDrive and Storage Made Easy
❖ Works with Windows and Mac
❖ Integrates in Mac Finder and is mountable as a drive in
Windows
rclone: mass copy, backup, data migration
❖ rclone is a multithreaded data copy / mirror
tool
❖ Consistent performance on Linux, Mac and
Windows
❖ E.g. keep a mirror of Synology workgroup
NAS (QNAP has a builtin swift mirror
option)
❖ Data remains accessible by swc, desktop
clients
❖ Mirror protected by swift undelete (currently
60 days retention)
Galaxy: Scientific Workflow Management
❖ Galaxy web based high throughput
computing at the Hutch uses Swift as
primary storage in production today
❖ SwiftStack patches contributed to Galaxy
Project
❖ Swift allows to delegate “root” access to
bioinformaticians
❖ Integrated with Slurm HPC scheduler:
automatically assigns default PI account for
each user
Summary
Discovery is driven by technologies that generate larger and larger datasets
❖ Object storage ideal for
❖ Ever-growing data volumes
❖ High throughput required for HPC
❖ Faster and lower cost than S3 & no
proprietary lock-in
top related