Top Banner
Swift A quick introduction March 2014 [email protected]
40

Initial presentation of swift (for montreal user group)

Jan 15, 2015

Download

Documents

Marcos Garcia

 
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Initial presentation of swift (for montreal user group)

SwiftA quick introduction

March [email protected]

Page 2: Initial presentation of swift (for montreal user group)

Index

● What is Object storage● A quick look to Amazon S3 ● Swift use cases● History & Architecture● Swift features● The API● Demo using Cyberduck

Page 3: Initial presentation of swift (for montreal user group)

What is object storage

● HTTP accessible storage of objects (files) in buckets (folders)

● Like FTP or WebDAV● Added security access, metadata● Everything is a URL● Cheap and hassle-free

○ Notion of unlimited capacity○ No fragmentation or integrity checks○ No locks or concurrency problems○ Support of partial reads or writes

Page 4: Initial presentation of swift (for montreal user group)

What is object storage

● Designed for cloud-era requirements○ Secure○ Reliable○ Scalable○ Fast○ Inexpensive○ Simple

Page 5: Initial presentation of swift (for montreal user group)

Quick look to Amazon S3

Page 6: Initial presentation of swift (for montreal user group)

● Content storage and distribution○ Serve static files or whole websites from S3 directly

● Better scalability for web server tier○ Reduces ‘data gravity’, low I/O in the server, all HTTP

● Storage for data analysis● Fine-grained access control to buckets● Backup, archiving and disaster recovery

○ even if Amazon Glacier is a cheaper option

● ... but it’s not a Content Distribution Network ○ doesn’t optimize routing for lowest latency○ is not optimized for content streaming○ that’s why Amazon Cloudfront exists

Some Amazon S3 use cases

Page 7: Initial presentation of swift (for montreal user group)

The cost of Amazon S3

● Main reason to use S3: price● Example: 1 TB stored, modified 100GB per month

○ Storage cost: $85 / month○ Data Transfer (Upload): $0○ Data Transfer (Download): $12, at $0.12/GB

● A cheaper option: reduced redundancy (99’9% instead of 99’999999999%)○ Storage cost: $68

● Even cheaper, but just for backups (very limited functionalities): Glacier ○ Storage cost: $10

Page 8: Initial presentation of swift (for montreal user group)

Swift use cases

● Object Storage system● Massively Scalable● Runs on commodity hardware● An S3-like solution

What is it

● Hard drive or filesystem● NFS / SMB share● Block storage● any SAN/NAS/DAS● not even a CDN

What is NOT

Page 9: Initial presentation of swift (for montreal user group)

Swift use cases

● Multi tenancy ○ Ideal for Public or Private Clouds○ Different URLs, groups of users, access codes, fine-grained privileges

● Backups ○ Write-Once, read-never (long term archiving).○ Disaster recovery.

● Web Content○ Write many, read many.○ File-sharing websites (temporary access).○ Static website or media-focused blogs (i.e. imgur).

● Large Objects ○ Medical/Scientific images.○ Store your fancy images from the moon (i.e: nasa).○ Store your VM from the cloud.

Page 10: Initial presentation of swift (for montreal user group)

History

● Rackspace Cloud Files V1.○ Distributed storage.○ Centralized metadata.○ PostgreSQL DB

● 2009: Rackspace Cloud Files V2 (Swift).○ Full redesign and rewrite. Opensource.○ API compatible with Amazon S3○ Worked closely with ops.○ Distributed storage and metadata.○ Logical placement, based on algorithm

Page 11: Initial presentation of swift (for montreal user group)

● Highly available, distributed, eventually consistent object storage, using commodity servers

● Eventually consistent: a write is acknowledged before waiting for full replication confirmation

○ Referring the CAP theorem, Swift chose:■ availability and partition tolerance ■ dropped consistency.

● 3 rings to replicate○ Accounts○ Containers○ Objects

Swift architecture

Page 12: Initial presentation of swift (for montreal user group)

Swift architecture

Proxy Proxy Proxy Proxy

Storage Storage Storage Storage

The Ring

● Multiple components, usually on 2 type of nodes○ Proxy servers: Runs the swift-proxy-server processes which proxy

requests to the appropriate Storage nodes. It also contains the TempAuth service as WSGI middleware.

○ Storage servers: Runs the swift-account-server, swift-container-server, and swift-object-server processes which control storage of the account databases, the container databases, as well as the actual stored objects.

Page 13: Initial presentation of swift (for montreal user group)

Swift architecture

Proxy Proxy Proxy Proxy

Storage Storage Storage Storage

The Ring

● Proxy tier○ Handles Incoming Requests

Scales Horizontally

Page 14: Initial presentation of swift (for montreal user group)

Swift architecture

Proxy Proxy Proxy Proxy

Storage Storage Storage Storage

The Ring

● The Ring○ Maps data (accounts, containers, objects) to storage servers

Example of 3-replication

Page 15: Initial presentation of swift (for montreal user group)

Swift architecture

Proxy Proxy Proxy Proxy

Storage Storage Storage Storage

The Ring

● Storage zones○ Isolate Physical failures

Page 16: Initial presentation of swift (for montreal user group)

Swift architecture

Proxy Proxy Proxy Proxy

Storage Storage Storage Storage

The Ring

● Quorum writes○ Proxy acknowledges after the 2nd replica is OK, no wait for 3rd

Lookup

Page 17: Initial presentation of swift (for montreal user group)

Swift architecture

Proxy Proxy Proxy Proxy

Storage Storage Storage Storage

The Ring

● Single-disk reads

Lookup

Page 18: Initial presentation of swift (for montreal user group)

Swift architecture

Proxy Proxy Proxy Proxy

Storage Storage Storage Storage

The Ring

● Replication○ A process that runs continuously, checks integrity as well

Page 19: Initial presentation of swift (for montreal user group)

Swift features

● ACL○ Free form implemented by the auth system middleware

● Healthcheck○ Simple healthcheck page for LB

● Ratelimit○ Rate Limiting requests

● Staticweb○ Provide index.html in containers

● TempURL○ Temporary URL generation for objects

● FormPost○ Translates a browser form post into a regular Swift object PUT

● Domain Remap○ Pretty URL with domains based containers

Page 20: Initial presentation of swift (for montreal user group)

Swift features

● Bulk Operations ○ Multiple DELETE or upload or even tar.(b|g)z upload

● Account Quotas○ Give operator ability to limit or set as read only accounts

● Container Quotas○ Allows user to restrict a public container (i.e: with formpost)

● Large Objects (upload > 5GB)○ Internally splitted when uploaded. Downloads a single assembled

object, supports files of virtually unlimited size● CORS

○ Upload directly from the browser via javascript to Swift● Versioning

○ Allow versioning all object in a container● Swift3

○ S3 Compatible but this one has been pulled out of swift

Page 21: Initial presentation of swift (for montreal user group)

The API

● Bindings for different languages: python, ruby, java…● Multiple CLI tools: python-swiftclient, jcloud, fog

Page 22: Initial presentation of swift (for montreal user group)

● Swift CLI:○ delete, download, list, post, stat,upload,capabilities○ post: Updates meta information for the account,

container,or object● Examples of metadata (HTTP Headers)

○ X-Account-Access-Control (for ACL)○ X-Account-Sysmeta-Global-Write-Ratelimit (for ratelimit)○ X-Object-Manifest (for dynamic large objects)○ X-Versions-Location (for object versioning)○ X-Container-Sync-* (used internally for container synchronisation)○ X-Delete-At and X-Delete-After (for object expiration)○ X-Container-Meta-Access-Control (for CORS)

● Other○ crossdomain.xml (for cross-domain policies)

The API

Page 23: Initial presentation of swift (for montreal user group)

Demo using Cyberduck

Connection templates here:https://trac.cyberduck.io/wiki/help/en/howto/openstack

Page 24: Initial presentation of swift (for montreal user group)

Thank you

Page 25: Initial presentation of swift (for montreal user group)
Page 26: Initial presentation of swift (for montreal user group)

BACKUP SLIDES (for Q&A)

Page 27: Initial presentation of swift (for montreal user group)

Proxy Servers

● Swift public face○ The entry point, and it has to do a lot of work too

● Determines the appropriate storage nodes○ By using a logical map

● Coordinates responses○ Ensures at least two replicas have succeeded

writing the object to disk before confirming to the client

Page 28: Initial presentation of swift (for montreal user group)

The ring

● Used by proxies and replication processes.● Maps requests to storage nodes● Availability zones

○ Ensure your objects are placed as far as possible● Regions

○ Support for global clusters, multi-region replication● Scale-out without affecting most entities

○ Only a fraction needs to be moved around○ Still, it’s better to use the weighing system

● Up to you how to synchronise the ring

Page 29: Initial presentation of swift (for montreal user group)

The ring

Example: - partition power of 3- 3 first digits are ring coordinates

MD5 hash

Page 30: Initial presentation of swift (for montreal user group)

Account / Container Servers

● Stored using SQLITE Database● Simple schema

○ Table for listing○ Table for metadata○ Stats information

● Scaling○ With high concurrency, SQLite gets you a lot of IO

Wait, this is when you use ‘ratelimit’

Page 31: Initial presentation of swift (for montreal user group)

Object Servers

● Use filesystem to store files○ The file (object) is dumped on disk ‘as is’

● Use ‘xattrs’ to store metadata○ On ext4, xfs

● Files named by timestamps○ Last write always win○ Deletion is treated as a version of the file with a tombstone object

● Directory structures○ /mount/data_dir/partition/hash_suffix/hash/object.ts

Page 32: Initial presentation of swift (for montreal user group)

Replication

● N-factor, configurable. By default is 3● Asynchronous and peer-to-peer replicator

process○ Traverses the local filesystem to detect changes○ Concurrently performs operations, balancing load across physical

disks

● Push model system○ Records and files are generally only copied from local to remote

replicas○ It’s the duty of a node holding data to ensure its data gets to where it

belongs○ Replica placement handled by the ring

Page 33: Initial presentation of swift (for montreal user group)

● DB Replication○ Hash comparison of DB files○ Replicates whole database file using rsync, new unique id is assigned

● Object replication○ Uses rsync for transport○ Sync only subsets of directories○ Hash based○ Bound by the number of uncached directories it has to traverse

Replication

Page 34: Initial presentation of swift (for montreal user group)

● Standard WSGI○ Pipeline composed of a succession of middleware, ending with one

application. The last one,

● Usually provided by the proxy○ But it can be provided by other server roles

● Auth is pluggable via middleware○ swauth○ keystone

Middleware

Page 35: Initial presentation of swift (for montreal user group)

Amazon S3 in initial slides: price of $0,085 per GB per month. ROI after 5-6 months

http://www.slideshare.net/joearnold/7-steps-to-roll-out-a-private-open-stack-swift-cluster-joe-arnold-swiftstack-20120417

Swift cost estimation

Page 36: Initial presentation of swift (for montreal user group)

Amazon S3 in initial slides: price of $0.085 per GB per month. ROI after barely 9 months○ Monthly S3 cost for 145 TB = $10,600 ($8.5k if reduced redundancy)○ Monthly S3 cost for 1.3 PB = $82,600 ($66k if reduced redundancy)

http://www.slideshare.net/joearnold/7-steps-to-roll-out-a-private-open-stack-swift-cluster-joe-arnold-swiftstack-20120417

Swift cost estimation

Page 37: Initial presentation of swift (for montreal user group)

Connecting to Swift (I)

1. (Example using a ca.enocloud.com account)2. download your openrc.sh file3. source it (i.e. source marcos.garcia-openrc.sh)4. put your password5. do “keystone catalog” to validate the keystone public URL6. recover the object-store public URL (i.e. http://198.154.188.142:

8080/v1/AUTH_17698de747ea403283730999605716c9 )7. use swift CLI to validate (i.e. swift list)8. in Cyberduck, setup a connection ‘Openstack Swift (Keystone

HTTP)’, with tenant:username (i.e. marcos.garcia:marcos.garcia) and password, server ca.enocloud.com and port 5000

Page 38: Initial presentation of swift (for montreal user group)

Connecting to Swift (II)

Page 39: Initial presentation of swift (for montreal user group)

Connecting to Swift (III)

Page 40: Initial presentation of swift (for montreal user group)

Connecting to Swift (IV)