Top Banner
FARSITE: Federated, FARSITE: Federated, Available, and Available, and Reliable Storage for Reliable Storage for an Incompletely an Incompletely Trusted Environment Trusted Environment
36

FARSITE: Federated, Available, and Reliable Storage for an Incompletely Trusted Environment

Jan 03, 2016

Download

Documents

elaine-sutton

FARSITE: Federated, Available, and Reliable Storage for an Incompletely Trusted Environment. Introduction. Farsite: serverless distributed file system Logically functions as a centralized file server Designed for desktop environments Need some effort for initial configurations - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: FARSITE:  Federated, Available, and Reliable Storage for an Incompletely Trusted Environment

FARSITE: Federated, FARSITE: Federated, Available, and Reliable Available, and Reliable

Storage for an Incompletely Storage for an Incompletely Trusted EnvironmentTrusted Environment

Page 2: FARSITE:  Federated, Available, and Reliable Storage for an Incompletely Trusted Environment

IntroductionIntroduction

Farsite: serverless distributed file systemFarsite: serverless distributed file system Logically functions as a centralized file serverLogically functions as a centralized file server

Designed for desktop environmentsDesigned for desktop environments Need some effort for initial configurationsNeed some effort for initial configurations With little central administration to With little central administration to

maintainmaintain

Page 3: FARSITE:  Federated, Available, and Reliable Storage for an Incompletely Trusted Environment

Farsite CharacteristicsFarsite Characteristics

Peer-to-peer among untrusted machinesPeer-to-peer among untrusted machines Need to handle privacy, integrity, durabilityNeed to handle privacy, integrity, durability

CryptographyCryptography Randomized replicationRandomized replication Byzantine fault-toleranceByzantine fault-tolerance

Page 4: FARSITE:  Federated, Available, and Reliable Storage for an Incompletely Trusted Environment

Farsite WorkloadsFarsite Workloads

High access localityHigh access locality Low update rateLow update rate Sequential accesses with rare Sequential accesses with rare

concurrencyconcurrency

Page 5: FARSITE:  Federated, Available, and Reliable Storage for an Incompletely Trusted Environment

AdministrationAdministration

Machine certificates bind machines to their Machine certificates bind machines to their public keyspublic keys

User certificates bind users to their public User certificates bind users to their public keyskeys

Namespace certificates bind namespace Namespace certificates bind namespace roots to their managing machinesroots to their managing machines

Page 6: FARSITE:  Federated, Available, and Reliable Storage for an Incompletely Trusted Environment

Design AssumptionsDesign Assumptions

for ~10for ~1055 machines machines All interconnected by a high-bandwidth, All interconnected by a high-bandwidth,

low-latency networklow-latency network Majority of machines to be up most of the Majority of machines to be up most of the

timetime Uncorrelated permanent machine failuresUncorrelated permanent machine failures Read-mostly sharingRead-mostly sharing Few malicious usersFew malicious users

Page 7: FARSITE:  Federated, Available, and Reliable Storage for an Incompletely Trusted Environment

Enabling Technology TrendsEnabling Technology Trends

Increase in unused disk capacityIncrease in unused disk capacity In 2000, 58% of disk capacity unused at In 2000, 58% of disk capacity unused at

MicrosoftMicrosoft Can replicate data for reliabilityCan replicate data for reliability

Decrease in the computational costDecrease in the computational cost Can easily encrypt at 53 MB/secCan easily encrypt at 53 MB/sec Disk transfers at 32 MB/secDisk transfers at 32 MB/sec Can use strong cryptography for securityCan use strong cryptography for security

Page 8: FARSITE:  Federated, Available, and Reliable Storage for an Incompletely Trusted Environment

Namespace RootsNamespace Roots

Allow multiple roots for multiple machinesAllow multiple roots for multiple machines

Page 9: FARSITE:  Federated, Available, and Reliable Storage for an Incompletely Trusted Environment

Trust and CertificationTrust and Certification

Based on public-key-cryptographic Based on public-key-cryptographic certificatescertificates Encrypt(KeyEncrypt(Keypublicpublic, text, textplainplain) ) text textciphercipher

Decrypt(KeyDecrypt(Keyprivateprivate, text, textciphercipher) ) text textplainplain

Encrypt(KeyEncrypt(Keyprivateprivate, text, textplainplain) ) text textciphercipher

Decrypt(KeyDecrypt(Keypublicpublic, text, textciphercipher) ) text textplainplain

Page 10: FARSITE:  Federated, Available, and Reliable Storage for an Incompletely Trusted Environment

Public Key Encryption BasicsPublic Key Encryption Basics

IdeaIdea Public key is publishedPublic key is published Private key is the secretPrivate key is the secret

Encrypt(KeyEncrypt(Keymy_publicmy_public, “Hi, Andy”), “Hi, Andy”) Anyone can create it, but only I can read itAnyone can create it, but only I can read it

Encrypt(KeyEncrypt(Keymy_privatemy_private, “I’m Andy”), “I’m Andy”) Everyone can read it, but only I can create itEveryone can read it, but only I can create it

Page 11: FARSITE:  Federated, Available, and Reliable Storage for an Incompletely Trusted Environment

Public Key Encryption BasicsPublic Key Encryption Basics

Encrypt(KeyEncrypt(Keyyour_publicyour_public, Encrypt(Key, Encrypt(Keymy_privatemy_private, “I , “I

know your secret”))know your secret”)) Only you can read it, and only I can send itOnly you can read it, and only I can send it

Page 12: FARSITE:  Federated, Available, and Reliable Storage for an Incompletely Trusted Environment

Basic SystemBasic System

Every machine has three rolesEvery machine has three roles ClientClient

• A machine that interacts with a userA machine that interacts with a user Directory groupDirectory group

• A set of machines that manage files via Byzantine-A set of machines that manage files via Byzantine-fault-tolerant protocolfault-tolerant protocol

• Every group member owns a replicaEvery group member owns a replica File hostFile host

Page 13: FARSITE:  Federated, Available, and Reliable Storage for an Incompletely Trusted Environment

More on the Basic SystemMore on the Basic System

+ Reliability + Reliability

+ Data integrity+ Data integrity

- Performance- Performance Byzantine’s algorithm can only tolerate up to Byzantine’s algorithm can only tolerate up to

1/3 of failed replicas1/3 of failed replicas Need lots of replicasNeed lots of replicas

- Privacy- Privacy

- Storage consumption - Storage consumption

Page 14: FARSITE:  Federated, Available, and Reliable Storage for an Incompletely Trusted Environment

System EnhancementsSystem Enhancements

Local cachingLocal caching A client can lease a copy of a fileA client can lease a copy of a file

Encrypt written files with public keys of all Encrypt written files with public keys of all authorized clientsauthorized clients Offload those files to file hostsOffload those files to file hosts Store only the content hash of those files Store only the content hash of those files

locallylocally Can validate damaged copiesCan validate damaged copies Can tolerate n – 1 file host failuresCan tolerate n – 1 file host failures

Page 15: FARSITE:  Federated, Available, and Reliable Storage for an Incompletely Trusted Environment

Traditional Byzantine Approach Traditional Byzantine Approach [CL99][CL99]

Client

File

Meta-Data

Byzantine fault-tolerant protocol

Byzantine servers

3f +1 file copies to handle f failures

Page 16: FARSITE:  Federated, Available, and Reliable Storage for an Incompletely Trusted Environment

Farsite: BFT only for meta-dataFarsite: BFT only for meta-data

Client

Byzantine fault-tolerant protocol

Directory groupFile hosts

f + 1 file copiesfor f failures

Page 17: FARSITE:  Federated, Available, and Reliable Storage for an Incompletely Trusted Environment

Semantic Differences from NTFSSemantic Differences from NTFS

Hard limit on concurrent writesHard limit on concurrent writes Soft limit on concurrent readSoft limit on concurrent read

Sometime supply stale snapshotsSometime supply stale snapshots No name-locking on open file’s pathNo name-locking on open file’s path

Page 18: FARSITE:  Federated, Available, and Reliable Storage for an Incompletely Trusted Environment

File System FeaturesFile System Features

ReliabilityReliability AvailabilityAvailability SecuritySecurity DurabilityDurability ConsistencyConsistency ScalabilityScalability EfficiencyEfficiency ManageabilityManageability

Page 19: FARSITE:  Federated, Available, and Reliable Storage for an Incompletely Trusted Environment

Reliability and AvailabilityReliability and Availability

ReplicationReplication When a machine in unavailable for an When a machine in unavailable for an

extended periodextended period Its functions migrate to othersIts functions migrate to others

CachingCaching

Page 20: FARSITE:  Federated, Available, and Reliable Storage for an Incompletely Trusted Environment

PrivacyPrivacy

File content and metadata are encryptedFile content and metadata are encrypted Convergent encryptionConvergent encryption

Encrypt(HashEncrypt(Hashone_wayone_way(block(blockplainplain), block), blockplainplain) )

blockblockciphercipher

Hash EncryptData blocks

Page 21: FARSITE:  Federated, Available, and Reliable Storage for an Incompletely Trusted Environment

More on Convergent EncryptionMore on Convergent Encryption

Block hashes are used to identify identical Block hashes are used to identify identical block contentsblock contents

Block-level encryption allows block-level Block-level encryption allows block-level changes without re-encrypting the entire changes without re-encrypting the entire filefile

Page 22: FARSITE:  Federated, Available, and Reliable Storage for an Incompletely Trusted Environment

More on Convergent EncryptionMore on Convergent Encryption

Encrypt(KeyEncrypt(Keyfilefile, file_hashes, file_hashesplainplain) )

file_hashesfile_hashesciphercipher

EncryptBlock hashes

Page 23: FARSITE:  Federated, Available, and Reliable Storage for an Incompletely Trusted Environment

More on Convergent EncryptionMore on Convergent Encryption

Encrypt(KeyEncrypt(Keyclient1_publicclient1_public, Key, Keyfilefile) ) Key Keyfile_cipher1file_cipher1

Encrypt(KeyEncrypt(Keyclient2_publicclient2_public, Key, Keyfilefile) ) Key Keyfile_cipher2file_cipher2

…… Store both encrypted file and keysStore both encrypted file and keys

Page 24: FARSITE:  Federated, Available, and Reliable Storage for an Incompletely Trusted Environment

DirectoriesDirectories

Also encrypted Also encrypted Use Use exclusive encryptionexclusive encryption

Prevent malicious client from encrypting a Prevent malicious client from encrypting a syntactically illegal namesyntactically illegal name

Page 25: FARSITE:  Federated, Available, and Reliable Storage for an Incompletely Trusted Environment

IntegrityIntegrity

Use hash trees to compare filesUse hash trees to compare files If the root matches, two files are identicalIf the root matches, two files are identical If not, compare the hashes at the lower levelIf not, compare the hashes at the lower level Until the discrepancy is identifiedUntil the discrepancy is identified

The cost of in-place updates is logarithmic The cost of in-place updates is logarithmic of the file sizeof the file size

Linear time to verify the integrity of Linear time to verify the integrity of individual blocksindividual blocks

Page 26: FARSITE:  Federated, Available, and Reliable Storage for an Incompletely Trusted Environment

DurabilityDurability

Updates are logged and compressed Updates are logged and compressed locallylocally

The log is pushed back to the directory The log is pushed back to the directory group periodically and when a lease is group periodically and when a lease is recalledrecalled

Each log entry is verified Each log entry is verified

Page 27: FARSITE:  Federated, Available, and Reliable Storage for an Incompletely Trusted Environment

ConsistencyConsistency

Control can be loaned to clientsControl can be loaned to clients Content leasesContent leases Name leasesName leases Mode leasesMode leases Access leasesAccess leases

Page 28: FARSITE:  Federated, Available, and Reliable Storage for an Incompletely Trusted Environment

Data ConsistencyData Consistency

Content leasesContent leases Read/writeRead/write Read-onlyRead-only

• Assures no stale dataAssures no stale data Single-writer, multiple-reader semanticsSingle-writer, multiple-reader semantics A lease is kept until it is expired or recalledA lease is kept until it is expired or recalled Can lease a file, directory, a treeCan lease a file, directory, a tree

Page 29: FARSITE:  Federated, Available, and Reliable Storage for an Incompletely Trusted Environment

Namespace ConsistencyNamespace Consistency

Name leasesName leases Can create a file name Can create a file name Can create a directory and its files and Can create a directory and its files and

subdirectoriessubdirectories

Page 30: FARSITE:  Federated, Available, and Reliable Storage for an Incompletely Trusted Environment

Windows File-Sharing SemanticsWindows File-Sharing Semantics

Mode leasesMode leases Read, write, delete, exclude-read, exclude-Read, write, delete, exclude-read, exclude-

write, exclude-deletewrite, exclude-delete

Page 31: FARSITE:  Federated, Available, and Reliable Storage for an Incompletely Trusted Environment

Windows Deletion SemanticsWindows Deletion Semantics

Open it, mark it for deletion, close itOpen it, mark it for deletion, close it A file is not deleted until the last file closeA file is not deleted until the last file close Access leasesAccess leases

Public: Lease holder has the file openPublic: Lease holder has the file open ProtectedProtected

• No other client will be granted access without first No other client will be granted access without first contacting the lease holdercontacting the lease holder

PrivatePrivate• No other client has any access lease on the fileNo other client has any access lease on the file

Page 32: FARSITE:  Federated, Available, and Reliable Storage for an Incompletely Trusted Environment

ScalabilityScalability

Hint-based pathname translationHint-based pathname translation CachingCaching

Delayed directory-change notificationDelayed directory-change notification

Page 33: FARSITE:  Federated, Available, and Reliable Storage for an Incompletely Trusted Environment

Space EfficiencySpace Efficiency

Reclaim space from duplicate filesReclaim space from duplicate files Workgroup-shared documentsWorkgroup-shared documents Multiple copies of common applicationsMultiple copies of common applications Can save 50% of storage requirementCan save 50% of storage requirement Based on hash comparisonsBased on hash comparisons

Page 34: FARSITE:  Federated, Available, and Reliable Storage for an Incompletely Trusted Environment

Time EfficiencyTime Efficiency

Insert a delay between a file creation and Insert a delay between a file creation and replicationreplication Expect many files get deleted shortly after Expect many files get deleted shortly after

their creationtheir creation Reduced network trafficReduced network traffic

Page 35: FARSITE:  Federated, Available, and Reliable Storage for an Incompletely Trusted Environment

Local-Machine AdministrationLocal-Machine Administration

Machine replacementMachine replacement A special case of hardware failureA special case of hardware failure

Little need for backupLittle need for backup

Page 36: FARSITE:  Federated, Available, and Reliable Storage for an Incompletely Trusted Environment

Performance MeasurementsPerformance Measurements

Used only five machines…Used only five machines… With only 1 hour of file-system traceWith only 1 hour of file-system trace

450,164 file operations450,164 file operations 2 to 4 times as long as NTFS 2 to 4 times as long as NTFS

reads/writes/closesreads/writes/closes 9 times as long for opens9 times as long for opens 20 times as long for metadata accesses20 times as long for metadata accesses 5.5 times slower I/O latencies5.5 times slower I/O latencies