LEC16 Dist Para File Systems - ranger.uta.edu

Post on 18-Dec-2021

2 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

Transcript

CSE 3320OperatingSystems

DistributedandParallelFileSystems

Jia RaoDepartmentofComputerScience and Engineering

http://ranger.uta.edu/~jrao

RecapofPreviousClasses• Filesystemsprovideanabstractionofpermanentlystoreddatao Namespace:filesanddirectories

} Translatepathstocorrespondinglocationsondisks

o Spacemanagementandoptimizations} Freeblocks

} Cachingandprefetching

o Reliabilityandconsistency

DistributedandParallelFileSystems

• Providesimilarabstractionsofdataonmultiple machineso Namespace:pathnameàmachineID:diskblockaddress

o Management:placementoffilesonmachines} Replication

} Striping

• Designedforperformanceandavailability

Distributedv.s.ParallelFileSystems• Designobjectives

o Fault-tolerancev.s.Concurrentperformance

• Datadistributiono Entirefileonasinglenodev.s.stripingovermultinodes

• Symmetryo Storageco-locatedwithappsv.s.storageseparatedfromapps

• Fault-toleranceo Designedforfault-tolerancev.s.relyingonenterprisestorage

• Workloado Looselycoupled,distributedappsv.s.coordinatedHPCapps

Theboundaryisblurring

Examples

• DistributedFileSystemso NFS,GFS(GoogleFileSystem),HDFS(Hadoop DistributedFileSystem),GlusterFS

• ParallelFileSystemso PVFS(ParallelVirtualFileSystem),Lustre,OCFS2,GPFS

DesignIssues(1)

• Nameservero mapsfilenamestoobjects(files,directories,blocks)o Implementationoptions

} SinglenameServer¨ Simple implementation, reliabilityandperformance issues

} SeveralNameServers(ondifferenthosts)¨ Eachserverresponsible foradomain

DesignIssues(2)• Caching

o Cachingattheclient:Mainmemoryvs.Disko Cacheconsistency

} Serverinitiated¨ Serverinformscachemanagerswhendatainclientcachesisstale¨ Clientcachemanagersinvalidatestaledataorretrievenewdata¨ Disadvantage:extensivecommunication

} Clientinitiated¨ Cachemanagersattheclientsvalidatedatawithserverbeforereturningitto

clients¨ Disadvantage:extensivecommunication

} Prohibit filecachingwhenconcurrent-writing¨ Severalclientsopenafile,atleastoneofthemforwriting¨ Serverinformsallclientstopurgethatcachedfile

} Lockfileswhenconcurrent-writesharing (atleastoneclientopens forwrite)

DesignIssues(3)• Update(write)policy

o Onceaclientwritesintoafile(andthelocalcache),whenshouldthemodifiedcachebesenttotheserver?} Write-through:allwritesattheclients,immediatelytransferredtotheservers¨ Advantage:reliability¨ Disadvantage:performance, itdoesnottakeadvantageofthecache

} Delayedwriting:delaytransfertoservers¨ Advantages:

¨ Manywritestakeplace(including intermediateresults)beforeatransfer

¨ Somedatamaybelost¨ Disadvantage:reliability

} Delayedwritinguntilfileisclosedatclient¨ Forshortopen intervals,sameasdelayedwriting¨ Forlong intervals,reliabilityproblems

DesignIssues(4)Availability

o Whatisthelevelofavailabilityoffilesinadistributedfilesystem?

o Usereplicationtoincreaseavailability,i.e.manycopies(replicas)offilesaremaintainedatdifferentsites/servers

o Replicationissues:} Howtokeepreplicasconsistent

} Howtodetectinconsistencyamong replicas

DesignIssues(5)Scalability

o Dealwithagrowingsystem?

o Issues} Nodejoinandleave(fail)

} Cacheconsistency

} Nameserver

o Solutions} Replication

} Designcacheconsistencyprotocolforscalability

} Multiplename(meta)servers

} Takeadvantageofmulti-threadandmulti-core

Example- GlusterFS (DFS)

Client-1 Client-2 Client-N

Gluster VirtualStoragePool(builtondonatedpartitionsoneachmachine)

Gluster GlobalNamespace(Gluster Native)

IPnetwork

Example– GlusterFS (2)

• Threewaystoplacefileso Distribute:placeentirefilesondifferentservers

} Pros:goodscalability,efficientdiskspaceusage

} Cons:poorreliability

o Replicate:placeidenticalcopiesoffilesondifferentservers} Pros:reliability

} Cons:wasteddiskspace,moderatescalability

o Stripe:placeonlypartofafileononeserver} Pros:goodperformanceforconcurrentandrandomaccess

} Cons:poorscalabilityandreliability

Example– PVFS(PFS)

Example– PVFS(PFS)

Significant improvement inthroughputWhatcouldbetheissues?

1. Severcoordinationaffectsefficiency2. ClientQoS?

DFSandPFSintheCloud(1)

• Bothapproachesprovidecheap,reliableandhigh-performancecloudstoragesolutions

Usecase-1

DFSandPFSintheCloud(2)

Usecase-2

SomeRealResults…• Hosta8-VMHadoop clusteron8DELLmachines

• PerformedmicroandrealI/Ointensiveworkloads

• Twostoragesolutions:PVFSandlocalext3

PVFS Localext3

Gridmix websort 20GBdata 2391second 4693second

16k 32k 64k 256k 1M

Sequential 58.89 60.15 60.47 104.80 130.47

random 12.34 20.84 33.51 50.43 108.71

16k 32k 64k 256k 1M

Sequential 120.11 120.56 120.39 120.39 120.57

random 4.01 7.80 14.71 43.20 92.19

PVFS

Localext3

Networkbandwidthbottleneck

top related