Top Banner
Thomas Hollstegge Distributed File Systems
35

Distributed File Systems

Feb 25, 2016

Download

Documents

B_U_C_K

Distributed File Systems. Agenda. Motivation Distributed file system basics Case studies Summary and outlook. Agenda. Motivation Distributed file system basics Case studies Summary and outlook. Motivation. ICT allows for distributed work Users work timely and spatially separated - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Distributed File Systems

Thomas Hollstegge

Distributed File Systems

Page 2: Distributed File Systems

2

Distributed file systems

Agenda

MotivationDistributed file system basicsCase studiesSummary and outlook

Page 3: Distributed File Systems

3

Distributed file systems

Agenda

MotivationDistributed file system basicsCase studiesSummary and outlook

Page 4: Distributed File Systems

4

Distributed file systems

Motivation

ICT allows for distributed workUsers work timely and spatially separatedThey need access to common data collections

Provided by distributed file systems (DFS)

Distributed work leads to new business models24/7 customer serviceAnalysis of worldwide financial information (stock prices etc.)

Economic relevance!

Different DFSs were developed in the past Structured discussion necessary

Page 5: Distributed File Systems

5

Distributed file systems

Agenda

MotivationDistributed file system basicsCase studiesSummary and outlook

Page 6: Distributed File Systems

6

Distributed file systems

Basics – Storage fundamentals

„Storage“: Fundamendal abstraction in computingData encapsulated in objectsExplicit creation and deletionUnaffected by system failures

„File system“: Refinement of abstraction

Three different usage dimensionsSingle user vs. multiple usersSingle-thread vs. multi-thread OSSingle site vs. multiple sites

Page 7: Distributed File Systems

7

Distributed file systems

Basics – Requirements for DFS (1/2)

TransparencyUser must be unaware of internal separation of componentsAccess, performance, location, scaling transparency

AvailabilitySystem should be fault tolerant

Concurrent updatesSimultaneous access to a single resource

ReplicationFile may be present at different locationsShares load between servers, enhances fault tolerancy

Page 8: Distributed File Systems

8

Distributed file systems

Basics – Requirements for DFS (2/2)

Hardware and software heterogeneitySupport for various platforms

ConsistencyData integrity has to be maintained

SecurityAccess control, user authentication, confidentiality

EfficiencyPerformance should be comparable to local file systems

Page 9: Distributed File Systems

9

Distributed file systems

Basics – Abstract file service model (1/3)

Source: [CDK01], p. 318

Page 10: Distributed File Systems

10

Distributed file systems

Basics – Abstract file service model (2/3)

Service Operations

Directory service Lookup(Dir, Name) FileId – throws NotFoundAddName(Dir, Name, File) – throws NameDuplicateUnName(Dir, Name) – throws NotFoundGetNames(Dir, Pattern) NameSeq

Flat file service Read(FileId, i, n) Data – throws BadPositionWrite(FileId, i, Data) – throws BadPositionCreate() FileIdDelete(FileId)GetAttributes(FileId) AttrSetAttributes(FileId, Attr)

Source: [CDK01], p. 319-322

Page 11: Distributed File Systems

11

Distributed file systems

Basics – Abstract file service model (3/3)

Access controlServer-side user authorisationAccess rights checked upon directory lookup or every request

Hierarchical file structureRealised within the client moduleDirectories may store references to other directories

File groupsSet of files that can be moved between serversSimilar to a file system

Page 12: Distributed File Systems

12

Distributed file systems

Agenda

MotivationDistributed file system basicsCase studies

Network File System (NFS)Andrew File System (AFS)Lustre

Summary and outlook

Page 13: Distributed File Systems

13

Distributed file systems

NFS – History

198?: NFSv1Developed at Sun Microsystems, unreleased

1984: NFSv2Developed at Sun MicrosystemsFirst released version, widely acceptedSupports files < 4GB, synchronous writes

1992: NFSv3Developed by a group of researchersOvercomes drawbacks (file size, asynchronous writes)

2002: NFSv4Enhanced security, user authenticationBetter Windows support

Page 14: Distributed File Systems

14

Distributed file systems

NFS – General description (1/2)

Source: [CDK01], p. 324

Page 15: Distributed File Systems

15

Distributed file systems

NFS – General description (2/2)

Stateless protocolServer does not maintain client statesClient requests are blocking (Exception: asynchronous write)

User authenticationDefault: UNIX user ID (insecure!)Optional: Kerberos, DES

CachingRead cache: YesWrite cache: No!

Server file systemNot restricted, should support unique file IDs

Page 16: Distributed File Systems

16

Distributed file systems

NFS – Abstract model (1/2)

vs.

Page 17: Distributed File Systems

17

Distributed file systems

NFS – Abstract model (2/2)

OperationsSimilar to UNIX file system callsAll abstract operations can be represented

Access controlChecked upon every request

Hierarchical file systemRealised within the client module

File groupsNot supported, only manual movement of files

Page 18: Distributed File Systems

18

Distributed file systems

NFS – Requirements

TransparencyAvailabilityConcurrent updatesReplicationHeterogeneityConsistencySecurityEfficiency

Page 19: Distributed File Systems

19

Distributed file systems

Agenda

MotivationDistributed file system basicsCase studies

Network File System (NFS)Andrew File System (AFS)Lustre

Summary and outlook

Page 20: Distributed File Systems

20

Distributed file systems

AFS – History

1982: Initial versionDeveloped at Carnegie Mellon University (CMU), PittsburghPart of the Andrew distributed computing environmentProvides support for teaching and research

1989: Spin-offDevelopment outsourced to Transarc Inc.

1994: Transarc acquired by IBMAll rights owned by IBM

2000: Open-sourceCode was released under an open source licenseSince then: continuous development

Page 21: Distributed File Systems

21

Distributed file systems

AFS – General description (1/3)

Page 22: Distributed File Systems

22

Distributed file systems

AFS – Name spaces

Page 23: Distributed File Systems

23

Distributed file systems

AFS – General description (2/3)

Cached?No!

Page 24: Distributed File Systems

24

Distributed file systems

AFS – General description (3/3)

Caching„Callback promises“Workstations are notified when cached files change

Stateful protocolServer maintains client statesProblematic when client fails

User authenticationKerberos

Server file systemNot restricted, should support unique file IDs

Page 25: Distributed File Systems

25

Distributed file systems

AFS – Abstract model (1/2)

vs.

Page 26: Distributed File Systems

26

Distributed file systems

AFS – Abstract model (2/2)

OperationsDiffer from abstract modelSome operations combined, callback promises added

Access controlRights checked upon every requestExtended access lists per directory

Hierarchical file systemRealized within the client module

File groupsFile idenitfier contains link to file groupLocation database maps file groups to servers

Page 27: Distributed File Systems

27

Distributed file systems

AFS – Requirements

TransparencyAvailabilityConcurrent updatesReplicationHeterogeneityConsistencySecurityEfficiency

Page 28: Distributed File Systems

28

Distributed file systems

Agenda

MotivationDistributed file system basicsCase studies

Network File System (NFS)Andrew File System (AFS)Lustre

Summary and outlook

Page 29: Distributed File Systems

29

Distributed file systems

Lustre (1/3)

„Lustre“: Linux ClusterFile system especially suited for clustersEasily handles thousands of clients and servers

Uses object-based storageObjects offer methods for data access, attributes, policiesHigh-level abstractionLower performance than block-based storage

Three system rolesObject Storage Targets (OST)Metadata Servers (MDS)Clients

Page 30: Distributed File Systems

30

Distributed file systems

Lustre (2/3)

Object StorageTargets(OST)

MetadataServers(MDS)

Clients

File operations,locking

Recovery,file status

Directorymetadata

Source: [BS02], p. 51

Page 31: Distributed File Systems

31

Distributed file systems

Lustre (3/3)

Lustre partly follows abstract modelSeparation of directory and flat file serviceFile attributes managed by OSTs

Hierarchical file systemsRealised within the client module

High availabilityHeavy use of redundancyCaching of metadata

Page 32: Distributed File Systems

32

Distributed file systems

Agenda

MotivationDistributed file system basicsCase studiesSummary and outlook

Page 33: Distributed File Systems

33

Distributed file systems

Summary and outlook

Abstract file service modelDeveloped to meet many requirements for DFSs

Different implementationsNFS: Stateless, concurrency controlAFS: Stateful, heavy use of caching, better performance

Other approach: LustreModularised approach, especially suited for clusters

Future developmentsLarge-scale environmentsCloud computingIssues: Data security, privacy

Page 34: Distributed File Systems

34

Distributed file systems

ANY QUESTIONS?Thank you for your attention!

Page 35: Distributed File Systems

35

Distributed file systems

Literature

[BS02] Peter J. Braam, Philip Schwan: Lustre: The intergalactic file system, Proceedings of the 2003 Ottawa Linux Symposium, pp. 50–54, 2002.[CDK01] George Coulouris, Jean Dollimore, Tim Kindberg: Distributed Systems, Concepts and Design, 3rd. ed., Addison-Wesley, 2001.[Kir06] Olaf Kirch: Why NFS Sucks, Proceedings of the Linux Symposium, 2nd. ed., pp. 51–63, 2006.[MSC+ 86] James H. Morris, Mahadev Satyanarayanan, Michael H. Conner, John H. Howard, David S. H. Rosenthal, F. Donelson Smith: Andrew: A distributed personal computing environment, Commununications of the ACM, 29(3), pp. 184–201, Association for Computing Machinery, 1986.[PJS+ 94] Brian Pawlowski, Chet Juszczak, Peter Staubach, Carl Smith, Diane Lebel, David Hitz: NFS Version 3: Design and Implementation, Proceedings of the Summer 1994 USENIX Technical Conference, pp. 137–151, 1994.[Sat89] Mahadev Satyanarayanan: Distributed file systems, Distributed systems, S. Mullender (ed.), pp. 149–188, ACM Press, 1989.[Sch03] Philip Schwan: Lustre: Building a file system for 1000-node clusters, Proceedings of the 2003 Ottawa Linux Symposium, pp. 380–386, 2003.[Tan03] Andrew S. Tanenbaum: Moderne Betriebssysteme, 2nd. ed., Prentice Hall, 2003.[Tv07] Andrew S. Tanenbaum, Marten van Steen: Distributed Systems: Principles and Paradigmsva, 2nd. ed., Prentice Hall, 2007.