Top Banner
Grupo de Arquitectura de Computadores, Comunicaciones y Sistemas (Computer Architecture, Communications and Systems Group) http://www.arcos.inf.uc3m.es UNIVERSIDAD CARLOS III DE MADRID A parallel File System for Networks of Windows Workstations José María Pérez Jesús Carretero José Daniel García Félix García Alejandro Calderón
33

A parallel File System for Networks of Windows Workstations

Jan 20, 2016

Download

Documents

lawson

A parallel File System for Networks of Windows Workstations. José María Pérez Jesús Carretero José Daniel García Félix García Alejandro Calderón. Outline. Introduction Goals Design Evaluation Conclusion. High performance and data storage. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: A parallel File System for Networks of Windows Workstations

Grupo de Arquitectura de Computadores, Comunicaciones y Sistemas(Computer Architecture, Communications and Systems Group)

http://www.arcos.inf.uc3m.esUNIVERSIDAD CARLOS III DE MADRID

A parallel File System for Networks of Windows Workstations

José María PérezJesús CarreteroJosé Daniel GarcíaFélix GarcíaAlejandro Calderón

Page 2: A parallel File System for Networks of Windows Workstations

Grupo de Arquitectura de Computadores, Comunicaciones y Sistemashttp://www.arcos.inf.uc3m.esUNIVERSIDAD CARLOS III DE MADRIDExpanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids

Page –2–

Outline

Introduction

Goals

Design

Evaluation

Conclusion

Page 3: A parallel File System for Networks of Windows Workstations

Grupo de Arquitectura de Computadores, Comunicaciones y Sistemashttp://www.arcos.inf.uc3m.esUNIVERSIDAD CARLOS III DE MADRIDExpanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids

Page –3–

High performance and data storage Growing need for high performance data

storage. Growing capacity of disks. Growing data storage from applications.

I/O becomes in bottleneck.

Typical solution: Parallel I/O Join several storage resources Large storage. Increased scalability and performance. Load balancing.

Page 4: A parallel File System for Networks of Windows Workstations

Grupo de Arquitectura de Computadores, Comunicaciones y Sistemashttp://www.arcos.inf.uc3m.esUNIVERSIDAD CARLOS III DE MADRIDExpanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids

Page –4–

Parallel File Systems

Several nodes with storage devices. Accesses performed in parallel. Data striped among nodes.

Striping allows: Parallel access to different files. Parallel access to the same file.

Striping originally used in RAID.Striping originally used in RAID.

Page 5: A parallel File System for Networks of Windows Workstations

Grupo de Arquitectura de Computadores, Comunicaciones y Sistemashttp://www.arcos.inf.uc3m.esUNIVERSIDAD CARLOS III DE MADRIDExpanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids

Page –5–

Current state

Current solutions are neither general nor flexible.

Do not use standard servers. Difficult to integrate in existing networks of

workstations. Need to install new difficult servers. Available for specific platforms.

Implementation outside the operating system. A new I/O API is needed. Applications need to be modified or recompiled.

Page 6: A parallel File System for Networks of Windows Workstations

Grupo de Arquitectura de Computadores, Comunicaciones y Sistemashttp://www.arcos.inf.uc3m.esUNIVERSIDAD CARLOS III DE MADRIDExpanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids

Page –6–

Outline

Introduction

Goals

Design

Evaluation

Conclusion

Page 7: A parallel File System for Networks of Windows Workstations

Grupo de Arquitectura de Computadores, Comunicaciones y Sistemashttp://www.arcos.inf.uc3m.esUNIVERSIDAD CARLOS III DE MADRIDExpanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids

Page –7–

WinPFS: Goal

Build a parallel file system for networks of Windows workstations using standard data sharing services (as Windows Shared Folders).

A first prototype has been built using A first prototype has been built using CIFS/SMB servers.CIFS/SMB servers.

Page 8: A parallel File System for Networks of Windows Workstations

Grupo de Arquitectura de Computadores, Comunicaciones y Sistemashttp://www.arcos.inf.uc3m.esUNIVERSIDAD CARLOS III DE MADRIDExpanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids

Page –8–

Detailed goals

Integrate existing storage resources using shared folders rather than installing new servers. Accomplished by using Windows Redirectors.

Simple setup. Implemented as a new Windows File System in the kernel

(a new stackable driver in the I/O hierarchy).

Easy to use. No special API’s Applications work without recompilation.

Enhance performance, scalability and capacity. Request splitting, balanced data allocation, load

balancing, ...

Page 9: A parallel File System for Networks of Windows Workstations

Grupo de Arquitectura de Computadores, Comunicaciones y Sistemashttp://www.arcos.inf.uc3m.esUNIVERSIDAD CARLOS III DE MADRIDExpanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids

Page –9–

C lie n ts

C lien t

R ed irec to rs

N F SC IF SH T T P -W eb D av...

In tran e tD istrib uted p ar titio n 2

S ite 1

S ite 3S ite 2

H T T PW e b D AVC IF S L o ca l ....

....

N F S

W in 3 2 M P I- IO

W in P F S

D istrib u ted p artitio n 1

Page 10: A parallel File System for Networks of Windows Workstations

Grupo de Arquitectura de Computadores, Comunicaciones y Sistemashttp://www.arcos.inf.uc3m.esUNIVERSIDAD CARLOS III DE MADRIDExpanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids

Page –10–

Outline

Introduction

Goals

Design

Evaluation

Conclusion

Page 11: A parallel File System for Networks of Windows Workstations

Grupo de Arquitectura de Computadores, Comunicaciones y Sistemashttp://www.arcos.inf.uc3m.esUNIVERSIDAD CARLOS III DE MADRIDExpanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids

Page –11–

WinPFS design

Design based in a new Windows kernel component: A file system redirector. Implements the basis of the file system. Isolates users from the parallel file system Uses protocols to connect to different network file

systems.

Redirector redirects requests to remote servers with specific protocol (e.g.: CIFS/SMB).

WinPFS is registered as a virtual remote file system, implement the parallel I/O mechanisms and use other remote data services (redirectors).

Page 12: A parallel File System for Networks of Windows Workstations

Grupo de Arquitectura de Computadores, Comunicaciones y Sistemashttp://www.arcos.inf.uc3m.esUNIVERSIDAD CARLOS III DE MADRIDExpanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids

Page –12–

Win32 POSIX DOS

Native NT API

I/O Manager

CIFS WebDav Netware NFS Local

WinPFS

Page 13: A parallel File System for Networks of Windows Workstations

Grupo de Arquitectura de Computadores, Comunicaciones y Sistemashttp://www.arcos.inf.uc3m.esUNIVERSIDAD CARLOS III DE MADRIDExpanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids

Page –13–

Remote data access

User point of view

Access to remote data through shared folders.

WinPFS creates a new shared folder: \\PFS.

Users can access parallel files through this shared folder.

Kernel point of view

Access through CIFS/SMB, …

Capture requests through the usage of Universal naming Convention (UNC).

Special kind of file system: a redirection of redirector drivers.

Page 14: A parallel File System for Networks of Windows Workstations

Grupo de Arquitectura de Computadores, Comunicaciones y Sistemashttp://www.arcos.inf.uc3m.esUNIVERSIDAD CARLOS III DE MADRIDExpanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids

Page –14–

File striping and requests

Page 15: A parallel File System for Networks of Windows Workstations

Grupo de Arquitectura de Computadores, Comunicaciones y Sistemashttp://www.arcos.inf.uc3m.esUNIVERSIDAD CARLOS III DE MADRIDExpanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids

Page –15–

Layered I/O

Windows NT family has a layered I/O model.

Several layers to process a request in the I/O subsystem.

Each layer is a driver which can receive a request and pass it to lower layers in the I/O stack.

The model allows the insertion of new layers, using new drivers.

File systems are implemented as drivers in the I/O model, so new file systems can be added at the kernel level.

Page 16: A parallel File System for Networks of Windows Workstations

Grupo de Arquitectura de Computadores, Comunicaciones y Sistemashttp://www.arcos.inf.uc3m.esUNIVERSIDAD CARLOS III DE MADRIDExpanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids

Page –16–

I/O Request Management

IRP (I/O Request Packet) Describes an I/O request. Sent to kernel-mode drivers by I/O Manager (in

behalf of the client).

I/O Manager Receives system calls. Creates IRP describing the request. Deliver the IRP to the appropriate driver.

MUP (Multi UNC Provider) Identifies the kernel-mode driver in charge for a

network name.

Page 17: A parallel File System for Networks of Windows Workstations

Grupo de Arquitectura de Computadores, Comunicaciones y Sistemashttp://www.arcos.inf.uc3m.esUNIVERSIDAD CARLOS III DE MADRIDExpanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids

Page –17–

Page 18: A parallel File System for Networks of Windows Workstations

Grupo de Arquitectura de Computadores, Comunicaciones y Sistemashttp://www.arcos.inf.uc3m.esUNIVERSIDAD CARLOS III DE MADRIDExpanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids

Page –18–

Request management

Create: IRP’s are replicated and sent to each server.

Read/Write: Request split in smaller subrequests.

Create Directory: IRP’s are replicated and set to each server.

Page 19: A parallel File System for Networks of Windows Workstations

Grupo de Arquitectura de Computadores, Comunicaciones y Sistemashttp://www.arcos.inf.uc3m.esUNIVERSIDAD CARLOS III DE MADRIDExpanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids

Page –19–

Using WinPFS

Administration / Installation: Install a new driver in client nodes. Share folders in server nodes. Indicate shared folders using registry in client nodes.

User Prefix paths with \\PFS.

We plan to map remote names to common driver We plan to map remote names to common driver letters.letters.

WinPFS may be used with any API that is on top of Windows Services. Win32, POSIX, DOS, cygwin, …

Page 20: A parallel File System for Networks of Windows Workstations

Grupo de Arquitectura de Computadores, Comunicaciones y Sistemashttp://www.arcos.inf.uc3m.esUNIVERSIDAD CARLOS III DE MADRIDExpanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids

Page –20–

Other Features

Caching. Caching mechanisms performed by redirectors. Limited to Windows caching model. More advanced caching for future work.

Security and Authentication Current model works on a Windows Domain, forests and trusted

domains. Standard Windows mechanisms used to managed policies and security in enterprises, labs and departments.

Uses standard Windows security model. Changes to be done for workgroup or not trusted domains.

Data consistency between clients. Currently only solved for all servers using CIFS, using the

default mechanism used by CIFS redirector oplocks (oportunistic locks).

Page 21: A parallel File System for Networks of Windows Workstations

Grupo de Arquitectura de Computadores, Comunicaciones y Sistemashttp://www.arcos.inf.uc3m.esUNIVERSIDAD CARLOS III DE MADRIDExpanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids

Page –21–

Outline

Introduction

Goals

Design

Evaluation

Conclusion

Page 22: A parallel File System for Networks of Windows Workstations

Grupo de Arquitectura de Computadores, Comunicaciones y Sistemashttp://www.arcos.inf.uc3m.esUNIVERSIDAD CARLOS III DE MADRIDExpanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids

Page –22–

Evaluation

Creating a file of 100 MB.

Write sequentially.

Read sequentially.

Static buffer size.

Client cache disabled.

Two clusters with four nodes.

Node BiProcessor Pentium III. 1 GHz. 1 GB main memory. 200 GB disk.

GigaEthernet network

1 Windows 2003 Server 7 Windows XP

Professional

Page 23: A parallel File System for Networks of Windows Workstations

Grupo de Arquitectura de Computadores, Comunicaciones y Sistemashttp://www.arcos.inf.uc3m.esUNIVERSIDAD CARLOS III DE MADRIDExpanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids

Page –23–

Evaluation infrastructure

PC

PC

PC

PC

PC

PC

PC

PC

GigaEthernet Switch

GigaEthernet Switch

Page 24: A parallel File System for Networks of Windows Workstations

Grupo de Arquitectura de Computadores, Comunicaciones y Sistemashttp://www.arcos.inf.uc3m.esUNIVERSIDAD CARLOS III DE MADRIDExpanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids

Page –24–

Configurations

CIFS: One server.

PFS88: 8 servers in parallel.

PFS44: 4 servers in parallel.

PFS84: 4 servers in parallel and selected randomly from a set of 8.

In all cases 8 clients running (1 client per In all cases 8 clients running (1 client per node)node)

Page 25: A parallel File System for Networks of Windows Workstations

Grupo de Arquitectura de Computadores, Comunicaciones y Sistemashttp://www.arcos.inf.uc3m.esUNIVERSIDAD CARLOS III DE MADRIDExpanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids

Page –25–

Write results

0

50

100

150

200

250

300

1K 2K 4K 8K 16K 32K 64K 128K 256K 512K 1M

Buffer Size (Bytes)

Th

rou

gh

pu

t (M

bit

s/S

)

CIFS

PFS88

PFS44

PFS84

Page 26: A parallel File System for Networks of Windows Workstations

Grupo de Arquitectura de Computadores, Comunicaciones y Sistemashttp://www.arcos.inf.uc3m.esUNIVERSIDAD CARLOS III DE MADRIDExpanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids

Page –26–

Read results

0

200

400

600

800

1000

1200

1400

1K 2K 4K 8K 16K 32K 64K 128K 256K 512K 1M

Buffer Size (Bytes)

Th

rou

gh

pu

t (M

bit

s/S

)

CIFS

PFS88

PFS44

PFS84

Page 27: A parallel File System for Networks of Windows Workstations

Grupo de Arquitectura de Computadores, Comunicaciones y Sistemashttp://www.arcos.inf.uc3m.esUNIVERSIDAD CARLOS III DE MADRIDExpanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids

Page –27–

Results

All WinPFS solutions provide better results than CIFS.

PFS88 provides the best performance as its parallelism degree is maximum.

Performance reaches to 250 Mbit/s for write operations. Writes limited by the disks.

Performance reaches to 1200 Mbit/s for read operations. Reads limited by the network.

Page 28: A parallel File System for Networks of Windows Workstations

Grupo de Arquitectura de Computadores, Comunicaciones y Sistemashttp://www.arcos.inf.uc3m.esUNIVERSIDAD CARLOS III DE MADRIDExpanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids

Page –28–

Write Speedup PFS88/CIFS

Page 29: A parallel File System for Networks of Windows Workstations

Grupo de Arquitectura de Computadores, Comunicaciones y Sistemashttp://www.arcos.inf.uc3m.esUNIVERSIDAD CARLOS III DE MADRIDExpanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids

Page –29–

Read Speedup PFS88/CIFS

Page 30: A parallel File System for Networks of Windows Workstations

Grupo de Arquitectura de Computadores, Comunicaciones y Sistemashttp://www.arcos.inf.uc3m.esUNIVERSIDAD CARLOS III DE MADRIDExpanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids

Page –30–

Speedup results

Speedup is higher with more concurrent clients.

Write speedup from 500% to 700% may be achieved.

Read speedup is less 100% because data are obtained from server caches without disk accesses.

WinPFS performance is limited by the striping size buffer.

Page 31: A parallel File System for Networks of Windows Workstations

Grupo de Arquitectura de Computadores, Comunicaciones y Sistemashttp://www.arcos.inf.uc3m.esUNIVERSIDAD CARLOS III DE MADRIDExpanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids

Page –31–

Outline

Introduction

Goals

Design

Evaluation

Conclusion

Page 32: A parallel File System for Networks of Windows Workstations

Grupo de Arquitectura de Computadores, Comunicaciones y Sistemashttp://www.arcos.inf.uc3m.esUNIVERSIDAD CARLOS III DE MADRIDExpanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids

Page –32–

Conclusions

WinPFS is a parallel file system implemented as a kernel-mode driver.

Integration into the kernel provides higher performance. Uses existing mechanisms at kernel level.

No change or recompilation needed in client applications.

We can run an application that uses parallel I/O taking advantages of the shared folders in our organizations, without affecting users. For example, launch an I/O intensive application in a

classroom, and accessing to shared folders.

Page 33: A parallel File System for Networks of Windows Workstations

Grupo de Arquitectura de Computadores, Comunicaciones y Sistemashttp://www.arcos.inf.uc3m.esUNIVERSIDAD CARLOS III DE MADRIDExpanding Windows Kernel to Integrate Heterogeneous Resources on Data Grids

Page –33–

Future Work Use of Active Directory Service to create metadata repository

and give a consistent image of parallel file systems. Objective: No need of manual edition of client registry to provide

information about shared folders.

Evaluation with other operating systems. Linux, FreeBsd and Solaris sharing folders with Samba.

Evaluation with other protocols (redirectors). NFS (redirector provided by Services for UNIX 3.5) and WebDAV. So, a WinPFS can connect to more servers, including NAS.

Parallel usage of heterogeneous resources and protocols in networks of workstations.

Dynamically addition and removal of storage nodes.

Data allocation and load balancing for heterogeneous distributed systems.