PAPP 2004 Gava 1 Frédéric Gava Parallel I/O Bulk-Synchronous Parallel ML In
PAPP 2004 Gava 1
Frédéric Gava
Parallel I/O
Bulk-Synchronous Parallel MLIn
PAPP 2004 Gava 2
Outline Introduction
• The BSP model
• The BSML language
External Memory in BSML
• Cost model
• Problems and solutions
Conclusion and Future Work
PAPP 2004 Gava 3
Introduction
PAPP 2004 Gava 4
Bulk Synchronous Parallelism +
Functional Programming = BSML
Advantages of the BSP model:1. Portability2. Scalability, deadlock free
3. Simple cost model Performance prediction
Advantages of functional programming:
1. High level features (higher order functions, pattern-matching, concrete types, etc…)
2. Savety of the environment3. Programs Proofs
PAPP 2004 Gava
The Caraml Project Funds by the ACI Grid program (French National Grid program)
• First phase: safety
• Second phase: multiprogramming
• Third phase: extensions for Grid computing
Tools and applications
Organized in 3 phases:
6
The BSP model
T(s) = (max0i<p wi) + hg + L
0 1 2 3 p-1Proc.
PAPP 2004 Gava 7
The BSML language Library for the « Objective Caml » language (called BSMLlib)
Operations on a parallel data structure called vector: par
Operations to access to the BSP parameters :
4 Operations on a parallel vectors
8
if vec at n then … else …
=
if
at n then e1 else e2
n
Global Conditional
… true bp-1…b1b0
e1
PAPP 2004 Gava 9
External Memory
Model
We have: • M = Size of the main memory• D = Number of disks• B = Size of one block in a disk• G = Time to read/write in parallel B blocks (D*B data)
PAPP 2004 Gava 11
Problem
let bug = mkpar (fun pid -> if pid=0 then open_write « toto.dat » else NOTHING) in open_read « toto.dat »
« local side effects » => modification of the global environment
12
Solutions Two file systems
• local files => one files system on each process
• global files :
o a shared files system
o or replicate local files on a different directory)
New primitives for the differents files
Confluence of the semantics
Compositional cost model
PAPP 2004 Gava 13
Example Scan_list :
scan_list (+) <[0;1], [2;3], [4] >
Read/write values in blocks using tempory files
< [0;0+1], [0+1+2;0+1+2+3], [0+1+2+3+4] >
Benchmark
PAPP 2004 Gava 15
Conclusion BSML = BSP + ML
External Memory in BSML
New cost model
New Primitives
Confluence
Compositional cost model
16
Future Work
Add to BSML :
Parallel composition
Exceptions
Pattern – matching of parallel values
Polymorphic type system for BSML with I/O
Implementation of « big » applications
PAPP 2004 Gava 17