U.C. Berkeley and LBNLU.C. Berkeley and LBNL
• Global address space languages support
• Global pointers and distributed arrays
• User controls layout of data across nodes
• Direct read and write to remote memory
• Single Program Multiple Data (SPMD) control
• Similar to using threads, but with remote accesses
• Global synchronization, barriers
• Languages: UPC, Co-Array Fortran, Titanium
• GASNet - A common communication system tailored for global address space languages
Distributed Data Structures
Latency Performance
GASNet Goals• Language-independence: Compatibility with several global-address
space languages and compilers
• UPC, Titanium, Co-array Fortran, possibly others..
• Hide language- or compiler-specific details, such as shared-pointer representation
• Hardware-independence: variety of parallel architectures & OS's
• SMP: Linux/UNIX SMP's, Origin 2000, etc.
• Clusters of uniprocessors or SMP's: IBM SP, Compaq AlphaServer, Linux/UNIX clusters, etc.
• Support many high-performance networks:MPI, Myrinet/GM, Quadrics/elan, IBM/LAPI, Infiniband
• Ease of implementation on new hardware
• Allow quick prototype implementations
• Implementations can leverage performance features of hardware
• Provide both portability & performance
GASNet Core API
Global Address Space Languages
GASNet Extended API
Bandwidth Performance
Christian Bell, Dan Bonachea, Wei Chen, Jason Duell, Paul Hargrove, Christian Bell, Dan Bonachea, Wei Chen, Jason Duell, Paul Hargrove, Parry Husbands, Costin Iancu, Mike Welcome, Kathy YelickParry Husbands, Costin Iancu, Mike Welcome, Kathy Yelick
Compiler-generated code
Compiler-specific runtime system
GASNet Extended API
GASNet Core API
Network Hardware
• Wider interface that includes more complicated operations
• We provide a reference implementation of the extended API in terms of the core API
• Implementors can choose to directly implement any subset for performance - leverage hardware support for higher-level operations
• Most basic required network primitives• Implemented directly on each platform
• Minimal set of network functions needed to support a working implementation
• General enough to implement everything else• Based heavily on active messages paradigm
• Provides powerful extensibility mechanism
GASNet Put/Get Latency(min over msg sz)
0
5
10
15
20
25
30
35
40
mpi-refext
elan-refext
elan-elan
mpi-refext
elan-refext
elan-elan
mpi-refext
gm-gm mpi-refext
gm-gm
mic
rose
cond
s
put_nbget_nb
quadrics -falcon quadrics - lemieux myrinet - millennium myrinet - alvarez
GASNet Put/Get Bulk Bandwidth (max over msg sz)
0
50
100
150
200
250
300
mpi-refext
elan-refext
elan-elan
mpi-refext
elan-refext
elan-elan
mpi-refext
gm-gm mpi-refext
gm-gm
MB
/sec
put_nb_bulkget_nb_bulk
quadrics - falcon quadrics - lemieux myrinet - millennium myrinet - alvarez
GASNet Myrinet/GM Bandwidth
0
20
40
60
80
100
120
140
160
180
200
0 10000 20000 30000 40000 50000 60000 70000size (bytes)
MB
/sec get_bulk (blocking)
get_bulk (non-blocking)put_bulk (blocking)put_bulk (non-blocking)
GASNet Myrinet/GM Latency
0
10
20
30
40
50
60
1 10 100 1000 10000size (bytes)
mic
rose
cond
s
get (blocking)get (non-blocking)put (blocking)put (non-blocking)
GASNet Quadrics/elan Bandwidth
0
50
100
150
200
250
300
0 10000 20000 30000 40000 50000 60000 70000size (bytes)
MB
/sec
get_bulk (blocking)get_bulk (non-blocking)put_bulk (blocking)put_bulk (non-blocking)
GASNet Quadrics/elan Latency
0
2
4
6
8
10
12
14
16
18
20
1 10 100 1000 10000size (bytes)
mic
rose
cond
s
get (blocking)get (non-blocking)put (blocking)put (non-blocking)
http://upc.nersc.govhttp://upc.nersc.gov [email protected]@lbl.gov