www.openfabrics.org Open MPI Project State of the Union - April 2007 Jeff Squyres Cisco, Inc.
2www.openfabrics.org
Overview
Project purposeSub projects
Current status Continuing / future directions
3www.openfabrics.org
Why does Open MPI exist?
Maximize all MPI expertise Research / academia Industry …elsewhere
Capitalize on [literally] years of MPI research and implementation experience
The sum is greater than the parts
Research /academia
Industry
4www.openfabrics.org
Why separate from M[VA]PICH?
Open, inclusive communityNot limited to just Open Fabrics
Common: TCP, shared memory, OFED* (MVAPICH only)
OMPI-specific: Myrinet, Portals, InfiniPath
M[VA]PICH have different project goals They both chose to remain separate
5www.openfabrics.org
Current membership
14 members, 6 contributors 4 US DOE labs 8 universities 7 vendors 1 individual
6www.openfabrics.org
Not-so-subtle hint
…would love to see an iWARP vendor in the list! (please come talk to me!)
7www.openfabrics.org
Current projects
“Open MPI Project” is an umbrella organization for multiple projects OMPI: Open MPI ORTE:Open Run-Time Environment PLPA: Portable Linux Processor Affinity MTT: MPI (Middleware) Testing Tool
8www.openfabrics.org
Project: Open MPI / ORTE
Recently released new 1.2 seriesOF-related changes compared to v1.1 series
Better overall performance, lots of bug fixes Improvements for run-time/launch scalability Relocate installed MPI (good for ISVs) Support for fork() with OFED 1.2 Support fixed limits for registered memory Fixes for heterogeneous network environments Native InfiniPath support
10www.openfabrics.org
Success stories
OFED + Open MPI Thunderbird Sandia cluster
• #6 in Top 500
Road Runner Los Alamos cluster• 16k Opteron cores + 16k cell broadband engines
Coyote Los Alamos cluster• 2580 Opteron cores
Sun ClusterTools v7
11www.openfabrics.org
OFED involvement
Initially planned on “v1.2ofed” Included some OF-
specific updates But community released
v1.2.1 before OFED 1.2
Therefore, included community OMPI v1.2.1 release in OFED v1.2
OMPI SVN development trunk
v1.2 series branch
v1.2
v1.2ofed
v1.2.1Today
12www.openfabrics.org
OFED involvement
“MPI Selector” Menu-based and CLI commands
Trivially set system-wide and per-user default MPI selection No editing of “dot” files necessary Displays / select between all installed MPI’s
Works with all MPI’s Including HP MPI and Intel MPI
13www.openfabrics.org
Ongoing OFA-related work
More flexible OF wireup schemes Heterogeneous networking scenarios Multiple QP’s per connection
More flexible resource affinity schemes Processor / core, HCA / port
Automatic path migrationRDMA CM functionalityBetter LMC / multi-LID routing
14www.openfabrics.org
Ongoing OFA-related work
Message coalescingAsynchronous progressExploit new Mellanox HCA capabilitiesBetter utilization of network resourcesHeterogeneityMulticast, UD
15www.openfabrics.org
Roadmap
1.2 series is current stable v1.2.1 latest release
1.3 series tentatively targeted at end of year Checkpoint / restart (and other FT) Integration with debuggers Windows support (*) MPI collectives performance improvements LSF integration
16www.openfabrics.org
Project: Processor affinity (PLPA)
Linux API for affinity has changed 3 times Changed number and type of arguments Used same function name (!) Both kernel and glibc functions Installed glibc may not match kernel!
Affinity is critical for performance Especially with increasing core count per host Already critical on NUMA machines (locality!)
17www.openfabrics.org
Which API to use?
Compile-time solution not sufficient Need complex “configure” script to figure it out Only determines glibc API, not kernel API, so it
may not even be sufficient Does not help for shipping static binaries (ISVs)
Need a run-time solution Paul Hargrove (LBNL) devised safe kernel probe PLPA library born
18www.openfabrics.org
PLPA library
Constant API suitable for ISVs BSD license
Automatically performs the run-time probe Dispatches to correct back-end kernel function Bypasses glibc
19www.openfabrics.org
Current status
Releases: Stable series: 1.0.x Upcoming series: 1.1
New 1.1 features Topology information
• Mapping between (socket,core) tuple, hardware threads, CPU node, and Linux processor ID’s
plpa-taskset(1) command• Same as taskset(1), but groks topology information
20www.openfabrics.org
Project: MPI Testing Tool (MTT)
Could be named “Middleware Testing Tool” Very little (no?) MPI-specific
Not specific to Open MPI Has been used with LAM, MPICH2, MVAPICH2
Used as primary test mechanism for OMPI Distributed testing by member organizations
21www.openfabrics.org
Open MPI MTT Usage
Distributed regression testing Nightly and weekend runs Results e-mailed every weekday morning
Supports various resource managersSupports correctness and performance testsCornerstone of Open MPI release process
Each member tests the platforms they care about
22www.openfabrics.org
Indiana U. server
Member site Member site
Member siteMember site
Tarball
Nightly Regression Testing
Nightly tarballs created at Indiana U.
Member sites Download tarball and tests Compile and run tests
Members upload results to central DB
E-mail sent at 12 and 24 hour intervals
Real-time web querying
23www.openfabrics.org
Indiana U. server
Member site Member site
Member siteMember site
Tarball DB
Nightly Regression Testing
Nightly tarballs created at Indiana U.
Member sites Download tarball and tests Compile and run tests
Members upload results to central DB
E-mail sent at 12 and 24 hour intervals
Real-time web querying
24www.openfabrics.org
Indiana U. server
Tarball DB
Nightly Regression Testing
Nightly tarballs created at Indiana U.
Member sites Download tarball and tests Compile and run tests
Members upload results to central DB
E-mail sent at 12 and 24 hour intervals
Real-time web querying
25www.openfabrics.org
Usage in Open MPI
Currently available to all OMPI members Strongly “encouraged”
E-mail results examined every day 12 and 24 hour windows Weekend windows
MTT software to be released publicly later this year
26www.openfabrics.org
The Open MPI Project
More than just MPI Concerned with real-world HPC
Open community Come join us!
Solid OpenFabrics support is critical Many unanswered questions Plenty of room for academic and industry
ongoing work