Top Banner
www.openfabrics.org Open MPI Project State of the Union - April 2007 Jeff Squyres Cisco, Inc.
27

Www.openfabrics.org Open MPI Project State of the Union - April 2007 Jeff Squyres Cisco, Inc.

Jan 03, 2016

Download

Documents

Carmella Fox
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Www.openfabrics.org Open MPI Project State of the Union - April 2007 Jeff Squyres Cisco, Inc.

www.openfabrics.org

Open MPI ProjectState of the Union - April 2007

Jeff Squyres

Cisco, Inc.

Page 2: Www.openfabrics.org Open MPI Project State of the Union - April 2007 Jeff Squyres Cisco, Inc.

2www.openfabrics.org

Overview

Project purposeSub projects

Current status Continuing / future directions

Page 3: Www.openfabrics.org Open MPI Project State of the Union - April 2007 Jeff Squyres Cisco, Inc.

3www.openfabrics.org

Why does Open MPI exist?

Maximize all MPI expertise Research / academia Industry …elsewhere

Capitalize on [literally] years of MPI research and implementation experience

The sum is greater than the parts

Research /academia

Industry

Page 4: Www.openfabrics.org Open MPI Project State of the Union - April 2007 Jeff Squyres Cisco, Inc.

4www.openfabrics.org

Why separate from M[VA]PICH?

Open, inclusive communityNot limited to just Open Fabrics

Common: TCP, shared memory, OFED* (MVAPICH only)

OMPI-specific: Myrinet, Portals, InfiniPath

M[VA]PICH have different project goals They both chose to remain separate

Page 5: Www.openfabrics.org Open MPI Project State of the Union - April 2007 Jeff Squyres Cisco, Inc.

5www.openfabrics.org

Current membership

14 members, 6 contributors 4 US DOE labs 8 universities 7 vendors 1 individual

Page 6: Www.openfabrics.org Open MPI Project State of the Union - April 2007 Jeff Squyres Cisco, Inc.

6www.openfabrics.org

Not-so-subtle hint

…would love to see an iWARP vendor in the list! (please come talk to me!)

Page 7: Www.openfabrics.org Open MPI Project State of the Union - April 2007 Jeff Squyres Cisco, Inc.

7www.openfabrics.org

Current projects

“Open MPI Project” is an umbrella organization for multiple projects OMPI: Open MPI ORTE:Open Run-Time Environment PLPA: Portable Linux Processor Affinity MTT: MPI (Middleware) Testing Tool

Page 8: Www.openfabrics.org Open MPI Project State of the Union - April 2007 Jeff Squyres Cisco, Inc.

8www.openfabrics.org

Project: Open MPI / ORTE

Recently released new 1.2 seriesOF-related changes compared to v1.1 series

Better overall performance, lots of bug fixes Improvements for run-time/launch scalability Relocate installed MPI (good for ISVs) Support for fork() with OFED 1.2 Support fixed limits for registered memory Fixes for heterogeneous network environments Native InfiniPath support

Page 9: Www.openfabrics.org Open MPI Project State of the Union - April 2007 Jeff Squyres Cisco, Inc.

9www.openfabrics.org

Version history

Page 10: Www.openfabrics.org Open MPI Project State of the Union - April 2007 Jeff Squyres Cisco, Inc.

10www.openfabrics.org

Success stories

OFED + Open MPI Thunderbird Sandia cluster

• #6 in Top 500

Road Runner Los Alamos cluster• 16k Opteron cores + 16k cell broadband engines

Coyote Los Alamos cluster• 2580 Opteron cores

Sun ClusterTools v7

Page 11: Www.openfabrics.org Open MPI Project State of the Union - April 2007 Jeff Squyres Cisco, Inc.

11www.openfabrics.org

OFED involvement

Initially planned on “v1.2ofed” Included some OF-

specific updates But community released

v1.2.1 before OFED 1.2

Therefore, included community OMPI v1.2.1 release in OFED v1.2

OMPI SVN development trunk

v1.2 series branch

v1.2

v1.2ofed

v1.2.1Today

Page 12: Www.openfabrics.org Open MPI Project State of the Union - April 2007 Jeff Squyres Cisco, Inc.

12www.openfabrics.org

OFED involvement

“MPI Selector” Menu-based and CLI commands

Trivially set system-wide and per-user default MPI selection No editing of “dot” files necessary Displays / select between all installed MPI’s

Works with all MPI’s Including HP MPI and Intel MPI

Page 13: Www.openfabrics.org Open MPI Project State of the Union - April 2007 Jeff Squyres Cisco, Inc.

13www.openfabrics.org

Ongoing OFA-related work

More flexible OF wireup schemes Heterogeneous networking scenarios Multiple QP’s per connection

More flexible resource affinity schemes Processor / core, HCA / port

Automatic path migrationRDMA CM functionalityBetter LMC / multi-LID routing

Page 14: Www.openfabrics.org Open MPI Project State of the Union - April 2007 Jeff Squyres Cisco, Inc.

14www.openfabrics.org

Ongoing OFA-related work

Message coalescingAsynchronous progressExploit new Mellanox HCA capabilitiesBetter utilization of network resourcesHeterogeneityMulticast, UD

Page 15: Www.openfabrics.org Open MPI Project State of the Union - April 2007 Jeff Squyres Cisco, Inc.

15www.openfabrics.org

Roadmap

1.2 series is current stable v1.2.1 latest release

1.3 series tentatively targeted at end of year Checkpoint / restart (and other FT) Integration with debuggers Windows support (*) MPI collectives performance improvements LSF integration

Page 16: Www.openfabrics.org Open MPI Project State of the Union - April 2007 Jeff Squyres Cisco, Inc.

16www.openfabrics.org

Project: Processor affinity (PLPA)

Linux API for affinity has changed 3 times Changed number and type of arguments Used same function name (!) Both kernel and glibc functions Installed glibc may not match kernel!

Affinity is critical for performance Especially with increasing core count per host Already critical on NUMA machines (locality!)

Page 17: Www.openfabrics.org Open MPI Project State of the Union - April 2007 Jeff Squyres Cisco, Inc.

17www.openfabrics.org

Which API to use?

Compile-time solution not sufficient Need complex “configure” script to figure it out Only determines glibc API, not kernel API, so it

may not even be sufficient Does not help for shipping static binaries (ISVs)

Need a run-time solution Paul Hargrove (LBNL) devised safe kernel probe PLPA library born

Page 18: Www.openfabrics.org Open MPI Project State of the Union - April 2007 Jeff Squyres Cisco, Inc.

18www.openfabrics.org

PLPA library

Constant API suitable for ISVs BSD license

Automatically performs the run-time probe Dispatches to correct back-end kernel function Bypasses glibc

Page 19: Www.openfabrics.org Open MPI Project State of the Union - April 2007 Jeff Squyres Cisco, Inc.

19www.openfabrics.org

Current status

Releases: Stable series: 1.0.x Upcoming series: 1.1

New 1.1 features Topology information

• Mapping between (socket,core) tuple, hardware threads, CPU node, and Linux processor ID’s

plpa-taskset(1) command• Same as taskset(1), but groks topology information

Page 20: Www.openfabrics.org Open MPI Project State of the Union - April 2007 Jeff Squyres Cisco, Inc.

20www.openfabrics.org

Project: MPI Testing Tool (MTT)

Could be named “Middleware Testing Tool” Very little (no?) MPI-specific

Not specific to Open MPI Has been used with LAM, MPICH2, MVAPICH2

Used as primary test mechanism for OMPI Distributed testing by member organizations

Page 21: Www.openfabrics.org Open MPI Project State of the Union - April 2007 Jeff Squyres Cisco, Inc.

21www.openfabrics.org

Open MPI MTT Usage

Distributed regression testing Nightly and weekend runs Results e-mailed every weekday morning

Supports various resource managersSupports correctness and performance testsCornerstone of Open MPI release process

Each member tests the platforms they care about

Page 22: Www.openfabrics.org Open MPI Project State of the Union - April 2007 Jeff Squyres Cisco, Inc.

22www.openfabrics.org

Indiana U. server

Member site Member site

Member siteMember site

Tarball

Nightly Regression Testing

Nightly tarballs created at Indiana U.

Member sites Download tarball and tests Compile and run tests

Members upload results to central DB

E-mail sent at 12 and 24 hour intervals

Real-time web querying

Page 23: Www.openfabrics.org Open MPI Project State of the Union - April 2007 Jeff Squyres Cisco, Inc.

23www.openfabrics.org

Indiana U. server

Member site Member site

Member siteMember site

Tarball DB

Nightly Regression Testing

Nightly tarballs created at Indiana U.

Member sites Download tarball and tests Compile and run tests

Members upload results to central DB

E-mail sent at 12 and 24 hour intervals

Real-time web querying

Page 24: Www.openfabrics.org Open MPI Project State of the Union - April 2007 Jeff Squyres Cisco, Inc.

24www.openfabrics.org

Indiana U. server

Tarball DB

Nightly Regression Testing

Nightly tarballs created at Indiana U.

Member sites Download tarball and tests Compile and run tests

Members upload results to central DB

E-mail sent at 12 and 24 hour intervals

Real-time web querying

Page 25: Www.openfabrics.org Open MPI Project State of the Union - April 2007 Jeff Squyres Cisco, Inc.

25www.openfabrics.org

Usage in Open MPI

Currently available to all OMPI members Strongly “encouraged”

E-mail results examined every day 12 and 24 hour windows Weekend windows

MTT software to be released publicly later this year

Page 26: Www.openfabrics.org Open MPI Project State of the Union - April 2007 Jeff Squyres Cisco, Inc.

26www.openfabrics.org

The Open MPI Project

More than just MPI Concerned with real-world HPC

Open community Come join us!

Solid OpenFabrics support is critical Many unanswered questions Plenty of room for academic and industry

ongoing work

Page 27: Www.openfabrics.org Open MPI Project State of the Union - April 2007 Jeff Squyres Cisco, Inc.

www.openfabrics.org

Thank You

http://www.open-mpi.org/