Top Banner
Infiniband and RDMA Technology Doug Ledford
13

Infiniband and RDMA Technologyrich/Infiniband/Summit2006-RDMA.pdf1.2 4.3 16.1 60.3 2003 2004 2005* 2006 p Cancels per Trade CAGR 269% 2005* = Aug 05 – Source: NASDAQ 12.6 20.5 36.1

Sep 29, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Infiniband and RDMA Technologyrich/Infiniband/Summit2006-RDMA.pdf1.2 4.3 16.1 60.3 2003 2004 2005* 2006 p Cancels per Trade CAGR 269% 2005* = Aug 05 – Source: NASDAQ 12.6 20.5 36.1

Infiniband and RDMA TechnologyDoug Ledford

Page 2: Infiniband and RDMA Technologyrich/Infiniband/Summit2006-RDMA.pdf1.2 4.3 16.1 60.3 2003 2004 2005* 2006 p Cancels per Trade CAGR 269% 2005* = Aug 05 – Source: NASDAQ 12.6 20.5 36.1

Top 500 SupercomputersNov 2005

● #5 Sandia National Labs, 4500 machines, 9000 CPUs, 38TFlops, 1 big headache

● Performance great....but....● Adding new machines problematic due to software 

interactions● Diagnosing and locating faults very difficult

Page 3: Infiniband and RDMA Technologyrich/Infiniband/Summit2006-RDMA.pdf1.2 4.3 16.1 60.3 2003 2004 2005* 2006 p Cancels per Trade CAGR 269% 2005* = Aug 05 – Source: NASDAQ 12.6 20.5 36.1

OpenFabrics Software Stack

InfiniBandHCA

iWARPNIC

HardwareSpecific Driver

HardwareSpecificDriver

InfiniBandConnection

Manager (CM)

InfiniBandMAD

InfiniBandSpecificVerbs

InfiniBandSubnet Admin

Client(SA Client)

iWARPConnection

Manager (CM)

iWARPSpecific

Verbs/API

ConnectionManager

Abstraction (CMA)

Common Verbs/ API

User Level Verbs

SDPIPoIB SRP iSER RDS

UDAPL

SDP LibraryUser Level MAD API

Open SMDiagnostic

Tools

Hardware

Provider

Core

Verbs / APILayer

Upper Layer Protocol

User APIs

NFS-RDMARPC

ClusteredDB Access

ClusterFS

SocketsBasedAccess

VariousMPIs

Application Level

Access to File

Systems

BlockStorageAccess

IP basedApp

Access

OtherFS

Headache

Page 4: Infiniband and RDMA Technologyrich/Infiniband/Summit2006-RDMA.pdf1.2 4.3 16.1 60.3 2003 2004 2005* 2006 p Cancels per Trade CAGR 269% 2005* = Aug 05 – Source: NASDAQ 12.6 20.5 36.1

690

140030

6

0

200

400

600

800

1000

1200

1400

1600

10GigE w/ TOE 20GigInfiniBand

Thro

ughp

ut (M

B/s)

0

5

10

15

20

25

30

35

CPU 

Utiliz

atio

n (%

)

Source: “ Head to TOE” from OSU, “ InfiniBand and 10-Gigabit Ethernet for I/O in cluster computing”from Sandia National Laboratories, and Mellanox

Page 5: Infiniband and RDMA Technologyrich/Infiniband/Summit2006-RDMA.pdf1.2 4.3 16.1 60.3 2003 2004 2005* 2006 p Cancels per Trade CAGR 269% 2005* = Aug 05 – Source: NASDAQ 12.6 20.5 36.1

1.2 4.3

16.1

60.3

2003 2004 2005* 2006 p

Cancels per Trade

CAGR269%

2005* = Aug 05 – Source: NASDAQ

1.2 4.3

16.1

60.3

2003 2004 2005* 2006 p

Cancels per Trade

CAGR269%

1.2 4.3

16.1

60.3

2003 2004 2005* 2006 p

Cancels per Trade

CAGR269%

2005* = Aug 05 – Source: NASDAQ

12.6

20.5

36.1

63.6

2003 2004 2005* 2006 p

Quotes per Trade

CAGR72%

2005* = Aug 05 – Source: NASDAQ

12.6

20.5

36.1

63.6

2003 2004 2005* 2006 p

Quotes per Trade

CAGR72%

12.6

20.5

36.1

63.6

2003 2004 2005* 2006 p

Quotes per Trade

CAGR72%

2005* = Aug 05 – Source: NASDAQ

2005

120,000

80,000

4,799 7,063 9,650 12,90625,869

55,105

2000 2001 2002 2003 2004 Feb­05 Jun­05 Dec2005Proj

Aggregated One Minute Peak MPS Rates CTS, CQS, OPRA, NQDS

Source: SIAC, OPRA, and NASDAQ

Wall Street Trading Environment Challenges

Page 6: Infiniband and RDMA Technologyrich/Infiniband/Summit2006-RDMA.pdf1.2 4.3 16.1 60.3 2003 2004 2005* 2006 p Cancels per Trade CAGR 269% 2005* = Aug 05 – Source: NASDAQ 12.6 20.5 36.1

Performance-Low Latency-High Bandwidth-Efficient CPU Utilization-Reliable Transport

HighPerformanceComputing Storage

MiddlewareServers

Aggregation

Web Servers

Number of Concurrent Applications

EnterpriseLANs

InfiniBand

10GigE

1GigE

HighPerformanceEmbedded

Interconnects

FibreChannel

DatabaseServers

Page 7: Infiniband and RDMA Technologyrich/Infiniband/Summit2006-RDMA.pdf1.2 4.3 16.1 60.3 2003 2004 2005* 2006 p Cancels per Trade CAGR 269% 2005* = Aug 05 – Source: NASDAQ 12.6 20.5 36.1

● Price/Performance– $69 (OEM) adapter IC vs. $500 for similar 10GigE adapter IC solution– $200 (OEM) adapter card vs. $2000 for comparable 10GigE card – 1.4GB/s and 2.7µs latency

● Virtualization– Highest utilization of computing and storage resources– Simplifies adding resources for rapidly expanding data centers

IB HCAIB HCA20 Gb/s

VirtualMachine …

Hypervisor

GbE NIC

Page 8: Infiniband and RDMA Technologyrich/Infiniband/Summit2006-RDMA.pdf1.2 4.3 16.1 60.3 2003 2004 2005* 2006 p Cancels per Trade CAGR 269% 2005* = Aug 05 – Source: NASDAQ 12.6 20.5 36.1
Page 9: Infiniband and RDMA Technologyrich/Infiniband/Summit2006-RDMA.pdf1.2 4.3 16.1 60.3 2003 2004 2005* 2006 p Cancels per Trade CAGR 269% 2005* = Aug 05 – Source: NASDAQ 12.6 20.5 36.1

New Top500 Cluster

01020304050607080

New IB clusters New Myrinetclusters

New Quadricsclusters

2004 2005

Infiniband/RDMA use climbing rapidly

Page 10: Infiniband and RDMA Technologyrich/Infiniband/Summit2006-RDMA.pdf1.2 4.3 16.1 60.3 2003 2004 2005* 2006 p Cancels per Trade CAGR 269% 2005* = Aug 05 – Source: NASDAQ 12.6 20.5 36.1

Common Verbs/ API

User Level Verbs

SDPIPoIB

UDAPL

SDP Library

Kernel Provided Interface

User APIs

SocketsBasedAccess

VariousMPIs

Application Level

IP basedApp

Access

Level 1 ­ IPoIB

● Easiest to use, requires no modification of applications● Lowest overall payback

Page 11: Infiniband and RDMA Technologyrich/Infiniband/Summit2006-RDMA.pdf1.2 4.3 16.1 60.3 2003 2004 2005* 2006 p Cancels per Trade CAGR 269% 2005* = Aug 05 – Source: NASDAQ 12.6 20.5 36.1

Common Verbs/ API

User Level Verbs

SDPIPoIB

UDAPL

SDP Library

Kernel Provided Interface

User APIs

SocketsBasedAccess

VariousMPIs

Application Level

IP basedApp

Access

Level 2 – SDP

● You might be able to use libsdp library to enable SDP in your application without any code changes or recompiles● If not, the code changes to natively support SDP are very minimal● This methods gets a good deal of the RDMA benefit

Page 12: Infiniband and RDMA Technologyrich/Infiniband/Summit2006-RDMA.pdf1.2 4.3 16.1 60.3 2003 2004 2005* 2006 p Cancels per Trade CAGR 269% 2005* = Aug 05 – Source: NASDAQ 12.6 20.5 36.1

Common Verbs/ API

User Level Verbs

SDPIPoIB

UDAPL

SDP Library

Kernel Provided Interface

User APIs

SocketsBasedAccess

VariousMPIs

Application Level

IP basedApp

Access

Level 3 – Verbs/MPI

● Code must be written to either the verbs or MPI API● Code changes are not minimal, and in some cases require rethinking of application design● This methods gets full benefit of RDMA capabilities

Page 13: Infiniband and RDMA Technologyrich/Infiniband/Summit2006-RDMA.pdf1.2 4.3 16.1 60.3 2003 2004 2005* 2006 p Cancels per Trade CAGR 269% 2005* = Aug 05 – Source: NASDAQ 12.6 20.5 36.1

OpenFabrics Software Stack

InfiniBandHCA

iWARPNIC*

HardwareSpecific Driver

HardwareSpecificDriver*

InfiniBandConnection

Manager (CM)

InfiniBandMAD

InfiniBandSpecificVerbs

InfiniBand SubnetAdmin Client(SA Client)

iWARPConnection

Manager (CM)*

iWARPSpecific

Verbs/API*

ConnectionManager

Abstraction (CMA)

Common Verbs/ API

User Level Verbs

SDPIPoIB SRP iSER* RDS*

UDAPL

SDP LibraryUser Level MAD API

Open SMDiagnostic

Tools

Hardware

Provider

Core

Verbs / API Layer

Upper Layer Protocol

User

APIs

ClusteredDB Access

SocketsBasedAccess

VariousMPIs

Application Level

BlockStorageAccess

IP basedApp

Access

Common

IB Specific

iWARP Specific

Key

* ­ Future