Top Banner
Develop Application with Open Fabrics Yufei Ren Tan Li
19

Develop Application with Open Fabrics Yufei Ren Tan Li.

Dec 20, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Develop Application with Open Fabrics Yufei Ren Tan Li.

Develop Application withOpen Fabrics

Yufei RenTan Li

Page 2: Develop Application with Open Fabrics Yufei Ren Tan Li.

 Agenda

• RDMA concept review• Modules in OFED-1.5.1 userspace• librdmacm(RDMA Communication)• libibverbs(InfiniBand)• Installation OFED on FedoraCore12/RHEL5

• about Lustre & future work

Page 3: Develop Application with Open Fabrics Yufei Ren Tan Li.

RDMA ?

• RDMA: networking technologies that have a software interface with three features:– Remote DMA (RDMA write, RDMA read)– Asynchronous work queues (as Tan has illustrated)– Kernel bypass

Page 4: Develop Application with Open Fabrics Yufei Ren Tan Li.

RDMA - Kernel bypass

non-iWARP iWARP

Page 5: Develop Application with Open Fabrics Yufei Ren Tan Li.

RDMA Verbs and Objects

• Not quite an API

• Abstract definition of functionality

• “Resources(Objects) operated on by Verbs(functions).”– such as Queue Pair/Completion Queue operated on by

Create/Destroy.– rdma_create_qp()/rdma_destroy_qp() in

librdmacm/include/rdma_cma.h

• Maybe considered as Object and Method in OO language(C++/Java).

Page 6: Develop Application with Open Fabrics Yufei Ren Tan Li.

What is OpenFabrics

• include:– Kernel-level drivers– Channel-oriented RDMA bypasses– Application Program Interface(API)

• for:– Parallel Message Passing(MPI)– Socket Data Exchage(SDP)– File System(Lustre)

 

Page 7: Develop Application with Open Fabrics Yufei Ren Tan Li.

Modules in OFED-1.5.1 userspace

• librdmacm: Linux library to abstract connection setup.

• libibverbs: a library that allows programs to use RDMA "verbs" for direct access to RDMA (currently InfiniBand and iWARP) hardware from userspace.

• device-specific drivers:– IB: libmthca, libmlx4, libipathverbs,

libehca– iWARP: libcxgb3, libamso

Page 8: Develop Application with Open Fabrics Yufei Ren Tan Li.

librdmacm

• Linux library to abstract connection setup. Same code runs on IB and iWARP fabric technologies.

• Mimics TCP socket model. (socket, connect, bind, listen, accept, getaddrinfo, etc). cm_id is socket analog.

• IP addressing can be used on iWARP, even InfiniBand (IPoIB).• Additional address/route resolution steps.–rdma_resolve_addr()–rdma_resolve_route()

• Events reported through “channels”- rdma_create_event_channel()- rdma_get_cm_channel()- rdma_ack_cm_channel()

   

Page 9: Develop Application with Open Fabrics Yufei Ren Tan Li.

An example of ftp via OpenFabrics

Put

Get

RDMA FTP Client

RDMA FTP Serverrdma_getaddrinfo()

rdma_create_ep()

rdma_listen()

rdma_accept()

blocks until connection from

client

rdma_get_recv_comp()

rdma_post_send()

rdma_connect()

rdma_post_send()

rdma_get_recv_comp()

rdma_disconnect()

connection establishment

data

data

rdma_getaddrinfo()

rdma_create_ep()

rdma_deg_mr()

rdma_destroy_ep()

rdma_disconnect()

rdma_deg_mr()

rdma_destroy_ep()

FTPProtocol

FS

Page 10: Develop Application with Open Fabrics Yufei Ren Tan Li.

librdmacm – initialization

• rdma_create_event_channel()– Open a channel used to report communication events.

Asynchronous events are reported to users through event channels. Each event channel maps to a file descriptor.

• rdma_create_id()– Allocate a communication identifier. Creates an

identifier that is used to track communication information. Just as socket_fd.

Page 11: Develop Application with Open Fabrics Yufei Ren Tan Li.

librdmacm – active connection steps• rdma_resolve_addr()

– Resolve destination and optional source addresses from IP addresses to an RDMA address. If successful, the specified rdma_cm_id will be bound to a local device. getaddrinfo() in socket API.

• rdma_resolve_route()– Resolve the route information needed to establish a

connection. This is called on the client side of a connection after calling rdma_resolve_addr, but before calling rdma_connect.

• rdma_connect()– Initiate an active connection request.

Page 12: Develop Application with Open Fabrics Yufei Ren Tan Li.

librdmacm – passive connection steps• rdma_bind_addr()

– Bind an RDMA identifier to a source address.

• rdma_listen()– Listen for incoming connection requests.

• rdma_accept()– Called to accept a connection request.

Page 13: Develop Application with Open Fabrics Yufei Ren Tan Li.

librdmacm – data transfer• rdma_post_send()

– opcode == IBV_WR_RDMA_READ– RDMA read

• rdma_post_send()– Opcode == IBV_WR_RDMA_WRITE– RDMA write.

• librdmacm/example/rping.c

Page 14: Develop Application with Open Fabrics Yufei Ren Tan Li.

librdmacm – Abbreviation•QP: queue pair•CQ: completion queue•WQ: working queue•MR: memory region•PD: protection domain•SRQ: shared receive queue•AH: address handle•MW: memory window

Page 15: Develop Application with Open Fabrics Yufei Ren Tan Li.

libibverbs• libibverbs is a library that allows programs to use

RDMA "verbs" for direct access to RDMA (currently InfiniBand and iWARP) hardware from userspace.

• Linux implementation of RDMA verbs.• Loads device-specific drivers for hardware

support.• IB: libmthca, libmlx4, libipathverbs, libehca• iWARP: libcxgb3, libamso

Page 16: Develop Application with Open Fabrics Yufei Ren Tan Li.

Install OFED on FedoraCore12

http://docs.google.com/Doc?docid=0AYXBBIFwi6bqZGY5cm1jeGJfNjAzc2N6eGt2Mw&hl=en 

Page 17: Develop Application with Open Fabrics Yufei Ren Tan Li.

lustre

• File system clients• Object Storage Servers(OSS): provide file I/O

services• Metadata Servers(MDS): manage the names and

directories in the file system

Page 18: Develop Application with Open Fabrics Yufei Ren Tan Li.

Lustre – cont’

Page 19: Develop Application with Open Fabrics Yufei Ren Tan Li.

Future work

• OpenFabrics run example on netqos04.• Configure lustre on netqos04. Real cluster need

more machines. LPAR?• OpenFabrics sources and RFC5040/5041/5044.