Top Banner
e-Science All Hands Meeting 1-4 Sep 03 R. Hughes-Jones Manchester 1 High Bandwidth High Throughput in the MB-NG & DataTAG Projects Richard Hughes-Jones, Stephen Dallison , Gareth Fairey Dept. of Physics and Astronomy, University of Manchester Robin Tasker Daresbury Laboratory CLRC Miguel Rio, Yee Ting Li Dept. of Physics and Astronomy, University College London MB - NG
28

E-Science All Hands Meeting 1-4 Sep 03 R. Hughes-Jones Manchester 1 High Bandwidth High Throughput in the MB-NG & DataTAG Projects Richard Hughes-Jones,

Mar 28, 2015

Download

Documents

Brian Larsen
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: E-Science All Hands Meeting 1-4 Sep 03 R. Hughes-Jones Manchester 1 High Bandwidth High Throughput in the MB-NG & DataTAG Projects Richard Hughes-Jones,

e-Science All Hands Meeting 1-4 Sep 03R. Hughes-Jones Manchester

1

High Bandwidth High Throughput in the MB-NG & DataTAG Projects

Richard Hughes-Jones, Stephen Dallison , Gareth Fairey Dept. of Physics and Astronomy, University of Manchester

Robin TaskerDaresbury Laboratory CLRC

Miguel Rio, Yee Ting LiDept. of Physics and Astronomy, University College London

MB - NG

Page 2: E-Science All Hands Meeting 1-4 Sep 03 R. Hughes-Jones Manchester 1 High Bandwidth High Throughput in the MB-NG & DataTAG Projects Richard Hughes-Jones,

e-Science All Hands Meeting 1-4 Sep 03R. Hughes-Jones Manchester

2

Topology of the MB – NG Network

KeyGigabit Ethernet2.5 Gbit POS Access

MPLS Admin. Domains

MB - NG UCL Domain

UKERNADevelopment

NetworkEdge Router Cisco 7609

man01

man03

Boundary Router Cisco 7609

Boundary Router Cisco 7609

RAL Domain

Manchester Domain lon01

lon02

lon03

man02

Page 3: E-Science All Hands Meeting 1-4 Sep 03 R. Hughes-Jones Manchester 1 High Bandwidth High Throughput in the MB-NG & DataTAG Projects Richard Hughes-Jones,

e-Science All Hands Meeting 1-4 Sep 03R. Hughes-Jones Manchester

3

r04gvaCisco7606

r04chi-Cisco7609stm16(DTag)

r05chi-JuniperM10

r06chi-Alcatel7770

r05gva-JuniperM10

r06gvaAlcatel7770

cernh4-Cisco7609ar3-chicago -Cisco7606

stm4(DTag)

cernh7-Cisco7609

SURFNET

stm16(Colt)backup+projects

s01gvaExtreme S1i

w01gvaw02gvaw03gvaw04gvaw05gvaw06gvaw20gvav02gvav03gva

7x

w01chiw02chi

v10chiv11chiv12chiv13chi

s01chiExtreme S5i

8x

3x

GEANT

VTHD/INRIA

stm16(FranceTelecom)

DataTAG

CERN/Caltech production NetworkChicago Geneva

s02gva Cisco5505-management

2x

2x

ONS15454

ONS15454

Alcatel 1670 Alcatel 1670

CANARIE

SURFNETCESNET

ONS15454

stm64(GC)

Cisco2950-management

SWITCHStm16(Swisscom)

CNAF

Datatag Testbed

GEANTcernh7

[email protected] last update: 20030701

w03chiw04chiw05chi

3x 3x 2x

2x

1000baseSX

SDH/Sonet

1000baseT

10GbaseLX

w03

w06chi

w01bol

CCC tunnel

STM64

Page 4: E-Science All Hands Meeting 1-4 Sep 03 R. Hughes-Jones Manchester 1 High Bandwidth High Throughput in the MB-NG & DataTAG Projects Richard Hughes-Jones,

e-Science All Hands Meeting 1-4 Sep 03R. Hughes-Jones Manchester

4

End Hosts how good are they really ?

Page 5: E-Science All Hands Meeting 1-4 Sep 03 R. Hughes-Jones Manchester 1 High Bandwidth High Throughput in the MB-NG & DataTAG Projects Richard Hughes-Jones,

e-Science All Hands Meeting 1-4 Sep 03R. Hughes-Jones Manchester

5

End Hosts b2b & end-to-end UDP Tests Test with UDPmon

Supermicro P4DP6

Max throughput 975Mbit/s 20% CPU utilisation receiver

packets > 1000 bytes 40% CPU utilisation smaller

packets

PCI:64 bit 66 MHz Latency 6,1ms & well behaved Latency Slope 0.0761 µs/byte B2B Expect: 0.0118 µs/byte

PCI 0.00188 GigE 0.008 PCI 0.00188

6 routers

Jitter small 2-3 µs FWHM

lon3-man1

0

100

200

300

400

500

600

700

800

900

1000

0 5 10 15 20 25 30 35 40

Delay between sending the frames us

Recv W

ire r

ate

Mb

its/s

50 bytes 100 bytes 200 bytes 400 bytes 600 bytes 800 bytes 1000 bytes 1200 bytes 1400 bytes 1472 bytes

1472 bytes w=100 jitter UDP

0

10000

20000

30000

40000

50000

60000

0 100 200 300Latency us

1472 bytes w=200 jitter UDP

0

10000

20000

30000

40000

50000

60000

70000

0 100 200 300Latency us

1472 bytes w=300 jitter UDP

0

10000

20000

30000

40000

50000

60000

0 100 200 300Latency us

lon3-man1

y = 0.0761x + 6075.9

6080

61006120

6140

6160

61806200

6220

0 200 400 600 800 1000 1200 1400Message length bytes

Late

ncy

us

Page 6: E-Science All Hands Meeting 1-4 Sep 03 R. Hughes-Jones Manchester 1 High Bandwidth High Throughput in the MB-NG & DataTAG Projects Richard Hughes-Jones,

e-Science All Hands Meeting 1-4 Sep 03R. Hughes-Jones Manchester

6

Send PCI

Receive PCI

Send setup

Data Transfers

Receive Transfers

Send PCI

Receive PCI

Send setup

Data Transfers

Receive Transfers

Signals on the PCI bus 1472 byte packets every 15 µs Intel Pro/1000

PCI:64 bit 33 MHz

82% usage

PCI:64 bit 66 MHz

65% usage

Data transfers half as long

Page 7: E-Science All Hands Meeting 1-4 Sep 03 R. Hughes-Jones Manchester 1 High Bandwidth High Throughput in the MB-NG & DataTAG Projects Richard Hughes-Jones,

e-Science All Hands Meeting 1-4 Sep 03R. Hughes-Jones Manchester

7

Interrupt Coalescence Investigations Kernel parameters for

Socket Buffer size rtt*BW

TCP mem-mem lon2-man1 Tx 64 Tx-abs 64 Rx 0 Rx-abs 128 820-980 Mbit/s +- 50 Mbit/s

Tx 64 Tx-abs 64 Rx 20 Rx-abs 128 937-940 Mbit/s +- 1.5 Mbit/s

Tx 64 Tx-abs 64 Rx 80 Rx-abs 128 937-939 Mbit/s +- 1 Mbit/s

Page 8: E-Science All Hands Meeting 1-4 Sep 03 R. Hughes-Jones Manchester 1 High Bandwidth High Throughput in the MB-NG & DataTAG Projects Richard Hughes-Jones,

e-Science All Hands Meeting 1-4 Sep 03R. Hughes-Jones Manchester

8

txqueuelen-vs-sendstalls

Tx Queue located betweenIP stack & NIC driver

TCP treats ‘Queue full’ as congestion !

Results for Lon Man

Select txqueuelen =2000

Page 9: E-Science All Hands Meeting 1-4 Sep 03 R. Hughes-Jones Manchester 1 High Bandwidth High Throughput in the MB-NG & DataTAG Projects Richard Hughes-Jones,

e-Science All Hands Meeting 1-4 Sep 03R. Hughes-Jones Manchester

9

Network Investigations

Page 10: E-Science All Hands Meeting 1-4 Sep 03 R. Hughes-Jones Manchester 1 High Bandwidth High Throughput in the MB-NG & DataTAG Projects Richard Hughes-Jones,

e-Science All Hands Meeting 1-4 Sep 03R. Hughes-Jones Manchester

10

Network BottlenecksBackbones 2.5 and 10 Gbit – usually good (in Europe)Access links need care GEANT-NRN and Campus – SuperJANET4NNW – SJ4 Access: given as example of good forward planning:

10 November 2002

1 Gbit link

24 February 200326 Feb 2003

Upgraded to 2.5 Gbit

Trunking – use of multiple 1 Gbit Ethernet links

Page 11: E-Science All Hands Meeting 1-4 Sep 03 R. Hughes-Jones Manchester 1 High Bandwidth High Throughput in the MB-NG & DataTAG Projects Richard Hughes-Jones,

e-Science All Hands Meeting 1-4 Sep 03R. Hughes-Jones Manchester

11

24 Hours HighSpeed TCP mem-mem

TCP mem-mem lon2-man1 Tx 64 Tx-abs 64 Rx 64 Rx-abs 128 941.5 Mbit/s +- 0.5 Mbit/s

Page 12: E-Science All Hands Meeting 1-4 Sep 03 R. Hughes-Jones Manchester 1 High Bandwidth High Throughput in the MB-NG & DataTAG Projects Richard Hughes-Jones,

e-Science All Hands Meeting 1-4 Sep 03R. Hughes-Jones Manchester

12

1 stream every 60 s: man1 lon2 man2 lon2 man3 lon2

Sample every 10ms

1 Stream: Average 940 Mbit/s No Dup ACKs No SACKs No Sendstalls

2 Streams: Average ~500 Mbit/s Many Dup ACKs Cwnd reduced

2 Streams: Average ~300 Mbit/s

TCP sharing man1-lon2

Page 13: E-Science All Hands Meeting 1-4 Sep 03 R. Hughes-Jones Manchester 1 High Bandwidth High Throughput in the MB-NG & DataTAG Projects Richard Hughes-Jones,

e-Science All Hands Meeting 1-4 Sep 03R. Hughes-Jones Manchester

13

2Streams:

Dips in throughput due to Dup ACK

~4 losses /sec A bit regular ?

Cwnd decreases: 1 point 33% Ramp starts at 62% Slope 70Bytes/us

2 TCP streams man1-lon2

1 sec

Page 14: E-Science All Hands Meeting 1-4 Sep 03 R. Hughes-Jones Manchester 1 High Bandwidth High Throughput in the MB-NG & DataTAG Projects Richard Hughes-Jones,

e-Science All Hands Meeting 1-4 Sep 03R. Hughes-Jones Manchester

14

Standard TCPHighSpeed TCPScalable TCP

kernel on the receiver dropped packets periodically

MB-NG Network rtt 6.2 ms.Recovery time 1.6s

DataTAG Network rtt 119 ms. Recovery time 590s 9.8 min

Throughput of the DataTAG network was factor ~5 lower than that on the MB-NG network

TCP Protocol Stack Comparisons

MB-NG

DataTAG

MSS

RTTC

*2

* 2

Page 15: E-Science All Hands Meeting 1-4 Sep 03 R. Hughes-Jones Manchester 1 High Bandwidth High Throughput in the MB-NG & DataTAG Projects Richard Hughes-Jones,

e-Science All Hands Meeting 1-4 Sep 03R. Hughes-Jones Manchester

15

Application Throughput

Page 16: E-Science All Hands Meeting 1-4 Sep 03 R. Hughes-Jones Manchester 1 High Bandwidth High Throughput in the MB-NG & DataTAG Projects Richard Hughes-Jones,

e-Science All Hands Meeting 1-4 Sep 03R. Hughes-Jones Manchester

16PC PC

MB – NG SuperJANET4 Development Network

UCL

OSM-1OC48-POS-SS

MCC

OSM-1OC48-POS-SS

MAN

Gigabit Ethernet2.5 Gbit POS Access2.5 Gbit POS coreMPLS Admin. Domains

SJ4 Dev

SJ4 DevSJ4

Dev

SJ4 Dev

PC PCPC

3ware RAID0

PC

3ware RAID0

MB - NG

Page 17: E-Science All Hands Meeting 1-4 Sep 03 R. Hughes-Jones Manchester 1 High Bandwidth High Throughput in the MB-NG & DataTAG Projects Richard Hughes-Jones,

e-Science All Hands Meeting 1-4 Sep 03R. Hughes-Jones Manchester

17

Gridftp Throughput HighSpeedTCP RAID0 Disk Tests:

120 Mbytes/s Read 100 Mbytes/s Write

Int Coal 64 128 Txqueuelen 2000 TCP buffer 1 M byte

(rtt*BW = 750kbytes)

Interface throughput

Data Rate: 520 Mbit/s

Same for B2B tests

So its not that simple!

TCP ACK traffic

Data traffic

Page 18: E-Science All Hands Meeting 1-4 Sep 03 R. Hughes-Jones Manchester 1 High Bandwidth High Throughput in the MB-NG & DataTAG Projects Richard Hughes-Jones,

e-Science All Hands Meeting 1-4 Sep 03R. Hughes-Jones Manchester

18

Gridftp Throughput + Web100

Throughput Mbit/s:

See alternate 600/800 Mbitand zero

Cwnd smooth No dup Ack / send stall /

timeouts

Page 19: E-Science All Hands Meeting 1-4 Sep 03 R. Hughes-Jones Manchester 1 High Bandwidth High Throughput in the MB-NG & DataTAG Projects Richard Hughes-Jones,

e-Science All Hands Meeting 1-4 Sep 03R. Hughes-Jones Manchester

19

http data transfers HighSpeed TCP

Bulk data moved by web servers Apachie web server

out of the box! prototype client - curl http library 1Mbyte TCP buffers 2Gbyte file Throughput ~720 Mbit/s Cwnd - some variation No dup Ack / send stall /

timeouts

Page 20: E-Science All Hands Meeting 1-4 Sep 03 R. Hughes-Jones Manchester 1 High Bandwidth High Throughput in the MB-NG & DataTAG Projects Richard Hughes-Jones,

e-Science All Hands Meeting 1-4 Sep 03R. Hughes-Jones Manchester

20

BaBar Case Study: Disk Performace

BaBar Disk Server Tyan Tiger S2466N

motherboard 1 64bit 66 MHz PCI bus Athlon MP2000+ CPU AMD-760 MPX chipset 3Ware 7500-8 RAID5 8 * 200Gb Maxtor IDE

7200rpm disks Note the VM parameter

readahead max

Disk to memory (read)Max throughput 1.2 Gbit/s 150 MBytes/s)

Memory to disk (write)Max throughput 400 Mbit/s 50 MBytes/s)[not as fast as Raid0]

Page 21: E-Science All Hands Meeting 1-4 Sep 03 R. Hughes-Jones Manchester 1 High Bandwidth High Throughput in the MB-NG & DataTAG Projects Richard Hughes-Jones,

e-Science All Hands Meeting 1-4 Sep 03R. Hughes-Jones Manchester

21

BaBar Case Study: Throughput & PCI Activity

3Ware forces PCI bus to 33 MHz BaBar Tyan to MB-NG SuperMicro

Network mem-mem 619 Mbit/s

Disk – disk throughput bbcp 40-45 Mbytes/s (320 – 360 Mbit/s)

PCI bus effectively full!

Read from RAID5 Disks Write to RAID5 Disks

Page 22: E-Science All Hands Meeting 1-4 Sep 03 R. Hughes-Jones Manchester 1 High Bandwidth High Throughput in the MB-NG & DataTAG Projects Richard Hughes-Jones,

e-Science All Hands Meeting 1-4 Sep 03R. Hughes-Jones Manchester

22

Conclusions

The MB-NG Project has achieved: Continuous memory to memory data transfers with an average user data rate of

940 Mbit/s for over 24 hours using the HighSpeed TCP stack. Sustained high throughput data transfers of 2 GByte files between RAID0 disk

systems using Gridftp and bbcp. Transfers of 2 GByte files using the http protocol from the standard apache Web

server and HighSpeed TCP that achieved data rates of ~725 Mbit/s. Ongoing operation and comparison of different Transport Protocols

- Optical Switched Networks Detailed investigation of Routers, NICs & end-host performance. Working with e-Science groups to get high performance to the user.

Sustained data flows at Gigabit rates are achievable Use Server quality PCs not Supermarket PCs + care with interfaces Be kind to the Wizards !

Page 23: E-Science All Hands Meeting 1-4 Sep 03 R. Hughes-Jones Manchester 1 High Bandwidth High Throughput in the MB-NG & DataTAG Projects Richard Hughes-Jones,

e-Science All Hands Meeting 1-4 Sep 03R. Hughes-Jones Manchester

23

More Information Some URLs

MB-NG project web site: http://www.mb-ng.net/ DataTAG project web site: http://www.datatag.org/UDPmon / TCPmon kit + writeup:

http://www.hep.man.ac.uk/~rich/netMotherboard and NIC Tests:

www.hep.man.ac.uk/~rich/net/nic/GigEth_tests_Boston.ppt& http://datatag.web.cern.ch/datatag/pfldnet2003/

TCP tuning information may be found at:http://www.ncne.nlanr.net/documentation/faq/performance.html & http://www.psc.edu/networking/perf_tune.html

Page 24: E-Science All Hands Meeting 1-4 Sep 03 R. Hughes-Jones Manchester 1 High Bandwidth High Throughput in the MB-NG & DataTAG Projects Richard Hughes-Jones,

e-Science All Hands Meeting 1-4 Sep 03R. Hughes-Jones Manchester

24

Backup Slides

Page 25: E-Science All Hands Meeting 1-4 Sep 03 R. Hughes-Jones Manchester 1 High Bandwidth High Throughput in the MB-NG & DataTAG Projects Richard Hughes-Jones,

e-Science All Hands Meeting 1-4 Sep 03R. Hughes-Jones Manchester

25

EU Review Demo Consisted of:

Raid0Disk

Data over TCP Streams

Raid0Disk

GridFTP GridFTP

Dante MonitoringNode Monitoring Site Monitoring

Page 26: E-Science All Hands Meeting 1-4 Sep 03 R. Hughes-Jones Manchester 1 High Bandwidth High Throughput in the MB-NG & DataTAG Projects Richard Hughes-Jones,

e-Science All Hands Meeting 1-4 Sep 03R. Hughes-Jones Manchester

26

Throughput on the day !

TCP ACKs

Data~400 Mbit/s

Page 27: E-Science All Hands Meeting 1-4 Sep 03 R. Hughes-Jones Manchester 1 High Bandwidth High Throughput in the MB-NG & DataTAG Projects Richard Hughes-Jones,

e-Science All Hands Meeting 1-4 Sep 03R. Hughes-Jones Manchester

27

Some Measurements of Throughput CERN -SARAStandard TCP txlen 100 25 Jan03

0

100

200

300

400

500

1043509370 1043509470 1043509570 1043509670 1043509770

Time

I/f

Rat

e M

bits

/s

00.20.40.60.811.21.41.61.82

Re

cv. R

ate

Mb

its/s

Out Mbit/s In Mbit/s

Hispeed TCP txlen 2000 26 Jan03

0

100

200

300

400

500

1043577520 1043577620 1043577720 1043577820 1043577920Time

I/f

Rat

e M

bits

/s

00.20.40.60.811.21.41.61.82

Recv. R

ate

Mbits

/s

Out Mbit/s

In Mbit/s

Using the GÉANT Backup Link 1 GByte file transfers

Blue Data

Red TCP ACKs

Standard TCPAverage Throughput 167 Mbit/s

Users see 5 - 50 Mbit/s!

High-Speed TCPAverage Throughput 345 Mbit/s

Scalable TCPAverage Throughput 340

Mbit/s

Scalable TCP txlen 2000 27 Jan03

0

100

200

300

400

500

1043678800 1043678900 1043679000 1043679100 1043679200Time

II/f

Rat

e M

bits

/s

00.20.40.60.811.21.41.61.82

Re

cv. R

ate

Mb

its/s

Out Mbit/s

In Mbit/s

Page 28: E-Science All Hands Meeting 1-4 Sep 03 R. Hughes-Jones Manchester 1 High Bandwidth High Throughput in the MB-NG & DataTAG Projects Richard Hughes-Jones,

e-Science All Hands Meeting 1-4 Sep 03R. Hughes-Jones Manchester

28

What the Users Really find:CERN – RAL using production GÉANT

CMS Tests 8 streams

50 Mbit/s @ 15 MB buffer

Firewall 100 Mbit/s

NNW – SJ4 Access1 Gbit link

CERN -RAL 12 Dec 02

0102030405060708090

0 10 20 30 40 50time 0.5 hr

hro

ughput

Mbit/s

Total RateRate/Stream