Top Banner
1 Characterization and Evaluation of TCP and UDP- based Transport on Real Networks Les Cottrell, Saad Ansari, Parakram Khandpur, Ruchi Gupta, Richard Hughes-Jones, Michael Chen, Larry McIntosh, Frank Leers SLAC, Manchester University, Chelsio and Sun Protocols for Fast Long Distance Networks, Lyon, France February, 2005 www.slac.stanford.edu/grp/scs/net/talk05/pfld-feb05.ppt Partially funded by DOE/MICS Field Work Proposal on Internet End-to-end Performance Monitoring (IEPM)
20

1 Characterization and Evaluation of TCP and UDP-based Transport on Real Networks Les Cottrell, Saad Ansari, Parakram Khandpur, Ruchi Gupta, Richard Hughes-Jones,

Dec 13, 2015

Download

Documents

Flora Ross
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 1 Characterization and Evaluation of TCP and UDP-based Transport on Real Networks Les Cottrell, Saad Ansari, Parakram Khandpur, Ruchi Gupta, Richard Hughes-Jones,

1

Characterization and Evaluation of TCP and UDP-based Transport on Real

Networks Les Cottrell, Saad Ansari, Parakram Khandpur, Ruchi Gupta, Richard Hughes-Jones,

Michael Chen, Larry McIntosh, Frank LeersSLAC, Manchester University, Chelsio and Sun

Protocols for Fast Long Distance Networks, Lyon, FranceFebruary, 2005

www.slac.stanford.edu/grp/scs/net/talk05/pfld-feb05.ppt

Partially funded by DOE/MICS Field Work Proposal on Internet End-to-end Performance Monitoring

(IEPM)

Page 2: 1 Characterization and Evaluation of TCP and UDP-based Transport on Real Networks Les Cottrell, Saad Ansari, Parakram Khandpur, Ruchi Gupta, Richard Hughes-Jones,

2

Project goals• Evaluate various techniques for achieving high bulk-

throughput on fast long-distance real production WAN links

• How useful for production: ease of configuration, throughput, convergence, fairness, stability etc.

• For different RTTs• Recommend “optimum” techniques for data intensive

science (BaBar) transfers using bbftp, bbcp, GridFTP• Provide input for validation of simulator & emulator

findings

Page 3: 1 Characterization and Evaluation of TCP and UDP-based Transport on Real Networks Les Cottrell, Saad Ansari, Parakram Khandpur, Ruchi Gupta, Richard Hughes-Jones,

3

Techniques rejected• Jumbo frames

– Not an IEEE standard– May break some UDP applications– Not supported on SLAC LAN

• Sender mods only, HENP model is few big senders, lots of smaller receivers– Simplifies deployment, only a few hosts at a few

sending sites– So no Dynamic Right Sizing (DRS)

• Runs on production nets– No router mods (XCP/ECN)

Page 4: 1 Characterization and Evaluation of TCP and UDP-based Transport on Real Networks Les Cottrell, Saad Ansari, Parakram Khandpur, Ruchi Gupta, Richard Hughes-Jones,

4

Software Transports• Advanced TCP stacks

– To overcome AIMD congestion behavior of Reno based TCPs

– BUT: • SLAC “datamover” are all based on Solaris, while

advanced TCPs currently are Linux only• SLAC production systems people concerned about non-

standard kernels, ensuring TCP patches keep current with security patches for SLAC supported Linux version

• So also very interested in transport that runs in user space (no kernel mods)– Evaluate UDT from UIC folks

Page 5: 1 Characterization and Evaluation of TCP and UDP-based Transport on Real Networks Les Cottrell, Saad Ansari, Parakram Khandpur, Ruchi Gupta, Richard Hughes-Jones,

5

Hardware Assists• For 1Gbits/s paths, cpu, bus etc. not a problem• For 10Gbits/s they are more important• NIC assistance to the CPU is becoming popular

– Checksum offload– Interrupt coalescence– Large send/receive offload (LSO/LRO)– TCP Offload Engine (TOE)

• Several vendors for 10Gbits/s NICs, at least one for 1Gbits/s NIC

• But currently restricts to using NIC vendor’s TCP implementation

• Most focus is on the LAN– Cheap alternative to Infiniband, MyriNet etc.

Page 6: 1 Characterization and Evaluation of TCP and UDP-based Transport on Real Networks Les Cottrell, Saad Ansari, Parakram Khandpur, Ruchi Gupta, Richard Hughes-Jones,

6

Protocols Evaluated• TCP (implementations as of April 2004)

– Linux 2.4 New Reno with SACK: single and parallel streams (Reno)

– Scalable TCP (Scalable)– Fast TCP– HighSpeed TCP (HSTCP)– HighSpeed TCP Low Priority (HSTCP-LP)– Binary Increase Control TCP (BICTCP)– Hamilton TCP (HTCP)– Layering TCP (LTCP)

• UDP – UDT v2.

Page 7: 1 Characterization and Evaluation of TCP and UDP-based Transport on Real Networks Les Cottrell, Saad Ansari, Parakram Khandpur, Ruchi Gupta, Richard Hughes-Jones,

7

• Chose 3 paths from SLAC– Caltech (10ms), Univ Florida (80ms), CERN

(180ms)

• Used iperf/TCP and UDT/UDP to generate traffic

• Each run was 16 minutes, in 7 regions

Methodology (1Gbit/s)

Ping 1/s

Iperf or UDT

ICMP/ping traffic

TCP/UDP

bottleneck

iperf

SLACCaltech/UFL/CERN

2 mins 4 mins

Page 8: 1 Characterization and Evaluation of TCP and UDP-based Transport on Real Networks Les Cottrell, Saad Ansari, Parakram Khandpur, Ruchi Gupta, Richard Hughes-Jones,

8

Behavior Indicators• Achievable throughput

• Stability S= σ/μ (standard deviation/average)

• Intra-protocol fairness F =

n

ii

n

ii n

1

22

1

n

ii

n

ii n

1

22

1

n

ii

n

ii n

1

22

1

Page 9: 1 Characterization and Evaluation of TCP and UDP-based Transport on Real Networks Les Cottrell, Saad Ansari, Parakram Khandpur, Ruchi Gupta, Richard Hughes-Jones,

9

Behavior wrt RTT• 10ms (Caltech): Throughput, Stability (small is

good), Fairness minimum (over regions 2 thru 6) (closer to 1 is better)

– Excl. FAST ~ 720±64Mbps, S~0.18±0.04, F~0.95– FAST ~ 400±120Mbps, S=0.33, F~0.88

• 80ms (U. Florida): Throughput, Stability– All ~ 350±103Mbps, S=0.3±0.12, F~0.82

• 180ms (CERN): – All ~ 340±130Mbps, S=0.42±0.17, F~0.81

• The Stability and Fairness effects are more manifest on longer RTT, so focus on CERN

Page 10: 1 Characterization and Evaluation of TCP and UDP-based Transport on Real Networks Les Cottrell, Saad Ansari, Parakram Khandpur, Ruchi Gupta, Richard Hughes-Jones,

10

Reno single stream• Low performance on fast long distance paths

– AIMD (add a=1 pkt to cwnd / RTT, decrease cwnd by factor b=0.5 in congestion)

– Net effect: recovers slowly, does not effectively use available bandwidth, so poor throughput

• Remaining flows do not take up slack when flow removed

Congestion has a dramatic effect

Recovery is slow

Multiple streams increase recovery rate

SLAC to CERN

RTT increases when achieves best throughput

Page 11: 1 Characterization and Evaluation of TCP and UDP-based Transport on Real Networks Les Cottrell, Saad Ansari, Parakram Khandpur, Ruchi Gupta, Richard Hughes-Jones,

11

Fast • Also uses RTT to detect congestion

– RTT is very stable: σ(RTT) ~ 9ms vs 37±0.14ms for the others

SLAC-CERN

Big drops in throughput which take several seconds to recover from

2nd flow never gets equal share of bandwidth

Page 12: 1 Characterization and Evaluation of TCP and UDP-based Transport on Real Networks Les Cottrell, Saad Ansari, Parakram Khandpur, Ruchi Gupta, Richard Hughes-Jones,

12

HTCP• One of the best performers

– Throughput is high– Big effects on RTT when achieves best throughput– Flows share equally

Appears to need >1 flow toachieve best throughput

Two flows share equally

SLAC-CERN

> 2 flows appears less stable

Page 13: 1 Characterization and Evaluation of TCP and UDP-based Transport on Real Networks Les Cottrell, Saad Ansari, Parakram Khandpur, Ruchi Gupta, Richard Hughes-Jones,

13

BICTCP• Needs > 1 flow for best throughput

Page 14: 1 Characterization and Evaluation of TCP and UDP-based Transport on Real Networks Les Cottrell, Saad Ansari, Parakram Khandpur, Ruchi Gupta, Richard Hughes-Jones,

14

UDTv2• Similar behavior to better TCP stacks

– RTT very variable at best throughputs– Intra-protocol sharing is good– Behaves well as flows add & subtract

Page 15: 1 Characterization and Evaluation of TCP and UDP-based Transport on Real Networks Les Cottrell, Saad Ansari, Parakram Khandpur, Ruchi Gupta, Richard Hughes-Jones,

15

Overall ProtoAvg thru (Mbps)

S (σ/μ)

min (F)

σ (RTT)

MHz/ Mbps

Scal. 423±115 0.27 0.83 22 0.64

BIC 412±117 0.28 0.98 55 0.71

HTCP 402±113 0.28 0.99 57 0.65

UDT 390±136 0.35 0.95 49 1.2

LTCP 376±137 0.36 0.56 41 0.67

Fast 335±110 0.33 0.58 9 0.66

HSTCP 255±187 0.73 0.79 25 0.9

Reno 248±163 0.66 0.6 22 0.63

HSTCP-LP 228±114 0.5 0.64 33 0.65

Scalable is one of best, but inter-protocol fairness is poor (see Bullot et al.)BIC & HTCP are about equalUDT is close, BUT cpu intensive (factor of 2, used to be >factor of 10 worse)Fast gives low RTT values & variabilityAll TCP protocols use similar cpu (HSTCP looks poor because throughput low)

Page 16: 1 Characterization and Evaluation of TCP and UDP-based Transport on Real Networks Les Cottrell, Saad Ansari, Parakram Khandpur, Ruchi Gupta, Richard Hughes-Jones,

16

10Gbps tests• At SC2004 using two 10Gbps dedicated paths

between Pittsburgh and Sunnyvale– Using Solaris 10 (build 69) and Linux 2.6– On Sunfire Vx0z (dual & quad 2.4GHz 64 bit AMD

Opterons) with PCI-X 133MHz 64 bit– Only 1500 Byte MTUs

• Achievable performance limits (using iperf)– Reno TCP (multi-flows) vs UDTv2, – TOE (Chelsio) vs no TOE (S2io)

Page 17: 1 Characterization and Evaluation of TCP and UDP-based Transport on Real Networks Les Cottrell, Saad Ansari, Parakram Khandpur, Ruchi Gupta, Richard Hughes-Jones,

17

Results• UDT limit was ~ 4.45Gbits/s

– Cpu limited

• TCP Limit was about 7.5±0.07 Gbps, regardless of:– Whether LAN (back to back) or WAN

• WAN used 2MB window & 16 streams

– Whether Solaris 10 or Linux 2.6– Whether S2io or Chelsio NIC

• Gating factor=PCI-X – Raw bandwidth 8.53Gbps– But transfer broken into segments to allow interleaving– E.g. with max memory read byte count of 4096Bytes with Intel

Pro/10GbE LR NIC limit is 6.83Gbits/s

• One host with 4 cpus & 2 NICs sent 11.5±0.2Gbps to two dual cpu hosts with 1 NIC each

• Two hosts to two hosts (1 NIC/host) 9.07Gbps goodput forward & 5.6Gbps reverse

Page 18: 1 Characterization and Evaluation of TCP and UDP-based Transport on Real Networks Les Cottrell, Saad Ansari, Parakram Khandpur, Ruchi Gupta, Richard Hughes-Jones,

18

TCP CPU Utilization• CPU power important• Each cpu=2.4GHz• Throughput increases with

flows• Util. not linear(throughput) • Depends on flows too

Chelsio(TOE)

• Normalize GHz/Gbps• Chelsio + TOE + Linux 2.6.6• S2io + CKS offload + Sol10

– S2io supports LSO but Sol10 did not, so not used

– Microsoft reports 0.017GHz/Gbps with Windows+S2io/LSO, 1 flow

Page 19: 1 Characterization and Evaluation of TCP and UDP-based Transport on Real Networks Les Cottrell, Saad Ansari, Parakram Khandpur, Ruchi Gupta, Richard Hughes-Jones,

19

Conclusions• Need testing on real networks

– Controlled simulation & emulation critical for understanding

– BUT need to verify, and results look different than expected (e.g. Fast)

• Most important for transoceanic paths• UDT looks promising, still needs work for >

6Gbits/s• Need to evaluate various offloads (TOE, LSO ...)

• Need to repeat inter-protocol fairness vs Reno• New buses important, need NICs to support

then evaluate

Page 20: 1 Characterization and Evaluation of TCP and UDP-based Transport on Real Networks Les Cottrell, Saad Ansari, Parakram Khandpur, Ruchi Gupta, Richard Hughes-Jones,

20

Further Information• Web site with lots of plots & analysis

– www.slac.stanford.edu/grp/scs/net/papers/pfld05/ruchig/Fairness/

• Inter-protocols comparison (Journal of Grid Comp, PFLD04)– www.slac.stanford.edu/cgi-wrap/getdoc/slac-pub-10402.pdf

• SC2004 details– www-iepm.slac.stanford.edu/monitoring/bulk/sc2004/