Top Banner
FAST TCP in Linux Cheng Jin David Wei http://netlab.caltech.edu/FAST/WANinLab/nsfvisit
26
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: FAST TCP in Linux Cheng Jin David Wei .

FAST TCP in Linux

Cheng Jin David Wei

http://netlab.caltech.edu/FAST/WANinLab/nsfvisit

Page 2: FAST TCP in Linux Cheng Jin David Wei .

netlab.caltech.edu

Outline

Overview of FAST TCP. Implementation Details. SC2002 Experiment Results. FAST Evaluation and WAN-in-Lab.

Page 3: FAST TCP in Linux Cheng Jin David Wei .

netlab.caltech.edu

FAST vs. Linux TCP

Distance = 10,037 km; Delay = 180 ms; MTU = 1500 B; Duration: 3600 sLinux TCP Experiments: Jan 28-29, 2003

1333173.182Linux TCPtxqlen=100

3909319.352Linux TCPtxqlen=10000

78 1851.86 1Linux TCPtxqlen=100

753

387

111

Transfer(GB)

1,79718.032FAST

19.11.2002

9259.281FAST

19.11.2002

2662.671Linux TCPtxqlen=10000

Throughput(Mbps)

BmpsPeta

Flows

Page 4: FAST TCP in Linux Cheng Jin David Wei .

netlab.caltech.edu

Aggregate Throughput

Linux TCP Linux TCP FAST

Average utilization

19%

27%

92%FAST Standard MTU Utilization averaged over 1hr

txq=100 txq=10000

95%

16%

48%

Linux TCP Linux TCP FAST

2G

1G

Page 5: FAST TCP in Linux Cheng Jin David Wei .

netlab.caltech.edu

Summary of Changes

RTT estimation: fine-grain timer. Fast convergence to equilibrium. Delay monitoring in equilibrium. Pacing: reducing burstiness.

Page 6: FAST TCP in Linux Cheng Jin David Wei .

netlab.caltech.edu

FAST TCP Flow Chart

Slow Start

Fast Convergence

Equilibrium

Loss Recovery

NormalRecovery

Time-out

Page 7: FAST TCP in Linux Cheng Jin David Wei .

netlab.caltech.edu

RTT Estimation

Measure queueing delay. Kernel timestamp with s resolution. Use SACK to increase the number of RTT

samples during recovery. Exponential averaging of RTT samples to

increase robustness.

Page 8: FAST TCP in Linux Cheng Jin David Wei .

netlab.caltech.edu

Fast Convergence

Rapidly increase or decrease cwnd toward equilibrium.

Monitor the per-ack queueing delay to avoid overshoot.

Page 9: FAST TCP in Linux Cheng Jin David Wei .

netlab.caltech.edu

Equilibrium

Vegas-like cwnd adjustment in large time-scale -- per RTT.

Small step-size to maintain stability in equilibrium.

Per-ack delay monitoring to enable timely detection of changes in equilibrium.

Page 10: FAST TCP in Linux Cheng Jin David Wei .

netlab.caltech.edu

Pacing

What do we pace? Increment to cwnd.

Time-Driven vs. event-driven. Trade-off between complexity and

performance. Timer resolution is important.

Page 11: FAST TCP in Linux Cheng Jin David Wei .

netlab.caltech.edu

Time-Based Pacing

cwnd increments are scheduled at fixed intervals.

data ack data

Page 12: FAST TCP in Linux Cheng Jin David Wei .

netlab.caltech.edu

Event-Based Pacing

Detect sufficiently large gap between consecutive bursts and delay cwnd increment until the end of each such burst.

Page 13: FAST TCP in Linux Cheng Jin David Wei .

SCinet Caltech-SLAC experiments

netlab.caltech.edu/FAST

SC2002 Baltimore, Nov 2002

Experiment

Sunnyvale Baltimore

Chicago

Geneva

3000km 1000km

70

00

km

C. Jin, D. Wei, S. LowFAST Team and Partners

Internet: distributed feedbacksystem Rf (s)

Rb’(s)

x

p

TCP AQM

Theory

FAST TCP Standard MTU Peak window = 14,255 pkts Throughput averaged over > 1hr 925 Mbps single flow/GE card

9.28 petabit-meter/sec 1.89 times LSR

8.6 Gbps with 10 flows 34.0 petabit-meter/sec 6.32 times LSR

21TB in 6 hours with 10 flows

Implementation Sender-side modification Delay based

Highlights

1 2

1

2

7

9

10G

enev

a-Sunnyv

ale

Baltim

ore-

Sunn

yval

eFA

ST

I2 L

SR

#flows

Page 14: FAST TCP in Linux Cheng Jin David Wei .

netlab.caltech.edu

Network

(Sylvain Ravot, caltech/CERN)

Page 15: FAST TCP in Linux Cheng Jin David Wei .

netlab.caltech.edu

FAST BMPS

Internet2Land Speed

Record

FAST

1 2

1

2

7

9

10

Geneva-S

unnyvale

Baltim

ore-S

unnyvale

#flows

FAST Standard MTU Throughput averaged over > 1hr

Page 16: FAST TCP in Linux Cheng Jin David Wei .

netlab.caltech.edu

Aggregate Throughput

1 flow 2 flows 7 flows 9 flows 10 flows

Average utilization

95%

92%

90%

90%

88%FAST Standard MTU Utilization averaged over > 1hr

1hr 1hr 6hr 1.1hr 6hr

Page 17: FAST TCP in Linux Cheng Jin David Wei .

netlab.caltech.edu

Caltech-SLAC Entry

Rapid recoveryafter possiblehardware glitch

Power glitchReboot

100-200Mbps ACK traffic

Page 18: FAST TCP in Linux Cheng Jin David Wei .

SCinet Caltech-SLAC experiments

netlab.caltech.edu/FAST

SC2002 Baltimore, Nov 2002

PrototypeC. Jin, D. Wei

TheoryD. Choe (Postech/Caltech), J. Doyle, S. Low, F. Paganini (UCLA), J. Wang, Z. Wang (UCLA)

Experiment/facilities Caltech: J. Bunn, C. Chapman, C. Hu (Williams/Caltech), H. Newman, J. Pool, S.

Ravot (Caltech/CERN), S. Singh CERN: O. Martin, P. Moroni Cisco: B. Aiken, V. Doraiswami, R. Sepulveda, M. Turzanski, D. Walsten, S. Yip DataTAG: E. Martelli, J. P. Martin-Flatin Internet2: G. Almes, S. Corbato Level(3): P. Fernes, R. Struble SCinet: G. Goddard, J. Patton SLAC: G. Buhrmaster, R. Les Cottrell, C. Logg, I. Mei, W. Matthews, R. Mount, J.

Navratil, J. Williams StarLight: T. deFanti, L. Winkler TeraGrid: L. Winkler

Major sponsorsARO, CACR, Cisco, DataTAG, DoE, Lee Center, NSF

Acknowledgments

Page 19: FAST TCP in Linux Cheng Jin David Wei .

netlab.caltech.edu

Evaluating FAST

End-to-End monitoring doesn’t tell the whole story.

Existing network emulation (dummynet) is not always enough.

Better optimization if we can look inside and understand the real network.

Page 20: FAST TCP in Linux Cheng Jin David Wei .

netlab.caltech.edu

Dummynet and Real Testbed

Page 21: FAST TCP in Linux Cheng Jin David Wei .

netlab.caltech.edu

Dummynet Issues

Not running on a real-time OS -- imprecise timing.

Lack of priority scheduling of dummynet events.

Bandwidth fluctuates significantly with workload.

Much work needed to customize dummynet for protocol testing.

Page 22: FAST TCP in Linux Cheng Jin David Wei .

netlab.caltech.edu

10 GbE Experiment

Long-distance testing of Intel 10GbE cards.

Sylvain Ravot (Caltech) achieved 2.3 Gbps using single stream with jumbo frame and stock Linux TCP.

Tested HSTCP, Scalable TCP, FAST, and stock TCP under Linux.

1500B MTU: 1.3 Gbps SNV -> CHI; 9000B MTU: 2.3 Gbps SNV -> GVA

Page 23: FAST TCP in Linux Cheng Jin David Wei .

netlab.caltech.edu

TCP Loss Mystery

Frequent packet loss with 1500-byte MTU. None with larger MTUs.

Packet loss even when cwnd is capped at 300 - 500 packets.

Routers have large queue size of 4000 packets.

Packets captured at both sender and receiver using tcpdump.

Page 24: FAST TCP in Linux Cheng Jin David Wei .

netlab.caltech.edu

How Did the Loss Happen?

loss detected

Page 25: FAST TCP in Linux Cheng Jin David Wei .

netlab.caltech.edu

How Can WAN-in-Lab Help?

We will know exactly where packets are lost.

We will also know the sequence of events (packet arrivals) that lead to loss.

We can either fix the problem in the network if any, or improve the protocol.

Page 26: FAST TCP in Linux Cheng Jin David Wei .

netlab.caltech.edu

Conclusion

FAST improves the end-to-end performance of TCP.

Many issues are still to be understood and resolved.

WAN-in-Lab can help make FAST a better protocol.