Top Banner
Copyright © 2005 Department of Computer Science 1 Solving the TCP-incast Problem with Application-Level Scheduling Maxim Podlesny, University of Waterloo Carey Williamson, University of Calgary
20

Solving the TCP-incast Problem with Application-Level Scheduling

Jan 31, 2016

Download

Documents

emery

Solving the TCP-incast Problem with Application-Level Scheduling. Maxim Podlesny, University of Waterloo Carey Williamson, University of Calgary. Motivation. 2. 2. Emerging IT paradigms Data centers, grid computing, HPC, multi-core Cluster-based storage systems, SAN, NAS - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Solving the TCP-incast Problem with Application-Level Scheduling

Copyright © 2005 Department of Computer Science

1

Solving the TCP-incast Problem with Application-Level Scheduling

Maxim Podlesny, University of WaterlooCarey Williamson, University of Calgary

Page 2: Solving the TCP-incast Problem with Application-Level Scheduling

Copyright © 2005 Department of Computer Science

22

Motivation

2

• Emerging IT paradigms– Data centers, grid computing, HPC, multi-core– Cluster-based storage systems, SAN, NAS– Large-scale data management “in the cloud”– Data manipulation via “services-oriented computing”

• Cost and efficiency advantages from IT trends, economy of scale, specialization marketplace

• Performance advantages from parallelism– Partition/aggregation, MapReduce, BigTable, Hadoop– Think RAID at Internet scale! (1000x)

Page 3: Solving the TCP-incast Problem with Application-Level Scheduling

Copyright © 2005 Department of Computer Science

33

Problem Statement

• High-speed, low-latency network (RTT ≤ 0.1 ms) • Highly-multiplexed link (e.g., 1000 flows)• Highly-synchronized flows on bottleneck link• Limited switch buffer size (e.g., 32 KB)

How to provide high goodputfor data centerapplications?

TCP retransmission timeouts

TCP throughput degradation

N

Page 4: Solving the TCP-incast Problem with Application-Level Scheduling

Copyright © 2005 Department of Computer Science

444

Related Work• E. Krevat et al., “On Application-based Approaches to Avoiding TCP

Throughput Collapse in Cluster-based Storage Systems”, Proceedings of SuperComputing 2007

• A. Phanishayee et al., “Measurement and Analysis of TCP Throughput Collapse in Cluster-based Storage Systems”, Proceedings of FAST 2008

• Y. Chen et al., “Understanding TCP Incast Throughput Collapse in Datacenter Networks”, WREN 2009

• V. Vasudevan et al., “Safe and Effective Fine-grained TCP Retransmissions for Datacenter Communication”, Proceedings of ACM SIGCOMM 2009

• M. Alizadeh et al., “Data Center TCP”, Proc. ACM SIGCOMM 2010• A. Shpiner et al., “A Switch-based Approach to Throughput Collapse

and Starvation in Data Centers”, IWQoS 2010

Page 5: Solving the TCP-incast Problem with Application-Level Scheduling

Copyright © 2005 Department of Computer Science

55

Summary

• Data centers have specific network characteristics

• TCP-incast throughput collapse problem emerges

• Possible solutions:

– Tweak TCP timers and/or parameters for this environment

– Redesign (or replace!) TCP in this environment

– Rewrite applications for this environment (Facebook)

– Increase switch buffer sizes (extra queueing delay!)

– Smart edge coordination for uploads/downloads

Summary of Related Work

Page 6: Solving the TCP-incast Problem with Application-Level Scheduling

Copyright © 2005 Department of Computer Science

6

Data Center System Model

N servers

Logical

data block

(S)

(e.g., 1 MB)

Server

Request

Unit

(SRU)

(e.g., 32 KB)

1

2

3

N

packet size S_DATA

small buffer B

link capacity C

switch client

Page 7: Solving the TCP-incast Problem with Application-Level Scheduling

Copyright © 2005 Department of Computer Science

7

Performance Comparisons

Internet vs. data center network:• Internet propagation delay: 10-100 ms• data center propagation delay: 0.1 ms• packet size 1 KB, link capacity 1 Gbps -> packet transmission time is 0.01 ms

Page 8: Solving the TCP-incast Problem with Application-Level Scheduling

Copyright © 2005 Department of Computer Science

88

Summary

• Determine maximum TCP flow concurrency (n)

that can be supported without any packet loss

• Arrange the servers into k groups of (at most) n

servers each, by staggering the group scheduling

Analysis Overview (1 of 2)

Page 9: Solving the TCP-incast Problem with Application-Level Scheduling

Copyright © 2005 Department of Computer Science

99

Summary

• Determine maximum TCP flow concurrency (n)

that can be supported without any packet loss

– Determine flow size in packets (based on SRU and MSS)

– Determine maximum outstanding packets per flow (Wmax)

– Determine max flow concurrency (based on B and Wmax)

• Arrange the servers into k groups of (at most) n

servers each, by staggering the group scheduling

Analysis Overview (2 of 2)

Page 10: Solving the TCP-incast Problem with Application-Level Scheduling

Copyright © 2005 Department of Computer Science

1010

Summary

• Recall TCP slow start dynamics:

– Initial TCP congestion window (cwnd) is 1 packet

– Acks cause cwnd to double every RTT (1, 2, 4, 8, 16…)

• Consider TCP transfer of an arbitrary SRU (e.g., 21)

• Determine peak power-of-2 cwnd value (WA)

• Determine “residual window” for the last RTT (WB)

• Wmax depends on both WA and WB (e.g., WA+ WB/2 )

Determining Wmax

Page 11: Solving the TCP-incast Problem with Application-Level Scheduling

Copyright © 2005 Department of Computer Science

1111

Scheduling Overview

n

nn

n n n

N

Page 12: Solving the TCP-incast Problem with Application-Level Scheduling

Copyright © 2005 Department of Computer Science

12

Scheduling Details

Using lossless scheduling of server responses: maximum n servers responding simultaneously, with k groups of responding servers scheduled

Using lossless scheduling of server responses: maximum n servers responding simultaneously, with k groups of responding servers scheduled

Server i (1 <= i <= N) starts responding at:

Server i (1 <= i <= N) starts responding at:

Page 13: Solving the TCP-incast Problem with Application-Level Scheduling

Copyright © 2005 Department of Computer Science

13

Theoretical Results

Maximum goodput of an application in a data center with lossless scheduling is:

where:• S - size of a logical data block• T - actual completion time of an SRU• - SRU completion time used for scheduling• k – how many groups of servers to use

• dmax - real system scheduling variance

Maximum goodput of an application in a data center with lossless scheduling is:

where:• S - size of a logical data block• T - actual completion time of an SRU• - SRU completion time used for scheduling• k – how many groups of servers to use

• dmax - real system scheduling variance

maxd+T+)(kT

S=g

1~

Page 14: Solving the TCP-incast Problem with Application-Level Scheduling

Copyright © 2005 Department of Computer Science

141414

Solution Analytical Model Results

Page 15: Solving the TCP-incast Problem with Application-Level Scheduling

Copyright © 2005 Department of Computer Science

15

Results for 10 KB Fixed SRU Size (1 of 2)

Page 16: Solving the TCP-incast Problem with Application-Level Scheduling

Copyright © 2005 Department of Computer Science

16

Results for 10 KB Fixed SRU Size (2 of 2)

Page 17: Solving the TCP-incast Problem with Application-Level Scheduling

Copyright © 2005 Department of Computer Science

17

Results for Varied SRU Size (1 MB / N)

Page 18: Solving the TCP-incast Problem with Application-Level Scheduling

Copyright © 2005 Department of Computer Science

18

Effect of TCP Timer Granularity

Page 19: Solving the TCP-incast Problem with Application-Level Scheduling

Copyright © 2005 Department of Computer Science

19

Summary and Conclusion

Application-level scheduling for TCP-incast throughput collapse

Main idea: scheduling responses of servers so that there are no losses

Maximum goodput with lossless scheduling Non-monotonic goodput, highly-sensitive to network configuration parameters

Page 20: Solving the TCP-incast Problem with Application-Level Scheduling

Copyright © 2005 Department of Computer Science

20

Future Work

Implementing and testing our solution in real data centers

Evaluating our solution for different application traffic scenarios