Top Banner
76

(WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014

Jul 02, 2015

Download

Technology

Tuning your EC2 web server will help you to improve application server throughput and cost-efficiency as well as reduce request latency. In this session we will walk through tactics to identify bottlenecks using tools such as CloudWatch in order to drive the appropriate allocation of EC2 and EBS resources. In addition, we will also be reviewing some performance optimizations and best practices for popular web servers such as Nginx and Apache in order to take advantage of the latest EC2 capabilities.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014
Page 2: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014

HTTP

Page 3: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014

• Optimize the web server stack

Page 4: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014
Page 5: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014

• Remember: optimizations by definition are

app-specific

Page 6: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014
Page 7: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014
Page 8: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014
Page 9: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014
Page 10: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014
Page 11: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014

CloudWatch

0

1

2

3

4

5

10

:00

10

:01

10

:02

10

:03

10

:04

10

:05

10

:06

10

:07

10

:08

10

:09

10

:10

10

:11

10

:12

10

:13

10

:14

10

:15

Average request size

Average request size

Filters

Page 12: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014
Page 13: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014
Page 14: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014
Page 15: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014

0

50

100

150

200

250

1 6 111621263136414651566166717681869196

Latency at percentile Average Latency

0

200

400

600

800

1000

1200

1400

1600

1800

2000

6 9

12

15

18

21

24

27

30

33

36

39

42

45

48

55

20

4

20

7

21

0

Latency histogram

Frequency

Page 16: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014

0

5

10

15

20

25

Category 1

Chart Title

response_processing_time

request_processing_time

backend_processing_time

Page 17: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014

0

5

10

15

20

25

Average latency by type

GET POST

2.85

2.9

2.95

3

3.05

3.1

3.15

3.2

3.25

Average latency

Total

Page 18: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014
Page 19: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014

• Whatever makes most sense to you!

Page 20: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014
Page 21: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014

Justin Lintz

Page 22: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014

Who am I?• Senior Web Operations Engineer at Chartbeat

• Previously worked at

– Bitly

– TheStreet.com

– Corsis

@lintzston [email protected]

Page 23: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014

Chartbeat measures and monetizes attention on the web. Working with 80% of

the top US news sites and global media sites in 50 countries, Chartbeat brings

together editors and advertisers to identify in real time the active time an

audience consumes articles, videos, paid content, and display advertising.

Page 24: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014

http://chartbeat.com/publishing/demo

Page 25: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014

• 400–500 servers

• Peak traffic: 275,000 requests/second

• 11–12 million concurrent users across all

sites in our network

Page 26: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014

http://chartbeat.com/totaltotal

Page 27: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014

Traffic characteristicsEvery 15 seconds

213 byte request + headers

43 byte, response size

Page 28: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014

Logs

Page 29: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014

Logging not “free”Sequential writes are fast

Logs grow and then...

Page 30: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014

What do you do with them?• Rotate

• Compress

• Ship them elsewhere?

All impact latency of your requests!

Page 31: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014

Gzip impact on request latency

● 8 GB file

● Default GZIP

compression settings

● EXT4

● C3.xlarge on SSD

ephemeral storage

Page 32: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014

Simple tweaks

Page 33: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014

Hourly rotate• Logrotate doesn’t support out of box

0 * * * * /usr/sbin/logrotate -f /etc/logrotate.d/nginx >

/dev/null 2>&1

Goal: smaller latency spikes spread throughout day

Page 34: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014

Avoid compression• But if you must, use

– LZ4

– LZO

– Snappy

Order of magnitude faster than gzip or bzip2,

fraction of the CPU

Page 35: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014

Extent-based file system

EXT4 or XFS

Page 36: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014

SSD• GP2 Amazon EBS volumes

• New generation Amazon EC2 instance types

– C3

– M3

– R3

– I2

Page 37: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014

More involved tweaks

Page 38: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014

Stream logs via Syslog• Max 1 KB line length per RFC3164

• Only supported in Nginx 1.7.1+• Apache supported via CustomLog piping to logger

Page 39: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014

Only log at load balancer• Only one side of picture

• Can’t log custom headers or format logs

• Logs are delayed

Page 40: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014

Pull node on rotate• Using prerotate/postrotate in logrotate

– Pull node from ELB via API and place back on

completion

• Requires staggering nodes

• Probably not worth the effort?

Page 41: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014

Sysctl tweaks

Page 42: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014

Listen queue backlog

net.core.somaxconn = 128

Apache: ListenBackLog 511

Nginx: listen backlog=511

should be larger

Page 43: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014

man listen(2)

If the backlog argument is greater than the value in

/proc/sys/net/core/somaxconn, then it is silently

truncated to that value; the default value in this file is

128. In kernels before 2.4.25, this limit was a hard-

coded value, SOMAXCONN, with the value 128.

Page 44: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014
Page 45: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014

Additional TCP backlog• net.core.netdev_max_backlog = 1000

– Per CPU backlog

– Network frames

• net.ipv4.tcp_max_syn_backlog = 128

• Half-open connections

Page 46: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014

Initial congestion window

TCP congestion window - initcwnd (initial)

Starting in Kernel 2.6.39, set to 10

Previous default was 3!http://research.google.com/pubs/pub36640.html

Older Kernel? $ ip route change default via 192.168.1.1 dev eth0 proto static initcwnd 10

Page 47: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014

net.ipv4.tcp_slow_start_after_idle

• Set to 0 to ensure connections don’t go back to

default TCP window size after being idle too long

Example: HTTP KeepAlive

Page 48: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014

TIME_WAIT sockets

Page 49: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014

net.ipv4.tcp_max_tw_buckets

• Max number of sockets in TIME_WAIT. We actually

set this very high, because before we moved

instances behind a load balancer it was normal to

have 200K+ sockets in TIME_WAIT state.

• Exceeding this leads to sockets being torn down

until under limit

Page 50: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014

net.ipv4.tcp_fin_timeout• The time a connection should spend in FIN_WAIT_2

state. Default is 60 seconds, lowering this will free

memory more quickly and transition the socket to

TIME_WAIT.

• This will NOT reduce the time a socket is in

TIME_WAIT which is set to 2 * MSL (max segment

lifetime).

Page 51: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014

net.ipv4.tcp_fin_timeout continued...

MSL is hardcoded in the kernel at 60 seconds!

https://github.com/torvalds/linux/blob/master/include/

net/tcp.h#L115

#define TCP_TIMEWAIT_LEN (60*HZ) /* how long to wait to destroy TIME-WAIT

* state, about 60 seconds */

Page 52: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014

“If it is on the Internet then it

must be true, and you can’t

question it”

—Abraham Lincoln

Page 53: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014

net.ipv4.tcp_tw_recycle DANGEROUS

• Clients behind NAT/stateful FW will get

dropped

• *99.99999999% of time should never be

enabled

* Probably 100%, but there may be a valid case out there

Page 54: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014
Page 55: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014

net.ipv4.tcp_tw_reuse

Makes a safer attempt at freeing sockets in

TIME_WAIT state

Page 56: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014

Recycle vs. reuse deep dive

http://bit.ly/tcp-time-wait

Page 57: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014

net.ipv4.tcp_rmem/wmem

Format: min default max (in bytes)

• The kernel will autotune the number of bytes to use

for each socket based on these settings. It will start at default and work between the min and max

Page 58: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014

net.ipv4.tcp_mem

Format: low pressure max (in pages!)

• Below low, Kernel won’t put pressure on sockets to

reduce mem usage. When pressure hits, sockets

reduce memory until low is hit. If max hits, no new

sockets.

Page 59: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014

Additional readings

https://www.kernel.org/doc/Documentation/networking/ip-sysctl.txt

man tcp(7)

Page 60: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014

Nginx/Apache

Page 61: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014

listen backlogApache: ListenBackLog 511

Nginx: listen backlog=511

– limited by net.core.somaxconn

Page 62: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014

tcp_defer_acceptApache: AcceptFilter http data

AcceptFilter https data

Nginx: listen [deferred]

– Wait till we receive data packet before passing

socket to server. Completing TCP handshake won’t trigger an accept()

Page 63: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014

sendfileApache: EnableSendfile off

Nginx: sendfile off

– Saves context switching from userspace on

read/write

– “zero copy”; happens in kernel space

Page 64: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014

tcp_corkApache: Enabled w/ sendfile

Nginx: tcp_nopush off

– aka TCP_CORK sockopt

– allows application to control building of packet;

e.g., pack a packet with full HTTP response

– Only works with sendfile

Page 65: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014

tcp_nodelay (Nagle’s algo)Apache: On

• No ability to turn off

Nginx: tcp_nodelay on

• Only affects keep-alive connections

• Will add latency if turned off in favor of bandwidth

Page 66: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014

HTTP Keep-Alive Apache: KeepAlive On

KeepAliveTimeout 5

MaxKeepAliveRequests 100

Nginx: keepalive_timeout 75s

keepalive_requests 100

Note: If using ELB you must match the timeout to the

the ELB timeout setting

Page 67: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014

HTTP Keep-Alive• Also enable on upstream proxies

– Available since Nginx 1.1.4 proxy_http_version 1.1;

proxy_set_header Connection "";

upstream foo {

server 10.1.1.1;

keepalive 1024;

}

Page 68: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014

HTTP Keep-Alive

Page 69: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014
Page 70: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014
Page 71: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014
Page 72: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014
Page 73: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014
Page 74: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014
Page 75: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014

everything

your

quantifiable

continuously

Page 76: (WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014

Please give us your feedback on this session.

Complete session evaluations and earn re:Invent swag.

http://bit.ly/awsevals