Top Banner
@EdMcBane 7 lessons learned building HP/HA systems Never gonna give you up Never gonna let you down
37

Never gonna give you up

Jul 18, 2015

Download

Internet

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Never gonna give you up

@EdMcBane 7 lessons learned building HP/HA systems

Never gonnagive you up

Never gonna let you down

Page 2: Never gonna give you up

@EdMcBane

Francesco Degrassi

Enthusiastic yet pragmatic Lean Software Developer.

Uppish and cynical nihilist from time to time.

Page 3: Never gonna give you up

@EdMcBane

Lean Software Development and team coaching

Continuous Delivery, High availability, performance

Security sensitive & high uncertainty domains

Page 4: Never gonna give you up

@EdMcBane

The challenge

● Primary european client

● Innovative service for the consumer market

● Large userbase (200K+ users)

● Very high request rate

● Low latency requirement (<< RTT)

Page 5: Never gonna give you up

@EdMcBane

What we built

Page 6: Never gonna give you up

@EdMcBane

What did we learn?

Page 7: Never gonna give you up

@EdMcBane

Make your assumptions explicit

and keep testing them

Don’t eatthe yellow snow

Page 8: Never gonna give you up

@EdMcBane

Make your assumptions explicit

and keep testing them

#1 Make your

assumptions explicitand keep challenging them

Page 9: Never gonna give you up

@EdMcBane

Make your assumptions explicit

and keep testing them

#2 Performance &

High Availability are not extra features

Page 10: Never gonna give you up

@EdMcBane

Page 11: Never gonna give you up

@EdMcBane

Make your assumptions explicit

and keep testing them

#3 Do not reinvent

the wheel

...but keep things simple

Page 12: Never gonna give you up

@EdMcBane

Page 13: Never gonna give you up

@EdMcBane

● Everything was good with the single core scenario

In our case...

Page 14: Never gonna give you up

@EdMcBane

SO_REUSEPORT

For TCP, so_reuseport allows multiple listener sockets to be bound to the same port.

Received packets are distributed to multiple sockets bound to the same port using a 4-tuple hash.

With so_reuseport the distribution is uniform.

Page 15: Never gonna give you up

@EdMcBane

Everything should be made as simple as possible, but not simpler

— Albert Einstein

Page 16: Never gonna give you up

@EdMcBane

LESS(1) General Commands Manual LESS(1)

NAME less - opposite of more

SYNOPSIS less -? less --help less -V less --version less [-[+]aABcCdeEfFgGiIJKLmMnNqQrRsSuUVwWX~] [-b space] [-h lines] [-j line] [-k keyfile] [-{oO} logfile] [-p pattern] [-P prompt] [-t tag] [-T tagsfile] [-x tab,...] [-y lines] [-[z] lines] [-# shift] [+[+]cmd] [--] [filename]... (See the OPTIONS section for alternate option syntax with long option names.)

DESCRIPTION

LESS IS similar to MORE (1), but has many more features. Less does not have to read the entire input file before starting, so with large input files it starts up faster than text editors like vi (1). Less uses termcap (or terminfo on some systems), so it can run on

Manual page less(1) line 1 (press h for help or q to quit) .

Page 17: Never gonna give you up

@EdMcBane

Make your assumptions explicit

and keep testing them

#4Be wary of

cargo-cult optimization

Page 18: Never gonna give you up

@EdMcBane

Page 19: Never gonna give you up

@EdMcBane

TCP_TW_RECYCLE

Enable fast recycling TIME-WAIT sockets. Default value is 0. It should not be changed without advice/request of technical experts.

Linux will drop any segment from the remote host whose timestamp is not strictly bigger than the latest recorded timestamp

TCP_TW_RECYCLE + NAT = MADNESS

Page 20: Never gonna give you up

@EdMcBane

Page 21: Never gonna give you up

@EdMcBane

Make your assumptions explicit

and keep testing them

#5High Availability is much more than just redundancy

Page 22: Never gonna give you up

@EdMcBane

Page 23: Never gonna give you up

@EdMcBane

● Redundant hardware● Redundant software components

But there’s more!

● Graceful degradation● Incremental rollouts

Failure impact

Page 24: Never gonna give you up

@EdMcBane

Failure frequency

But then also:

● proven technology

● high quality hardware

● automation (to avoid errors)

Page 25: Never gonna give you up

@EdMcBane

● Effective monitoring○ realtime○ reliable○ understandable○ thorough○ meaningful○ actionable

● Rollback / rollforward● Automation (for speed)

Time to recover

Page 26: Never gonna give you up

@EdMcBane

Our response plan goes something like this...

AaaaaAAaaaah

Page 27: Never gonna give you up

@EdMcBane

...but be prepared to improvise

● In house experience

● Developers on call

● Drills (chaos monkeys)

Processes designed for ordinary times

are not resilient in a crisis and need to be changed.

Page 28: Never gonna give you up

@EdMcBane

Make your assumptions explicit

and keep testing them

#6 Embrace diversity

Page 29: Never gonna give you up

@EdMcBane

Page 30: Never gonna give you up

@EdMcBane

Page 31: Never gonna give you up

@EdMcBane

Make your assumptions explicit

and keep testing them

#7Monitoring is essential

… and we can do way better

Page 32: Never gonna give you up

@EdMcBane

No one size fits all

● “Monitor everything”, like “100% test coverage” is a nice slogan.

● Each environment requires a slightly different solution

● Balance between data availability, cost and ability to keep it actionable

Page 33: Never gonna give you up

@EdMcBane

Page 34: Never gonna give you up

@EdMcBane

We are doing logging wrong

● Unstructured

● Inconsistent

● Poor defaults

● Complex, obscure components

● A huge waste of computing power

Page 35: Never gonna give you up

@EdMcBane

We need a complete overview

● Logs

● Metrics

● Alerts

● Together, coherent, cross-referenced

Page 36: Never gonna give you up

@EdMcBane

Human beings, who are almost unique in having the ability to learn from the experience of others, are also remarkable for their apparent disinclination to do so.

Douglas Adams