Top Banner
Your Code is Wrong Nathan Marz @nathanmarz 1
106

Your Code is Wrong

Sep 08, 2014

Download

Technology

nathanmarz

My keynote at NoSQL Now! on August 21st, 2013
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Your Code is Wrong

Your Code is Wrong

Nathan Marz@nathanmarz 1

Page 2: Your Code is Wrong

Let’s start with an example

Page 3: Your Code is Wrong

Storm’s “reportError” method

Page 4: Your Code is Wrong

(Storm is a realtime computation system, like Hadoop but for realtime)

Page 5: Your Code is Wrong

Storm architecture

Page 6: Your Code is Wrong

Storm architecture

Master node (similar to Hadoop JobTracker)

Page 7: Your Code is Wrong

Storm architecture

Used for cluster coordination

Page 8: Your Code is Wrong

Storm architecture

Run worker processes

Page 9: Your Code is Wrong

Storm’s “reportError” method

Page 10: Your Code is Wrong

Used to show errors in the Storm UI

Page 11: Your Code is Wrong

Error info is stored in Zookeeper

Page 12: Your Code is Wrong

What happens when a user deploys code like this?

Page 13: Your Code is Wrong

Denial-of-service on Zookeeper and cluster goes down

Page 14: Your Code is Wrong

Robust!

Designed input space Actual input space

Page 15: Your Code is Wrong

Your code is wrong

Page 16: Your Code is Wrong

Your code is literally wrong

Page 17: Your Code is Wrong

Your code is wrong

Page 18: Your Code is Wrong
Page 19: Your Code is Wrong

Why do you believe your code is correct?

Page 20: Your Code is Wrong

Your code

Dependency 1

Dependency 2

Dependency 3

Page 21: Your Code is Wrong

Dependency 1

Dependency 4

Dependency 5

Page 22: Your Code is Wrong

Dependency 4

Dependency 6

Dependency 9

Dependency 7

Dependency 8

Page 23: Your Code is Wrong

Dependency 3,000,000

Hardware

Page 24: Your Code is Wrong

Electronics

Page 25: Your Code is Wrong

Chemistry

Page 26: Your Code is Wrong

Atomic physics

Page 27: Your Code is Wrong

Quantum mechanics

Page 28: Your Code is Wrong

I think I can safely say that nobody understands

quantum mechanics.

Richard Feynman

Page 29: Your Code is Wrong

Your code is wrong

Page 30: Your Code is Wrong

Your code

...

Page 31: Your Code is Wrong

All the software you’ve used has had bugs in it

Page 32: Your Code is Wrong

Including the software you’ve written

Page 33: Your Code is Wrong

Your code issometimes correct

Page 34: Your Code is Wrong

That’s good enough!

Page 35: Your Code is Wrong
Page 36: Your Code is Wrong

Treat code as nondeterministic

Page 37: Your Code is Wrong

Embrace “your code is wrong”to design better software

Page 38: Your Code is Wrong

Robust!

Designed input space Actual input space

Page 39: Your Code is Wrong

Robust!

Designed input space Actual input space

Page 40: Your Code is Wrong

An example

Page 41: Your Code is Wrong

Learning from Hadoop

Jobtracker

Job

Job

Job

Page 42: Your Code is Wrong

Learning from Hadoop

Jobtracker

Job

Job

Job

Page 43: Your Code is Wrong

Learning from Hadoop

Jobtracker

Job

Job

Job

Page 44: Your Code is Wrong

Your code is wrong

Page 45: Your Code is Wrong

So your processes will crash

Page 46: Your Code is Wrong

Storm’s daemons are process fault-tolerant

Page 47: Your Code is Wrong

Storm

Nimbus

Topology

Topology

Topology

Page 48: Your Code is Wrong

Storm

Nimbus

Topology

Topology

Topology

Page 49: Your Code is Wrong

Storm

Nimbus

Topology

Topology

Topology

Page 50: Your Code is Wrong

Storm

Nimbus

Topology

Topology

Topology

Page 51: Your Code is Wrong

Storm

Nimbus

Topology

Topology

Topology

Page 52: Your Code is Wrong

Robust!

Designed input space Actual input space

Page 53: Your Code is Wrong

Robust!

Designed input space Actual input space

Page 54: Your Code is Wrong

The impact of code being wrong

Page 55: Your Code is Wrong

Robust!

Designed input space Actual input space

Failures!Bad performance!Security holes!

Irrelevant!

Page 56: Your Code is Wrong

Design principle #1

Measuring and monitoring are the foundation of solid engineering

Page 57: Your Code is Wrong

Measuring: Under what range of inputs does my software function well?

Page 58: Your Code is Wrong

Monitoring: What’s the actual input space of my software?

Page 59: Your Code is Wrong

Measure & MonitorLatencyThroughputStack tracesBuffer sizesMemory usageCPU usage#threads spawned...

Page 60: Your Code is Wrong

How you monitor your software is as important as its functionality

Page 61: Your Code is Wrong

Design principle #2

Embrace immutability

Page 62: Your Code is Wrong

Read/write databaseApplication

Page 63: Your Code is Wrong

MySQLApplication

Page 64: Your Code is Wrong

MongoDBApplication

Page 65: Your Code is Wrong

RiakApplication

Page 66: Your Code is Wrong

CassandraApplication

Page 67: Your Code is Wrong

HBaseApplication

Page 68: Your Code is Wrong

Your code is wrong

Page 69: Your Code is Wrong

So data will be corrupted

Page 70: Your Code is Wrong

And you may not know why

Page 71: Your Code is Wrong

ViewsImmutable,

ever-growing data

Application

Architecture based on immutability

Page 72: Your Code is Wrong

ViewsImmutable,

ever-growing data

Application

Lambda architecture

Page 73: Your Code is Wrong

Design principle #3

Minimize dependencies

Page 74: Your Code is Wrong

The less that can go wrong, the less that will go wrong

Page 75: Your Code is Wrong

Example:Storm’s usage of Zookeeper

Page 76: Your Code is Wrong

Worker locations stored in Zookeeper

Page 77: Your Code is Wrong

All workers must know locations of other workers to send messages

Page 78: Your Code is Wrong

Two ways to get location updates

Page 79: Your Code is Wrong

1. Poll Zookeeper

Worker Zookeeper

Page 80: Your Code is Wrong

2. Use Zookeeper “watch” feature to get push notifications

Worker Zookeeper

Page 81: Your Code is Wrong

Method 2 is faster but relies on another feature

Page 82: Your Code is Wrong

Storm uses both methods

Worker Zookeeper

Page 83: Your Code is Wrong

If watch feature fails, locations still propagate via polling

Page 84: Your Code is Wrong

Eliminating dependence justified by small amount of code required

Page 85: Your Code is Wrong

Design principle #4

Explicitly respect functional input ranges

Page 86: Your Code is Wrong

Storm’s “reportError” method

Page 87: Your Code is Wrong

Implement self-throttling to avoid overloading other systems

Page 88: Your Code is Wrong

Design principle #5

Embrace recomputation

Page 89: Your Code is Wrong

“Your code is wrong” meanings1. Design input space differs from actual input space2. The logic of your code is wrong3. Requirements are constantly changing

Page 90: Your Code is Wrong

You must be able to change your code to match shifting requirements

Page 91: Your Code is Wrong

Example: blogging software

Page 92: Your Code is Wrong

New requirement: search

Page 93: Your Code is Wrong

Have to build a search index

Page 94: Your Code is Wrong
Page 95: Your Code is Wrong

Recomputation gives you so much more

Page 96: Your Code is Wrong

ViewsImmutable,

ever-growing data

Application

Page 97: Your Code is Wrong

Building software no different than any other engineering

Page 98: Your Code is Wrong

The underlying challenges are the same

Page 99: Your Code is Wrong
Page 100: Your Code is Wrong
Page 101: Your Code is Wrong

What will break it?

Page 102: Your Code is Wrong

What are limits of my dependencies?

Page 103: Your Code is Wrong

How can I add redundancy to increase robustness?

Page 104: Your Code is Wrong

Can I isolate failures?

Page 105: Your Code is Wrong

Our raw materials are ideas instead of matter

Page 106: Your Code is Wrong

Thank you