Leveling the Playing Field Aaron Bedra Chief Security Officer, Eligible @abedra keybase.io/abedra
Leveling the Playing Field
Aaron Bedra Chief Security Officer, Eligible @abedra keybase.io/abedra
Right now, your web applications are being
attacked
And it will happen again, and again, and again
As you grow so will the target on you
Keeping up with security is difficult
Actually, it’s unfair
Things you have to get right Things the attacker has to get right
Time the attacker has to focus on you Time you have to focus on the attacker
It’s asymmetric warfare
There’s no way to manually keep up
ManualAutomated
Intelligent
Scaling your defenses means strategic
automation
STOP!
Let’s talk about the problem we are solving
for a minute
Problems
• We don’t know what people are doing
• We don’t know how often they are doing it
• We don’t know how effective we are
• We are don’t have enough resources to keep up
Goals• Reduce noise
• Generate better signal
• Reduce operational overhead
• Build better business cases
• Spend energy on the really important stuff
Reducing Noise
It starts with really simple stuff
Tie up the loose ends with static configuration
Static configuration checklistAt least a B+ rating on SSL Labs*
Reject extensions that you don’t want to accept
Reject known bad user agents
Reject specific known bad actors
Custom error pages that fit your application
Basic secure headers
You’ll be surprised how well this works
It has a fringe benefit of creating better
awareness
You can feed this back to your intelligence
Reducing Operational Overhead
Dealing with malicious actors has to be easy
It shouldn’t require deploys, reloads, or any potential forward impact
Let’s talk about how to create something that will
help
Step 1Put everything in one place!
Centralization of events is critical
If you can’t see it, it didn’t happen
There are options
Log aggregation and a query engine
The query engine can serve as your discovery
agent
A nice first step
But it will eventually fall over
That’s when you reach for a messaging system
Log to topics in a queue
Create processors to understand events
Step 2Process Events
For every event type you will need to understand
how to process it
Structured logging can help, but it doesn’t fit
everywhere
The goal is to accept an event and return
consumable details
type logEntry struct { Address string Method string Uri string ResponseCode string }
func processEntry(entry string) logEntry { parts := strings.Split(entry, " ") event := logEntry{ Address: parts[0], Method: strings.Replace(parts[5], "\"", "", 1), Uri: parts[6], ResponseCode: parts[8], } return event; }
You will likely have multiple processors
Split topics by event type or application
Once you have the data accessible, figure out
what happened
Track everything!
• HTTP Method
• Time since last request/average requests per sec
• Failed responses
• Failure of intended action (e.g. login, add credit card, edit, etc)
• Anything noteworthy
type Actor struct { Methods map[string]int FailedLogins int FailedResponses map[string]int }
func updateEvents(event logEntry, counts *map[string]Actor) { counts[event.Address].Methods[event.Method] += 1 if event.ResponseCode != "200" || event.ResponseCode != "302" { counts[event.Address].FailedResponses[ResponseCode] += 1 } if event.Method == "POST" && event.ResponseCode == "200" { counts[event.Address].FailedLogins += 1 } }
Once you have things in one place, it’s all about counting
Simple counts with thresholds go a long way
Step 3Thresholds, Patterns, and Deviations
Exceeding a count is a signal that something
needs to be done
There are a lot of signals that could be malicious
You can start with simple thresholds
• Too many failed logins
• Too many bad response codes (4xx, 5xx)
• Request volume too high
These provide a lot of signal
But they don’t get you all the way there
There are patterns of behavior that signal
malicious intent
Example
10.20.253.8 - - [23/Apr/2013:14:20:21 +0000] "POST /login HTTP/1.1" 200 267"-" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:8.0) Gecko/20100101 Firefox/8.0" "77.77.165.233"
10.20.253.8 - - [23/Apr/2013:14:20:22 +0000] "POST /users/king-roland/credit_cards HTTP/1.1" 302 2085 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:8.0) Gecko/20100101 Firefox/8.0" "77.77.165.233"
10.20.253.8 - - [23/Apr/2013:14:20:23 +0000] "POST /users/king-roland/credit_cards HTTP/1.1" 302 2083 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:8.0) Gecko/20100101 Firefox/8.0" "77.77.165.233"
10.20.253.8 - - [23/Apr/2013:14:20:24 +0000] "POST /users/king-roland/credit_cards HTTP/1.1" 302 2085 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:8.0) Gecko/20100101 Firefox/8.0" "77.77.165.233"
That was a carding attack
As you dig in, you will find many patterns like
these
But again it doesn’t cover everything
There will also be interesting deviations
5%5%4%
27% 59%
GET POST HEAD PUT DELETE
Deviations in normal flow are interesting but not necessarily malicious
You will have to build more intelligent processing to
understand them
Example
A password reset request comes from a new
location
Is it a harmless request or an account takeover?
Your processors will have to make complicated choices based on lots of information
Nailing deviation requires the largest amount of
effort
Step 4Act
Once you have enough information to make a decision, you must act
There are multiple ways to act
• Blacklist
• Whitelist
• Mark
• Do nothing
Blacklist and whitelist are pretty straight forward
Blacklist when thresholds are exceeded or
patterns/deviation fit
Whiltelist things you never want to be
blacklisted
Marking is more interesting
Marking allows you to tag actors as potentially
malicious
This allows you to dynamically modify your
responses
And choose how you react
“Of course machines can't think as people do. A machine is different from a person. Hence, they think differently.”
-- Alan Turing, The Imitation Game
You can often render bots useless with small
changes
Which exposes them as bots
And gives you the confidence you need to
blacklist them
Marking also helps you lower the rate of false
positives
Step 5Visualize
Visualization is incredibly helpful
You need a window into your automation
Spending a few minutes a day looking at what
happened is vital
You can pretty easily catch bugs this way
Architecture & Peformance
There are three main ideas
• The thing that acts on actors
• The shared cache
• The event processors
Acting on actors should be fast
Fast in a web request is single digit milliseconds
You can choose to embed this in your applications
or your web servers
Data locality is important
It usually involves replicating the global cache
to each decision point
The cache should hold everything needed to act
on actors
The web server asks the cache what to do
The event processors work out of band
Their sole purpose is to populate the cache
Processors tend to be more custom
But the cache and the acting logic is common
github.com/repsheet
Pitfalls
Things to consider• False positives
• Decision latency
• Incorrect modeling
• Bad data
• Monitoring
There’s a good chance you will block incorrectly
Make use of whitelisting
Mobile carriers will be a problem
So will NATed IP addresses
Time to decision should be monitored
Create a solid regression suite
Run all your models through it when you make
even a single change
Understand where bad data can impact you
Build tolerance of bad data so you don’t make
incorrect decisions
Monitor everything!
This type of automation deserves every monitor and metric you can get
Questions?