Top Banner
® © 2016 MapR Technologies 1 ® © 2016 MapR Technologies 1 © 2016 MapR Technologies ® Advanced Threat Detection on Streaming Data Carol McDonald, Solution Architect Strata + Hadoop World March 2016
30

Advanced Threat Detection on Streaming Data

Apr 15, 2017

Download

Software

Carol McDonald
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Advanced Threat Detection on Streaming Data

®© 2016 MapR Technologies 1 ®© 2016 MapR Technologies 1 © 2016 MapR Technologies

®

Advanced Threat Detection on Streaming Data Carol McDonald, Solution Architect Strata + Hadoop World March 2016

Page 2: Advanced Threat Detection on Streaming Data

®© 2016 MapR Technologies 2 ®© 2016 MapR Technologies 2

Meeting Advanced Threats Head On

•  Solutionary: Managed Security Services Provider –  Provides Threat Intelligence as a

Service

Page 3: Advanced Threat Detection on Streaming Data

®© 2016 MapR Technologies 3 ®© 2016 MapR Technologies 3

Real-time Detection of Advanced Threats

•  Objective: –  Provide real time threat Intelligence on

trillions of messages per year –  Store and process lots of unstructured

security data –  Combine machine learning and predictive

analytics

Page 4: Advanced Threat Detection on Streaming Data

®© 2016 MapR Technologies 4 ®© 2016 MapR Technologies 4

Event-based Detection of Advanced Threats

Threat Alerts Store and

Process Unstructured

Data

Anomaly Detection

Real-time Threat Intelligence

Predictive Analytics Machine Learning

Page 5: Advanced Threat Detection on Streaming Data

®© 2016 MapR Technologies 5 ®© 2016 MapR Technologies 5

Meeting Advanced Threats Head On

•  Challenges: –  Expanding Data storage

in RDBMS expensive $$ –  Could not process

unstructured data at scale

Scaling Unstructured Data Processing

Challenges

RDBMS Economics Unstructured Data

Page 6: Advanced Threat Detection on Streaming Data

®© 2016 MapR Technologies 6 ®© 2016 MapR Technologies 6

Serve Data Store Data Collect Data

What Did The Solution Need to do ?

Process Data Data Sources

? ? ? ?

Security Feeds

HTTP

Syslog

Firewall

Other

Page 7: Advanced Threat Detection on Streaming Data

®© 2016 MapR Technologies 7 ®© 2016 MapR Technologies 7

How to do this with High Performance at Scale? •  Parallel , Partitioned = fast , scalable

Page 8: Advanced Threat Detection on Streaming Data

®© 2016 MapR Technologies 8 ®© 2016 MapR Technologies 8

Data Ingest

Solution: Stream Processing Architecture

Topics

Sources

Security Feeds

HTTP

Syslog

Firewall

Other

Data Ingest: •  Kafka or MapR Streams: fast

distributed messaging

Topics

Topics

Topics

Page 9: Advanced Threat Detection on Streaming Data

®© 2016 MapR Technologies 9 ®© 2016 MapR Technologies 9

Fast Distributed Messaging

•  Topics organize events into categories

•  Topics decouple producers from Consumers

Page 10: Advanced Threat Detection on Streaming Data

®© 2016 MapR Technologies 10 ®© 2016 MapR Technologies 10

Fast Distributed Messaging

•  Topics are partitioned for fast throughput and scalability

Page 11: Advanced Threat Detection on Streaming Data

®© 2016 MapR Technologies 11 ®© 2016 MapR Technologies 11

How to do this with High Performance at Scale? •  Parallel , Partitioned:

–  Messaging

Page 12: Advanced Threat Detection on Streaming Data

®© 2016 MapR Technologies 12 ®© 2016 MapR Technologies 12

Data Ingest

Complex Event Processing with Storm and Esper Stream

Processing

Parser Bolt

Kafka Spout

Enrich Bolts

Esper Kakfa Bolt

Esper Spout

Topic

Alert Bolts

Cross topology correlation of events

•  Stream Processing: –  Storm: distributed real

time computation –  Esper: Complex Event

Processing Topics

Topics

Topics

Page 13: Advanced Threat Detection on Streaming Data

®© 2016 MapR Technologies 13 ®© 2016 MapR Technologies 13

Complex Event Processing with Esper

•  Detect a related set or pattern of events within a time window

•  Example Pattern Excess Login Failure: –  Same user, same source login failure

SELECT * FROM Event(ip_src IS NOT NULL AND ec_activity=’Logon’ AND ec_outcome = ‘Failure’)

.std:groupwin(ip_src).win:time (300 sec) GROUP BY ip_src HAVING COUNT(*) = 10

Page 14: Advanced Threat Detection on Streaming Data

®© 2016 MapR Technologies 14 ®© 2016 MapR Technologies 14

How to do this with High Performance at Scale? •  Parallel , Partitioned:

–  Processing

Page 15: Advanced Threat Detection on Streaming Data

®© 2016 MapR Technologies 15 ®© 2016 MapR Technologies 15

Real-time Detection of Advanced Threats: Examples

Data transferred from critical database servers

Large traffic flows from a host to a given IP address

Employee accessing database servers at unusual hours

User logging in from two different countries within a short window

Page 16: Advanced Threat Detection on Streaming Data

®© 2016 MapR Technologies 16 ®© 2016 MapR Technologies 16

Complex Event Processing with Storm and Esper

Cross-topology correlation of events

Page 17: Advanced Threat Detection on Streaming Data

®© 2016 MapR Technologies 17 ®© 2016 MapR Technologies 17

NoSQL Storage

Solution: Stream Processing Architecture Stream

Processing

MapR-FS

MapR-DB

HDFS Bolt

Index Bolt

HBase Bolt

•  NoSQL Storage –  HBase: fast scalable storage and

caching –  Elastic Search: Indexing for real-

time search analytics

Page 18: Advanced Threat Detection on Streaming Data

®© 2016 MapR Technologies 18 ®© 2016 MapR Technologies 18

Scalability with HBase (MapR-DB)

Key colB colC

val val val

xxx val val Key colB col

C

val val val

xxx val val Key colB col

C

val val val

xxx val val

Storage Model RDBMS HBase

Normalized schema à Joins for queries can cause bottleneck

De-normalized schema à Data that is read together is stored together

Page 19: Advanced Threat Detection on Streaming Data

®© 2016 MapR Technologies 19 ®© 2016 MapR Technologies 19

MapR-DB (HBase API) is Designed to Scale

Key Range

xxxx xxxx

Key Range

xxxx xxxx

Key Range

xxxx xxxx

Key colB colC

val val val

xxx val val

Key colB colC

val val val

xxx val val

Key colB colC

val val val

xxx val val

Fast Reads and Writes by Key! Data is automatically partitioned by Key Range!

Page 20: Advanced Threat Detection on Streaming Data

®© 2016 MapR Technologies 20 ®© 2016 MapR Technologies 20

How to do this with High Performance at Scale? •  Parallel , Partitioned:

–  Storage

Page 21: Advanced Threat Detection on Streaming Data

®© 2016 MapR Technologies 21 ®© 2016 MapR Technologies 21

NoSQL Storage

Solution: Stream Processing Architecture

MapR-FS

MapR-DB

•  Machine Learning –  thread modeling –  anomaly detection

•  Security Analytics

Serve Data

Page 22: Advanced Threat Detection on Streaming Data

®© 2016 MapR Technologies 22 ®© 2016 MapR Technologies 22

Data Driven Forensics Investigation

•  What can the data tell us? –  What happened within a time range?

–  How did the threat get in?

–  What are all the activities associated with a specific IP/user?

–  How much data was affected?

–  Has this occurred elsewhere in the past?

Page 23: Advanced Threat Detection on Streaming Data

®© 2016 MapR Technologies 23 ®© 2016 MapR Technologies 23

Solution: Stream Processing Architecture

Page 24: Advanced Threat Detection on Streaming Data

®© 2016 MapR Technologies 24 ®© 2016 MapR Technologies 24

Key to Real Time: Event-based Data Flows

Key to Scale = Parallel Partitioned: •  Messaging •  Processing •  Storage

Page 25: Advanced Threat Detection on Streaming Data

®© 2016 MapR Technologies 25 ®© 2016 MapR Technologies 25

Stream Processing

Building a Complete Data Architecture

Sources/Apps Bulk Processing

Web-Scale Storage MapR-FS MapR-DB MapR Streams

Event Streaming Database

Page 26: Advanced Threat Detection on Streaming Data

®© 2016 MapR Technologies 26 ®© 2016 MapR Technologies 26

Key to Real Time: Convergence A

pps

High Availability Data Protection

Unified Security Real Time Multi-tenancy

Unified M

anagement &

Monitoring

Customer Experience Data Architecture Optimization

Security Investigation & Event Management

Operational Intelligence

Managed Services & Custom Apps

Event Streaming

Database

Storage

Converged Data Platform

Page 27: Advanced Threat Detection on Streaming Data

®© 2016 MapR Technologies 27 ®© 2016 MapR Technologies 27

Why Hadoop for Security Analytics?

•  Cost effective for storing and analyzing large volumes of data in real-time

•  Provides search & query, machine learning for activity correlation and anomaly detection

•  When it comes to Hadoop, select an enterprise distribution (e.g. MapR Converged Data Platform) so you can focus on your primary objective

Page 28: Advanced Threat Detection on Streaming Data

®© 2016 MapR Technologies 28 ®© 2016 MapR Technologies 28

To Learn More: •  http://learn.mapr.com/

Page 29: Advanced Threat Detection on Streaming Data

®© 2016 MapR Technologies 29 ®© 2016 MapR Technologies 29

To Learn More: •  Download example code –  https://github.com/caroljmcdonald/mapr-streams-sparkstreaming-hbase

•  Read explanation of example code –  https://www.mapr.com/blog/spark-streaming-hbase

Page 30: Advanced Threat Detection on Streaming Data

®© 2016 MapR Technologies 30 ®© 2016 MapR Technologies 30

Q & A

@mapr

https://www.mapr.com/blog/author/carol-mcdonald

Engage with us!

mapr-technologies