Top Banner
61

Fluturas presentation @ Big Data Conclave

Nov 01, 2014

Download

Technology

fluturads

Flutura had presented at the big data conclave . Please find the presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Fluturas presentation @ Big Data Conclave
Page 2: Fluturas presentation @ Big Data Conclave

Agenda

• 3 Industries , 5 real life Flutura user stories

• 7 Key “Gotchas” & Big Data Best Practices

Page 3: Fluturas presentation @ Big Data Conclave

Case Study-1 : Reducing Network threats by Detecting Patterns in

perimeter device logs

Page 4: Fluturas presentation @ Big Data Conclave

What is the Biz problem being solved ?

Page 5: Fluturas presentation @ Big Data Conclave

What is the problem being solved?

Network threats are growing ...

Page 6: Fluturas presentation @ Big Data Conclave

What is the problem being solved?

• 2 types of threats – Internal ( Social Unrest & Watch List ) & External ( Hackers )

External hackers Internal Activists

Page 7: Fluturas presentation @ Big Data Conclave

Who is experiencing the pain ? Telecom Security Operations centre

Page 8: Fluturas presentation @ Big Data Conclave

Lots of Telecom Machine data left untapped !

This is typically flushed but has gold in it

Page 9: Fluturas presentation @ Big Data Conclave

Why is it important to solve this problem?

• Reduces network disruption from hackers

• Minimize social disruption and unrest

Page 10: Fluturas presentation @ Big Data Conclave

Traditional RDBMS architectures cant handle high velocity machine data !

Page 11: Fluturas presentation @ Big Data Conclave

SOC's cant see threat patterns … running BLIND

• Being Blind = Risk • BeingCannot be blind to patterns anymore • The capability to “see” patterns previously not seen • Network activity and behaviour – Firewalls , routers • Saves lives, provides social stability – WL Chatter !

Page 12: Fluturas presentation @ Big Data Conclave

Capability to remove “data blind folds” to “SEE” behavioural patterns key to

security

MACHINE DATA

KEY TO UNCOVERING

SECURITY PATTERNS !

Page 13: Fluturas presentation @ Big Data Conclave

What are some “behavioural signatures” ?

1. Sudden increase in you tube uploads @ night

1. Viral Rate of propagation of MMS videos

Page 14: Fluturas presentation @ Big Data Conclave

So what does the data look like ? National content filtering log – 1 billion events/day !

Page 15: Fluturas presentation @ Big Data Conclave

16

1329031890 http://photogallery.indiatimes.com/photo/4686985.cms 94.200.107.14 94.200.0.0 Du_Public_IP_Address 0 37

1 2 3 4 5 6

Decoding 7 components of the Netsweeper log entry

7

EPOCH Time stamp

URL requested Source IP Client

subnet Client group

name 0 allowed 1 denied

URL Category Descp tbd

50 categories in the system

Education, Pornoraphy, Phishing, Criminal Skills etc

23" - Its related to "Pornography “45" - Its related to "GENERAL"

Timestamp URL requested Source IP Client Subnet Client Group Name Denied flag URL Categort

Decoding National content filtering logs

Page 16: Fluturas presentation @ Big Data Conclave

Expand to ingest variety of watched events

File Delete Events

User Login Failure Events

Root access Failures

2 Sigma events

Table Drop Events

Table Delete Events

Column Drop Events

Critical Proc recompilation

OS logs Database logs

Critical tsn value changes

Master data changes

App login failures

Login at unusual time windows

Application logs

Search for specific keywords

2 Sigma event for URL’s

Decomp tree- failed reqsts

Login Failure

Web server logs

Dropped call frequency

Watch List inbound/outbound

Cut calls - poor connection

Call Failure event frequency

Timeout event frequency

Swarm event detected

Dropped IP calls frequency

Failed IP call frequency

CDR logs IPR logs

SMS Capacity events

Unusual sms traffic events

User defined router events

Compliance related router event

Router logs

Odd hour Unsuccessful logins

X happens Y times in Z time

User defined firewall events

Compliance oriented firewall e

Firewall logs

Frequency of login failures high in a certain pockets Recency of late night events noticed in certain pockets Certain corridors experiencing high dropped calls

Page 17: Fluturas presentation @ Big Data Conclave

Converting raw data Actionable Intelligence

INTEGRATED

EVENT 360

REPOSITORY

SENSE &

RESPOND

LAYER

LOG FILE

INGESTION

MACHINE LEARNING

ALGORITHMS ON

GRANULAR LOG

EVENT DATA

INFER INTENT FROM

PATTERNS

AND CREATE EVENT

PROFILES

LOAD RISK /

BEHAVIOR PROFILE

TO RULES ENGINE

DB

INTERCEPT OR

OFFLINE REVIEW OF

EVENTS

CONSOLIDATE & REVIEW

EVENT INTERCEPTS TO

ASSESS EVENT RULE

EFFECTIVENESS

MEASURE PATTERN RULE

EFFECTIVENESS

- TRUE POSITIVE / FALSE

POSITIVES

CASE MANAGEMENT

WORKFLOW

TELECOM SWITCHES OTHER DEVICES •CDR LOG FILES •IP LOG FILES •MISC LOG FILES

Holistic Value Chain

BIG DATA

REPOSITORY

Page 18: Fluturas presentation @ Big Data Conclave

Case Study-2 : Decoding travellers intent

Page 19: Fluturas presentation @ Big Data Conclave

What's the problem we are trying to solve ?

• Travellers are “signalling” to us thru the behaviour they exhibit

• OTA is unable to sense n respond to these varied behaviour

Page 20: Fluturas presentation @ Big Data Conclave

Why is it important to solve this problem ?

• Impacts look to book

• Increase revenue from cross sell

Page 21: Fluturas presentation @ Big Data Conclave

Srikanth intends to travel from San Fran to NYC

Page 22: Fluturas presentation @ Big Data Conclave

Srikanth searches !

Page 23: Fluturas presentation @ Big Data Conclave

Srikanths First Moment of Truth !

Page 24: Fluturas presentation @ Big Data Conclave

Srikanth sees the options rendered !

Page 25: Fluturas presentation @ Big Data Conclave

Is Srikanth Price Sensitive or Time conscious traveller?

87 % 13%

Page 26: Fluturas presentation @ Big Data Conclave

Does Srikanth have a bias towards any

airline ?

Those small clicks reveal a lot !

Page 27: Fluturas presentation @ Big Data Conclave

So who is Srikanth? Do we 'know' him ?

What's his behavorial DNA ? Key vectors ?

Early bird ( days = 21 ) Price insensitive ( click % = 89 %) Prefers American Airlines Most valuable customer ( Decile-1 ) Intra visit interval = 17 days Visit dispersion = 12 % International Churn propensity = 0 Bargain hunter = No ( 3 % coupon) Roadie = Yes ( 28000 miles per qtr ) Sentiment index = 73 %

Page 28: Fluturas presentation @ Big Data Conclave

How do we respond in real time to Srikanths experience and behavioural patterns we’ve seen ?

• If Srikanth is a high value customer

• If he does not book within 8 min window

• In real time route to high performing agent

• Short circuit the queue

• Extra 10 % discount since he is vulnerable

• If search response time velocity is trending downward

• Signal to beef up infrastructure

• Optimise code base

• Property recommendations

Page 29: Fluturas presentation @ Big Data Conclave

Case Study-3 : Watched List

Page 30: Fluturas presentation @ Big Data Conclave

What is the problem being solved?

• Internal watch lists

• Can we get e signals in their behavior ? Call patterns ?

SMS patterns ?

Youtube upload patterns ?

Watched countries ?

Intrawatch list chatter ?

Late night communication behavior ?

• Watch list activity intelligence takes 6 weeks

• Bring it down to < day

• Enhance it to make it real time

Page 31: Fluturas presentation @ Big Data Conclave

Why is it important to solve this problem ?

• Threat signals are there in telecom and communication logs

• Saves lives !

• Ensures national

security !

Page 32: Fluturas presentation @ Big Data Conclave

Under the hood

• Remote Authentication Dial-In User Service (RADIUS) provide authentication, authorization and accounting for network access.

• When a user wants to get access to the Internet he will first have to give his users

credentials (in most cases username and password) to a local RADIUS client.

Page 33: Fluturas presentation @ Big Data Conclave

Deconstructing Radius Logs

The IP address of the NAS ( Network Access server ) that is sending the request

The framed address to be configured for the user

3 time stamps

User Identity

Page 34: Fluturas presentation @ Big Data Conclave

Radius logs Netsweeper logs

Subscriber database

Rich Security intelligence !

Triangulate from 3 event data pools

Page 35: Fluturas presentation @ Big Data Conclave

Access/Device

Framed IP address

Customer ethnicity

URL accessed

Date/time

Day

Week

Client IP address

Customer type

Customer browse location

Post paid Subscriber Database

1329031890 http://photogallery.indiatimes.com/photo/4686985.cms 94.200.107.14 94.200.0.0 Du_Public_IP_Address 0 37

Status

Enterprise

Residential

Asian

European

Dubai

Smart Phone

Desktop

Ipad

Others

URL Type

Gaming sites

News sites

Others

?

? Yes

No

Business rule to derive access device to be elicited from

SME

Location mapping business logic to be elicited from SME

Social Networking

Blogs

P2P sites

VPN/VOIP

NAS Port Id

Username Nas port id RADIUS Logs

Co-relating fragmented telecom log files-Info model

Page 36: Fluturas presentation @ Big Data Conclave

Calls to watched countries

Intra Watch list Chatter velocity is high

Call patterns reveal malicious intent

Page 37: Fluturas presentation @ Big Data Conclave

38

Entity on watch list

NOT on watched list but high level of

interactions

Are people ‘n’ degrees away from watched list performing 2 sigma activity across multiple Call dimensions – sms, voice, conference and other behavioral activity ?

CDR From BTN To TN Date/Time Duration Call type, Approximate tower location which carried

call

Watch List Recommender Data Product Modeling Unique behavioural signature

Page 38: Fluturas presentation @ Big Data Conclave

Discarded Telecom data--> Actionable Security patterns

Page 39: Fluturas presentation @ Big Data Conclave

Case Study-4 : Mobile forensics

Page 40: Fluturas presentation @ Big Data Conclave

Mobile funnel data Analyzing Mobile Sub Channel Behavioural

shift to Drive revenues for a leading online

travel company

Page 41: Fluturas presentation @ Big Data Conclave

What's the problem being solved ?

• More applications becoming mobile

• There is a dip in transaction completion rate

• Friction points and hot spots exist

• No way to “see” these hot spots and patterns

Page 42: Fluturas presentation @ Big Data Conclave

• Spot friction points

• Mobile funnel drops

• Payment gateway drops

• Airline connector drops

Page 43: Fluturas presentation @ Big Data Conclave

Funnel Analysis

Page 44: Fluturas presentation @ Big Data Conclave

Churn Scoring Model

Page 45: Fluturas presentation @ Big Data Conclave

Case Study-5 : Money transmission

Page 46: Fluturas presentation @ Big Data Conclave

Minimizing fund leakages to watched entities

Money transmission event stream Threat matrix Graph Analysis

Page 47: Fluturas presentation @ Big Data Conclave

Money transmission behavioral modeling

Page 48: Fluturas presentation @ Big Data Conclave

Modeling money transmission behavior

Page 49: Fluturas presentation @ Big Data Conclave

Graph analysis to monitor money transmission patterns

• Each account can be modelled as a node in a graph

• Behaviour across nodes can be analyzed

• Proxy behaviours can be easily discerned

Page 50: Fluturas presentation @ Big Data Conclave

7 Key “gotchas” ( best practices)

Page 51: Fluturas presentation @ Big Data Conclave

Lesson-1 : Think “Polyglot persistence”

Asset

Sensor

Parameters

Asset tags Sensor tags

Events

Column family ( Hbase/Cassandra)

Document db ( Mongo)

Graph db ( Neo4js)

RDBMS ( Oracle )

Heavy duty write workloads

Photos, Videos, text Inter relationships

Low velocity self service

Logical Business Model

“Different strokes for different folks”

Page 52: Fluturas presentation @ Big Data Conclave

Lesson-2 : Think “pattern extraction”

1. Collaborative filtering

2. Text Mining

3. Scoring Models (

Logistic etc )

Embedding one ML process can help SPOT patterns not previously seen

Page 53: Fluturas presentation @ Big Data Conclave

Lesson-3 : Think “Baby steps”

• 60-90 day Hadoop Sandbox

• Build quick wins to

build momentum

• Pick a few low

hanging use cases to demonstrate impact

No Big Bang !

Page 54: Fluturas presentation @ Big Data Conclave

Lesson-4 : Think “Data Products”

• Data Product = “Action an end user takes”

• EXAMPLE

• Watch List recommender vs tons of “feel good” graphs

• Next best action vs lots of dials, graphs

Focus on Outcomes more than Analysis

Page 55: Fluturas presentation @ Big Data Conclave

Lesson-5 : Think “MVP-Minimum Viable Product”

• Minimalist ... Key is to start simple

• Only core features ... No bells and whistles

• Get feedback from early adopters and enrich features

Page 56: Fluturas presentation @ Big Data Conclave

How can Big Data co-exist with existing DW solutions ?

Big Data Existing DW

Page 57: Fluturas presentation @ Big Data Conclave

Existing DW

OSS BSS CRM

ETL

Existing BI tools

Radius logs IP traffic

logs Comments

File copy / Bulk load / Agent based

Operational App Integration

Existing DW

OSS BSS CRM

ETL

Existing BI tools

Radius logs IP traffic

logs Comments

File copy / Bulk load / Agent based

Operational App Integration

Lesson-6 : Gracefully Co-exist

Page 58: Fluturas presentation @ Big Data Conclave

Lesson-7 : Think “Biz backward … NOT Tech forward”

1. What is the business problem you are solving ? Tightly framed ?

2. Why is important to solve this problem ?

3. What happens if we dont solve this problem ?

4. Is status quo an option ?

5. Is the business pain acknowledged ?

6. How would the end user “feel” when the product is deployed ?

7. Are budgets allocated ?

8. What is the actual use case to solve the pain ?

Connect with business @ a deeper level !

Page 59: Fluturas presentation @ Big Data Conclave

1. Think “Polyglot Persistence”

2. Think “Pattern Extraction”

3. Think “Crawl-Walk-Run”

4. Think “Data Products”

5. Think “MVP”

6. Think “Co-existence”

7. Think “Business Impact/Outcomes”

To summarize !

Page 60: Fluturas presentation @ Big Data Conclave

Taming and channelising data beast is going to be a crucial capability for survival !

Page 61: Fluturas presentation @ Big Data Conclave

Pl feel free to reach out …

[email protected]