Top Banner
Big Data for Everyman Erik Swan, Michael Wilde
38

Big Data for Everyman

Dec 20, 2014

Download

Technology

Michael Wilde

A presentation given by Erik Swan, CTO/Co-Founder of Splunk and Michael Wilde, Splunk NInja at the SXSW Interactive 2012 Conference on March 11, 2011
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Big Data for Everyman

Big Data for Everyman

Erik Swan, Michael Wilde

Page 2: Big Data for Everyman

Hi... We work at Splunk.

Page 3: Big Data for Everyman

We stare at data all day.

Page 4: Big Data for Everyman
Page 5: Big Data for Everyman

WTF is Big Data?!

Page 6: Big Data for Everyman

larger than small data?

Page 7: Big Data for Everyman

smaller than giant data?

Page 8: Big Data for Everyman

some cool sauce for DBAs?

Page 9: Big Data for Everyman

Aaaahhh, no.

Page 10: Big Data for Everyman

a simple way to describe a massive problem

*or opportunity depending on your p.o.v.

Page 11: Big Data for Everyman

Volume | Velocity | Variety | Variability

GPS,RFID,

Hypervisor,Web Servers,

Email, MessagingClickstreams, Mobile,

Telephony, IVR, Databases,Sensors, Telematics, Storage,

Servers, Security Devices, Desktops

Big data comes out of machines

Page 12: Big Data for Everyman

Volume | Velocity | Variety | Variability

GPS,RFID,

Hypervisor,Web Servers,

Email, MessagingClickstreams, Mobile,

Telephony, IVR, Databases,Sensors, Telematics, Storage,

Servers, Security Devices, Desktops

Machine-generated data is one of the fastest growing, most complex

and most valuable segments of big data

Big data comes out of machines

Page 13: Big Data for Everyman
Page 14: Big Data for Everyman

no, not uswe’re justnice guyswho wantshow youcool stuff

Page 15: Big Data for Everyman

you are a producer and consumer of data

building a service?

using an app?

Page 16: Big Data for Everyman

Location-­‐Based  Messaging  and  Intelligence  For  Your  App  and  Your  Customers

Seth RabinowitzCEO

James RodmellCTO

Page 17: Big Data for Everyman

2011-11-06 11:57:31,65,00027d27-ae02-627d-a79a-fa0004d3a347,40.75496,-73.963853,60

2011-11-06 12:17:32,65,00027d27-ae02-627d-a79a-fa0004d3a347,40.755001,-73.963886,70

2011-11-06 12:37:33,65,00027d27-ae02-627d-a79a-fa0004d3a347,40.754982,-73.963849,75

2011-11-06 12:57:34,65,00027d27-ae02-627d-a79a-fa0004d3a347,40.754984,-73.963883,85

2011-11-06 13:17:35,65,00027d27-ae02-627d-a79a-fa0004d3a347,40.754941,-73.9639,90

2011-11-06 13:37:36,65,00027d27-ae02-627d-a79a-fa0004d3a347,40.754948,-73.963874,90

2011-11-06 13:57:37,65,00027d27-ae02-627d-a79a-fa0004d3a347,40.754931,-73.963892,95

2011-11-06 14:17:38,50,00027d27-ae02-627d-a79a-fa0004d3a347,40.755232,-73.963522,100

2011-11-06 14:37:33,65,00027d27-ae02-627d-a79a-fa0004d3a347,40.754979,-73.9639,100

Data! Good!DATE/TIME

DEVICE ID

LAT/LONG

BATTERY STRENGTH

Page 19: Big Data for Everyman

Oh, real quick. Did you check in

or tweet #splunk #sxsw

...please

Page 20: Big Data for Everyman

All this data can be pretty cooland empowering

Page 21: Big Data for Everyman

Text

except one little

PROBLEM

Page 22: Big Data for Everyman

alot of it looks like this

Page 23: Big Data for Everyman

13/Apr/2011 08:52:53,Info,Teardown,ASA-session-6-302014,TCP,192.168.2.16,192.168.1.6,(empty),(empty),1099,135,epmap,(empty),0,113/Apr/2011 08:52:53,Info,Teardown,ASA-session-6-302014,TCP,192.168.2.16,192.168.1.6,(empty),(empty),1100,43025,43025_tcp,(empty),0,113/Apr/2011 08:52:55,Info,Teardown,ASA-session-6-302014,TCP,192.168.2.75,192.168.1.6,(empty),(empty),1048,135,epmap,(empty),0,113/Apr/2011 08:52:55,Info,Teardown,ASA-session-6-302014,TCP,192.168.2.75,192.168.1.6,(empty),(empty),1049,43025,43025_tcp,(empty),0,113/Apr/2011 08:52:55,Info,Teardown,ASA-session-6-302014,TCP,192.168.2.75,192.168.1.6,(empty),(empty),1051,135,epmap,(empty),0,113/Apr/2011 08:52:55,Info,Teardown,ASA-session-6-302014,TCP,192.168.2.75,192.168.1.6,(empty),(empty),1052,43025,43025_tcp,(empty),0,113/Apr/2011 08:52:55,Info,Teardown,ASA-session-6-302014,TCP,192.168.2.64,192.168.1.6,(empty),(empty),1694,135,epmap,(empty),0,1

Page 24: Big Data for Everyman

and we’re expected to talk to it like this

Page 25: Big Data for Everyman

select (select max(answer.answer) from answer where answer.member_id in (select member_id from team_members where project_id in ( select project_idfrom project where Business_stream='Upstream' and stage='Appraise' andproject_id in (select project_id from projectextra where subteam<>1 ) ) ) andanswer.page_id=page.page_id) as thinl, (select max(avgscore) from task_projectwhere task_project.project_id not in (select project_id from projectextrawhere subteam=1 ) and task_project.project_id in (select project_id fromproject where stage='Appraise' and Business_stream = 'Upstream') andtask_project.page_id=page.page_id) as bmax, (select max(answer) from answerwhere answer.page_id=page.page_id) as datamax, (select avg(avgscore) fromtask_project where project_id=1 and task_project.page_id=page.page_id) asprojavg, (select avg(avgscore) from task_project where project_id not in(select project_id from projectextra where subteam=1) andtask_project.page_id=page.page_id) as companyavg, (select avg(avgscore) fromtask_project where project_id not in (select project_id from projectextrawhere subteam=1) and project_id in (select project_id from project whereBusiness_stream = 'Upstream') and task_project.page_id=page.page_id) asbusinessavg, page.* from page,riverorder where page.category_name='BusinessBoundaries' and stage_name='Appraise' andriverorder.category_name=page.category_name order byriverorder.riverorder,page.order_id select (select max(answer.answer) fromanswer where answer.member_id in ( select member_id from team_members whereproject_id in ( select project_id from project whereBusiness_stream='Upstream' and stage='Appraise' and project_id in (selectproject_id from projectextra where subteam<>1 ) ) ) andanswer.page_id=page.page_id) as thinl, (select max(avgscore) from task_projectwhere task_project.project_id not in (select project_id from projectextrawhere subteam=1 ) and task_project.project_id in (select project_id fromproject where stage='Appraise' and Business_stream = 'Upstream') andtask_project.page_id=page.page_id) as bmax, (select max(answer) from answerwhere answer.page_id=page.page_id) as datamax, (select avg(avgscore) fromtask_project where project_id=1 and task_project.page_id=page.page_id) asprojavg, (select avg(avgscore) from task_project where project_id not in(select project_id from projectextra where subteam=1) andtask_project.page_id=page.page_id) as companyavg, (select avg(avgscore) fromtask_project where project_id not in (select project_id from projectextrawhere subteam=1) and project_id in (select project_id from project whereBusiness_stream = 'Upstream') and task_project.page_id=page.page_id) asbusinessavg, page.* from page,riverorder where page.category_name='BusinessBoundaries' and stage_name='Appraise' andriverorder.category_name=page.category_name order byriverorder.riverorder,page.order_id

Page 26: Big Data for Everyman

It could be better. yes? better is good!

Page 27: Big Data for Everyman

{[-­‐]    checkin  :  {[-­‐]        badges  :  [],        created  :  1331454784,        geolat  :  "30.2640941786",        geolong  :  "-­‐97.7414819408",        mayor  :  {[-­‐]            type  :  "nochange"        },        primarycategory  :  {[-­‐]            fullpathname  :  "Food:American  Restaurants",            iconurl  :  "https://foursquare.com/img/categories/food/default.png",            id  :  "4bf58dd8d48988d14e941735",            nodename  :  "American  Restaurants"        },        timezone  :  "America/Chicago",        user  :  {[-­‐]            gender  :  "male"        },        venue  :  {[-­‐]            id  :  "4d752b1bba682d43e7563876",            name  :  "CNN  Grill  @  SXSW  (Max's  Wine  Dive)"        }    }} readable, ya think?

Text

Page 28: Big Data for Everyman

source=foursquare | timechart count by checkin.venue.name

The languages to talk to data are getting better for us humans

Page 29: Big Data for Everyman

Guys.. come on! Go back to the data please.

Page 30: Big Data for Everyman
Page 31: Big Data for Everyman

a simple way to describe a massive problem

A friend in Boulder can help

Need data?

Page 33: Big Data for Everyman
Page 34: Big Data for Everyman

Just when you think you’re all done, wait. There is another

consumer you may have forgotten

Page 35: Big Data for Everyman

Someone with a different

perspective sees your service as

input to theirs

Page 36: Big Data for Everyman

DEMAND REALTIME DATAIN A STREAM OVER THE WEB

IN JSON FORMAT

Page 37: Big Data for Everyman

Hey audience!We still have a few

minutes.

What questions might you have

been saving until this exact moment?

Page 38: Big Data for Everyman

Thanks.

Erik Swan, CTO Co-Founder,

Splunk

Michael WildeSplunk Ninja

Who else sends you on your way with a cute dog photo?