Top Banner
OCTOBER 1114, 2016 BOSTON, MA
42

Searching The Enterprise Data Lake With Solr - Watch Us Do It!: Presented by Paul Nelson, Search Technologies

Apr 16, 2017

Download

Technology

LucidWorks
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by Paul Nelson, Search Technologies

O C T O B E R   1 1 -­‐ 1 4 ,   2 0 1 6     •     B O S T O N ,   M A  

Page 2: Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by Paul Nelson, Search Technologies

O C T O B E R   1 1 -­‐ 1 4 ,   2 0 1 6     •     B O S T O N ,   M A  

Searching the Enterprise Data Lake with Solr - Watch us do it! Paul Nelson – [email protected]

Chief Architect, Search Technologies

Page 3: Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by Paul Nelson, Search Technologies

THERE  WILL  BE  A  DEMO  Stay  Tuned!  

Page 4: Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by Paul Nelson, Search Technologies

205+  Search  Consultants  Worldwide  

San  Diego  

San  Jose,  CR  

Cincinna6  

Manila,  PH  Washington              (HQ)  

•  Founded  2005  •  Deep  search  experLse  

•  900+  customers  worldwide  •  Consistent  profitability  

•  Search  engines  &  Big  Data  •  Vendor  independent  

London,  UK  

Frankfurt,  DE  Prague,  CZ  

Page 5: Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by Paul Nelson, Search Technologies

Agenda  •  The  Enterprise  Data  Lake  (EDL)  •  Why  Search  the  EDL?  •  The  Process  •  How  To:    Step  By  Step  •  And  then  what?  

Page 6: Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by Paul Nelson, Search Technologies

In  The  Beginning  

Applica6on  

Computer  Users  

Database  

Dashboards  

Reports  

Search  &  Troubleshoo6ng  

Alerts  

Page 7: Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by Paul Nelson, Search Technologies

This  Evolved  to  Data  Warehouses  

Many  Computer  Users  

Dozens  of    Applica6ons  Dozens  of    Applica6ons  Dozens  of    Applica6ons  Dozens  of    Applica6ons  Dozens  of    Applica6ons  Dozens  of    Applica6ons  

Extract  Transform  

Load  

Enterprise  Data  Warehouse  

Dashboards  

Reports  

Search  &  Troubleshoo6ng  

Alerts  

Page 8: Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by Paul Nelson, Search Technologies

And  Now  the  Enterprise  Data  Lake  

Many,  many,  many    Computer  Users  

Enterprise  Data  Lake  

Dashboards  

Reports  

Search  &  Troubleshoo6ng  

Alerts  Analyze  

Hundreds  of  Applica6ons   Raw  Data  

And  Processed  Data  

Page 9: Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by Paul Nelson, Search Technologies

What’s  new  about  the  Data  Lake?  •  Ingest  RAW  DATA  •  Keep  it  FOREVER  •  Make  it  ALL  AVAILABLE  •  Analyze  it  ONLY  WHEN  NEEDED  •  Do  it  at  MASSIVE  SCALE  

Page 10: Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by Paul Nelson, Search Technologies

Why  the  Data  Lake?  •  You  never  know  what’s  important  up  front  –  New  data  mining  techniques  invented  daily  –  Therefore,  keep  everything  

•  There  is  too  much  data  variety  –  Therefore,  only  process  what  you  need  

•  Save  money  by  not  ETL’ing  useless  stuff  •  There  are  many  different  use  cases  –  Shared  re-­‐use  of  data  by  anyone  –  Data  is  power!  Power  to  the  people!  

Page 11: Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by Paul Nelson, Search Technologies

But  Now  There’s  a  Problem:  •  10’s  of  thousands  of  databases  •  Billions  of  records    

How  to  find  the  data  you  need?  

Page 12: Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by Paul Nelson, Search Technologies

SO  LET’S  SEARCH  THE  DATA  LAKE  

Page 13: Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by Paul Nelson, Search Technologies

“People  today  think  search  and  big  data  are  separate  but  in  two  or  three  years,  everyone  will  wonder  why  we  ever  thought  that.”    Doug  Cu?ng  Chief  Architect,  Cloudera  Creator  of  Lucene  &  Hadoop  

Page 14: Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by Paul Nelson, Search Technologies

The  Process  

Ingest  

1  

Research  the  Data  

2  

Configure  Solr  

3  

Parse  &    Index  

4  

Search  &  Analyze  

5  

Produc6on  

6  

Page 15: Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by Paul Nelson, Search Technologies

1.    Ingest  

HDFS  Load  Data  

Hadoop  

Page 16: Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by Paul Nelson, Search Technologies

2.    Research  the  Data  

HDFS  Research  

Hadoop  

Page 17: Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by Paul Nelson, Search Technologies

3.    Configure  Solr  

HDFS  

solrconfig.xml  

schema.xml  

Hadoop  

Page 18: Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by Paul Nelson, Search Technologies

4.    Parse  &  Index  

HDFS  

Index  Morphlines  

Hadoop  

Page 19: Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by Paul Nelson, Search Technologies

5.    Search  &  Analyze  

HDFS  

Index  

Hadoop  

Hue  Morphlines  

Page 20: Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by Paul Nelson, Search Technologies

6.  Move  to  Produc6on  •  Tes6ng,  Quality  Control  –  Field  processing  –  Search  Features  –  Analy6cs  

•  Incremental  Processing  –  Flume,  Spark  Streaming,  Incremental  Batches  

•  Workflow  /  Scheduled  Jobs  (Oozie)  •  Security  Controls  

Page 21: Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by Paul Nelson, Search Technologies

WATCH  US  DO  IT!  

Page 22: Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by Paul Nelson, Search Technologies

Resources  •  HDFS  File  System  Commands  

–  hips://hadoop.apache.org/docs/r2.7.3/hadoop-­‐project-­‐dist/hadoop-­‐common/FileSystemShell.html  

•  solrctl  Reference  Guide  –  hips://www.cloudera.com/documenta6on/enterprise/5-­‐7-­‐x/topics/search_solrctl_ref.html    

•  Morphlines  Reference  Guide  –  hip://kitesdk.org/docs/1.1.0/morphlines/morphlines-­‐reference-­‐guide.html  –  hips://github.com/typesafehub/config/blob/master/HOCON.md    

•  MapReduce  Indexer  Tool  –  hips://github.com/cloudera/search/tree/cdh5-­‐1.0.0_5.2.1/search-­‐mr    

•  Crunch  Indexer  –  hips://github.com/cloudera/search/tree/cdh5-­‐1.0.0_5.2.1/search-­‐crunch    

•  Lily  HBase  Indexer  –  hip://www.cloudera.com/documenta6on/enterprise/latest/topics/search_hbase_batch_indexer.html    

Page 23: Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by Paul Nelson, Search Technologies

What’s  Next  •  Explore  other  analy6c  interfaces  

–  Banana,  Zoom  Data  •  Spark  

–  Streaming  Data  –  Complex  Analy6cs  à  Store  results  in  Solr  à  More  analy6cs!  

•  Index  Many  More  Collec6ons  –  Create  a  Process:    Data  research  à  Data  Model  Design  à  Implement  

•  Self-­‐Service  Inges6on  –  Document  processes  for  others  to  use  –  Templates  for  inges6on  

•  Hire  Search  Technologies!  

Page 24: Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by Paul Nelson, Search Technologies

QUESTIONS?  ANSWERS!  Thank  you!  

Page 25: Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by Paul Nelson, Search Technologies
Page 26: Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by Paul Nelson, Search Technologies
Page 27: Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by Paul Nelson, Search Technologies
Page 28: Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by Paul Nelson, Search Technologies
Page 29: Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by Paul Nelson, Search Technologies
Page 30: Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by Paul Nelson, Search Technologies
Page 31: Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by Paul Nelson, Search Technologies
Page 32: Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by Paul Nelson, Search Technologies
Page 33: Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by Paul Nelson, Search Technologies
Page 34: Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by Paul Nelson, Search Technologies
Page 35: Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by Paul Nelson, Search Technologies
Page 36: Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by Paul Nelson, Search Technologies
Page 37: Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by Paul Nelson, Search Technologies
Page 38: Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by Paul Nelson, Search Technologies
Page 39: Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by Paul Nelson, Search Technologies
Page 40: Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by Paul Nelson, Search Technologies
Page 41: Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by Paul Nelson, Search Technologies
Page 42: Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by Paul Nelson, Search Technologies