Webinar: Fusion for Business Intelligence

Post on 19-Jan-2017

180 Views

Category:

Technology

4 Downloads

Preview:

Click to see full reader

Transcript

Fusion  for  Business  Intelligence  

Allan  Syiek  Senior  Sales  Engineer  September  14,  2016  

Session  Objec,ves  

By  the  end  of  this  session,  you  will:    –  Have  a  high  level  awareness  of  the  variety  of  search  and  discovery  funcFonality  available  

–  Select  the  right  product  for  a  parFcular  use  case  

–  Know  why  this  baby  is  so  happy    

Agenda  Ø The  Beer  and  Diaper  Legend  Ø DIKW  Pyramid  Ø What  is  Enterprise  Search  Ø Indexing  101  Ø StaFsFcs  vs.  Data  Mining  vs.  Machine  Learning  Ø What  is  Business  Intelligence  Ø Where  does  Fusion  Fit?  

Parable  of  the  Beer  and  the  Diapers  

Illustrates  the  difference  between  querying  and  data  mining,    already  firmly  enshrined  in  BI  mythology    

 The  DIKW  Pyramid    

What  is  Enterprise  Search  

Q.  What  do  you  do  with  a  mountain  of  data  located  everywhere?  A.  Depends….  What  do  you  need  it  for?  

•  Crawling,  Parsing,  Indexing,  Searching  •  Advanced  Searches  •  Searching  Structured  Data  •  Searching  Unstructured  Data  •  Metadata  •  Ranking  •  Results  •  Access  Control  •  UI  •  Tuning  •  ReporFng  •  Scale  and  Performance  

Aspects  of  Enterprise  Search  

Index Pipeline

Tika  Parser  Exclusion  Filter  Field  Mapper  HTML  Transform  Stage  XML  Transform  Stage  OpenNLP  EnFty  ExtracFon  Gaze]eer  ExtracFon  Regular  Expression  AggregaFng  Javascript  (custom  scripts)  …and  others…  

Sear

ch C

olle

ctio

n

Sear

ch U

I

Search  Fields/Parameters  Facets    Landing  Pages  Boost  Documents  Block  Documents  Security  Trimming  RecommendaFon  BoosFng  Rollup  Aggregator  Sub  Query  Javascript  (custom  scripts)  …and  others…  

Doc

umen

tsQuery Pipeline

 Indexing  101    

A  system  used  to  make  finding  informa,on  easier.    

Every  word  is  converted  into  a  wordID  by  using  an  in-­‐memory  hash  table  -­‐-­‐  the  lexicon.      Occurrences  in  the  current  document  are  translated  into  hit  lists  and  are  wri]en  into  the  forward  “barrels”.      Inverted  Barrels  have  been  sorted.    

 

Indexing  101  -­‐  Ranking  

•  Score  Results  for  PresentaFon  – Weighted  by          Term  Frequency-­‐Inverse  Document  Frequency            (TF-­‐IDF)  –  Clustering  –  Complex  proprietary  algorithms  

   

Indexing  101  -­‐  Relevance  

Sta,s,cs  vs.  Data  Mining  vs.  Machine  Learning  

– Sta,s,cs  quan%fies  numbers  – Data  Mining  explains  pa]erns  – Machine  Learning  predicts  with  models  – Ar,ficial  Intelligence  behaves  and  reasons  

What  is  Business  Intelligence  

•  BI  technologies  provide  historical,  current  and  predicFve  views  of  business  operaFons  

•  Business  intelligence  is  made  up  of  an  increasing  number  of  components  including:  –  MulFdimensional  aggregaFon  and  allocaFon  (OLAP–  Online  AnalyFcal  Processing)  –  DenormalizaFon,  tagging  and  standardizaFon  (relaFonal  database)  –  Real  Fme  reporFng  with  analyFcal  alert  –  A  method  of  interfacing  with  unstructured  data  sources  (data  mining)  –  Group  consolidaFon,  budgeFng  and  rolling  forecasts  –  StaFsFcal  inference  and  probabilisFc  simulaFon  –  Key  performance  indicators  opFmizaFon  –  Version  control  and  process  management  –  Open  item  management  

•  Why Fusion for Log Analytics?

 •  Secure  access  to  dashboards  

•  ETL  of  logs  using  Index  pipelines  

•  Spark  run  analysis  models  for  logs  and  leverage  with  ML  index  pipeline  

 •  Time  series  index  management  

Massive-­‐scale  log  analyFcs  

•  Index billions of log events per day, real-time

•  Recent event and historical analysis: Analyze logs over time: today, recent, past week, past 30 days, …

•  Easy to use dashboards to visualize common

questions and allow for ad hoc analysis •  Ability to scale linearly as business grows …

with sub-linear growth in costs!

•  Easy to setup, easy to manage, easy to use

•  Signals  &  RecommendaFons  

Fusion  can  capture,  store,  and  aggregate  signals  from  a  variety  of  sources  to  drive  predicFve  search  capabiliFes  and  conFnuous  relevancy  tuning  

Signals can includeClicks  and  queries  Add-­‐to-­‐cart  and  purchase  behavior  Geo-­‐locaFon  User  behavior  and  preferences  User  history  and  past  orders  Device  

VisualizaFon  &  Insight  with  SILK  

SILK Dashboards provide a rich visual interface for users to search, inspect and visualize event/log data Gives user the power to perform ad-hoc search and analysis on massive amounts of multi-structured and time series data. Real-time insights and trends for on-the-fly decision making using the most accurate and up-to-date data Users can share visualizations and dashboards

REST  API  

Worker   Worker   Cluster  Mgr.  

Apache  Spark  

Shards   Shards  

Apache  Solr  

HDFS  (O

pFon

al)  

Shared  Config  Mgmt  Leader  ElecFon   Load  Balancing  

ZK  1  

Apache  Zookeeper  

ZK  N  

DATABASEWEBFILELOGSHADOOP CLOUD

Connectors

Alerting/Messaging

NLP

PipelinesBlob Storage

Scheduling

Recommenders/Signals

Core  Services  

Admin UI

SECURITY BUILT-IN

Lucidworks View

Where Does Fusion Fit?

Learn  more  at  -­‐    lucenerevoluFon.org  

Thank  You  Q  &  A  

top related