Top Banner
Big Data Meetup Machine Data Analytics Raghuram Velega IBM Software Architect Big Data Analytics © 2013 IBM Corporation
59

Machine Data Analytics

Jan 15, 2015

Download

Technology

Nicolas Morales

Gain New Insights by Analyzing Machine Logs using Machine Data Analytics and BigInsights.

Half of Fortune 500 companies experience more than 80 hours of system down time annually. Spread evenly over a year, that amounts to approximately 13 minutes every day. As a consumer, the thought of online bank operations being inaccessible so frequently is disturbing. As a business owner, when systems go down, all processes come to a stop. Work in progress is destroyed and failure to meet SLA’s and contractual obligations can result in expensive fees, adverse publicity, and loss of current and potential future customers. Ultimately the inability to provide a reliable and stable system results in loss of $$$’s. While the failure of these systems is inevitable, the ability to timely predict failures and intercept them before they occur is now a requirement.

A possible solution to the problem can be found is in the huge volumes of diagnostic big data generated at hardware, firmware, middleware, application, storage and management layers indicating failures or errors. Machine analysis and understanding of this data is becoming an important part of debugging, performance analysis, root cause analysis and business analysis. In addition to preventing outages, machine data analysis can also provide insights for fraud detection, customer retention and other important use cases.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Machine Data Analytics

Big Data MeetupMachine Data Analytics

Raghuram Velega

IBM Software ArchitectBig Data Analytics

© 2013 IBM Corporation

Page 2: Machine Data Analytics

Relevant Operations Data is Huge

A Typical Enterprise of 5000 servers with 125 applications across 2 or 3 data centers generates in excess of 1.4 TB of data per day

• 9 Gb Storage Data per day: 175K fiber ports

175 fiber ports,10 metrics per port, collected every 5 minutes, .5KB per port25K volumes, 10 metrics per volume, .5KB per volume5KB*(65K ports and volumes)*12*24 = 9.3 GB/day

• 2Gb Network performance data for Data Center networks (not access networks)

180x64 port Switches and 4 Routers to manage physical network.

Data flow of approximately 1TB unstructured data, and .4TB metric data per day,Scaled to 20K servers, approx 4TB unstructured, 1.6TB metric data

Daily Metric Output:

•250 Mb of event data from 125,000 Events

•125Mb of endpoint mgmt data from 5K servers

•12 Gb of performance data for 5000 servers

•1 Gb of performance for 5000 Virtual Machine

•8 Gb or Application middleware data

Assumptions: 40% of servers running monitored middleware

Average 60 metrics each, collected every 15 minutes

Average PMDB insert 1000 bytes, 40 inserts/server

•500 Mb Application transaction tracking data for 125 Applications

•1 Tb Log file data per day

200 Mb average per server (some will be smaller, some larger)

Example: WAS instances typically produce 400MB-750MB logs/day

•.35Tb Security data collected per day

Operational data growing

15-20% per year.

Page 3: Machine Data Analytics

Shifting market for IT Operations

� APM Digest survey* of Senior IT Ops @ Fortune 500

− 50% growing dissatisfaction with traditional performance management solutions for Production IT

− Inability to adapt to rapidly changing applications & workloads

− 30% of them believe that they do not have a way to proactively detect problems

− Looking to operate on raw data and gain actionable insights

� IT Analytics solutions can predict, detect and help solve problems by churning through piles of data and translating this to understandable, relevant information, and actionable insights.

* Source: APMDigest:http://apmdigest.com/it-analytics-emerging-as-dissatisfaction-grows-with-apm-and-bsm-tools

Operational Visibility

IT Overwhelmed by data

Page 4: Machine Data Analytics

Simple ad-hoc and scheduled Reporting to enable comparison of multiple metrics and data-sources

Streaming data analytics to provide realtime information and process Big Data volumes easily

Predictive Analytics enables forecasting and trending to provide foresight in resource demand, capacity & availability and clarify potential risks.

Provide holistic and accurate diagnosis by using guiding technology with behavioral learning capabilities.

Advanced correlation and pattern recognition to identify and resolve complex and undetectable events in real-time.

Performance

trending to plan

for growth

Self-learning

capabilities to

automatically adapt

to change

Detect capacity issues

prior to business

impact

Notice problems

sooner and more

accurately

Reduce false alerts

to lower

management costs

Automated

threshold setting for

quicker deployment

Leveraging analytics for

IT Operations

Exploiting IBM’s breadth of Analytics InitiativesProactively mitigate risk, attain insights to optimize actions, and reduce cost of ownership across Business, IT Operations, Asset Management, and more….

InfoSphereBigInsights

Page 5: Machine Data Analytics

• How should I plan maintenance to efficiently keep my assets operational, given what I know today about my six month resource availability.

• “What-if” we change our preventive maintenance strategy?

• Help me track capacity and performance of applications & services in cloud / virtual environments, when do I need to add more capacity?

• Show me how to reduce cost of running my virtual infrastructure & making it more compliant with best practices.

Se

arch

• What is driving my high maintenance costs and what can I do to address this?

• How can I reduce reserved material inventory due to work order backlog?

• How do we make sense out of the terabytes of metric and log data that is generated by our applications and the infrastructure on which they run to isolate problems and reduce downtime?

• Can I use analysis of my channel traffic analysis to achieve improved customer insight and intelligence?

Op

timize

IT Operations needs analytics to predict, to search and to optimize

Pre

dict

• Can we predict/project failure occurrences for specific asset types?

• How can we get early warning of failures in my critical retail applications?

• Can I predict which KPIs are going to cause application issues without manually configuring thresholds? I have 100s of thousands of KPIs.

• I want to predict my online banking outages and take corrective actions before customers hit them.

Page 6: Machine Data Analytics

How the Big Data Platform Can Help?

Raghuram Velega - IBM Software Architect(Big Data Analytics)

Page 7: Machine Data Analytics

� Assemble and combine relevant mix of information

� Discover and explore with smart visualizations

� Analyze, predict and automate for more accurate answers

� Take action and automate processes

� Optimize analytical performance and IT costs

� Reduced infrastructure complexity and cost

� Manage, govern and secure information

Enabling organizations to

Performance Management

Content Analytics

Decision Management

Risk Analytics

Business Intelligence and Predictive Analytics

Information Integration and Governance

BIG DATA PLATFORM

SECURITY, SYSTEMS, STORAGE AND CLOUD

Sales | Marketing | Finance | Operations | IT | Risk | HR

ANALYTICS

SOLUTIONS

Industry

CONSULTING and IMPLEMENTATION SERVICES

Content Management

Data Warehouse

Stream Computing

HadoopSystem

IBM Provides a Holistic and Integrated Approach to Big Data and Analytics

Page 8: Machine Data Analytics

Accelerators

Information Integration & Governance

Data Warehouse

Stream Computing

HadoopSystem

DiscoveryApplication

Development

Systems

Management

Data Media Content Machine Social

BIG DATA PLATFORM

The Platform for New Insight and Applications

InfoSphere Streams Analyze streaming data and large data bursts for real-time insights

InfoSphere BigInsights Cost-effectively analyze Petabytesof unstructured and structured data

InfoSphere Data ExplorerDiscover, understand, search, and navigate federated sources of big data

Page 9: Machine Data Analytics

Big Data ExplorationFind, visualize, understand all big data to improve business knowledge

Enhanced 360o Viewof the CustomerAchieve a true unified view, incorporating internal and external sources

Operations AnalysisAnalyze a variety of machinedata for improved business results

Data Warehouse AugmentationIntegrate big data and data warehouse capabilities to increase operational efficiency

Security/Intelligence ExtensionLower risk, detect fraud and monitor cyber security in real-time

The 5 High Value Big Data Use Cases

Page 10: Machine Data Analytics

Observed Big Data Use Cases

10 12/11/2013

4

5

8

8

10

13

13

14

18

19

20

22

23

24

29

32

71

139

143

197

0 20 40 60 80 100 120 140 160 180 200

BigInsights as NoSQL store

Transportation/ SCM

Medical/ Transcriptional Profiling

File storage or ECM offload

Event Processing

Smart Grid Apps

Environmental Sensor apps

Real Time Processing

Fraud / Risk

Financial Apps Algo Trading

Statistical /predictiveAnalysis

Geospatial Location/ Space exploration

Cyber Security

Analytic Apps

Audio, Video, Image Analysis

Telco Apps

Text Analytics

Database Offload, reporting,mining

Customer behavior/Social analysis

Machine Data Analysis

Source: Multiple websites , n=933 available data for n= 812, count of use cases is not mutually exclusive

Page 11: Machine Data Analytics

Big Data Creates A Challenge – And an Opportunity

What If You Could...

Traditional Big Data Approach

Leverage All of the Data Captured

Reduce Effort Required to Leverage

Data

Let Data Lead The Way, and continuously explore

Leverage data as it is captured – In Motion

Page 12: Machine Data Analytics

IBM Infosphere BigInsights : Machine Data Analytics

Page 13: Machine Data Analytics

Machine Data Analytics: Customer Example

• Intelligent Infrastructure Management: log analytics, energy bill forecasting, energy consumption optimization, anomalous energy usage detection, presence-aware energy management

• Optimized building energy consumption with centralized monitoring; Automated preventive and corrective maintenance

• Utilized InfoSphere Streams, InfoSphere BigInsights, IBM Cognos

� Do you deal with large volumes of machine data?

� How do you access and search that data?

� How do you perform root cause analysis?

� How do you perform complex real-time analysis to correlate across different data sets?

� How do you monitor and visualize streaming data in real time and generate alerts?

Would Operations Analysis benefit you?

Product Starting Point: InfoSphere BigInsights, InfoSphere Streams

Page 14: Machine Data Analytics

Raw

Logs a

nd

Machin

e D

ata

Indexing, Search

Statistical Modeling

Root Cause Analysis

Federated Navigation

& Discovery

Real-time Analysis

Only storewhat is needed

BigInsights : Machine Data Analytics

Machine Data

Accelerator

Page 15: Machine Data Analytics

Taking Full Advantage of Machine Data Requires New Thinking

Machine Data Characteristics

� From variety of complex systems with complex formats – no standards

� May not always have context

� Structured and unstructured data

� Extremely large volumes of data

� Streaming data as well as data at rest

� Time sensitive - agile in interpretation and ability to respond

� Requires sophisticated text analysis

� Adaptive/dynamic algorithms to efficiently process data

� Large scale indexing

Page 16: Machine Data Analytics

Taking Full Advantage of Machine Data Requires New Thinking

� Correlation across different data sets and/or different environments

� Data may need to be enriched or transformed to provide proper context

� Causal analysis (if problem on Tuesday, what happened on Monday to cause this)

� Pattern analysis

� Time and spatial based analysis

� Unique Visualization/UI needs based on data type and industry/application

� Sophisticated search capabilities.

Page 17: Machine Data Analytics

Customer Usage Pattern of Log Analysis with MDA

� Step 1:

− “What is happening in my systems?”

� Step 2:

− “Let me try to use my experience to correlate the events and sequence”

� Step 3:

− “I need a tool to do Step 2 – I have too many systems and too many logs”

� Step 4:

− “I need to combine with my system KPI data and monitor / report in a dashboard. Provide possible solutions to the problem / anomaly”

� Step 5:

− “I need to predict the behavior when I make changes, add error codes. or add new systems”

Page 18: Machine Data Analytics

Step 1: What is happening in my system?

� This is accomplished get all the log data, extract, parse, indexand search through a faceted interface.

� This is also the phase where basic event level metrics – max, min, counts, builtin range metrics, alerts when KPIs are not in range – are desired and tested.

� Dashboards that are dynamic and actionable in sync with the searches are highly desirable.

� The MDA provides the Faceted Search interface.

� KEY TECHNOLOGIES – Text Analytics, Faceted Search, BI

Page 19: Machine Data Analytics

Step 2: Let me correlate events

� In this phase, the customer performs searches and endeavors to make sense of the events and sequences

− We usually work side by side with the customer in this stage

− We extract the vital tribal knowledge and applications in the domain.

− We log their “experiential” notions of event sequences and correlations – this is essential to verify results when the user wants to go to Step 3.

� KEY TECHINOLOGIES – Big Sheets

Page 20: Machine Data Analytics

Step 3: I have too many systems and logs to correlate

� In this phase, the customer essentially wants to find relationships and patterns of occurrence between log events across systems and applications.

� The MDA provides uses sessionization and sequence mining capability to accomplish this step.

� KEY TECHNOLOGIES – Text Analytics, Machine Learning

Page 21: Machine Data Analytics

Step 4: Combine with my KPI, Topology data

� Once Step 3 is completed, the integration with the KPI, topology, and monitoring data is possible.

� This step allows us to expose the capabilities to the Network Operator and end user.

� KEY TECHNOLOGIES – Data Joins, SQL/JAQL, Big Sheets, Reporting Dashboards

Page 22: Machine Data Analytics

Step 5: Predict events based on patterns

� The more advanced customers and network operators would like to build predictive models based on the patterns they see in the events in log data.

� Customers want to build models that help with meeting enterprise SLAs for systems

� Downtime scheduling for systems is a complex problem for most data centers.

� KEY TECHNOLOGIES – Machine Learning (R, SPSS, System ML)

Page 23: Machine Data Analytics

High-Level Workflow

Apply Adapter

Page 24: Machine Data Analytics

� What– Copy the logs from these machines where logs are generated using

into hdfs.

� How– BigInsights Distributed copy app + MDA extensions

� Advantages• Use ftp/ sftp protocols supported by Distributed Copy App

• MDA extensions allow batch incremental processing, batch replement

• MDA extensions associating metadata like server names, or any other,

which is available to downstream analysis

Import

Page 25: Machine Data Analytics

� What– Identify log record boundaries

– Extract information from log records in text and XML

� How– BigInsights Text Analytics

� Advantages– Robust text extraction using SQL like language

• Avoid ‘brittle’ custom parsers

– Library of extractors for common log files

• Syslogs, websphere, web access, datapower, csv, generic

– Extensive tooling for custom extractor development and app customization

• Eclipse based IDE

Extract

Page 26: Machine Data Analytics

The Extract Stage: Text analytics applied to log files

RecordSplitting

(HDFS/GPFS)

Raw Log

Files

Field and Entity

Extraction

Log

Records

(text)

Semi-

Structured

Data

(JSON)

To Transform

Stage

AQLAQL AQLAQL

AQL extractors available for many common formats [syslog, websphere, csv, ...] BigInsights ships with tools for creating new extractors.

Page 27: Machine Data Analytics

Index

� What

− Index and facet extracted records and fields so it can be available for searching via the faceted searching user interface

� How

− BigInsights BigIndex

� Advantages

� Find correlated, log entries based on time through interactive UI

� Add/inject other data (e.g Excel) to enrich log context.

� Allow operations staff to quickly find log entries based on search terms such as, web service name, server name, exception code, transaction id etc

Page 28: Machine Data Analytics

Transform

� What

– Link and enrich log information from different entities• Find relationships between log records

• Integrate structured data with log data

– network configuration, user account information…

� How– JAQL

� Advantages– High level language that is Big Data aware

– Out of the box transformers

– Extensive tooling for application customization

• Eclipse IDE

Page 29: Machine Data Analytics

The Transform Stage: Linking logs from and other

information from varied sources

� Input: Parsed log records, additional structured data

� Output: Individual log records, from different IT entities, linked and enriched

Fault data

Network log

Web log

MQ log

Server logTransaction log

Performance

Data

Performance

and Fault data

Raw Logs

(HDFS/GPFS)

(HDFS/GPFS)

Text Files

Structured data

from non-log

sources

1.IT logs of a single business activity or transaction

– Up & down the IT stack

2.Log of a activity across one layer of IT stack (e.g. OS layer)

– Messages flowing through a sequence of routers

3.…

Link logs corresponding to Outlier

Detection

Correlations,Predictive

Models

Page 30: Machine Data Analytics

Analyze

� What– Correlate across fields

– Find frequently occurring sequences and combinations of events

– Potential for predictive modeling in the future

� How– System ML

� Advantages– Scalable to perform analytics on Big Data

– Flexible and customizable

– Easy to plugin into applications via a JAQL/Java interface

Page 31: Machine Data Analytics

Agenda

� Introduction

� High Level Workflow

� Some Highlights

� Demo

Page 32: Machine Data Analytics

Machine Data Adapters

� What are Adapters

− Adapt a variety of inputs to a standard output

� Why do we need Machine Data Adapters

− To handle different ‘machine data’ formats

Page 33: Machine Data Analytics

Adapters in High-Level Workflow

Apply Adapter

Page 34: Machine Data Analytics

Adapter Functions

� Create

− Enter Adapter-Name, LogType, ‘sample machine data’ and first ‘timestamp’ in the ‘sample machine data’

− Check the recommended ‘DataTime Format’ and ‘preTimeStamp Regex’ and select defaults like ‘timezone’, ‘year’ and ‘month’.

− Verify the extracted output and save if you find it good

− If extracted out is bad, then you can go back and edit parameter ‘Data Source Type’, ‘DataTime Format’ and ‘preTimeStamp Regex’

� Edit

� View

� Apply

� Delete

Page 35: Machine Data Analytics

Create Machine Data Adapter – Step-1

Page 36: Machine Data Analytics

Create Machine Data Adapter – Step-2

Page 37: Machine Data Analytics

Create Machine Data Adapter – Step-3

Page 38: Machine Data Analytics

Display Machine Data Adapter

Page 39: Machine Data Analytics

Edit Machine Data Adapter – Step-1

Page 40: Machine Data Analytics

Edit Machine Data Adapter – Step-2

Page 41: Machine Data Analytics

Edit Machine Data Adapter – Step-3

Page 42: Machine Data Analytics

Display Machine Data Adapter

Page 43: Machine Data Analytics

Apply Machine Data Adapter

Page 44: Machine Data Analytics

Verify the Adapter (metadata.json)

Page 45: Machine Data Analytics

Delete

Page 46: Machine Data Analytics

Data Explorer for Indexing Application

� Data Explorer Index Configuration File to support generic schema for extracted

machine data.

� Parallelizing data pushing to Data Explorer Indexer.

� Run Data Explorer Index Application

Page 47: Machine Data Analytics

Data Explorer Index Configuration File

� The Data Explorer index config file specifies which fields to index, which field

contains record ID as well as Data Explorer index field definitions: field name,

type, searchable, retrievable, filterable and sortable.

Example:

{ "source": { "dateFormat": "MMM dd yyyy HH:mm:ss.SSS Z", "fieldName":

"LogDatetime[].normalized_text", "suppress": false }, "target": {

"deFieldName": "LogDatetime", "filterable": true, "isRecordID": false,

"retrievable": true, "searchable": true, "sortable": true, "type": "Date" }}

� Default Index Configuration file is provided.

Page 48: Machine Data Analytics

Parallelizing data pushing to Data Explorer Indexer

� The application uses Oozie jaql action to parallelize the job to multiple tasks.

Jaql Hadoop Task 1

BigSearchBigSearch

Zookeeper

Cluster

Zookeeper

Cluster

Jaql Hadoop Task M

BigSearchBigSearch

DE Backend

Shard 1

DE Backend

Shard 1…

Locate shards

DE Backend

Shard N

DE Backend

Shard N

HDFS

Indexing app

BI platform/IDE

Page 49: Machine Data Analytics

Run Data Explorer index Application

Page 50: Machine Data Analytics

Basic Facet Search UI on Application Builder

Page 51: Machine Data Analytics

BI Log Monitoring and Analysis

• Ingest BigInsights logs in HBase in real time.

• Create Log Monitoring Extraction application that extracts log records from

HBase.

• Create Index Management application to delete old index log records from DFS.

• Embed the MDA Search UI within the BigInsights Dashboard for BigInsights log

search.

Page 52: Machine Data Analytics

Ingesting BigInsights Logs into HBase

� Chukwa agents setup on Name Node and each of the Data Nodes

� Adapters are programmatically installed and removed depending on user

configuration.

� Custom Chukwa writer class created to add logs into HBASE in real time.

� Log4j Interface streams logs to the adaptors which stream logs to HBASE

� Different log types are concurrently recorded in HBASE in a single table

Page 53: Machine Data Analytics

Data Collection Diagram

HBASE

Name Node

Hadoop Secondary Name Node

Hadoop Jobtracker

Hadoop Name Node

Data Node 1

Hadoop Data Node

Hadoop Task Tracker

Hadoop Task Attempt

Data Node 2

Hadoop Data Node

Hadoop Task Tracker

Hadoop Task Attempt

Data Node 3

Hadoop Data Node

Hadoop Task Tracker

Hadoop Task Attempt

� For HDFS with Symphony MapReduce Installation: Hadoop Data Node, Hadoop

Name Node and Hadoop Secondary Name Node logs are supported

� For GPFS with Apache MapReduce Installation: Hadoop Job Tracker, Hadoop

Task Tracker and Hadoop Task Attempt logs are supported

� For GPFS with Symphony MapReduce Installation: Only Hadoop Task Attempt

logs are supported

HDFS with Apache Map Reduce

Page 54: Machine Data Analytics

BigInsights Dashboard

� User starts the BigInsights log collection from the LogCollection app.

� User is able to stop the BigInsights log collection from the LogCollection app. Or

by turning off the Monitoring.

� The MDA Search UI is wrapped by a frame in BigInsights Dashboard.

Page 55: Machine Data Analytics

Dashboard

Page 56: Machine Data Analytics

LogCollection app.

Page 57: Machine Data Analytics

BigInsights Log Monitoring Application

� Is a BigInsights Chained application.

� Contains Log Monitoring Extraction application and Index application.

� Assumes that Log Monitoring Extraction application is running on schedule

mode.

� The BigInsights Logs is selected assumed as the workflow for Index application.

� Any configuration files are assumed to be the default configuration files installed

with MDA

� “Index Only New Logs” check-box in the Index application is assumed to be

unchecked.

Page 58: Machine Data Analytics

BigInsights Log Monitoring Application

Page 59: Machine Data Analytics

Agenda

� Introduction

� High Level Workflow

� New Features in MDA 2.1

� Demo