Top Banner
25

Big Data Analytics - download.microsoft.comdownload.microsoft.com/.../2BigDataAnalytics.pdf · Big data + traditional BI = power & simplicity Big, fast, or complex data Microsoft

May 29, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Big Data Analytics - download.microsoft.comdownload.microsoft.com/.../2BigDataAnalytics.pdf · Big data + traditional BI = power & simplicity Big, fast, or complex data Microsoft
Page 2: Big Data Analytics - download.microsoft.comdownload.microsoft.com/.../2BigDataAnalytics.pdf · Big data + traditional BI = power & simplicity Big, fast, or complex data Microsoft
Page 3: Big Data Analytics - download.microsoft.comdownload.microsoft.com/.../2BigDataAnalytics.pdf · Big data + traditional BI = power & simplicity Big, fast, or complex data Microsoft

Objectives

The information herein is for informational purposes only and represents the opinions and views of Project

Botticelli and/or Rafal Lukawiecki. The material presented is not certain and may vary based on several

factors. Microsoft makes no warranties, express, implied or statutory, as to the information in this

presentation.

Portions © 2013 Project Botticelli Ltd & entire material © 2012 Microsoft Corp unless noted otherwise.

Some slides contain quotations from copyrighted materials by other authors, as individually attributed or as

already covered by Microsoft Copyright ownerships. All rights reserved. Microsoft, Windows, Windows Vista

and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other

countries. The information herein is for informational purposes only and represents the current view of

Project Botticelli Ltd as of the date of this presentation. Because Project Botticelli & Microsoft must respond

to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft,

and Microsoft and Project Botticelli cannot guarantee the accuracy of any information provided after the

date of this presentation. Project Botticelli makes no warranties, express, implied or statutory, as to the

information in this presentation. E&OE.

Page 5: Big Data Analytics - download.microsoft.comdownload.microsoft.com/.../2BigDataAnalytics.pdf · Big data + traditional BI = power & simplicity Big, fast, or complex data Microsoft

Big data, or just complex data?

velocity

variety complexity

volume

Data

interpretingpreparing

Page 6: Big Data Analytics - download.microsoft.comdownload.microsoft.com/.../2BigDataAnalytics.pdf · Big data + traditional BI = power & simplicity Big, fast, or complex data Microsoft

Today’s big data, tomorrow’s little dataComplexity vs. current capabilities

FAA International Flight Service Station, Honolulu, Hawaii, 1964 (Public Domain Image)

Page 7: Big Data Analytics - download.microsoft.comdownload.microsoft.com/.../2BigDataAnalytics.pdf · Big data + traditional BI = power & simplicity Big, fast, or complex data Microsoft

Domain Common big data scenarios

Financial services Modeling true risk

Threat analysis and fraud detection

Trade surveillance

Credit scoring and analysis

Media & Entertainment

Recommendation engines

Ad targeting

Search quality

Abuse and click fraud detection

Retail Point of sales transaction analysis

Customer churn analysis

Sentiment analysis

Telecommunications Customer churn prevention

Network performance optimization

Call Detail Record (CDR) analysis

Network failure prediction

Government Cyber security (botnets, fraud)

Traffic congestion and re-routing

Environmental monitoring

Antisocial monitoring via social media

Healthcare Genomics research

Cancer research

Health pandemics early detection

Air quality monitoring

Page 8: Big Data Analytics - download.microsoft.comdownload.microsoft.com/.../2BigDataAnalytics.pdf · Big data + traditional BI = power & simplicity Big, fast, or complex data Microsoft

Big data + traditional BI = power & simplicity

Big, fast, or

complex

data

Microsoft

HDInsight

SQL Server tabular,

multidimensional,

relational DW, or

PDW

Interaction,

exploration,

visualisation

Page 9: Big Data Analytics - download.microsoft.comdownload.microsoft.com/.../2BigDataAnalytics.pdf · Big data + traditional BI = power & simplicity Big, fast, or complex data Microsoft

Apache Hadoop distribution

Developed by Hortonworks & Microsoft

Integrated with Microsoft BI

Microsoft HDInsight

Page 10: Big Data Analytics - download.microsoft.comdownload.microsoft.com/.../2BigDataAnalytics.pdf · Big data + traditional BI = power & simplicity Big, fast, or complex data Microsoft

Hadoop Principles

Practical method for

massive parallelisation of

analytical data processing

Page 11: Big Data Analytics - download.microsoft.comdownload.microsoft.com/.../2BigDataAnalytics.pdf · Big data + traditional BI = power & simplicity Big, fast, or complex data Microsoft

DEMO

Part 1: the job

Page 12: Big Data Analytics - download.microsoft.comdownload.microsoft.com/.../2BigDataAnalytics.pdf · Big data + traditional BI = power & simplicity Big, fast, or complex data Microsoft

Hadoop Principles: Data

Page 13: Big Data Analytics - download.microsoft.comdownload.microsoft.com/.../2BigDataAnalytics.pdf · Big data + traditional BI = power & simplicity Big, fast, or complex data Microsoft

Hadoop Principles: MapReduce

Page 14: Big Data Analytics - download.microsoft.comdownload.microsoft.com/.../2BigDataAnalytics.pdf · Big data + traditional BI = power & simplicity Big, fast, or complex data Microsoft

Hadoop cluster

Page 15: Big Data Analytics - download.microsoft.comdownload.microsoft.com/.../2BigDataAnalytics.pdf · Big data + traditional BI = power & simplicity Big, fast, or complex data Microsoft

Hadoop cluster

Buster Cluster, an early research

project by Miles Osborne, University

of Edinburgh, School of Informatics.

Picture used with permission.

http://homepages.inf.ed.ac.uk/miles/

Page 16: Big Data Analytics - download.microsoft.comdownload.microsoft.com/.../2BigDataAnalytics.pdf · Big data + traditional BI = power & simplicity Big, fast, or complex data Microsoft

Hadoop cluster

Cloudrent-a-Hadoop-cluster, or:

“Supercomputer for cents”

Windows Azure HD Insight

Page 17: Big Data Analytics - download.microsoft.comdownload.microsoft.com/.../2BigDataAnalytics.pdf · Big data + traditional BI = power & simplicity Big, fast, or complex data Microsoft

Processing logic in HDInsight

Page 18: Big Data Analytics - download.microsoft.comdownload.microsoft.com/.../2BigDataAnalytics.pdf · Big data + traditional BI = power & simplicity Big, fast, or complex data Microsoft

JS MapReduce Wordcount

Page 19: Big Data Analytics - download.microsoft.comdownload.microsoft.com/.../2BigDataAnalytics.pdf · Big data + traditional BI = power & simplicity Big, fast, or complex data Microsoft

Pig Latin Example — It’s All Parallel!

… [see http://pig.apache.org/docs/r0.7.0/tutorial.html]

Page 20: Big Data Analytics - download.microsoft.comdownload.microsoft.com/.../2BigDataAnalytics.pdf · Big data + traditional BI = power & simplicity Big, fast, or complex data Microsoft

Reusing processing logic — libraries

Collaborative filtering,

recommenders, clustering,

singular value decomposition,

parallel frequent pattern

mining, naive Bayes, decision

tree

Page 21: Big Data Analytics - download.microsoft.comdownload.microsoft.com/.../2BigDataAnalytics.pdf · Big data + traditional BI = power & simplicity Big, fast, or complex data Microsoft

DEMO

Part 2: the results

Page 22: Big Data Analytics - download.microsoft.comdownload.microsoft.com/.../2BigDataAnalytics.pdf · Big data + traditional BI = power & simplicity Big, fast, or complex data Microsoft

From HDInsight to attractive Microsoft BI

Page 23: Big Data Analytics - download.microsoft.comdownload.microsoft.com/.../2BigDataAnalytics.pdf · Big data + traditional BI = power & simplicity Big, fast, or complex data Microsoft

Operationalising Hadoop

Page 25: Big Data Analytics - download.microsoft.comdownload.microsoft.com/.../2BigDataAnalytics.pdf · Big data + traditional BI = power & simplicity Big, fast, or complex data Microsoft

The information herein is for informational purposes only and represents the opinions and views of Project Botticelli and/or Rafal Lukawiecki. The material presented is not certain and may vary based on several factors. Microsoft makes no warranties,

express, implied or statutory, as to the information in this presentation.

Portions © 2013 Project Botticelli Ltd & entire material © 2013 Microsoft Corp unless noted otherwise. Some slides contain quotations from copyrighted materials by other authors, as individually attributed or as already covered by Microsoft Copyright

ownerships. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and

represents the current view of Project Botticelli Ltd as of the date of this presentation. Because Project Botticelli & Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and

Microsoft and Project Botticelli cannot guarantee the accuracy of any information provided after the date of this presentation. Project Botticelli makes no warranties, express, implied or statutory, as to the information in this presentation. E&OE.