Top Banner
Thank you
39

Big data (reversim)

Jan 15, 2015

Download

Documents

Nati Shalom

Adopting Hadoop to manage your Big Data is an important step, but not the end-solution to your Big Data challenges. Here are some of the additional considerations you must face:

Choosing the right cloud for the job: The massive computing and storage resources that are needed to support Big Data applications make cloud environments an ideal fit, and more than ever, there is a growing number of choices of cloud infrastructure types and providers. Given the diverse options, and the dynamic environments involved, it becomes ever more important to maintain the flexibility for all your IT needs.

Big Data is a complex beast: It involves many and different moving parts, in large clusters, and is continually growing and evolving. Managing such an environment manually is not a viable option. The question is, how can you achieve automation of all this complexity?

The world beyond Hadoop: Big Data is not just Hadoop – there is a whole rapidly growing ecosystem to contend with, including NoSQL, data processing, analytics tools… As well as your own application services. How can you manage deployment, configuration, scaling and failover of all the different pieces, in a consistent way?

In this session, you'll learn how to deploy and manage your Hadoop cluster on any Cloud, as well as manage the rest of your big data application stack using a new open source framework called Cloudify.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Big data (reversim)

Thank you

Page 2: Big data (reversim)

Big Data In the

Cloud@natishalom

Page 3: Big data (reversim)

3

About GigaSpaces

Managing Big Data on the Cloud

100’s of Enterprise Customers

Page 4: Big data (reversim)

My Data Out of My

hands..

No Way!

Page 5: Big data (reversim)

5

The Reality of Big Data..

2.7 ZB

0.5 Petabytes

66%

Global Digital Data

Two years tweets

Plan to use Big Data/Cloud

43% think that data

analytics could be improved in their organization if data analytics was part of

cloud services

Page 6: Big data (reversim)

Large ISV Case Study

• Application– Call Center surveillance

• Background– Previously – voice data

• Goal for a new system– Monitor data & voice– Multiple data sources – Advanced correlations

Page 7: Big data (reversim)

The Challenges..

Ever Growing Data

Deeper Correlation

Tight Performance

Page 8: Big data (reversim)

A Classic Case for..

Page 9: Big data (reversim)

A Typical Big Data System…

Page 10: Big data (reversim)

The Challenge

Cost Business Impact

Lower Margins

Competiveness

Time to Market

Customer Satisfaction

Infrastructure

Operational

Page 11: Big data (reversim)

The Solution Big Data

in the Cloud

Page 12: Big data (reversim)

Big Data in the Cloud- 3 Reasons

• Skills– Do you really need/want this all in-

house?• Huge amounts of external data. – Does it make sense to move and

manage all this data behind your firewall?

• Focus on the value of your data– Instead of big data management.

Holger Kisker

Page 13: Big data (reversim)

Managing Big Data on the

Cloud

• Auto start VMs• Install and configure

app components • Monitor • Repair • (Auto) Scale• Burst…

Page 14: Big data (reversim)

Big Data in the Cloud..

Reduce the Infrastructure Cost

Choose the Right Cloud for the Job

Running Bare-Metal for high I/O workloads, Public cloud for sporadic workloads..

Page 15: Big data (reversim)

Big Data in the Cloud ..

Reducing The Operational Complexity

• Consistent Management

• Automation Through the Entire Stack

Page 16: Big data (reversim)

Lets Take a Closer Look …

Page 17: Big data (reversim)

Before we Begin.. We’ll Need to Break

Some Common Myth’s on Portability

Page 18: Big data (reversim)

Cloud Portability Myth #1

No one really needs cloud portability

Page 19: Big data (reversim)

Cloud Portability

Facts

Zynga moved ~80% of their workload from Amazon to their private zCloud

“own the base, rent the spike”

http://code.zynga.com/2012/02/the-evolution-of-zcloud/

Page 20: Big data (reversim)

Cloud Portability

Facts Started with Linode, then moved to RackSpace, then to AWS

http://code.mixpanel.com/2010/11/08/amazon-vs-rackspace/

Page 21: Big data (reversim)

Cloud Portability

Facts

• You want the flexibility to choose what’s right for you, when it’s right for you

• Based on pricing, features, availability, performance, etc.

Page 22: Big data (reversim)

Cloud Portability Myth #2

Cloud Portability ==

Cloud API Standardization

Page 23: Big data (reversim)

Cloud APIs, Today

Standard APIs (?)OCCIVCloud

OSS FrameworksOpenStackCloudStackEucalyptus

Abstraction frameworksJCloudsDeltacloudFogLibvirt

Page 24: Big data (reversim)

Cloud APIs, Today

Standard APIsNot practical in the foreseeable future

OSS Projects Need a couple more years to converge &

mature

Abstraction FrameworksProbably the only

practical (near-term) option

Page 25: Big data (reversim)

Realization:

What You Really Care

about Is App

Portability

OS is the same on any cloud

Most clouds have compute & storage

Elasticity & scaling have same effects on the app, regardless of the cloud

Page 26: Big data (reversim)

And now to a Closer

Look …

Consistent Management

Portability

Automation

Page 27: Big data (reversim)

® Copyright 2012 GigaSpaces Ltd. All Rights Reserved

27

Consistent ManagementRecipes consistent description for running any app:

What middleware services to run Dependencies between services How to install services Where application and service binaries are When to spawn or terminate instances How to monitor each of the services.

Page 28: Big data (reversim)

The Right Cloud for the Job (Cloud

Portability)

Page 29: Big data (reversim)

® Copyright 2012 GigaSpaces Ltd. All Rights Reserved

29

Choosing the Right Cloud for the Jobcompute { template "SMALL_LINUX"}

SMALL_LINUX : template imageId "us-east-1/ami-76f0061f“ remoteDirectory "/home/ec2-user/gs-files“ machineMemoryMB 1600 hardwareId "m1.small" locationId "us-east-1" localDirectory "upload" keyFile "myKeyFile.pem"

options ([ "securityGroups" : ["default"]as

String[], "keyPair" : "myKeyFile"])

overrides (["jclouds.ec2.ami-query":"",

"jclouds.ec2.cc-ami-query":""])privileged true

}

SMALL_LINUX : template{ imageId "1234" machineMemoryMB 3200 hardwareId "103" remoteDirectory "/root/gs-files" localDirectory "upload" keyFile "gigaPGHP.pem" options ([ "openstack.securityGroup" : "default", "openstack.keyPair" : "gigaPGHP"

])privileged true

}

Page 30: Big data (reversim)

Automation across the stack1 Upload your recipe.

2 Cloudify creates VM’s & installs agents

3 Agents install and manage your app

4 Cloudify automate the scaling

Page 31: Big data (reversim)

Big Data Apps, on Any Cloud, Your Way

Open source (Apache2)

Page 32: Big data (reversim)

32

Demo Time – Storm on Demand..

Page 33: Big data (reversim)

Other Similar Solutions…

Page 34: Big data (reversim)

RightScale

Page 35: Big data (reversim)

Amazon Elastic Map Reduce

Page 36: Big data (reversim)

Large ISV Case Study

• Application– Call Center surveillance system

• Background– Previously – voice data

• Goal for a new systemMonitor data & voiceMultiple data sources Advanced correlations Mission

Accomplished

Page 37: Big data (reversim)

Additional Benefits

• True Cloud Economics

• One product -> any Customer Environment

• Increased Agility

Page 38: Big data (reversim)

Try a simple Big Data Demo Yourself

The app

The Cloudify dashboardlaunch.cloudifysource.org/d

Page 39: Big data (reversim)

Thank You!

References: http://www.cloudifysource.org http://github.com/CloudifySource