Top Banner
Big Data in the Cloud? Yes, you can do it in OpenStack
28

Big Data in the Cloud? Yes, you can do it in OpenStack

Jan 22, 2018

Download

Technology

Obed N Muñoz
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Big Data in the Cloud? Yes, you can do it in OpenStack

Big Data in the Cloud? Yes, you can do it in

OpenStack

Page 2: Big Data in the Cloud? Yes, you can do it in OpenStack

Hello!I am Obed N MuñozI am here because I love to give presentations.

-vvvv

Page 3: Big Data in the Cloud? Yes, you can do it in OpenStack

Software Engineer

Who am I?

Musician Fast driver

Page 4: Big Data in the Cloud? Yes, you can do it in OpenStack

Agenda- Introduction: Cloud and

OpenStack- Data-processing- Sahara Project

Page 5: Big Data in the Cloud? Yes, you can do it in OpenStack

IntroductionCloud Computing and OpenStack

1

Page 6: Big Data in the Cloud? Yes, you can do it in OpenStack

Cloud and XaaS EraEverything as a Service

Page 7: Big Data in the Cloud? Yes, you can do it in OpenStack

“Cloud computing term is used for a variety of services and applications emerging for users to access on demand over the Internet as opposed to being utilized via on-premises means.

Page 8: Big Data in the Cloud? Yes, you can do it in OpenStack

OpenStack

Page 9: Big Data in the Cloud? Yes, you can do it in OpenStack

“OpenStack is a cloud operating system that controls large pools of compute storage and networking resources throughout a datacenter, all managed through a dashboard, CLI, RestFUL API ...

Page 10: Big Data in the Cloud? Yes, you can do it in OpenStack
Page 11: Big Data in the Cloud? Yes, you can do it in OpenStack

Architecture

Page 12: Big Data in the Cloud? Yes, you can do it in OpenStack
Page 13: Big Data in the Cloud? Yes, you can do it in OpenStack

Data-ProcessingData-Processing in the Cloud

2

Page 14: Big Data in the Cloud? Yes, you can do it in OpenStack

What’s around Data-Processing?◇ Big Data◇ Data Science◇ Cloud◇ Machine Learning◇ Patterns Recognition◇ Neural Networks◇ Etc ...

Page 15: Big Data in the Cloud? Yes, you can do it in OpenStack

Data-Processing Technologies

Page 16: Big Data in the Cloud? Yes, you can do it in OpenStack

Sahara ProjectData-Processing in OpenStack

3

Page 17: Big Data in the Cloud? Yes, you can do it in OpenStack

OpenStack SaharaThe Sahara project provides a simple means to provision data-intensive application cluster (Spark or Hadoop) on top of OpenStack.

https://wiki.openstack.org/wiki/Sahara

Page 18: Big Data in the Cloud? Yes, you can do it in OpenStack

Architecture

http://docs.openstack.org/developer/sahara/architecture.html

Page 19: Big Data in the Cloud? Yes, you can do it in OpenStack

Getting Started- Clusters- Templates- Provisioning Plugins- Image Registry- Data Processing Frameworks- Elastic Data Processing (EDP)

http://docs.openstack.org/developer/sahara/userdoc/edp.html

Page 20: Big Data in the Cloud? Yes, you can do it in OpenStack

More Features ...- OpenStack Block Storage support- Cluster Scaling- Data locality- Distributed Mode- Hadoop HDFS High Availability- Orchestration support- …

Page 21: Big Data in the Cloud? Yes, you can do it in OpenStack

Clusters (Hadoop)

http://docs.openstack.org/developer/sahara/userdoc/edp.html

Page 22: Big Data in the Cloud? Yes, you can do it in OpenStack

Data-Processing Frameworks

- Hadoop- Spark- Storm

http://docs.openstack.org/developer/sahara/userdoc/edp.html

Page 23: Big Data in the Cloud? Yes, you can do it in OpenStack

Provisioning Plugins- Vanilla - Vanilla Apache Hadoop- Ambari - Hortonworks Data

Platform- Spark - Apache Spark with Cloudera

HDFS- MapR Distribution - MapR plugin

with MapR File System- Cloudera - Cloudera Hadoop

http://docs.openstack.org/developer/sahara/userdoc/edp.html

Page 24: Big Data in the Cloud? Yes, you can do it in OpenStack

Elastic Data Processing (EDP)Allows the execution of jobs on cluster created from Sahara. It supports:

- Hive, Pig, MapReduce.Streaming, Java, Shell job types on Hadoop clusters

- Spark jobs- Shared File system service (manila), or Sahara own

database- Access to input and output data sources in:

- HDFS- Swift- Manila

http://docs.openstack.org/developer/sahara/userdoc/edp.html

Page 25: Big Data in the Cloud? Yes, you can do it in OpenStack

Resources- Documentation

- http://docs.openstack.org/developer/sahara/- https://wiki.openstack.org/wiki/Sahara -

- Hadoop/Spark Images- http://sahara-files.mirantis.com/images/upstream/mitak

a/ - OpenStack Auto-deployment with RDO

- https://www.rdoproject.org/install/quickstart/- Videos

- https://www.youtube.com/watch?v=idAaLo1stbw- https://www.youtube.com/watch?v=TgPTjrf1y0A

http://docs.openstack.org/developer/sahara/userdoc/edp.html

Page 26: Big Data in the Cloud? Yes, you can do it in OpenStack

http://hackathon.openstackgdl.org/

Page 27: Big Data in the Cloud? Yes, you can do it in OpenStack

Q & ACONCLUSION

Page 28: Big Data in the Cloud? Yes, you can do it in OpenStack

Thanks!Any questions?You can find me at:◇ @obedmr◇ [email protected]