Top Banner
Using Hadoop stack to build a cloud VAT declaraons revising service Alex Chistyakov Git in Sky Grodno, LVEE 2016
36

My talk at LVEE 2016

Jan 09, 2017

Download

Technology

Alex Chistyakov
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: My talk at LVEE 2016

Using Hadoop stack to build a cloud VATdeclarations revising service

Alex ChistyakovGit in Sky

Grodno, LVEE 2016

Page 2: My talk at LVEE 2016

Who I am

● Hello, my name is Alex

● Principal Engineer @ Git in Sky

● Hadoop operations engineer

● Former Java developer (not only Java and not so

“former” in fact)

Page 3: My talk at LVEE 2016

Who are you?

● Linux and OSS enthusiasts?

● Software developers?

● DevOps engineers?

● Big data guys?

Page 4: My talk at LVEE 2016

Well, what is this all about?

● Configuring a Hadoop/HBase cluster is easy

Page 5: My talk at LVEE 2016

Well, what is this all about?

● Configuring a Hadoop/HBase cluster is easy

● 1) Buy a lot of hardware

Page 6: My talk at LVEE 2016

Well, what is this all about?

● Configuring a Hadoop/HBase cluster is easy

● 1) Buy a lot of hardware

● 2) Configure the bloody cluster!

Page 7: My talk at LVEE 2016

Well, what is this all about?

● Configuring a Hadoop/HBase cluster is easy

● 1) Buy a lot of hardware

● 2) Configure the bloody cluster!

● 3) ???

Page 8: My talk at LVEE 2016

Well, what is this all about?

● Configuring a Hadoop/HBase cluster is easy

● 1) Buy a lot of hardware

● 2) Configure the bloody cluster!

● 3) ???

● 4) PROFIT!!!

Page 9: My talk at LVEE 2016

Big Data is hard!

● A customer wants a number of environments fordifferent purposes (dev, testing, staging &production)

● DevOps culture requires repeatability!

● (Observe a beautiful snowflake to the right)

● Business wants to reduce costs

Page 10: My talk at LVEE 2016

So, we need a detailed plan

● 1) Buy an enterprise subscription from Oracle

Page 11: My talk at LVEE 2016

So, we need a detailed plan

● 1) Buy an enterprise subscription from Oracle

● ^ FAIL!

Page 12: My talk at LVEE 2016

So, we need a detailed plan

● 1) Read the manual on the product site

Page 13: My talk at LVEE 2016

So, we need a detailed plan

● 1) Read the manual on the product site

● 2) Configure everything manually

Page 14: My talk at LVEE 2016

So, we need a detailed plan

● 1) Read the manual on the product site

● 2) Configure everything manually

● ^ FAIL!

Page 15: My talk at LVEE 2016

So, we need a detailed plan

● 1) Take Cloudera distribution of Hadoop

Page 16: My talk at LVEE 2016

So, we need a detailed plan

● 1) Take Cloudera distribution of Hadoop

● 2) Configure everything from a web interface

Page 17: My talk at LVEE 2016

So, we need a detailed plan

● 1) Take Cloudera distribution of Hadoop

● 2) Configure everything from a web interface

● 3) Don’t forget to buy an enterprise subscription

Page 18: My talk at LVEE 2016

So, we need a detailed plan

● 1) Take Cloudera distribution of Hadoop

● 2) Configure everything from a web interface

● 3) Don’t forget to buy an enterprise subscription

● 4) ^ MULTIPLE FAILS!!!

Page 19: My talk at LVEE 2016

A word on proprietary software

● Proprietary software is full of nasty bugs, period

Page 20: My talk at LVEE 2016

A word on open source software

● Open source software is awesome

Page 21: My talk at LVEE 2016

Software market in 2016

● It’s not “proprietary vs open source”

Page 22: My talk at LVEE 2016

Software market in 2016

● It’s not “proprietary vs open source”

● It’s “open source vs open source”

Page 23: My talk at LVEE 2016

Open source vs open source

● Cloudera CDH vs vanilla Apache

Page 24: My talk at LVEE 2016

So, we need a detailed plan

● 1) Hire a DevOps engineer

Page 25: My talk at LVEE 2016

So, we need a detailed plan

● 1) Hire a DevOps engineer

● 2) Use Chef or something

Page 26: My talk at LVEE 2016

So, we need a detailed plan

● 1) Hire a DevOps engineer

● 2) Use Chef or something

● 3) Automate all the things

Page 27: My talk at LVEE 2016

So, we need a detailed plan

● 1) Hire a DevOps engineer

● 2) Use Chef or something

● 3) Automate all the things

● 4) ???

Page 28: My talk at LVEE 2016

So, we need a detailed plan

● 1) Hire a DevOps engineer

● 2) Use Chef or something

● 3) Automate all the things

● 4) ???

● 5) PROFIT!!!

Page 29: My talk at LVEE 2016

100 reasons not to use Cloudera CDH

● Cloudera CDH obscures configuration

● Cloudera CDH generates textual configs from the DB

● Cloudera CDH is web-interface centric

● Cloudera CDH is a monolith with a vendor lock-in

Page 30: My talk at LVEE 2016

Our own little open source product

● Based on Ansible (Ansible is like Chef but awesome)

● https://github.com/gitinsky/ansible-hadoop-stack-howto

● https://github.com/gitinsky/ansible-role-*

Page 31: My talk at LVEE 2016

Problems

● Lack of documentation

Page 32: My talk at LVEE 2016

Problems

● Lack of documentation

● Lack of manpower

Page 33: My talk at LVEE 2016

Problems

● Lack of documentation

● Lack of manpower

● Nobody uses our product (except us)

Page 34: My talk at LVEE 2016

What about the VAT service thing?

● Forget it, it’s not that relevant

Page 35: My talk at LVEE 2016

Conclusions

● Open source software is awesome

● But Cloudera CDH is not

● We can make open source software better

Page 36: My talk at LVEE 2016

So long, and thanks for all the fish!

● Ask your questions please

● Alex Chistyakov, Principal Engineer @ Git in Sky

● http://gitinsky.com

[email protected]

● http://meetup.com/DevOps-40