Top Banner
By Sriram Study Point
23
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Big data and hadoop ecosystem tools

By

Sriram Study Point

 

Page 2: Big data and hadoop ecosystem tools

Introduction to Big Data Properties of Big Data Introduction to Hadoop Core components in Hadoop MapReduce Hadoop Ecosystem tools Conclusion

Page 3: Big data and hadoop ecosystem tools

A data which is beyond storage capacity and beyond

processing power

Properties of Big Data

According to IBM

Volume

Velocity

variety

Page 4: Big data and hadoop ecosystem tools
Page 5: Big data and hadoop ecosystem tools

1. Structured Data

RDBMS

2. Semi Structured Data

Log Files

3. Unstructured Data

text, audio, video, image etc..

Page 6: Big data and hadoop ecosystem tools
Page 7: Big data and hadoop ecosystem tools

? ?

Page 8: Big data and hadoop ecosystem tools

4.545

Page 9: Big data and hadoop ecosystem tools
Page 10: Big data and hadoop ecosystem tools
Page 11: Big data and hadoop ecosystem tools

Name Node Master of the system Maintains and manages the blocks

of data nodes

Data Node salves and provides actual storage responsible for read and write operations

Page 12: Big data and hadoop ecosystem tools
Page 13: Big data and hadoop ecosystem tools

Highly fault-tolerant

High Throughput

Suitable for applications with large data dets

Write once and read many times

Can be built by commodity hardware

Replicating data across different data nodes

Page 14: Big data and hadoop ecosystem tools

Low latency data access(quickly access small data) Lots of small files Multiple writes, arbitrary file modifications

Page 15: Big data and hadoop ecosystem tools
Page 16: Big data and hadoop ecosystem tools
Page 17: Big data and hadoop ecosystem tools

Familiar with SQL use

Initially given by Facebook

Internally runs with MapReduce

HiveQL-Hive Query Language act as interpreter

Can load thousands of rows at a time

Page 18: Big data and hadoop ecosystem tools

Importing data from RDBMS to HDFSExporting data from HDFS to RBMSUsed to Store data in HbaseUsed to upload data to Hive

Page 19: Big data and hadoop ecosystem tools

No need of lot of knowledge in programming and

SQL

Simplifies the work done by mapreduce programs

Initially given by Yahoo

Own language “Pig Latin Scripting”

Page 20: Big data and hadoop ecosystem tools

Works as a server

Coordinating more than one job at a time

Page 21: Big data and hadoop ecosystem tools

No SQL

Column Oriented Format

Data can be stored and processed

Page 22: Big data and hadoop ecosystem tools

Hadoop can handle any type of data

Open Source from Apache

Fault Tolerant

Provides tools for various domain knowledge

Works very fast compared to others

Page 23: Big data and hadoop ecosystem tools