www.edureka.co/big-data-and-hadoop Hadoop : The pile of Data View Big Data and Hadoop Course at: http:// www.edureka.co/big-data-and-hadoop For more details please contact us: US : 1800 275 9730 (toll free) INDIA : +91 88808 62004 Email Us : [email protected]For Queries: Post on Twitter @edurekaIN: #askEdureka Post on Facebook /edurekaIN
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
www.edureka.co/big-data-and-hadoop
Hadoop : The pile of Data
View Big Data and Hadoop Course at: http://www.edureka.co/big-data-and-hadoop
For more details please contact us: US : 1800 275 9730 (toll free)INDIA : +91 88808 62004Email Us : [email protected]
For Queries: Post on Twitter @edurekaIN: #askEdurekaPost on Facebook /edurekaIN
Apache Hadoop is a framework that allows for the distributed processing of large data sets across clusters of commodity computers using a simple programming model.
It is an Open-source Data Management with scale-out storage and distributed processing.
Known Entity Resources Growing, Complexities, Wide
OLTPComplex ACID TransactionsOperational Data Store
Best Fit Use Data DiscoveryProcessing Unstructured DataMassive Storage/Processing
RDBMS HADOOP
Slide 7 www.edureka.co/big-data-and-hadoop
Lots of Data (Terabytes or Petabytes)
Big data is the term for a collection of data sets solarge and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications
The challenges include capture, curation, storage,search, sharing, transfer, analysis, and visualization
What is Big Data?
cloud
tools
statistics
No SQL
compression
storage
support
database
analyze
information
terabytes
processing
mobile
Big Data
Slide 8 www.edureka.co/big-data-and-hadoop
IBM’s Definition – Big Data Characteristicshttp://www-01.ibm.com/software/data/bigdata/
To implement Hadoop on you data you should first understand the level of complexity of data and the rate it is going to grow
So we need a cluster planning, its may begin with building a small or medium cluster in your industry as per data (in GBs or few TBs ) available at present and scale up your cluster in future depending on the growth of your data