POORNIMA INSTITUTE OF ENGINEERING & TECHNOLOGY, JAIPUR DEPARTMENT OF COMPUTER ENGINEERING A PRACTICAL TRAINING PRESENTATION ON BIG DATA HADOOP SESSION 2014 – 15 Presented By: Guided By: Ashutosh Tiwari Dr. E.S. Pilli CE/11/083 Assistant Professor Ashok Rayal CS, Department CE/11/025 MNIT, Jaipur.
13
Embed
Presentation on Big Data Hadoop (Summer Training Demo)
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
POORNIMA INSTITUTE OF ENGINEERING & TECHNOLOGY, JAIPUR
Start Date: 28/05/2014 Last Date: 9/07/2014 No. Of Days: 45(30+15). Timing: 9 AM to 5 PM Our training at MNIT were broadly divided into three phases:
o Case study of Hadoop and related papers (first 30 days).
o Hadoop cluster making (first 30 days).o Implementation of Near Duplicate Detection Using
Hadoop MapReduce (last 15 days).
ABOUT PROJECT
Near Duplicate Detection:
Comparative analysis of millions documents exist in network jargon to find similar document based on a predefined threshold value.
Near duplicate detection is essentially used in web crawls and many others data mining tasks.
TECHNOLOGY SPECIFICATION OF PROJECT
Project: Near Duplicate Detection
Technology Used:
Hadoop Map Reduce HDFS
SSH and Shell Scripting Java
SNAPSHOTS-HDFS
SNAPSHOTS-MAPREDUCE PROCESSING
SNAPSHOTS-OUTPUT
CONCLUSION
Training in big data helped us to know what is the crazy trend in IT industries and how technology is becoming more fruitful to human development.
Big Data is the future. Currently A lot of research is going on in this field. As data is increasing at faster rate thus there is a huge need of such tools and technology which can handle it.
Hadoop is the most emerging framework used by most of big firms like Facebook, Microsoft, IBM, Yahoo, Amazon and lots of other more.
Our experience at MNIT, was absolutely awesome as it has given as the platform and support for our tasks and case study.