Page 1
Big Data and Hadoop
www.skillpeed.com Page 1
GOT ANY DOUBTS? TWEET WITH #ASKSKILLSPEED AND WE'LL GET THEM CLEARED!
INSTALLATION MANUAL: HADOOP
Step 1: Download Vmware from this link:
https://my.vmware.com/web/vmware/downloads
Step 2: Download the Ubuntu 14.04 image from the following link:
http://www.traffictool.net/vmware/ubuntu1404t.html
Step 3: Extract the file in a location on your machine
Step 4: Open the VM and click on open a virtual machine
Page 2
www.skillpeed.com Page 2
Step 5: Now add your extract to Ubuntu.
Now your VMware is showing like this given image then click on edit virtual machine setting
Page 3
www.skillpeed.com Page 3
Step 6: Increase your VMware memory min (6 GB), then click ok.
Page 4
www.skillpeed.com Page 4
Step 7: Now click on play virtual machine
Step 8: You’ll see the console below:
Page 5
www.skillpeed.com Page 5
Step 9: Click on Terminal
Page 6
www.skillpeed.com Page 6
Step 10: Install Java, add the repository as shown in screenshot below
Page 7
www.skillpeed.com Page 7
Step 11: It will ask for password: enter “password” and then press enter again as show below:
Page 8
www.skillpeed.com Page 8
Step 12: Now update your system
Page 9
www.skillpeed.com Page 9
Step 13: It will take some minutes to complete. After completion the screen will look as below
Page 10
www.skillpeed.com Page 10
Step 14: Now invoke the java-7-installer as shown below
Page 11
www.skillpeed.com Page 11
Step 15: It will ask for accepting the license press enter for OK as shown below
Page 12
www.skillpeed.com Page 12
Step 16: Say Yes
Page 13
www.skillpeed.com Page 13
Step 17: Now java installation will continue for some time. Once it is complete, the screen will be
like shown below
Page 14
www.skillpeed.com Page 14
Step 18: Confirm java version as below
Page 15
www.skillpeed.com Page 15
Step 19: Now install openssh server
Page 16
www.skillpeed.com Page 16
Step 20: Generate ssh keys
Page 17
www.skillpeed.com Page 17
Step 21: Just press enter for password, don’t enter any password by yourself. Once done it will
look like as below
Page 18
www.skillpeed.com Page 18
Step 22: Now just enable ssh access
Page 19
www.skillpeed.com Page 19
Step 23: Test ssh acces
Page 20
www.skillpeed.com Page 20
Step 24: Now disable the IPv6 as shown below
Page 21
www.skillpeed.com Page 21
Open the sysctl.conf file in vi editor as follows:
Step 25: Add the following lines as shown in screenshot below
Page 22
www.skillpeed.com Page 22
Step 26: Now reboot your virtual machine
Page 23
www.skillpeed.com Page 23
Step 27: Once you restart the machine test if IPv6 is disabled. Your output should be as shown in
the screenshot below:
Step 28: Now we are ready for Hadoop installation
Page 24
www.skillpeed.com Page 24
Download Hadoop Realease from this URL:-
o http://apache.bytenet.in/hadoop/common/stable1/
And download hadoop-1.2.1.tar.gz as shown below:
Page 25
www.skillpeed.com Page 25
Step 29: Now copy the downloaded tar file from downloads folder to your home folder as shown
below
Page 26
www.skillpeed.com Page 26
Step 30: Now you should see the tar file at home folder as shown below
Page 27
www.skillpeed.com Page 27
Step 31: Now extract the tar file as shown below
Page 28
www.skillpeed.com Page 28
Step 32: Now you should see Hadoop folder as follows
Page 29
www.skillpeed.com Page 29
Step 33: Now open your .basrc file as shown below
Page 30
www.skillpeed.com Page 30
Step 34: Now add the highlighted lines as shown below
Page 31
www.skillpeed.com Page 31
Step 35: You should now close the terminal and open a new terminal to check Hadoop and Java
homes as shown below
Page 32
www.skillpeed.com Page 32
Step 36: Now we will configure Hadoop. Go to conf directory in Hadoop folder, you should be able
to see all the config files as shown below
Page 33
www.skillpeed.com Page 33
Step 37: Edit Hadoop-env.sh as shown below
Page 34
www.skillpeed.com Page 34
Step 38: Change the Hadoop-env as shown below
Page 35
www.skillpeed.com Page 35
Step 39: Now create a Hadoop_data folder in your home location as shown below
Page 36
www.skillpeed.com Page 36
Step 40: Now edit core-site.xml as follows
Page 37
www.skillpeed.com Page 37
Step 41: Change the content as shown in the screenshot below
Page 38
www.skillpeed.com Page 38
Step 42: Edit the mapred-site.xml
Page 39
www.skillpeed.com Page 39
Step 43: Change the contents as shown below:
Page 40
www.skillpeed.com Page 40
Step 44: Now edit the hdfs-site.xml
Page 41
www.skillpeed.com Page 41
Step 45: Change the content as shown below
Page 42
www.skillpeed.com Page 42
Step 46: Format namenode
Page 43
www.skillpeed.com Page 43
Step 47: Once formatted, the screen should look as shown below:
Page 44
www.skillpeed.com Page 44
Step 48: Now get into bin folder of Hadoop
Page 45
www.skillpeed.com Page 45
Step 49: Start all the services of Hadoop as shown below:
Page 46
www.skillpeed.com Page 46
Step 50: Once all the services are up, the screen should look like below
Page 47
www.skillpeed.com Page 47
Step 51: You can confirm the services by running jps command as shown below
Page 48
www.skillpeed.com Page 48
Step 52: Now test whether sample map reduce program runs or not, by launching PI calculation
M/R as shown below
Page 49
www.skillpeed.com Page 49
Step 53: Once completed, you’ll see the output as shown below: