Distributed Computing With Raspberry Pi
Dec 15, 2015
Distributed Computing With Raspberry Pi
The main aim of the project is to make cluster rack with Raspberry Pi.
It can be used as a testing server and playground for research and development.
Multiple raspberry pi work much faster as compared to single processor.
Aim of the project
The Raspberry Pi is a low cost, credit-card sized computer that plugs into a computer monitor or TV, and uses a standard keyboard and mouse. It is a capable little device that enables people of all ages to explore computing, and to learn how to program in languages like Scratch and Python.
What is Raspberry Pi?
A computer cluster consists of a set of loosely connected or tightly connected computers that work together so that in many respects they can be viewed as a single system.
Cluster based computing
• Scalability (in terms of both time and space)
• Ability to deal with different data types • Minimal requirements for domain
knowledge to determine input parameters • Able to deal with noise and outliers • Insensitive to order of input records • Incorporation of user-specified constraints • Interpretability and usability
Desirable Properties of a Clustering Algorithm
Connecting all the raspberry pi’s to router using LAN cable.
Assigning and dedicating the IP’s to all raspberry pi with DHCP Protocol.
Providing all raspberry pi with RSA key or primary key to establish communication between them.
Making one of the raspberry pi as load balancer.
Connecting Raspberry Pi’s
IP’s are dedicated so the python code in pi’s can distribute data to exact pi each time for computing.
The RSA key are provided so that the pi’s can communicate to each other without authenticating each time.
A load balancer has a task to assign the job to each raspberry pi.
Need for the previous steps
Wnet watcher WinSCP Putty
Tools Used
Providing each raspberry pi with a dedicated screen and input device(not possible)
Using Putty to login into each raspberry pi and,
WinSCP to access terminal of each pi on a single screen.
Task 1
After getting access to each pi’s terminal a our job was to give some task that can run on single pi and on cluster too.
Preparing merge sort python code for single pi.
Running code on each pi and calculating time taken.(44.048 sec for random array of 2L elements)
Task 2
Making a client server architecture to make communication between load balancer and other pi’s.
Assigning one pi as a server and other pi’s as client.
Running the python code for merge sort for same 2L random elements and calculating time.(24.535 sec)
Task 3
Server Running
Client Running
Time with Single Pi
Time on Distributed System
Server Module(MergeServer.py)
Client Module(MergeClient.py)
Task module for single pi (Merge1.py)
Task module for distributed system(MergeSort.py)
Modules in the Pi
Data Analysis To learn aspects of cluster computing Parallel Processing Hadoop in general How servers works
Main focus
Low Cost Higher Efficiency Software like Hadoop is Open Source and
hence can be used freely! Therefore easing the research work
Aspects of Server based infrastructure stack will be known
Advantages
Pi Connection to Screen
Booting Pi for first use
Thank You