-
Cluster Computing in Android based Devices
B.Tech Project Mid-Term Evaluation, April 2014
Assistant Professor, Dept. of Computer Science and Engineering,
IIT Roorkee.
DR. P. SATEESH KUMAR
Group Members
Kari Dharani Kumar 10114024 Vishruth Reddy 10114030 Sumit Badwal
10114040 Sumit Kumar 10114041
Supervisor
in Collaborative Environment
Dept. of Computer Science and Engineering, Indian Institute of
Technology Roorkee
1
-
Table of Contents Problem Statement
.......................................................................................................................................
3
Introduction
..................................................................................................................................................
3
Work Done
....................................................................................................................................................
4
Cluster setup on x86
.............................................................................................................................
4
Functioning of Android Application
......................................................................................................
4
Modules
........................................................................................................................................................
6
User Identification
................................................................................................................................
6
Splitting Text File
...................................................................................................................................
6
Searching
...............................................................................................................................................
7
Snapshots of Application
......................................................................................................................
8
Performance
Evaluation..............................................................................................................................
11
Measuring Speedup
............................................................................................................................
11
Network Performance (using NETPerf)
...............................................................................................
11
Future Work
................................................................................................................................................
12
Conclusion
...................................................................................................................................................
13
References
..................................................................................................................................................
14
2
-
Problem Statement Our topic Cluster Computing in Android based
devices focuses on two points load balancing and high Availability.
The program or task given to an embedded system like mobile or
digital TV can be parallelized for better performance and
throughput. The program needs to be distributed into separate tasks
and then given to each node in the cluster. This completes the task
in less time and proves effective. Also working in a cluster, the
overall system becomes computationally powerful. Moving on to our
project, we chose to do the same work which is done on the x86
machines to ARM processors. To be precise, we want to get the same
work done on the ARM based embedded systems like Mobile phones,
Digital TVs which runs Android, or any flavour of Linux.
Introduction Cluster Computing: In cluster computing, a
collection of stand-alone computers are interconnected to form a
single integrated computing resource, which is able to improve
performance and availability than using a single computer. Most
Common uses are load balancing and providing high availability.
Load Balancing - a single workload (e.g. a computation) is
shared by several computers that are linked together, which work as
a single unit. Any of the workloads coming to the system are
distributed among the computers in the cluster, such that the work
is balanced among them. This improves the performance of the whole
system.
High Availability - redundant nodes are provided to make sure
that the service provided by the cluster is always available (even
when some system components fail). Clusters can achieve a great
improvement in performance when compared to the price.
Our project focuses on Mobile Phone Cluster which had to be
implemented on Android platform. The phones will run the Android
Application and will discover each other, transfer data, send
messages to each other over the same Wi-Fi network. A task is
generated or received by the master node which is already in the
same Wi-Fi network. The task will be transferred and assigned
through the same method to other mobile slaves. Each slave will
then send result back to the master. Hence our system model has
more features over a computer cluster.
Mobility - the mobile devices can be relocated anywhere as they
are not dependent upon the location. They just need a network to
connect each other which in our case is Wi-Fi.
Scalability - the system is highly scalable since the mobile
devices can be added or removed when needed. When load on the
system would be high then new Android devices running the same
application will be introduced in the Wi-Fi network.
Decentralization - the slaves are treated similar and power to
process is given to each slave by the master.
3
-
Work Done Cluster setup on x86
Fig: 1 Setting up the cluster
Functioning of Android Application The devices running the
application can detect other devices running the same application
over the same Wi-Fi network. This is achieved by using
UDP-broadcast discovery and then use a TCP/IP based protocol stack
to create a reliable, local, peer-to-peer communications network.
UDP stands for User Datagram Protocol and the broadcast packets are
sent to entire subnet. (e.g.: 192.168.7.001 to 192.168.7.254). Each
node transmits UDP broadcast periodically and parses broadcast
messages from other nodes to discover all the nodes on the same
subnet. TCP stands for Transmission Control Protocol and unlike UDP
it is used for creating reliable connections. In a TCP connection
acknowledgement is sent upon data reception. It contains 3
phases
Connection Establishment Send Data Connection Termination
4
-
The application can send data like text messages and images with
selected members of the Wi-Fi network. There are two ways to send
data through channels: Public Channels- The mobile device or node
is automatically added to the public channel as soon as it starts
discovery. This channel doesnt restrict any node. Private Channels-
The nodes can create their own channels by naming them. So if any
node joins a channel of some name and other nodes also join the
channel with the same name then this channel becomes private.
6 1
5 2
4 3 Fig 2. Mobile Channels
Channel A
Channel B
Public Channel
5
-
Modules Our project can be divided into different modules
dealing with the proper communication and distribution of tasks.
Our task is specifically designed to search for key-value pairs in
a text file. This task is very computationally intensive when there
are very large number of entries. In our case we will split the
text file into smaller files and distribute these text files to
each node which searches and then give back the result. User
Identification A mobile device will be having unique identification
code like IP or MAC Address of the Wi-Fi Adapter. This ID of device
can be used for the communication but it will be difficult for a
user to identify a particular device. The application will involve
registration based entry. Each device will have a name and that
name will be sent along with the ID. So the user will be able to
which mobile device it is sharing data. Basically its a hash map of
ID and the name.
ID and keys are assigned m-bit identifier using consistent
hashing because both keys and IDs (IP addresses) are uniformly
distributed and in the same identifier space. Consistent hashing is
also necessary to let nodes join and leave the network without
disruption.
There is a mapping of the key onto a node. Both keys and nodes
are assigned an m-bit identifier. For nodes, this identifier is a
hash of the node's IP address. For keys, this identifier is a hash
of a keyword, such as a file name. It is not uncommon to use the
words "nodes" and "keys" to refer to these identifiers, rather than
actual nodes or keys.
Splitting Text File Splitting a text file into multiple smaller
text files in order to get the work done by as many devices as
possible. The Advantage is the use of multiple devices instead of
single entity to finish the task for us. The splitting can be done
based on the size of the file or no of lines to be present in the
file. This Algorithm works in a similar way to Split command in
Linux. Files with similar names are created dynamically depending
on the size of the file. File Descriptors plays a main role in the
access permissions to the file. Before reading or writing into the
file it must be opened. Never forget to close the file descriptor
of the file after completion of the work. A text file with large
size is defragmented into small chunks of data as small individual
text files. We can choose the length of the line and the number of
lines in the file which helps in the Performance of the Algorithm.
Run the Loop until no more line exist. The line size can be
customized and a buffer is used to write the data into the file.
When the number of lines in a text
DEVICE ID VALUE
Mt7kR1R!l*mdC94E Sumit-xperia
hKRjtKmj#kh1FE47 vishruth-galaxy
sKTh3Kde$gf3N89O Badwal-note3
cPmchsbj&bc5FgC9 Dharani-duos
6
-
file has been reached which is known with the help of line
counter , the file is closed and a new text file is created and the
same process is repeated. Pseudocode: Searching
Create 2 file Pointers 1 to read and other to Write Set a Char
Buffer to Max Line Size Create an Array to Store new Split Files
Set FileCounter and LineCounter to 1 Open the File to Read Create
and open a New Split File Loop through all lines of the file If
LineCounter equals Max no of lines in a Split file, Close the Split
File Reset the LineCounter to 1 Increment the FileCounter Create
and open a New Split File Break to the next iteration of the loop
Else Write the lines to the Output file End if End Loop Close the
Read File
Create an array to store the line numbers Set Result=1 to
dynamically increase the length of array Open the file to Read
select on the new line Loop through each line of file, searching
for a matching string If a match is found, Store the line number in
an array Increment the Result by 1 Break to the next iteration of
the loop Else 'if no element in the array has a match for the
current line Set the first blank element = the current line End if
case default: 'if the line contains anything other than a date
Append the line to elementMarker 'i.e., the most recently used
element in the array End select Wend Set the function =
search_in_file End the function
7
-
Snapshots of Application Fig 3. First Screen
The first screen checks for Wi-Fi network. If there is no Wi-Fi
network then it shows error and user cant proceed to next screen of
Discovery Process.
Fig 4.Chord Discovery
In the next screen the user can start the discovery of his
Android device and get access to channels. The Start Discovery
button starts and stops the discovery. After Start Discovery has
been tapped, the android device sends UDP broadcast packets
continuously to find other devices on the same Wi-Fi network. Also
the application parses the broadcast messages received from other
devices to get the list of devices already present in the
network.
8
-
Fig 5 Public Channel joined
The android device has received UDP packets from another device.
Hence it has joined the public channel. Here the IDs of the other
phones are displayed.
Fig 6. Private Channel joined
The private channel is a form of secure channel for sharing
data. The channel is joined by giving a name to the channel. So, to
make it secure user can give a unique name. Private channel has
features similar to public channel with the difference of just the
name based entry.
9
-
Fig 7. All Available nodes
After pressing the more button user gets this screen where all
the mobile devices of the channel are present. Both the public
channel and joined channel provide the same screen and features.
User will be able to see the channel name at the top to which
he/she wants to share data. The nodes can be selected using the
checkboxes. To attach a file to share with nodes there is a File
button. The file types has been restricted to text and images
only.
Fig 8. Data Transfer
For now, the data sharing is displayed in the form of chat to
keep record.
Libraries and Software Packages Used Eclipse IDE Android SDK
ZeroMQ (MQ) - It is a high performance asynchronous messaging
library allowing us to design a complex communication system
easily. We could have used Berkeley socket interface but
maintaining raw sockets is difficult and cumbersome when we have to
build a scalable system. Really fast and simple (8M msgs/sec, 30sec
latency) [2].
10
-
Performance Evaluation
Measuring Speedup We can find the performance gain (speedup)
obtained by parallelizing an application over N processors.
Speedup: Ratio of time it takes to solve a problem on a single
processor T(1) to the time it takes to solve the problem on N
processors T(N). For single processor speedup is
= (1) () Upon parallelization, the run time is dependent on -
serial run time - parallel run time Now, for N processors, speedup
is
= + +
Network Performance (using NETPerf) NETPerf is a benchmark for
measuring network standards regarding various aspects of the
network. For our work we restrict ourselves to monitor the network
load on the cluster during communication and file transfers. For
communication standards, we measure the packet drop rate, latency.
For file transfer, we try to maximize the bandwidth by finding the
appropriate packet size.
Eq. [1]
Eq. [2]
11
-
Future Work We have so far achieved secure connection between
multiple android device hosts and used it to send messages and
transfer files securely to one or multiple devices. The connection
among the devices is established using WiFi and uses ZeroMQ(MQ)
library to communicate with them. Now we aim at using this
underlying connection mode to implement cluster of android based
devices which will be running a task that requires high
computational resources in parallel on all the connected nodes. We
need to identify each mobile device and provide a security feature
like a registration or some password to ensure that authentic users
can access the application. Also, the registration would provide
identity to the devices so that users can know which device is
trying to connect and share data. Since we are able to connect
devices and communicate among them, our main focus will be to
distribute the heavy task among devices without compromising power
and connection cost. We will be doing the performance evaluation of
Beowulf Cluster. Thereafter comparing the cost efficiency,
performance efficiency.
Final Result sent
Connect over Wi-Fi using ZeroMQ
Result from each node
Task request sent to Master
Multiple Android Nodes Available
All Nodes connected in Cluster
High Computational Task distributed among all nodes
Results of Every Node sent to Master to get final Result
Requesting Node received Final Result
Fig 9. Future Course of Work.
12
-
Conclusion The first phase involved a deep research in cluster
computing and its advantages and disadvantages. We had to find out
the usage of cluster and the kind of tasks it can perform
efficiently. In the next phase we did some implementation of
Beowulf Cluster on our laptops and saw how significantly it could
use resources to complete a set of tasks. Also we implemented our
research on Android Application. The Android Application uses Wi-Fi
network to communicate with the other nodes. In the next phase we
will improve its functioning by adding features of automatic task
assignment and then result compilation.
13
-
References 1. http://www.mpich.org/ 2.
http://www.slideshare.net/pieterh/overview-of-zeromq 3. An
evaluation of the system performance of a beowulf cluster by Karl
Johan Andersson,
Daniel Aronsson and Patrick Karlsson -
http://www.nsc.liu.se/grendel/ . 4. Byobu - Building a simple
Beowulf cluster with Ubuntu:
http://byobu.info/article/Building_a_simple_Beowulf_cluster_with_Ubuntu/
. 5. ZeroMQ: http://zguide.zeromq.org/page:all/ . 6. Chord
protocol-
http://en.wikipedia.org/wiki/Chord_%28peer-to-peer%29/
14
Assistant Professor,Problem StatementIntroductionWork
DoneCluster setup on x86Functioning of Android Application
ModulesUser IdentificationSplitting Text FileSearchingSnapshots
of ApplicationLibraries and Software Packages Used
Performance EvaluationMeasuring SpeedupNetwork Performance
(using NETPerf)
Future WorkConclusionReferences