Top Banner
Optimizing Data Partitioning at Broadcasting the Data on Balanced N- ary Tree Formed P2P Systems Takashi Yamanoue @ Fukuyama University ESKM2015, 4 th IIAI AAI @ Okayama, Japan, 15Jul, 2015.
58

Optimizing Data Partitioning at Broadcasting the Data

Aug 21, 2015

Download

Technology

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Optimizing Data Partitioning at Broadcasting the Data

Optimizing Data Partitioning at Broadcasting the Data on Balanced N-ary Tree Formed P2P Systems

Takashi Yamanoue @ Fukuyama University    ESKM2015, 4th IIAI AAI @ Okayama, Japan, 15Jul, 2015.

Page 2: Optimizing Data Partitioning at Broadcasting the Data

Contents• I. INTRODUCTION • II. THEORETICAL EQUATION FOR OPTIMIZING THE NUMBER

OF DATA PARTITIONING AT BROADCASTIONG ON A BALANCED N-ARY TREE

• III. SOLAR-CATS• IV. PERFORMANCE IMPROVEMENT OF SOLAR-CATS• V. RELATED WORKS• VI. CONCLUDING REMARKS

Page 3: Optimizing Data Partitioning at Broadcasting the Data

I. Introduction

• SOLAR-CATS– A Teaching tool for large size computer laboratories and

small seminar classes– Does not need a server because it uses peer to peer (P2P)

technology.

Page 4: Optimizing Data Partitioning at Broadcasting the Data

• Remote operation – of an application which is equipped with SOLAR-CATS, – on every PC in the class from one PC in the class.

• Interactive operation– of an application by all class members. – Has a Mutual exclusion function

Page 5: Optimizing Data Partitioning at Broadcasting the Data

• Sending of images– from one display in the class to all other displays quickly

• Annotation – is also possible and live changes on one display to all other

displays can be continuously distributed. • The recording and replaying operations

– of applications on SOLAR-CATS.

Page 6: Optimizing Data Partitioning at Broadcasting the Data

• The P2P technology of SOLAR-CATS– a kind of structured P2P system.– adopt the balanced N-ary tree as the structure. – TCP connection as the connection. – When a node receives a broadcast message from one

connection, the node sends the message to all of other connections if the node has them.

Page 7: Optimizing Data Partitioning at Broadcasting the Data

1 2

3 4

5 6

2 3

3 4 4 5 4 5

b=2 , i=4

t=3

Page 8: Optimizing Data Partitioning at Broadcasting the Data

• If traffic of any connection does not affect the traffics of any other connections in the group

• and when a broadcast message is sent from one node to its all connections,

• all nodes in the group receive the message in a term of O(log n) where n is the number of nodes in the group.

Page 9: Optimizing Data Partitioning at Broadcasting the Data

• a Theoretical Equation for optimizing the number of data partitioning at broadcasting on the balanced N-ary tree shaped P2P system.

• Applied the equation to improve the performance of image broadcasting function of SOLAR-CATS

• the performance has been improved. • The SOLAR-CATS has been used in real classes.

Page 10: Optimizing Data Partitioning at Broadcasting the Data

II. THEORETICAL EQUATION FOR OPTIMIZING THE NUMBER OF DATA PARTITIONING AT BROADCASTING ON A BALANCED N-ARY TREE

• We’d like to make it fast – Broadcasting on a Balanced N-Ary Tree, using TCP.

• We know– Usually, Messages of

• a small number of a large size > a large number of a small size

– when the data is partitioned into pieces of message and they are sent from one node to another node.

– For example, Jumbo Frame.

Page 11: Optimizing Data Partitioning at Broadcasting the Data

• However, – when a message is sent from one node to plural nodes, – without using IP multicast nor frame broadcast for keeping

reliability of the message passing, – there is the term that one or more of receiving nodes of

them is/are not working. – When the size of the message becomes larger, the term

also becomes longer. So there should be an optimum number of partitioning.

Page 12: Optimizing Data Partitioning at Broadcasting the Data

• We assume– Bandwidths and latencies of all connections are the same– Performances of all nodes are the same– Traffic of any connection does not affect the traffic of other

connections.

Page 13: Optimizing Data Partitioning at Broadcasting the Data

• b : the number of branches of the balanced N-ary tree• N : the number of nodes. • i: height of the tree.

– if b=2 , N, is given by the equation N=b^i-1 . • t: partitioning number. • and messages were sent in turn from the root node.

Page 14: Optimizing Data Partitioning at Broadcasting the Data

1 2

3 4

5 6

2 3

3 4 4 5 4 5

b=2 , i=4

t= 1

If no partitioning

Page 15: Optimizing Data Partitioning at Broadcasting the Data

1 2

3 4

5 6

2 3

3 4 4 5 4 5

Page 16: Optimizing Data Partitioning at Broadcasting the Data

1 2

3 4

5 6

2 3

3 4 4 5 4 5

Page 17: Optimizing Data Partitioning at Broadcasting the Data

1 2

3 4

5 6

2 3

3 4 4 5 4 5

Page 18: Optimizing Data Partitioning at Broadcasting the Data

1 2

3 4

5 6

2 3

3 4 4 5 4 5

Page 19: Optimizing Data Partitioning at Broadcasting the Data

1 2

3 4

5 6

2 3

3 4 4 5 4 5

Page 20: Optimizing Data Partitioning at Broadcasting the Data

1 2

3 4

5 6

2 3

3 4 4 5 4 5

There are terms, many nodes which are not working

Page 21: Optimizing Data Partitioning at Broadcasting the Data

1 2

3 4

5 6

2 3

3 4 4 5 4 5

b=2 , i=4

t=3

Page 22: Optimizing Data Partitioning at Broadcasting the Data

1 2

3 4

5 6

2 3

3 4 4 5 4 5

Page 23: Optimizing Data Partitioning at Broadcasting the Data

1 2

3 4

5 6

2 3

3 4 4 5 4 5

Page 24: Optimizing Data Partitioning at Broadcasting the Data

1 2

3 4

5 6

2 3

3 4 4 5 4 5

Page 25: Optimizing Data Partitioning at Broadcasting the Data

1 2

3 4

5 6

2 3

3 4 4 5 4 5

Page 26: Optimizing Data Partitioning at Broadcasting the Data

1 2

3 4

5 6

2 3

3 4 4 5 4 5

Page 27: Optimizing Data Partitioning at Broadcasting the Data

1 2

3 4

5 6

2 3

3 4 4 5 4 5

Page 28: Optimizing Data Partitioning at Broadcasting the Data

1 2

3 4

5 6

2 3

3 4 4 5 4 5

Page 29: Optimizing Data Partitioning at Broadcasting the Data

1 2

3 4

5 6

2 3

3 4 4 5 4 5

Page 30: Optimizing Data Partitioning at Broadcasting the Data

1 2

3 4

5 6

2 3

3 4 4 5 4 5

Page 31: Optimizing Data Partitioning at Broadcasting the Data

1 2

3 4

5 6

2 3

3 4 4 5 4 5

Terms of not working are reduced.

Page 32: Optimizing Data Partitioning at Broadcasting the Data

•T=CBb(t+i-2)+d(t) (1)–CB: term of transfer one piece of the data

–b: b-ary tree, t: partitioning number

–d(t): the relay time

Page 33: Optimizing Data Partitioning at Broadcasting the Data

•CB = 8s/(wt) + dH

(2)–s: whole size(byte) of data–w: bandwidth(bit/sec)–dH: the latency between connected nodes.

Page 34: Optimizing Data Partitioning at Broadcasting the Data

•d(t) = a + ct (3)– a,c : constants

Page 35: Optimizing Data Partitioning at Broadcasting the Data

• Assigning (2), (3) in to (1),

• T=8sb/w + a + dHb(i-2)+ 8sb(i-2)/(wt)+t(dHb+c) (4) = A + B/t + Ct (5)– There is the t0 which makes the T minimalby assigning the t0 into t of the (5)

Page 36: Optimizing Data Partitioning at Broadcasting the Data

•A=8sb/w + a + dHb(i-2)•B=8sb(i-2)/w•C=dHb+c

Page 37: Optimizing Data Partitioning at Broadcasting the Data

• The t0 is the positive solution of the following

–T’=-B/t^2 +C = 0• So

–t0 =

sqrt(B/C)=sqrt(8s(i-2)/{w(dH+c/b)}) (7)

Page 38: Optimizing Data Partitioning at Broadcasting the Data

Group Manager

Teacher’s node system

TCP TCP

TCP TCP TCP TCP

Student’s nodesystem

Student’s nodesystem

Student’s nodesystem

Student’s nodesystem

Student’s nodesystem

Student’s nodesystem

III. SOLAR-CATS

A Computer Assisted Teaching System

for large size computer laboratories andsmall seminar classes

Page 39: Optimizing Data Partitioning at Broadcasting the Data

Writer’s Assistant

Web Browser

Programming Environment

Text Editor

Draw

Applications

Main Controller Command Transceiver

Event Recorder/Player

Network

Page 40: Optimizing Data Partitioning at Broadcasting the Data
Page 41: Optimizing Data Partitioning at Broadcasting the Data

• SOLAR-CATS includes–PC Screen Image sharing function.= Sends the screen Image from one PC to other PCs

Page 42: Optimizing Data Partitioning at Broadcasting the Data

• Remote operation • Interactive operation• Sending of images• Annotation • The recording and replaying operations

Page 43: Optimizing Data Partitioning at Broadcasting the Data

IV. PERFORMANCE IMPROVEMENT OF SOLAR-CATS• Environmnent

– Send an Image at one node, 1.3MB, not compressed. – 100Mbps switch. latency of the switch is 2.3μ sec. – CPUs : Intel Pentium 1.3GHz or better. – The sizes of the memory were more than 500MB.

• Before– The image was partitioned in to 572 pieces.

Page 44: Optimizing Data Partitioning at Broadcasting the Data

TABLE I. SENDING TIME OF OLD FUNCTION

Number of Nodes

(with root)

Sending Time(sec.)

(measured 10 times)

Min. Max. Ave.

3 8.84 9.51 9.01

6(i=3) 8.68 10.05 9.33

(t=572)

Page 45: Optimizing Data Partitioning at Broadcasting the Data

TABLE I. PARTITIONING NUMBERS AND SENDING TIMES

Number of

Nodes

(with root)

Partitioning

number

(width)

Sending Time(sec)

(measured 10 times)

Min. Max. Avg.

3

572(24)

143(48)

36(96)

21(128)

9(196)

5(256)

4.47

1.72

2.10

1.58

1.65

1.48

8.58

2.54

3.35

2.05

2.35

2.14

6.13

2.09

2.56

1.82

1.99

1.92

7 572(24)

143(48)

36(96)

21(128)

9(196)

5(256)

3.21

2.03

2.21

1.97

1.90

1.99

9.75

4.95

3.09

2.45

2.48

2.08

4.67

3.11

2.64

2.25

2.18

2.01

(4times)

Frozen

Page 46: Optimizing Data Partitioning at Broadcasting the Data

(a) 3 nodes (i=2)

Page 47: Optimizing Data Partitioning at Broadcasting the Data

• When the number of nodes is three and the b is 2, i.e. i=2, the B of the equation (5) is 0. So there should be linear relationship between partitioning numbers and the sending times.

• T=A + B/t + Ct (5)–B=8sb(i-2)/w

Page 48: Optimizing Data Partitioning at Broadcasting the Data

• The fitting line of the (a) of the Figure 4 shows– partitioning numbers and the sending times is almost

linear– because R^2 is 0.937. … our assumption of (3) is realistic.

• A of (5) is approximately 1.8, C of (5) is approximately 0.007.

• So a of (3) should be about 1.6(sec.). d(t) = a + ct (3)– The a includes the term of screen capturing and screen

rendering. This value seems realistic.

Page 49: Optimizing Data Partitioning at Broadcasting the Data

(b) 7 nodes (i=3)

Page 50: Optimizing Data Partitioning at Broadcasting the Data

• When the number of nodes is seven, i.e. i=3, the B of the equation (5) should be about 0.2 because we know s=1.3MB, b=2, i=3, w=100Mbps. We assume that we can ignore the B because partitioning numbers are large enough for the ignorance for almost every partitioning number in the TABLEII.

T=A + B/t + Ct (5) B=8sb(i-2)/w

Page 51: Optimizing Data Partitioning at Broadcasting the Data

t0 = sqrt(B/C)=sqrt(8s(i-2)/{w(dH+c/b)}) (7)

• The C of the equation (5) is about 0.04 from the best fit line of the (b) of the Figure 4. The optimal partitioning number t0 of the equation (7) should be about seven. Sorry, We could not measure the time when it is. However,

• This number is also realistic because when the partitioning number was five or nine, average sending times of them in the TABLE II are almost two seconds and they are the smallest values in the table.

Page 52: Optimizing Data Partitioning at Broadcasting the Data

• Video of an improved SOLAR-CATS. (from 1min.30sec.)

Page 53: Optimizing Data Partitioning at Broadcasting the Data

V. RELATED WORKS

• The electronic chalk board by Hirahara and others [3][8] enables sharing the same image with a large number of users using P2P. This transmits an image of teacher’s screen to students’ screens uni-directionally. However, SOLAR-CATS enables bi-directional exchanging of information between the teacher and students using a real time sharing of operations.

Page 54: Optimizing Data Partitioning at Broadcasting the Data

• QuickBoard [4] is a web based WYSIWIS and it can be used for a large size class which has up to two hundred terminals. However, it uses a high performance server, and is also uni-directional.

• Wb [2] is a popular tool for real time communication among remote users using a draw program on a multicast network. However, wb is not equipped with mutual exclusion function.

Page 55: Optimizing Data Partitioning at Broadcasting the Data

• ESM [1], RelayCast [5] and Emma [6] are ALM (Application Level Multicast) systems; our system is also a kind of ALM. They are used for exchanging streaming data while SOLAR-CATS is used for sharing the same operation.

Page 56: Optimizing Data Partitioning at Broadcasting the Data

VI. CONCLUDING REMARKS

• A theoretical equation for optimizing the partitioning number at broadcasting data.

• Improved a computer assisted teaching system• The equation and the results of our experiment did

not conflict so much with the network programming guide in the Tanenbaum’s text book [7].

• We are improving the performance of SOLAR-CATS much more.

Page 57: Optimizing Data Partitioning at Broadcasting the Data

ACKNOWLEDGMENT

• We thank our students who help us to develop and test SOLAR-CATS.

• A part of this work was supported by Grant-in-Aid for Scientific Research of Japan Society for the Promotion of Science, Fundamental Research(C), 17500041.

Page 58: Optimizing Data Partitioning at Broadcasting the Data

• The t0 is the positive solution of the following

–T’=-B/t^2 +C = 0• So

–t0 =

sqrt(B/C)=sqrt(8s(i-2)/{w(dH+c/b)})