Top Banner
Towards Real-Time, Many Task Applications on Large Distributed Systems - focusing on the implementation of RT-BOINC Sangho Yi ([email protected])
40

Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)

Sep 24, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)

Towards Real-Time, Many Task Applications on Large Distributed Systems

- focusing on the implementation of RT-BOINC

Sangho Yi ([email protected])

Page 2: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)

Content   Motivation and Background

  RT-BOINC in a nutshell   Internal structures

  Design & implementation

  Conclusions and future work

Page 3: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)

Motivation  Demands for computing large-scale real-time(RT) tasks

increased in distributed computing environment

 Chess, Game of Go

 Real-time Forensic Analysis

 Ultra HD-level Real-time Multimedia Processing

 …

 Lack of support for RT in existing Desktop Grids, and Volunteer Computing environment

Page 4: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)

About BOINC   BOINC is tailored for maximizing task throughput, not

minimizing latency on the order of seconds.

 XtreemWeb and Condor have similar characteristics.

  A BOINC project has

 A BOINC server (web, storage, database, ...)

 Multiple BOINC clients

 Network connection between server - clients

Page 5: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)

BOINC Projects   Normally perform a few transactions in 1 sec with host

clients.

  1~50 transactions in 1 sec (ref. http://boincstats.com)

  Send large chunk of computation to the host clients.

 a couple of hours, or even days of computation

  Does not have RT guarantee

  Because it is tailored for maximizing total amount of computation.

Page 6: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)

Significant Gaps here...  ”I need a 10-second-car.” - in the movie ”Fast & Furious”

Vin diesel – the main actor in the movie

Page 7: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)

Significant Gaps here...  ”We need a 10-second-completion.” - in a ”Chess game”

Page 8: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)

RT-BOINC in a Nutshell   RT-BOINC features

  Providing low WCET (worst-case execution time) for all components

  No database operations at run-time

  O(1) interfaces for data structures

  Reduced complexity for server daemons  Almost O(1)

Page 9: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)

Original BOINC Internal BOINC Server

Host

Host

Host

Host

Host

Scheduler

Work-generator Requests for work distribution

Transitioner

Feeder

workunits in DB w w w

w w w w w

w w

workunit-result ready queue wr wr wr wr wr

Validator

Assimilator workunit-results in DB w

r w r w

r r

w r w r r r w

w r w r

r w

r r w

BOINC Project

File-deleter Results of work ...

: flow of distributing work requests : flow of reporting work results

BOINC Hosts

Page 10: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)

RT-BOINC Internal

Page 11: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)

Data management  MySQL Database vs. In-memory data structures

BOINC DB

(workunits, results, hosts, users, apps, platforms, and …) - based on MySQL

Complexity for lookup, insert, and remove: O(log

N) ~ O(N2)

In-Memory Data structures - O(1)

a b c

2a 2b 2c

Multi-level lookup tables and fixed-size list

Lookup pools

w w w

w

w w w w r r

r

r r r

r

r r

Main Database

In-memory data records with data

format compaction (workunits, results,

hosts, users, ...) - based on shm-IPC

(a) BOINC (b) RT-BOINC

Page 12: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)

Example 1) select from where;

ID of result

 Retrieving RESULT from the O(1) data structure

1 2 3 4

Ex) select * from result where workunitid = ‘0x1234’; 8 bits 4 bits 4 bits

24 = 16 entries

28 = 256 entries

Result table in main memory

Page 13: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)

Performance Evaluation   1) Micro and Macro Benchmarks

  Based on dummy server load

  2) Case Studies   Game of Go AI, (and Chess AI – soon)

Page 14: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)

Macro-benchmarks (high load)

Page 15: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)

Performance Evaluation - #2   Case Studies

  Game of Go - 9x9 board (currently working)  FueGo - a monte-carlo-based AI

 GTP protocol (go text protocol)

 KGS Go Server - can play with AI and human

  Chess (developing with Emmanuel Jeannot)  Distributed depth-first-search-based AI

 UCI protocol (universal chess interface)

Page 16: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)

Summary  RT-BOINC provides...

  Faster response time and real-time performance than BOINC.

  300~1,000 times lower WCET(worst-case execution time) for each server-side operation.

  less difference between the average and the worst-case performance.

  less difference between low and high load conditions.

Page 17: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)

Future work (The rest part)

RT-BOINC Server

Project manager requests work T: deadline Nc: # of computation Ps: probability for successful execution

request

RT-BOINC server provides the worst-case number of transactions processing per second: Nt

Lot of volunteer hosts

...

distribution

returning results

T Nc/Nt

Time for handling transactions in server

Time for computation in volunteer hosts

Time for communication between server and hosts

Checkpointing & Replication is required in the presence of hosts’ failures.

Red: What we have done in the first paper

Page 18: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)

Future work (The rest part)

RT-BOINC Server

Project manager requests work T: deadline Nc: # of computation Ps: probability for successful execution

request

RT-BOINC server provides the worst-case number of transactions processing per second: Nt

Lot of volunteer hosts

...

distribution

returning results

T Nc/Nt

Time for handling transactions in server

Time for computation in volunteer hosts

Time for communication between server and hosts

Checkpointing & Replication is required in the presence of hosts’ failures.

Blue: What we will show in the next paper

Page 19: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)

Go AI on RT-BOINC KGS

Go server GTP

Client Go AI Master

RT-BOINC server Work

generator Transitioner Feeder Scheduler Validator Assimilator

(aggregator) File deleter

Ask to move Send “genmove” command

Send input file Generate a workunit (initiate deadline timer)

Generates workunit- results pairs

Insert pairs into scheduler pool

Send works to clients

RT-BOINC Clients

(Worker)

Compute Works

(5~10 secs)

Return results to scheduler

Store results

Set need_validate = TRUE

Activate Transitioner

Validate results, and set ASSIMILATE_READY

Assimilate results into one file and return to Master

Select and return the best move

Return the best move

Set FILE_DELETE_READY, and activate File deleter Set ASSIMILATE_DONE, and activate Transitioner

Delete the result files

Response time = 15~25 secs

Set FILE_DELETE_DONE, and activate Feeder to clean the in-memory data structures

Delete data in-memory

Select the best move

(0~1 secs)

Network Communication Delay (5~10 secs)

Deadline timer can activate Transitioner

Page 20: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)

Experimental Setup (1)   We used a little bit fast machine, but used only 2

cores for this experiements.

  We’ll extend the scale of experiments when we have greater # of volunteers.

Component Description Notes

Processor 2.00 Ghz (Dual-Quad) Intel Xeon E5504

Main Memory 32GB (1,000 Mhz)

Secondary Storage HDD - sorry for lack of info :’)

Operating System Ubuntu 9.10 (karmic) Linux Kernel 2.6.31-19

Page 21: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)

Experimental Setup (2)   RT-BOINC

  Up to 50k active wu, result, host, users

  3.9GBs of memory usage on a 64bit machine  1.9GBs of memory usage for O(1) data structures

(49.5 % of total)

  BOINC   Recent server-stable version (Jun. 2010)

Page 22: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)

Minor Things for Experiments   Apache & MySQL

  Max # of connections (default is 100~256)

  Need 2 identical (physical) servers   For BOINC vs. RT-BOINC testing

Page 23: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)

Preliminary Results (Go AI)   Only preliminary results are available now.

 Two cases: 160, and 480 cores (of volunteers)

Deadline = 30 secs / move

Page 24: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)

Screen Shot on KGS

Page 25: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)

Macro-benchmarks  Difference of worst-case performance between low and high

load condition

Page 26: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)

Performance Evaluation - #1  Purpose: to measure real-time performance of BOINC and RT-

BOINC

  Criteria: the worst-case and the average execution time

  Method: micro and macro benchmarks

  Micro-benchmark: for each primary operation related to server process

  Macro-benchmark: for each server process (including feeder, scheduler, transitioner, work-generator, assimilator, validator, and file-deleter)

Page 27: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)

Experimental Environment  We used a little bit slow, common-off-the-shelf system. ;-)

  For ease of reproduction of the results

Component Description Notes

Processor 1.60GHz, 3MB L2 cache Intel Core 2 Duo

Main Memory 3GB (800 Mhz) Dual-channel DDR3

Secondary Storage Solid State Drive SLC Type

Operating System Ubuntu 9.10 (karmic) Linux Kernel 2.6.31-19

BOINC version Server stable version Nov. 11, 2009 (from SVN)

Page 28: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)

Micro-benchmarks  Average execution time (in seconds)

Page 29: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)

Micro-benchmarks  Worst-case execution time (in seconds)

Page 30: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)

Micro-benchmarks  Performance improvement ratio (RT-BOINC / BOINC)

Page 31: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)

Micro-benchmarks  Performance gap between worst-case and average

Page 32: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)

Macro-benchmarks (low load)

Page 33: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)

Source code on the Web   http://sourceforge.net/projects/rt-boinc

Page 34: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)

Size of Data Structures  RT-BOINC uses the ’shared memory segment’ IPC between

server daemon processes to share the data structures.

 For 10,000 entries of hosts, results, workunits, it consumes totally 1.09GB in main memory.

  Memory overhead for O(1) data structures is 38.6% of the total usage.

  Using 1GB memory is reasonable on the common-off-the-shelf 64-bit hardware platforms.

Page 35: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)

Detailed information on the Web   http://rt-boinc.sourceforge.net

Page 36: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)

Future work (Remaining issues)  Providing ’dynamic shared-memory management’ to reduce

memory usage

 Studying trade-offs between execution time and memory usage

 Studying better data structure management for O(1) response

 Finding better task deployment policy to

 Reduce server-side load and latency

 Improve real-time performance

Page 37: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)

Thanks! / Questions?

Page 38: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)

Example 2) insert into values(...);  Inserting RESULT to the O(1) data structure

Ex) insert into result ... values (...);

Result table in main memory

Get an available result field’s id from end of list Then, remove the ‘id’ from end of list

Lookup pool for available results

Insert result to this place

(a) Insertion

Page 39: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)

Example 3) delete from where;  Deleting RESULT from the O(1) data structure

Ex) delete from result where id=’1234’;

Result table in main memory

Insert ‘1234’ to the end of the result lookup list

Lookup pool for available results

Invalidate 1234th result

(b) Deletion

Page 40: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)

Prototype Implementation  Additional information

  Compaction of BOINC's data format

  Modification of PHP codes

  Trade-offs between memory usage and WCET  Statically adjustable with parameters

  Compatibility with BOINC  The rest parts are still compatible with BOINC.