Makefile::Parallel - Dependency specification language

Post on 30-Jun-2015

2612 Views

Category:

Economy & Finance

1 Downloads

Preview:

Click to see full reader

Transcript

Makefile::Parallel

Dependency Specification Language

Rúben Fonseca

root@cpan.org

"Well done is quickly done"Ceaser Augustus

~ 100 64-bits cores~ 100Gb RAM~ 4TB storageMyrinet 10GbLinux (CentOS)

SeARCH Cluster

How to run those processes?

3 options!

Solution 1

while(! end) { run process $p while(! finished $p) { sleep n }

mark $p as done}

Solution 2

#!/bin/foo

run p1run p2 result1run p3 result2run p4 result1 result2

Solution 3

Makefile::Parallel

:-)

M::P - Design Goals

Formal specification to describe process dependenciesReuse a well known language syntax (don't reinvent the wheel)Embed other languages to reuse their expressive powerMake it easy to run and maintainGenerate profiling dataHandle and recover from errors Run on different parallel / distributed computaing platformsHave fun and profit :-)

M::P - The language

Started as a Makefile subsetEvolved with our own sugar syntaxSupport for parametric rules Perl

Parse::YAPP

M::P - Simple example

prepare: (5:00) mkdir OutputData p <- sub{ print "$_\n" for (3..10) } run$p: prepare (20:00:00) [2] runMyProgram -p $p InputData > OutputData/run.$p cleanup: run$p (5:00) for a in @p; do rm -f OutputData/run.${a}.tmp; done

M::P - Simple exampleprepare: (5:00) mkdir OutputData p <- sub{ print "$_\n" for (3..10) } run$p: prepare (20:00:00) [2] runMyProgram -p $p InputData > OutputData/run.$p cleanup: run$p (5:00) for a in @p; do rm -f OutputData/run.${a}.tmp; done

M::P - Simple exampleprepare: (5:00) mkdir OutputData p <- sub{ print "$_\n" for (3..10) } run$p: prepare (20:00:00) [2] runMyProgram -p $p InputData > OutputData/run.$p cleanup: run$p (5:00) for a in @p; do rm -f OutputData/run.${a}.tmp; done

M::P - Scheduler

Try to run the processes in parallelMust be extensiveShould generate logs and visual profiling data

PBS - Portable Batch System

Job Scheduling for Clusters

TorqueOpenPBS Common API

qsubqstatqdeltracejob

M::P Scheduler - Main Loop

do { launch processes with fulfilled deps collect ended processes for each proc in ended_processes if proc has variables to expand calculate variables manipulate the dependency graph save journal sleep 10 seconds} while(!all processes completed)

M::P GetOpts

-local=(n) Schedule on the local machine-pbs Schedule on a PBS capable cluster-continue Recover from the last error-dump Dumps the AST of the parsed specification-clean Remove temporary files-debug (no need to debug)

M::P Profiling

Runtime logging: [...]2006/12/12 10:49:22 The job "ipfp005" is ready to run. Launching2006/12/12 10:49:22 Launched "ipfp005" (23996)2006/12/12 10:49:52 Process 23996 (ipfp005) has terminated [30s]2006/12/12 10:49:52 The job "postipfp005" is ready to run. Launching2006/12/12 10:49:52 Launched "postipfp005" (23997)2006/12/12 10:50:02 Process 23997 (postipfp005) has terminated [10s][...]

M::P Profiling

Final report: ID Start Time End Time Elapsedcodify 2006-12-12T10:41:10 2006-12-12T10:49:11 8m 1sngramsA 2006-12-12T10:49:11 2006-12-12T11:07:46 18m 34sngramsB 2006-12-12T10:49:11 2006-12-12T11:05:44 16m 33sinitmat001 2006-12-12T10:49:11 2006-12-12T10:50:12 1minitmat002 2006-12-12T10:49:11 2006-12-12T10:50:43 1m 31sinitmat003 2006-12-12T10:49:11 2006-12-12T10:51:03 1m 51s[...]

M::P Profiling

M::P Error Handling

M::P Error handling

M::P Real World Usage

NLP process

~ 100 lines of Makefile::Parallel syntax~ 4Gb of textDesktop P4 3Ghz == 1 weekSeARCH cluster == 12 hours

M::P Real World Usage

M::P About the module

use Clone qw(clone);use Cwd;use Data::Dumper;use Digest::MD5;use GraphViz;use Log::Log4perl;use Parse::Yapp;use Proc::Reliable;use Proc::Simple;use Time::HiRes qw(gettimeofday tv_interval);use Time::Interval;use Time::Piece::ISO;

M::P About the module

M::P New features

0.4- support for multiple parametric variables on each rule- new bugs added

0.5- fixed a bunch of bugs- new bugs added

0.6- fixed a small (but important!) bug- ran out of new ideas

Need a direction!

ambs@cpan.org

Thank you

root@cpan.org

top related