Workflows A Blind Alley in Grid Computing? Søren-Aksel Sørensen Department of Computer Science UCL
Mar 17, 2016
WorkflowsA Blind Alley in Grid Computing?
Søren-Aksel SørensenDepartment of Computer Science
UCL
Dep
artm
ent
of C
ompu
ter
Scie
nce
Abingdon 2005 Slide 2
Sixties work cycleData
ProgramJob Control
Timeshare
Dep
artm
ent
of C
ompu
ter
Scie
nce
Abingdon 2005 Slide 3
Seventies work cycle
Timeshare
Graphics
Hardcopy
Dep
artm
ent
of C
ompu
ter
Scie
nce
Abingdon 2005 Slide 4
Eighties/Nineties work cycle
Timeshare
Dep
artm
ent
of C
ompu
ter
Scie
nce
Abingdon 2005 Slide 5
GRID work cycle
ResourceNetwork
Dep
artm
ent
of C
ompu
ter
Scie
nce
Abingdon 2005 Slide 6
ResourceNetwork
HPC evolution since 1960Timeshared batch environment retainedJob Control cards replaced by Work Flows“Grid designers estimate that the average grid job will take anywhere from daysto weeks” www.ppdg.net/ mtgs/18jun02-lbl/ppdg-idat-ucb-interactivity.ppt
GRID work cycle
Dep
artm
ent
of C
ompu
ter
Scie
nce
Abingdon 2005 Slide 7
eScience
eScience is not financially viable.Must rely on Commodities Off The Shelf (CESDIS 1993). eScience must follow eBusinessWorkflow/Batch model
unacceptable for eBusinessEven RPC unsuitable.Alternatives must be found if Grid
computing is to survive.
Dep
artm
ent
of C
ompu
ter
Scie
nce
Abingdon 2005 Slide 8
This is what I want
Science requires hands-on experience
Dep
artm
ent
of C
ompu
ter
Scie
nce
Abingdon 2005 Slide 9
Requirements
Steering capability Rendering resources & devices Haptic devices
Real time modeling Resource prioritization Network QoS Dynamic resource control
How close are we to this?
Dep
artm
ent
of C
ompu
ter
Scie
nce
Abingdon 2005 Slide 10
My setup 1994
JANETW
orks
tati
on
Batc
hD
ata
TimeShare
Dep
artm
ent
of C
ompu
ter
Scie
nce
Abingdon 2005 Slide 11
My setup 1995
JANETW
orks
tati
on
Wor
ksta
tionPVM (1989)
Dat
a
Dep
artm
ent
of C
ompu
ter
Scie
nce
Abingdon 2005 Slide 12
My setup 1997
JANETW
orks
tati
on
Hub
Wor
ksta
tion
Wor
ksta
tion
Wor
ksta
tion
Wor
ksta
tion
Wor
ksta
tion
Wor
ksta
tion
Wor
ksta
tion
Wor
ksta
tion
Dep
artm
ent
of C
ompu
ter
Scie
nce
Abingdon 2005 Slide 13
Wor
ksta
tion
My setup 2004
JANET
Server Server
Switch Switch Switch Switch Switch Switch
Gb switch
Dep
artm
ent
of C
ompu
ter
Scie
nce
Abingdon 2005 Slide 14
Wor
ksta
tion
My setup 2005
JANET
Server Server
Switch Switch Switch Switch Switch Switch
Gb switch
Processingon Demand?
Dep
artm
ent
of C
ompu
ter
Scie
nce
Abingdon 2005 Slide 15
Example: Particle interaction~5,000 particles falling onto a surface.18 processors are used in this example.Processors are colour coded.Observe colour changes as objects change their home.
Sørensen 2004
Dep
artm
ent
of C
ompu
ter
Scie
nce
Abingdon 2005 Slide 16
But there are problemsBecause we are using human interaction,
smooth progression is essential.We immediately recognise the problem of
load balancing.But each model iteration does not require
the same effort.And don’t forget model induced variations
in time increments (t).We therefore need a dynamic resource
supply.
Dep
artm
ent
of C
ompu
ter
Scie
nce
Abingdon 2005 Slide 17
Load balanceDistributing the load among the available processors is relatively easy.
The model uses 90 herders on 18 processors so there is scope for herder migration to compensate for object migration.
Dep
artm
ent
of C
ompu
ter
Scie
nce
Abingdon 2005 Slide 18
Time progressionAs collisions become dominant, time progression slows down.
Transition is gradual because of particle size variations.
Dep
artm
ent
of C
ompu
ter
Scie
nce
Abingdon 2005 Slide 19
Resource managementLoad balancing is not sufficient.
We need to manage the progression gradient throughout to compensate for changes in time step.
This requires dynamic resource management
User progression
Dep
artm
ent
of C
ompu
ter
Scie
nce
Abingdon 2005 Slide 20
Virtual Relocateable Execution Controller
Creates an interface between the application and a virtual machine.
Virtual machine interacts with schedulers and resource discovery services.
Responsible for: information sharing. fault recovery. resource management. migration policy.
Based on PVM+SSH
Dep
artm
ent
of C
ompu
ter
Scie
nce
Abingdon 2005 Slide 21
Domain 0
JPortalAuthenticationTask generationTask submission
GriDMDiscoveryScheduling
Requests
Permits
GriDM
SGE
GriDM
SGE
GriDM
SGE
GriDM
SGE
Domain 1 Domain 2 Domain 3 Domain 4
VREC ResourceHolders
ApplicationApplication
JYDE
Dep
artm
ent
of C
ompu
ter
Scie
nce
Abingdon 2005 Slide 22
Ready for questions