Exa-Scale Volunteer Computing David P. Anderson Space Sciences Laboratory U.C. Berkeley
Jan 01, 2016
Exa-Scale Volunteer Computing
David P. Anderson
Space Sciences LaboratoryU.C. Berkeley
Outline
• Volunteer computing
• BOINC
• Applications
• Research directions
High-throughputcomputing
High-performancecomputing
program runstoo slow on PC
cluster(MPI)
supercomputer
cluster(batch)
Grid
Commercialcloud
Volunteercomputing
single job
# processors
multiple jobs
10K-1M
1000
100
1
Volunteer computing
• Early projects
– 1997: GIMPS, distributed.net
– 1999: SETI@home, Folding@home
• Today
– ~50 projects
– 500K volunteers
– 1M computers, 2.4M cores
– 10 PetaFLOPS
The potential of volunteer computing
• The volunteer resource pool• Current PetaFLOPS breakdown:
• Potential: ExaFLOPS by 2010– 4M GPUs * 1 TFLOPS * 0.25 availability
Processor type0
0.51
1.52
2.53
3.54
4.55
4.6
2.4 2.2
1.2
NVIDIA
CPU
PS3 (Cell)
ATI
BOINC
• Middleware for volunteer computing
– client, server, web
• Based at UC Berkeley Space Sciences Lab
• Open source (LGPL)
• NSF-funded since 2002
• http://boinc.berkeley.edu
BOINC: volunteers and projects
volunteers projects
CPDN
LHC@home
WCGattachments
The Utopian vision
• Better research gets more computing power
• An enlightened public decides what’s better
ScientificresearchThe public
resources
education/outreach
Science areas using BOINC• Biology
– protein study, genetic analysis• Medicine
– drug discovery, epidemiology• Physics
– LHC, nanotechnology, quantum computing• Astronomy
– data analysis, cosmology, galactic modeling• Environment
– climate modeling, ecosystem simulation• Math• Graphics rendering
Application types
• Computing-intensive analysis of large data
• Physical simulations
• Genetic algorithms
– GA-based optimization
Climateprediction.net
Einstein@home
• Gravitational waves; gravitational pulsars
SETI@home
Milkyway@home
GPUGRID.net
AQUA@home
• D-Wave Systems
• Simulation of “adiabatic quantum algorithms” for binary quadratic optimization
Quake Catcher Network
Account managers
BOINC software overview
client
apps
screensaver
GUI
scheduler
MySQL
data server
daemons
volunteer host
project serverHTTP
Client: job scheduling
• Queue multiple jobs
– avoid starvation
– minimize communication
– variety
• Job scheduling
– Round-robin time-slicing
– Earliest deadline first
Client: work fetch policy
• When? From which project? How much?• Goals
– maintain enough work– minimize scheduler requests– honor resource shares
• per-project “debt”
CPU 0
CPU 3
CPU 2
CPU 1
maxmin
BOINC schedulerapplications
Win32 + NVIDIA
Win64
Mac OS X
app versions
jobs
instances
Win32 N-core
Win32
- HW, SW description- existing workload- per resource type: # of instances requested # of seconds requested
- app version descriptions- job descriptions
Anonymous platform mechanism
• Volunteer supplies app versions.
– security
– optimization
– unsupported platforms
Umbrella projects
Example: IBM World Community Grid
Projectpublicityweb developmentsysadminapp porting
The Berkeley@home model
• A university has– scientists– a powerful “brand”– PR resources– IT infrastructure– lots of alumni (UCB: 500,000)
Hubs• nanoHUB: “science portal” for nanoscience
– social network + “app store”
– sharing of ideas, data, software
– computational portal
• HUBzero: generalization to other areas
– currently ~20 hubs
• Integration of BOINC with HUBzero
– each hub has a volunteer computing project
Volunteer computing research
• Host characterization• Simulation-based performance study• MPI-type apps• Apps in VMs• Data-intensive computing• Volunteer motivation
Conclusion
• Volunteer computing: Exa-scale potential– GPUs are crucial
• BOINC: enabling technology
• Bottlenecks
– the culture of scientific computing
– organizational models