Scientific Computing at SLAC Richard P. Mount Director: Scientific Computing and Computing Services Stanford Linear Accelerator Center HEPiX October 11, 2005
Jan 21, 2016
Scientific Computing at SLAC
Richard P. Mount
Director: Scientific Computing and Computing Services
Stanford Linear Accelerator Center
HEPiX
October 11, 2005
October 11, 2005Richard P. Mount, SLAC 2
SLAC Scientific ComputingBalancing Act
• Aligned with the evolving science mission of SLAC
but neither
• Subservient to the science mission
nor
• Unresponsive to SLAC mission needs
October 11, 2005Richard P. Mount, SLAC 3
SLAC Scientific Computing Drivers
• BaBar (data-taking ends December 2008)– The world’s most data-driven experiment
– Data analysis challenges until the end of the decade
• KIPAC – From cosmological modeling to petabyte data analysis
• Photon Science at SSRL and LCLS– Ultrafast Science, modeling and data analysis
• Accelerator Science– Modeling electromagnetic structures (PDE solvers in a demanding application)
• The Broader US HEP Program (aka LHC)– Contributes to the orientation of SLAC Scientific Computing R&D
October 11, 2005Richard P. Mount, SLAC 4
DOE Scientific Computing Funding at SLAC
• Particle and Particle Astrophysics
– $14M SCCS
– $5M Other
• Photon Science
– $0M SCCS
– $1M SSRL?
• Computer Science
– $1.5M
October 11, 2005Richard P. Mount, SLAC 5
Scientific ComputingThe relationship between Science and the
components of Scientific Computing Application Sciences
Issues addressable with “computing”
Computing techniques
Computing hardware
High-energy and Particle-Astro Physics, Accelerator Science, Photon Science …
Particle interactions with matter, Electromagnetic structures, Huge volumes of data, Image processing …
PDE Solving, Algorithmic geometry, Visualization, Meshes, Object databases, Scalable file systems …
Processors, I/O devices, Mass-storage hardware, Random-access hardware, Networks and Interconnects …
Computing architectures
Single system image, Low-latency clusters, Throughput-oriented clusters, Scalable storage …
SCCS FTE
~20
~26
October 11, 2005Richard P. Mount, SLAC 6
Scientific Computing:SLAC’s goals for
Scientific Computing
Computing
for
Data-
Intensive
Science
The Scien ce of Scientific Computing
Application Sciences
Issues addressable with “computing”
Computing techniques
Computing hardware
SLAC + Stanford Science
Computing architectures
SLAC + Stanford Science
Co
llabo
ratio
n w
ith
Sta
nfo
rd a
nd
Ind
us
try
October 11, 2005Richard P. Mount, SLAC 7
Scientific Computing:Current SLAC leadership and recent achievements in Scientific Computing
SLAC + Stanford Science
World’s largest database
Internet2 Land-Speed Record; SC2004 Bandwidth Challenge
Huge-memory systems for data analysis
Scalable data management
GEANT4 photon/particle interaction in complex structures (in collaboration with CERN)
PDE solving for complex electromagnetic structures
October 11, 2005Richard P. Mount, SLAC 8
What does SCCS run (1)?
Data analysis “farms” (also good for HEP simulation)
~ 4000 processors
~ Linux and Solaris
Shared-Memory multiprocessor
– SGI 3700
– 72 processors
– Linux
256 dual-opteron Sun V20zs
Myrinet cluster
– 128 processors
– Linux
October 11, 2005Richard P. Mount, SLAC 9
What does SCCS run (2)?Application-specific clusters
– each 32 to 128 processors
– Linux
PetaCache Prototype
– 64 nodes
– 16 GB memory per node
– Linux/Solaris
And even …
October 11, 2005Richard P. Mount, SLAC 10
What does SCCS run (3)?Disk Servers
– About 500TB
– Network attached
– Mainly xrootd
– Some NFS
– Some AFS
Tape Storage
– 6 STK Powderhorn Silos
– Up to 6 petabytes capacity
– Currently store 2 petabytes
– HPSS
About 120 TB
Sun/Solaris Servers
Sun fibrechannel disk arrays
October 11, 2005Richard P. Mount, SLAC 11
What does SCCS run (4)?Networks
– 10 Gigabits/sto ESNET
– 10 Gigabits/sR&D
– 96 fibersto Stanford
– 10 Gigabits/s core in computer center (as soon as we unpack the boxes)
October 11, 2005Richard P. Mount, SLAC 12
SLAC Computing - Principles and Practice (Simplify and Standardize)
• Lights-out operation – no operators for the last 10 years– Run 24x7 with 8x5 (in theory) staff
– (When there is a cooling failure on a Sunday morning, 10–15 SCS staff are on site by the time I turn up)
• Science (and business-unix) raised-floor computing– Adequate reliability essential
– Solaris and Linux
– Scalable “cookie-cutter” approach
– Only one type of farm CPU bought each year
– Only one type of file-server + disk bought each year
– Highly automated OS installations and maintenance• e.g see talk on how SLAC does Clusters by Alf Wachsmann
http://www.slac.stanford.edu/~alfw/talks/RCcluster.pdf
October 11, 2005Richard P. Mount, SLAC 13
Client Client Client Client Client Client
Disk Server
Disk Server
Disk Server
Disk Server
Disk Server
Disk Server
Tape Server
Tape Server
Tape Server
Tape Server
Tape Server
SLAC-BaBar Computing Fabric
IP Network (Cisco)
IP Network (Cisco)
120 dual/quad CPU Sun/Solaris~400 TB Sun FibreChannel RAID arrays (+some SATA)
1700 dual CPU Linux 800 single CPU Sun/Solaris
25 dual CPU Sun/Solaris40 STK 9940B6 STK 9840A6 STK Powderhornover 1 PB of data
HEP-specific ROOT software (Xrootd) + Objectivity/DB object database some NFS
HPSS + SLAC enhancements to ROOT and Objectivity server code
October 11, 2005Richard P. Mount, SLAC 14
Scientific ComputingResearch Areas (1)
(Funded by DOE-HEP and DOE SciDAC and DOE-MICS)
• Huge-memory systems for data analysis(SCCS Systems group and BaBar)
– Expected major growth area (more later)
• Scalable Data-Intensive Systems: (SCCS Systems and Physics Experiment Support groups)
– “The world’s largest database” (OK not really a database any more)
– How to maintain performance with data volumes growing like “Moore’s Law”?
– How to improve performance by factors of 10, 100, 1000, … ?(intelligence plus brute force)
– Robustness, load balancing, troubleshootability in 1000 – 10000-box systems
– Astronomical data analysis on a petabyte scale (in collaboration with KIPAC)
October 11, 2005Richard P. Mount, SLAC 15
Scientific ComputingResearch Areas (2)
(Funded by DOE-HEP and DOE SciDAC and DOE MICS)
• Grids and Security: (SCCS Physics Experiment Support. Systems and Security groups)
– PPDG: Building the US HEP Grid – Open Science Grid;
– Security in an open scientific environment;
– Accounting, monitoring, troubleshooting and robustness.
• Network Research and Stunts:(SCCS Network group – Les Cottrell et al.)
– Land-speed record and other trophies
• Internet Monitoring and Prediction:(SCCS Network group)
– IEPM: Internet End-to-End Performance Monitoring (~5 years)
– INCITE: Edge-based Traffic Processing and Service Inference for High-Performance Networks
October 11, 2005Richard P. Mount, SLAC 16
• GEANT4: Simulation of particle interactions in million to billion-element geometries:(SCCS Physics Experiment Support Group – M. Asai, D. Wright, T. Koi, J. Perl …)
– BaBar, GLAST, LCD …
– LHC program
– Space
– Medical
• PDE Solving for complex electromagnetic structures:(Kwok Ko‘s advanced Computing Department + SCCS clusters)
Scientific ComputingResearch Areas (2)
(Funded by DOE-HEP and DOE SciDAC and DOE MICS)
October 11, 2005Richard P. Mount, SLAC 17
Growing Competences
• Parallel Computing (MPI …)
– Driven by KIPAC (Tom Abel) and ACD (Kwok Ko)
– SCCS competence in parallel computing (= Alf Wachsmann currently)
– MPI clusters and SGI SSI system
• Visualization
– Driven by KIPAC and ACD
– SCCS competence is currently experimental-HEP focused (WIRED, HEPREP …)
– (A polite way of saying that growth is needed)
A Leadership-Class Facility for Data-Intensive Science
Richard P. Mount
Director, SLAC Computing ServicesAssistant Director, SLAC Research Division
Washington DC, April 13, 2004The Peta
Cache
Projec
t
October 11, 2005Richard P. Mount, SLAC 19
PetaCache Goals
• The PetaCache architecture aims at revolutionizing the query and analysis of scientific databases with complex structure.
– Generally this applies to feature databases (terabytes–petabytes) rather than bulk data (petabytes–exabytes)
• The original motivation comes from HEP
– Sparse (~random) access to tens of terabytes today, petabytes tomorrow
– Access by thousands of processors today, tens of thousands tomorrow
October 11, 2005Richard P. Mount, SLAC 20
PetaCacheThe Team
• David Leith, Richard Mount, PIs
• Randy Melen, Project Leader
• Chuck Boeheim (Systems group leader)
• Bill Weeks, performance testing
• Andy Hanushevsky, xrootd
• Systems group members
• Network group members
• BaBar (Stephen Gowdy)
October 11, 2005Richard P. Mount, SLAC 21
Random-Access Storage Performance
0.000000001
0.00000001
0.0000001
0.000001
0.00001
0.0001
0.001
0.01
0.1
1
10
100
1000
log10 (Obect Size Bytes)
Ret
reiv
al R
ate
Mb
ytes
/s
PC2100
WD200GB
STK9940B
0 1 2 3 4 5 6 7 8 9 10
Latency and Speed – Random Access
October 11, 2005Richard P. Mount, SLAC 22
The PetaCache Strategy• Sitting back and waiting for technology is a BAD idea
• Scalable petabyte memory-based data servers require much more than just cheap chips. Now is the time to develop:
– Data server architecture(s) delivering latency and throughput cost-optimized for the science
– Scalable data-server software supporting a range of data-serving paradigms (file-access, direct addressing, …)
– “Liberation” of entrenched legacy approaches to scientific data analysis that are founded on the “knowledge” that accessing small data objects is crazily inefficient
• Applications will take time to adapt not just codes, but their whole approach to computing, to exploit the new architecture
• Hence: three phases
1. Prototype machine (In operation)• Commodity hardware
• Existing “scalable data server software” (as developed for disk-based systems)
• HEP-BaBar as co-funder and principal user
• Tests of other applications (GLAST, LSST …)
• Tantalizing “toy” applications only (too little memory for flagship analysis applications)
• Industry participation
2. Development Machine (Next proposal)• Low-risk (purely commodity hardware) and higher-risk (flash memory system requiring some hardware development) components
• Data server software – improvements to performance and scalability, investigation of other paradigms
• HEP-BaBar as co-funder and principal user
• Work to “liberate” BaBar analysis applications
• Tests of other applications (GLAST, LSST …)
• Major impact on a flagship analysis application
• Industry partnerships, DOE Lab partnerships
3. Production Machine(s)• Storage-class memory with a range of interconnect options matched to the latency/throughput needs of differing applications
• Scalable data-server software offering several data-access paradigms to applications
• Proliferation – machines deployed at several labs
• Economic viability – cost-effective for programs needing dedicated machines
• Industry partnerships transitioning to commercialization
October 11, 2005Richard P. Mount, SLAC 23
Prototype Machine(Operational)
Cisco Switch
Data-Servers 64-128 Nodes, each Sun V20z, 2 Opteron CPU, 16 GB memory
Up to 2TB total MemorySolaris or Linux (mix and match)
Cisco Switches
Clientsup to 2000 Nodes, each 2 CPU, 2 GB memory
Linux
PetaCacheMICS + HEP-
BaBar Funding
Existing HEP-Funded BaBar Systems
October 11, 2005Richard P. Mount, SLAC 24
Object-Serving Software
• Xrootd/olbd (Andy Hanushevsky/SLAC)– Optimized for read-only access
– File-access paradigm (filename, offset, bytecount)
– Make 1000s of servers transparent to user code
– Load balancing
– Self-organizing
– Automatic staging from tape
– Failure recovery
• Allows BaBar to start getting benefit from a new data-access architecture within months without changes to user code
• The application can ignore the hundreds of separate address spaces in the data-cache memory
October 11, 2005Richard P. Mount, SLAC 25
Prototype Machine: Performance Measurements
• Latency
• Throughput (transaction rate)
• (Aspects of) Scalability
October 11, 2005Richard P. Mount, SLAC 26
Latency (microseconds) versus data retrieved (bytes)
0.00
50.00
100.00
150.00
200.00
250.00
100
600
1100
1600
2100
2600
3100
3600
4100
4600
5100
5600
6100
6600
7100
7600
8100
Server xrootd overhead
Server xrootd CPU
Client xroot overhead
Client xroot CPU
TCP stack, NIC, switching
Min transmission time
October 11, 2005Richard P. Mount, SLAC 27
Throughput Measurements
0
10000
20000
30000
40000
50000
60000
70000
80000
90000
100000
1 5 10 15 20 25 30 35 40 45 50
Number of Clients for One Server
Tra
nsa
ctio
ns
per
Sec
on
d
Linux Client - Solaris Server
Linux Client - Linux Server
Linux Client - Solaris Server bge
22 processor microseconds per transaction
October 11, 2005Richard P. Mount, SLAC 28
Storage-Class Memory
• New technologies coming to market in the next 3 – 10 years (Jai Menon – IBM)
• Current not-quite-crazy example is flash memory
October 11, 2005Richard P. Mount, SLAC 29
Development MachinePlans
Switch (10 Gigabit ports)
Data-Servers 80 Nodes, each 8 Opteron CPU, 128 GB memory
Up to 10TB total MemorySolaris/Linux
Cisco Switch Fabric
Clients up to 2000 Nodes, each 2 CPU, 2 GB memory
Linux
Data-Servers 30 Nodes, each 2 Opteron CPU, 1TB Flash memory
~ 30TB total MemorySolaris/Linux
PetaCache
SLAC-BaBar System
October 11, 2005Richard P. Mount, SLAC 30
Minor Details?
• 1970’s– SLAC Computing Center designed for ~35 Watts/square foot
– 0.56 MWatts maximum
• 2005– Over 1 MWatt by the end of the year
– Locally high densities (16 kW racks)
• 2010– Over 2 MWatts likely need
• Onwards– Uncertain, but increasing power/cooling need is likely
October 11, 2005Richard P. Mount, SLAC 31
Crystal Ball (1)
• The pessimist’s vision:
– PPA computing winds down to about 20% of its current level as BaBar analysis ends in 2012 (Glast is negligible, KIPAC is small)
– Photon Science is dominated by non-SLAC/non-Stanford scientists who do everything at their home institutions
– The weak drivers from the science base make SLAC unattractive for DOE computer science funding
October 11, 2005Richard P. Mount, SLAC 32
Crystal Ball (2)• The optimist’s vision:
– PPA computing in 2012 includes: • Vigorous/leadership involvement in LHC physics analysis using
innovative computing facilities
• Massive computational cosmology/astrophysics
• A major role in LSST data analysis (petabytes of data)
• Accelerator simulation for the ILC
– Photon Science computing includes: • A strong SLAC/Stanford faculty, leading much of LCLS science, fully
exploiting SLAC’s strengths in simulation, data analysis and visualization
• A major BES accelerator research initiative
– Computer Science includes:• National/international leadership in computing for data-intensive science
(supported at $25M to $50M per year)
– SLAC and Stanford:• University-wide support for establishing leadership in the science of
scientific computing
• New SLAC/Stanford scientific computing institute.