Barcelona Supercomputing Center
Post on 19-Jan-2016
34 Views
Preview:
DESCRIPTION
Transcript
Barcelona Supercomputing Center
Barcelona Supercomputing Center
• The BSC-CNS objectives:
• R&D in Computer Sciences, Life Sciences and Earth Sciences.
• Supercomputing support to external research.
• BSC-CNS is a consortium that includes :
• the Spanish Government (MEC) – 51%
• the Catalonian Government (DIUE) – 37%
• the Technical University of Catalonia (UPC) – 12%
• 300 people
Research areas
• Influence the way machines are built, programmed and used
• Through demonstration, ideas, cooperation with manufacturers & productse-science
Programming models• Evolving standarts (OpenMP x.y)• Prototyping infrastructure (mercurium, nanos library, …)• Dependeces/data-flow (StarSs for Cell, SMP, GPU, Grid)• Hierarchical/hybrid (MPI/SMPSs, NestedSs, …) • Software Distributed Shared Memory•Use of Transactional memory
Programming models• Evolving standarts (OpenMP x.y)• Prototyping infrastructure (mercurium, nanos library, …)• Dependeces/data-flow (StarSs for Cell, SMP, GPU, Grid)• Hierarchical/hybrid (MPI/SMPSs, NestedSs, …) • Software Distributed Shared Memory•Use of Transactional memory
Resource management• OS scheduling: resource/power aware job scheduling, dynamic load balancing• Scalable file systems • Efficient execution on distributed computing environments: GRIDSs @ MN/RES, Grid I/O, heterogenous workloads• Management for next-generation data centers: virtualization
Resource management• OS scheduling: resource/power aware job scheduling, dynamic load balancing• Scalable file systems • Efficient execution on distributed computing environments: GRIDSs @ MN/RES, Grid I/O, heterogenous workloads• Management for next-generation data centers: virtualization
Performance analysis• Tracing: scalable/online, sampling• Visualization: Paraver• Automatic analysis: spectral, clustering,…• Methodologies and training material• Integration with other tools
Performance analysis• Tracing: scalable/online, sampling• Visualization: Paraver• Automatic analysis: spectral, clustering,…• Methodologies and training material• Integration with other tools
Prediction and evaluation infrastructure• Dimemas: multiscale simulation• Interconnection network: overlap, contention, …• Node and microarchitecture level simulators: MPsim, TaskSim• Architecture support for programming models and runtimes
Prediction and evaluation infrastructure• Dimemas: multiscale simulation• Interconnection network: overlap, contention, …• Node and microarchitecture level simulators: MPsim, TaskSim• Architecture support for programming models and runtimes
UsersEarth SciencesLife Sciences Engineering apps
Programming models
• Implementations on top of other low level run times, FPGAs, OpenCL
• Granularity control
• Locality aware scheduling
• Application porting Hybrid MPI/StarSs and comparison with other models
• Load balancing in nested/hybrid implementations
• Instrumentation and analysiss for task based systems
StarSsCellSs
SMPSs
GPUSs
GridSs
ClearSpeedSsClusterSs
CompSs (Java)
#pragma css task input(A, B) output(C)void vadd3 (float A[BS], float B[BS], float C[BS]);#pragma css task input(sum, A) output(B)void scale_add (float sum, float A[BS], float B[BS]);#pragma css task input(A) inout(sum)void accum (float A[BS], float *sum);
for (i=0; i<N; i+=BS) // C=A+B vadd3 ( &A[i], &B[i], &C[i]);...for (i=0; i<N; i+=BS) // sum(C[i]) accum (&C[i], &sum);...for (i=0; i<N; i+=BS) // B=sum*A scale_add (sum, &E[i], &B[i]);...for (i=0; i<N; i+=BS) // A=C+D vadd3 (&C[i], &D[i], &A[i]);...for (i=0; i<N; i+=BS) // E=C+F vadd3 (&C[i], &F[i], &E[i]);
Performance tools
• Analysis of applications at large scale
• Maximize ratio of captured information / emitted data
• Intelligent on line data reduction
• Mixed instrumentation and sampling
• Advanced modeling/prediction of sequential computation behavior
• Memory behavior
• Use classification techniques of hardware counter metrics to identify potentially interesting transformations
CPI STACK model for sequential
computation parts
top related