Top Banner
Software Support for Advanced Computing Platforms Ananth Grama Professor, Computer Sciences and Coordinated Systems Lab., Purdue University. [email protected] http://www.cs.purdue.edu/pdsl
33

Software Support for Advanced Computing Platforms

Jan 13, 2016

Download

Documents

tadhg

Software Support for Advanced Computing Platforms. Ananth Grama Professor, Computer Sciences and Coordinated Systems Lab., Purdue University. [email protected] http://www.cs.purdue.edu/pdsl. Building Applications for Next Generation Computing Platforms. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Software Support for Advanced Computing Platforms

Software Support for Advanced Computing Platforms

Ananth GramaProfessor, Computer Sciences and

Coordinated Systems Lab.,Purdue University.

[email protected]://www.cs.purdue.edu/pdsl

Page 2: Software Support for Advanced Computing Platforms

Building Applications for Next Generation Computing Platforms

• Emerging trends point to two disruptive technologies:

– Architecture innovations from the desktop to scalable systems

– Embedded intelligence and ubiquitous processing

• How do we program these platforms efficiently?

Very little of what we have learned over three decades of parallel programming directly applies here.

Page 3: Software Support for Advanced Computing Platforms

Evolution of Microprocessor Architectures

• Chip-Multiprocessor Architectures

• Scalable Multicore Platforms

• Heterogeneous Multicore Processors

• Transactional Memory

Page 4: Software Support for Advanced Computing Platforms

Multicore Architectures -- An Overview

• The Myth:

– Multicore processors are designed for speed.

• The Reality:

Multicore processors are motivated by power considerations:

– Power is proportional to clock speed

– Power is quadratic in Vdd

– Vdd can be reduced as clock speed is reduced

– Computation speed is generally sublinear in clock speed

Page 5: Software Support for Advanced Computing Platforms

Multicore Architectures -- An Overview

• Collocate multiple processor cores on a single chip (a special class of chip-multiprocessors)

• Programming model is typically thread-based

• Many microprocessors are hardware compatible with existing motherboards (memory performance?)

• Memory systems vary widely across various vendors (AMD vs. Intel vs. IBM PowerPC/Cell)

Page 6: Software Support for Advanced Computing Platforms

Multicore Architectures -- Trends

• Current generation typically at dual- or quad-core

• Desktops and mobile dual-core variants available

• Scalable multicore: AMD and Intel both plan up to 16 cores in the next two years and up to 64 cores in the medium term.

• Heterogeneous multicore: some of the most commoly used processors today are heterogeneous multicore (network routers, ARM/TI DSPs in cell-phones).

Page 7: Software Support for Advanced Computing Platforms

Memory System Architecture

• Trading off latency and bandwidth (the Cell solution)

• Programmable caches

• Transactional Memory

Page 8: Software Support for Advanced Computing Platforms

Transactional Memory Overview

• Addresses problems of correctness of parallel programs as well as performance.

• Requires hardware support.

• Mitigates many of the problems associated with locks – composability, granularity, mixing correctness and performance.

Page 9: Software Support for Advanced Computing Platforms

Transactional Memory Overview

begin_transaction

x = x + 1

y = y + x

if (x < 10)

z = x;

else

z = y;

end_transaction

Thread 1

begin_transaction

x = x - 1

y = y - x

if (x > 10)

z = x;

else

z = y;

end_transaction

Thread 2

Each thread sees either all, or none of the other threads updates.

Basic mechanisms: isolation (conflict detection), versioning (maintain versions), and atomicity (commit or rollback).

Page 10: Software Support for Advanced Computing Platforms

Implications for Application Development and Performance

• Fundamental changes in the entire application stack

– Programming paradigms (models of concurrency)

– Software support (compilers, OS)

– Library support (application kernels)

– Runtime systems and performance monitoring (performance bottlenecks and alleviation)

– Analysis techniques (scaling to the extreme)

Page 11: Software Support for Advanced Computing Platforms

Ongoing work at Purdue / collaborators – A Birds-eye View

(Collaborators: Intel -- Compilers, Libraries, UMN -- Analysis Techniques, EPFL -- Programming Paradigms)

Programming Models: What are appropriate concurrency abstractions?

– When is communication good?

– How do we deal with the spectrum of coherence models seamlessly?

– How do we use transactions in real programs (I/O and networks are not transactional)

Page 12: Software Support for Advanced Computing Platforms

Programming Models: The Mediera Environment

– Define domains of identical coherence models.

– Build slack into concurrency.

– View other cores as intelligent caches.

– Use an LRU-type strategy to swap out threads across cores.

– Support for algorithmic asynchrony.

A number of important issues need to be resolved relating to mixed models -- messaging overhead associated with swapped out threads, resource bounds, livelock, priority inversion.

Page 13: Software Support for Advanced Computing Platforms

Library Support

• Building optimized multicore libraries for important computational kernels (sparse algebra, quantum scale – MD methods) / Intel MKE.

• Novel algorithms for memory-constrained platforms (excess FLOPS, instead of excess memory accesses).

• Demonstrated application performance (model reduction, nano-scale modeling).

• Comprehensive benchmarking of platforms (DARPA/HPCS pilot study) with a view to identifying performance bottlenecks and desirable application characteristics.

Page 14: Software Support for Advanced Computing Platforms

Analysis Techniques

How do we analyze programs over large number of cores?

• Isoefficiency metric

– Scaling problem size with number of cores to maintain performance.

• Memory constrained scaling

– Quantifying drop in performance with increase in number of cores while operating at peak memory

• Impact of limited bandwidth

– Increasing number of cores implies lower bandwidth at each core

Page 15: Software Support for Advanced Computing Platforms

Technical Objective

To develop the next generation software environment for scalable chip-multiprocessor systems, along with library support and validating applications.

Page 16: Software Support for Advanced Computing Platforms

Software Environments for Embedded Systems

Setting of calibration tests

Page 17: Software Support for Advanced Computing Platforms

Programming Scalable Systems

• The traditional approach to distributed programming involves writing “network-enabled” programs for each node– The program encodes distributed system behavior using

complex messaging between nodes

– This paradigm raises several issues and limitations:• Program development is time consuming

• Programs are error prone and difficult to debug

• Lack of a distributed behavior specification, which precludes verification

• Limitations with respect to scalability, heterogeneity

and performance

Page 18: Software Support for Advanced Computing Platforms

Programming Scalable Systems

• Macroprogramming entails direct specification of the distributed system behavior in contrast to programming individual nodes

• Provides:– Seamless support for heterogeneity

• Uniform programming platform

• Node capability-aware abstractions

• Performance scaling

– Separating the application from system-level details

– Scalability and adaptability with network & load dynamics

– Validation of behavioral specification

Page 19: Software Support for Advanced Computing Platforms

Technical Objective

To develop a second generation operating system suite that facilitates rapid macroprogramming of efficient self-organized distributed applications for scalable embedded systems

Page 20: Software Support for Advanced Computing Platforms

Ongoing Work: The CosmOS System Suite for Embedded Environments

• CosmOS Components:– Programming model, compilation techniques– Device independent node operating system

interfaces and implementations– Network operating system

Page 21: Software Support for Advanced Computing Platforms

CosmOS Programming Model

• Macroprogram consists of:• Distributed system behavioral specification

• Constraints associated with mapping behavioral specification to physical system

• Behavioral Specification– Functional Components (FCs)

• Represents a specific data processing function

• Typed input and output interface

– Interaction Assignment (IA)• Directed graph that specifies data flow through FCs

• Data source and sinks are (logical) device ports

Page 22: Software Support for Advanced Computing Platforms

CosmOS Program Valdiation

• Statically type-checked interaction assignment• The output of a component can be connected to the input of

another only if their types match

• Functional components represent a deterministic data processing function

• The output sequence depends only on the inputs to the FC

• Correctness• Given input at each source in the IA the outputs at sinks are

deterministically known

Page 23: Software Support for Advanced Computing Platforms

CosmOS Functional Components

• Elementary unit of execution– Isolated from the state of the system and other FCs– Uses only stack variables and statically assigned state memory– Asynchronous execution: data flow and control flow handled by

cosmOS

• Static memory– Prevents non-deterministic behavior due to malloc failures– Leads to a lean memory management system in the OS

• Reusable components– The only interaction is via typed interfaces

• Dynamically loadable components– Runtime updates possible

Average

raw_t

avg_t

avg_t

Page 24: Software Support for Advanced Computing Platforms

CosmOS Program Specification

• Sections:– Enumerations

– Declarations

– Mapping constraints

– IA Description

Page 25: Software Support for Advanced Computing Platforms

CosmOS Program: An Example• %photo : device = PHOTO_SENSOR, out [ raw_t ];• %fs : device = FILE_DUMP, in [ * ];• %avg : { fcid = FCID_AVG, in [ raw_t, avg_t ], out [ avg_t ] };• %thresh : { fcid = FCID_THRESH, in [ raw_t ], out [ raw_t ] };• @ snode = CAP_PHOTO_SENSOR : photo, thresh;• @ fast_m = CAP_FAST_CPU : avg;• @ server = CAP_FS | CAP_UNIQUE_SERVER : avg, fs;• start_ia• timer(100) photo(1);• photo(1) thresh(2,0,500);• thresh(2,0) avg(3,0,10), avg(4,0,100);• avg(3,0) fs(5) | avg(3,1);• avg(4,0) fs(6) | avg(4,1);• end_ia

raw_t

T(t)

P() Threshold(500)

raw_t raw_t*Average

(10)raw_t avg_t

FS

*Average(100)

raw_t avg_tFS

avg_t

avg_t

Page 26: Software Support for Advanced Computing Platforms

CosmOS: Runtime System

Average(10)avg_t

raw_t *avg_tFS

raw_t *avg_tFS

raw_traw_t

T(t)

P() Threshold(500)

raw_t

Average(100)avg_t

Page 27: Software Support for Advanced Computing Platforms

CosmOS: Runtime System

• Provides a low-footprint execution environment for CosmOS programs

• Key components– Data flow and control flow

– Locking and concurrency

– Load conditioning

– Routing primitives

Page 28: Software Support for Advanced Computing Platforms

CosmOS Node Operating System

UpdateableUser space

Static OSKernel

Platform Independent Kernel

App FC App FC App FCServicesServices

HW Drivers HW Drivers HW Drivers

Hardware Abstraction Layer

Page 29: Software Support for Advanced Computing Platforms

CosmOS: Current Status

• Fully functional implementations for Mica2 and POSIX (on Linux)

• Mica2:• Non-preemptive function pointer scheduler• Dynamic memory management

• POSIX:• Multi-threading using POSIX threads and

underlying scheduler• The OS exists as library calls and a single

management thread

Page 30: Software Support for Advanced Computing Platforms

CosmOS: Current Status

• Comprehensively evaluated and validated

• Alpha releases can be freely downloaded from:

http://www.cs.purdue.edu/~awan/cosmos/

Page 31: Software Support for Advanced Computing Platforms

CosmOS Validation

ECN Net

ECN Net

Internet

Internet

802.11b Peer-to-Peer

FM 433MHz

Laser attachedvia serial port to

Stargate computers

MICA2 motes withADXL 202

Currently laser readingscan be viewed for from anywhere over the Internet(conditioned on firewall settings)

Pilot deployment at BOWEN labs

Page 32: Software Support for Advanced Computing Platforms

CosmOS: Ongoing Work

• Semantics of the CosmOS Programming Model

• GUI for Interaction Assignment

• Library of modules

• Large-scale deployment and scalability studies

• Application-specific optimizations.

Page 33: Software Support for Advanced Computing Platforms

Thank you!

For papers and talks on these topics, please visit:

http://www.cs.purdue.edu/pdsl