Carnegie Mellon Meet Stevephanie, the Computer Spiral: Automatic Library Generation Spiral: Automatic Library Generation Franz Franchetti Electrical and Computer Engineering Electrical and Computer Engineering Carnegie Mellon University UW MSR Institute 2008, August 6 th : “The Concurrency Challenge”
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Carnegie Mellon
Meet Stevephanie, the ComputerSpiral: Automatic Library GenerationSpiral: Automatic Library Generation
Franz Franchetti
Electrical and Computer EngineeringElectrical and Computer EngineeringCarnegie Mellon University
UW MSR Institute 2008, August 6th: “The Concurrency Challenge”
High performance is hard even on “simple” architectures
Carnegie Mellon
Current Solution
Legions of programmers implement and optimize the same functionality for every platform and whenever asame functionality for every platform and whenever a new platform comes out.
Carnegie Mellon
Better: Automatic Performance Tuning
Automate (parts of) the implementation or optimization
Research effortsLinear algebra: Phipac/ATLAS, LAPACK, Sparsity/Bebop/OSKI, FlameTensor computations (TCE)PDE/fi i l F iPDE/finite elements: FenicsAdaptive sortingFourier transform: FFTW, UHFFT Linear transforms: Spiral…
Proceedings of the IEEE special issue, Feb. 2005
Promising new area but more work neededIn particular for parallelism …
Carnegie Mellon
SpiralpLibrary generator for linear transforms (DFT, DCT, DWT, filters, ….) and recently more …
Wide range of platforms supported: scalar, fixed point, vector, parallel, Verilog, GPU
Research Goal: “Teach” computers to write fast librariesComplete automation of implementation and optimizationConquer the “high” algorithm level for automationq g g
When a new platform comes out: Regenerate a retuned library
When a new platform paradigm comes out (e.g., CPU+GPU):Update the tool rather than rewriting the library
Intel is using Spiral to generate parts of their MKL and IPP libraries
Carnegie Mellon
Vision Behind Spiral
Current Future
Numerical problem
effo
rt
Numerical problem
algorithm selection
hum
an e
implementationC program
mat
edalgorithm selection
implementation
compilation
utom
ated
auto
compilation
p
Computing platform
au
Computing platform
C code a singularity: Compiler hasno access to high level information
Challenge: conquer the high abstraction level for complete automation
Carnegie Mellon
Main Idea: Joint Mathematical Abstraction
Model: common abstraction= spaces of matching formulas
νabstraction abstraction
νpμ
rewritingdefines
picksearch
architecturespace
algorithmspace
Architectural parameter:V l h
Kernel: bl ioptimizationVector length,
#processors, …problem size, algorithm choice
optimization
Carnegie Mellon
How Spiral WorksProblem specification (transform)
Spiral:
Algorithm Generation
Algorithm Optimization
controlsSpiral: Complete automation of the implementation and optimization task
Implementation
algorithm
arch
controls
p
Basic idea:Declarative representation
Code Optimization
C il ti
C code
Sea
of algorithms
Rewriting systems to d i i Compilation
Compiler Optimizationsperformance
S i l
generate and optimize algorithms
Fast executableSpiral
Carnegie Mellon
Generating Not Finding ParallelismGenerating, Not Finding Parallelism
Span space of false‐sharing free programs for empirical tuning
Carnegie Mellon
Going Beyond TransformsGoing Beyond Transforms
Transform = linear operator with one vector input and one vector outputlinear operator with one vector input and one vector output
linear
Key ideas: Generalize to (possibly nonlinear) operators with several inputs andGeneralize to (possibly nonlinear) operators with several inputs and several outputsGeneralize SPL (including tensor product) to OL (operator language)Generalize rewriting systems for parallelizationsGeneralize rewriting systems for parallelizations
Carnegie Mellon
Expressing Kernels as Operator FormulasViterbi DecoderMatrix‐Matrix Multiplication