Models Cache Oblivious Algorithms Cache Oblivious Data Structures Cache Oblivious Algorithms and Data Structures Theory and Practice Hitesh Ballani Department of Computer Science Cornell University CS 612 31 st March Hitesh Ballani Cache Oblivious Algorithms and Data Structures
48
Embed
Cache Oblivious Algorithms and Data Structures Theory and … · 2005. 5. 10. · Cache Oblivious Data Structures Cache Oblivious Algorithms and Data Structures Theory and Practice
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
ModelsCache Oblivious Algorithms
Cache Oblivious Data Structures
Cache Oblivious Algorithmsand Data Structures
Theory and Practice
Hitesh Ballani
Department of Computer ScienceCornell University
CS 61231st March
Hitesh Ballani Cache Oblivious Algorithms and Data Structures
ModelsCache Oblivious Algorithms
Cache Oblivious Data StructuresMotivation
Memory Hierarchy - A Fact of Lifea.k.a. the essence of CS 612
Multi-level memory hierarchies are omnipresentMemory speed ∝ (Distance from processor)−1
Good locality is important for achieving high performance
SPEED
REGISTER
L1 Cache
L2 Cache
L3 Cache
Memory
Disk
CAPACITY
Hitesh Ballani Cache Oblivious Algorithms and Data Structures
ModelsCache Oblivious Algorithms
Cache Oblivious Data StructuresMotivation
Hardware Parametersits a jungle out there
Modern hardware is not uniform - many differentparametersIn homework 1, we used X-RAY to measure
CPU speedInstruction Latency/ThroughputNumber of registersSpecial Instructions (eg. fma)Cache Stride/Associativity/Capacity/Line-Size/Hit-Latency
Current programsignore the parameters - poor performancedetermine the parameters
A perfectly balanced binary treeComparisons : O(log N)
How to minimize the cache misses?
Hitesh Ballani Cache Oblivious Algorithms and Data Structures
ModelsCache Oblivious Algorithms
Cache Oblivious Data StructuresStatic Data Structures
How to minimize the cache misses?Prokop’99
Choosing the memory layout
Layout : Mapping of nodes of a tree to memory cellsDifferent kinds of layouts
In-orderBreadth-firstDepth-firstvan Emde Boas
van Emde Boas Layout : Main Idea
Store recursive sub-trees in contiguous memory
Hitesh Ballani Cache Oblivious Algorithms and Data Structures
ModelsCache Oblivious Algorithms
Cache Oblivious Data StructuresStatic Data Structures
How to minimize the cache misses?Prokop’99
Choosing the memory layout
Layout : Mapping of nodes of a tree to memory cellsDifferent kinds of layouts
In-orderBreadth-firstDepth-firstvan Emde Boas
van Emde Boas Layout : Main Idea
Store recursive sub-trees in contiguous memory
Hitesh Ballani Cache Oblivious Algorithms and Data Structures
ModelsCache Oblivious Algorithms
Cache Oblivious Data StructuresStatic Data Structures
van Emde Boas layout
Split the tree at the middle level of edgesOne top recursive subtree∼√
N bottom recursive subtrees : size ∼√
N
Recursively layout the top and the bottom subtrees
Hitesh Ballani Cache Oblivious Algorithms and Data Structures
ModelsCache Oblivious Algorithms
Cache Oblivious Data StructuresStatic Data Structures
van Emde Boas layout example
Tree Height = 4
Hitesh Ballani Cache Oblivious Algorithms and Data Structures
ModelsCache Oblivious Algorithms
Cache Oblivious Data StructuresStatic Data Structures
How does this help us?
Search complexity
Recursive subtrees of size atmost B ⇒ two contiguous blocks
Two cache misses for each suchsubtree
# of cache misses whensearching down log n levels:
(2 log n) / log B = 2 logBn
Is this Divide and Conquer?
The layout is a kind of divide and conquer
The algorithm is the usual tree-search algorithm
Hitesh Ballani Cache Oblivious Algorithms and Data Structures
ModelsCache Oblivious Algorithms
Cache Oblivious Data StructuresStatic Data Structures
How does this help us?
Search complexity
Recursive subtrees of size atmost B ⇒ two contiguous blocks
Two cache misses for each suchsubtree
# of cache misses whensearching down log n levels:
(2 log n) / log B = 2 logBn
Is this Divide and Conquer?
The layout is a kind of divide and conquer
The algorithm is the usual tree-search algorithm
Hitesh Ballani Cache Oblivious Algorithms and Data Structures
ModelsCache Oblivious Algorithms
Cache Oblivious Data StructuresStatic Data Structures
Performance in practice
from Kumar’03
Linux/Itanium/2GB/g++ -O3/48 byte nodes
Hitesh Ballani Cache Oblivious Algorithms and Data Structures
ModelsCache Oblivious Algorithms
Cache Oblivious Data StructuresStatic Data Structures
Performance in practice
Cache Oblivious Search Trees via Binary Trees of SmallHeight, Brodal et. al.’02Linux,Pentium III 1GHz, 256KB cache, 1GB RAM, 4 bytenodes
Hitesh Ballani Cache Oblivious Algorithms and Data Structures
ModelsCache Oblivious Algorithms
Cache Oblivious Data StructuresStatic Data Structures
Another dose of reality!
Take Home Message
One needs to be careful when putting theory into practice
Hitesh Ballani Cache Oblivious Algorithms and Data Structures
ModelsCache Oblivious Algorithms
Cache Oblivious Data StructuresStatic Data Structures
Another dose of reality!
Take Home Message
One needs to be careful when putting theory into practice
Hitesh Ballani Cache Oblivious Algorithms and Data Structures
ModelsCache Oblivious Algorithms
Cache Oblivious Data StructuresStatic Data Structures
Some other data structures
Funnels Prokop’99Dynamic Search Tree Bender et. al.’00Packed Memory Structure Bender et. al.’00Priority Queue Arge et al.’02
Hitesh Ballani Cache Oblivious Algorithms and Data Structures
ModelsCache Oblivious Algorithms
Cache Oblivious Data StructuresSummary
Summary
Cache Oblivious Algorithms and Data Structures
Abstract away the hardware parametersCan handle varying cache specifics and multi-level memoryhierarchies while attaining asymptotic efficiency
A lot of CO algorithms have been developed latelymost are generalizations of previous external memoryalgorithmsmain techniques : Divide and Conquer, Recursive Layout
Their innate simplicity holds a lot of promise!!
A number of issues not addressed by the theoretic modelare critical for performance in practical settings
Hitesh Ballani Cache Oblivious Algorithms and Data Structures