YOU ARE DOWNLOADING DOCUMENT

Please tick the box to continue:

Transcript
Page 1: Optimized Hybrid  Scaled Neural Analog Predictor

Optimized Hybrid Scaled Neural Analog Predictor

Daniel A. Jiménez

Department of Computer ScienceThe University of Texas at San Antonio

Page 2: Optimized Hybrid  Scaled Neural Analog Predictor

Branch Prediction with Perceptrons

2

Page 3: Optimized Hybrid  Scaled Neural Analog Predictor

Branch Prediction with Perceptrons cont.

3

Page 4: Optimized Hybrid  Scaled Neural Analog Predictor

4

SNP/SNAP [St. Amant et al. 2008]

A version of piecewise linear neural prediction [Jiménez 2005]

Based on perceptron prediction

SNAP is a mixed digital/analog version of SNP

Uses analog circuit for costly dot-product operation

Enables interesting tricks e.g. scaling

Page 5: Optimized Hybrid  Scaled Neural Analog Predictor

5

Weight Scaling

Scaling weights by coefficients

Different history positions

have different importance!

Page 6: Optimized Hybrid  Scaled Neural Analog Predictor

6

The Algorithm: Parameters and Variables

C – array of scaling coefficients

h – the global history length

H – a global history shift register

A – a global array of previous branch addresses

W – an n × (GHL + 1) array of small integers

θ – a threshold to decide when to train

Page 7: Optimized Hybrid  Scaled Neural Analog Predictor

7

The Algorithm: Making a Prediction

Weights are selected based on the current branch and the ith most recent branch

Page 8: Optimized Hybrid  Scaled Neural Analog Predictor

The Algorithm: Training

If the prediction is wrong or |output| ≤ θ then

For the ith correlating weight used to predict this branch:

Increment it if the branch outcome = outcome of ith in history

Decrement it otherwise

Increment the bias weight if branch is taken

Decrement otherwise

8

Page 9: Optimized Hybrid  Scaled Neural Analog Predictor

SNP/SNAP Datapath

9

Page 10: Optimized Hybrid  Scaled Neural Analog Predictor

10

Tricks

Use alloyed [Skadron 2000] global and per-branch history Separate table of local perceptrons

Output from this stage multiplied by empircally determined coefficient

Training coefficients vector(s) Multiple vectors initialized to f(i) = 1 / (A + B × i)

Minimum coefficient value determined empircally

Indexed by branch PC

Each vector trained with perceptron-like learning on-line

Page 11: Optimized Hybrid  Scaled Neural Analog Predictor

Tricks(2)

Branch cache Highly associative cache with entries for branch information Each entry contains:

A partial tag for this branch PC The bias weight for this branch An “ever taken” bit A “never taken” bit

The “ever/never” bits avoid needless use of weight resources The bias weight is protected from destructive interference LRU replacement >99% hit rate

11

Page 12: Optimized Hybrid  Scaled Neural Analog Predictor

Tricks(3)

Hybrid predictor

When perceptron output is below some threshold: If a 2-bit counter gshare predictor has high confidence, use it

Else use a 1-bit counter PAs predictor

Multiple θs indexed by branch PC

Each trained adaptively [Seznec 2005]

Ragged array Not all rows of the matrix are the same size

12

Page 13: Optimized Hybrid  Scaled Neural Analog Predictor

Benefit of Tricks

13

Graph shows effect of one trick in isolation

Training coefficients yields most benefit

Page 14: Optimized Hybrid  Scaled Neural Analog Predictor

14

References

Jiménez & Lin, HPCA 2001 (perceptron predictor)

Jiménez & Lin, TOCS 2002 (global/local perceptron)

Jiménez ISCA 2005 (piecewise linear branch predictor)

Skadron, Martonosi & Clark, PACT 2000 (alloyed history)

Seznec 2005 (adaptively trained threshold)

St. Amant, Jiménez & Burger, MICRO 2008 (SNP/SNAP)

McFarling 1993, gshare

Yeh & Patt 1991, PAs

Page 15: Optimized Hybrid  Scaled Neural Analog Predictor

15

The End


Related Documents