Top Banner
The Energy Case for Graph Processing on Hybrid Platforms Abdullah Gharaibeh , Lauro Beltrão Costa, Elizeu Santos-Neto and Matei Ripeanu NetSysLab The University of British Columbia http://netsyslab.ece.ubc.ca
21

The Energy Case for Graph Processing on Hybrid Platforms Abdullah Gharaibeh, Lauro Beltrão Costa, Elizeu Santos-Neto and Matei Ripeanu NetSysLab The University.

Dec 18, 2015

Download

Documents

Holly Johnson
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: The Energy Case for Graph Processing on Hybrid Platforms Abdullah Gharaibeh, Lauro Beltrão Costa, Elizeu Santos-Neto and Matei Ripeanu NetSysLab The University.

The Energy Case for Graph Processing on Hybrid

Platforms

Abdullah Gharaibeh, Lauro Beltrão Costa, Elizeu Santos-Neto

and Matei Ripeanu

NetSysLabThe University of British Columbia

http://netsyslab.ece.ubc.ca

Page 2: The Energy Case for Graph Processing on Hybrid Platforms Abdullah Gharaibeh, Lauro Beltrão Costa, Elizeu Santos-Neto and Matei Ripeanu NetSysLab The University.

2

Graphs are Everywhere

1.4B pages, 6.6B links

1B users150B friendships

Page 3: The Energy Case for Graph Processing on Hybrid Platforms Abdullah Gharaibeh, Lauro Beltrão Costa, Elizeu Santos-Neto and Matei Ripeanu NetSysLab The University.

3

Challenges and Opportunities

Data-dependent memory access

patterns

Caches + summary data structures

Large memory footprint as large as 1TB

CPUs

Poor locality

Low compute-to-memory access

ratio

Varying degrees of parallelism(both intra- and inter- stage)

Page 4: The Energy Case for Graph Processing on Hybrid Platforms Abdullah Gharaibeh, Lauro Beltrão Costa, Elizeu Santos-Neto and Matei Ripeanu NetSysLab The University.

4

Challenges and Opportunities

Data-dependent memory access

patterns

Caches + summary data structures

Large memory footprint as large as 1TB

CPUs

Poor locality

Assemble a hybrid platform

Massive hardware multithreading

up to 12GB!

GPUs

Low compute-to-memory access

ratio

Caches + summary data structures

Varying degrees of parallelism(both intra- and inter- stage)

Page 5: The Energy Case for Graph Processing on Hybrid Platforms Abdullah Gharaibeh, Lauro Beltrão Costa, Elizeu Santos-Neto and Matei Ripeanu NetSysLab The University.

55

Performance Modeling Predicts speedup Intuitive

Totem A graph processing engine for hybrid

systems Applies algorithm-agnostic optimizations

Partitioning Strategies Workload to processor matchmaking

Past Work

A Yoke of Oxen and a Thousand Chickens for Heavy Lifting Graph Processing, Gharaibeh et al., PACT 2012

On Graphs, GPUs, and Blind Dating: A Workload to Processor Matchmaking Quest, Gharaibeh et al., IPDPS 2013

Main outcome: hybrid platforms enable significant performance gains

Page 6: The Energy Case for Graph Processing on Hybrid Platforms Abdullah Gharaibeh, Lauro Beltrão Costa, Elizeu Santos-Neto and Matei Ripeanu NetSysLab The University.

66

Performance Modeling Predicts speedup Intuitive

Totem A graph processing engine for hybrid

systems Applies algorithm-agnostic optimizations

Partitioning Strategies Workload to processor matchmaking

Past Work

Main outcome: hybrid platforms enable significant performance gains

A Yoke of Oxen and a Thousand Chickens for Heavy Lifting Graph Processing, Gharaibeh et al., PACT 2012

On Graphs, GPUs, and Blind Dating: A Workload to Processor Matchmaking Quest, Gharaibeh et al., IPDPS 2013

Focused on time to solution

as a success metric

Page 7: The Energy Case for Graph Processing on Hybrid Platforms Abdullah Gharaibeh, Lauro Beltrão Costa, Elizeu Santos-Neto and Matei Ripeanu NetSysLab The University.

7

Motivating Question

Is it energy efficient to use GPU-accelerated platforms for large-

scale graph processing?

Page 8: The Energy Case for Graph Processing on Hybrid Platforms Abdullah Gharaibeh, Lauro Beltrão Costa, Elizeu Santos-Neto and Matei Ripeanu NetSysLab The University.

Evaluation Platform

CharacteristicSandyBridge

(Xeon 2650)

Kepler

(K20)

Hardware Threads / Proc. 16 2496

Frequency / Core (MHz) 2000 705

LLC / Proc. (MB) 20 2

Main Memory / Proc. (GB) 256 5

TDP / Proc. (Watts) 95 225

8

S = CPU Socket G = GPU

GPU has double TDP

Power is measured at the wall AC outlet

Page 9: The Energy Case for Graph Processing on Hybrid Platforms Abdullah Gharaibeh, Lauro Beltrão Costa, Elizeu Santos-Neto and Matei Ripeanu NetSysLab The University.

Evaluation Platform

CharacteristicSandyBridge

(Xeon 2650)

Kepler

(K20)

Hardware Threads / Proc. 16 2496

Frequency / Core (MHz) 2000 705

LLC / Proc. (MB) 20 2

Main Memory / Proc. (GB) 256 5

TDP / Proc. (Watts) 95 225

9

S = CPU Socket G = GPU

High idle power is due to large DRAM space

The CPU is relatively power efficient at peak utilization!

At peak utilization, RAM consumes as much as

dual CPUs!

GPUs are power hungry at peak

utilization

Power is measured at the wall AC outlet

Page 10: The Energy Case for Graph Processing on Hybrid Platforms Abdullah Gharaibeh, Lauro Beltrão Costa, Elizeu Santos-Neto and Matei Ripeanu NetSysLab The University.

10

Challenges and Opportunities

GPU draws significant amount of powerThe workload is Irregular and memory-bound

GPU has low idle power (25W)

Offloading to GPU enables faster “race-to-idle”

Page 11: The Energy Case for Graph Processing on Hybrid Platforms Abdullah Gharaibeh, Lauro Beltrão Costa, Elizeu Santos-Neto and Matei Ripeanu NetSysLab The University.

Evaluation Study

WorkloadsReal and syntheticLarge: can not fit on GPU

memory

1111

Workload |V| |E|Twitter 41M 1.5B

UK-Web 105M 3.7BRMAT27 128M 2.0BRMAT28 256M 4.0BRMAT29 512M 8.0BRMAT30 1,024M 16.0B

BenchmarksBreadth-First Search (BFS)PageRank

MetricsRaw performance (TEPS)Raw power (Watts)Power normalized by processing rate

(Watts/TEPS)

Page 12: The Energy Case for Graph Processing on Hybrid Platforms Abdullah Gharaibeh, Lauro Beltrão Costa, Elizeu Santos-Neto and Matei Ripeanu NetSysLab The University.

GPU-offloading is useful for large graphs

Raw Performance

12S = CPU Socket G = GPU

1S1G > 2SPerformance scales with more processors

Page 13: The Energy Case for Graph Processing on Hybrid Platforms Abdullah Gharaibeh, Lauro Beltrão Costa, Elizeu Santos-Neto and Matei Ripeanu NetSysLab The University.

Power Consumption

13

BFS

1S1G ≤ 2S!

Page 14: The Energy Case for Graph Processing on Hybrid Platforms Abdullah Gharaibeh, Lauro Beltrão Costa, Elizeu Santos-Neto and Matei Ripeanu NetSysLab The University.

Power Consumption

14

BFS

load imbalance more variability

Page 15: The Energy Case for Graph Processing on Hybrid Platforms Abdullah Gharaibeh, Lauro Beltrão Costa, Elizeu Santos-Neto and Matei Ripeanu NetSysLab The University.

Power Consumption

15

BFS PageRank

PageRank draws more power

Page 16: The Energy Case for Graph Processing on Hybrid Platforms Abdullah Gharaibeh, Lauro Beltrão Costa, Elizeu Santos-Neto and Matei Ripeanu NetSysLab The University.

Normalizing by Processing Rate

16In most cases, 1S1G > 2S

Page 17: The Energy Case for Graph Processing on Hybrid Platforms Abdullah Gharaibeh, Lauro Beltrão Costa, Elizeu Santos-Neto and Matei Ripeanu NetSysLab The University.

Normalizing by Processing Rate

17

Energy efficiency scales with more processors

Page 18: The Energy Case for Graph Processing on Hybrid Platforms Abdullah Gharaibeh, Lauro Beltrão Costa, Elizeu Santos-Neto and Matei Ripeanu NetSysLab The University.

Normalizing by Processing Rate

18

Similar results on PageRank

Page 19: The Energy Case for Graph Processing on Hybrid Platforms Abdullah Gharaibeh, Lauro Beltrão Costa, Elizeu Santos-Neto and Matei Ripeanu NetSysLab The University.

Conclusions

A hybrid configuration is more energy and power efficient than a symmetric one

A “race-to-idle” strategy leads to better energy efficiency

RAM is a major power consuming component

19

Page 20: The Energy Case for Graph Processing on Hybrid Platforms Abdullah Gharaibeh, Lauro Beltrão Costa, Elizeu Santos-Neto and Matei Ripeanu NetSysLab The University.

20

Questions

code@: netsyslab.ece.ubc.ca

Page 21: The Energy Case for Graph Processing on Hybrid Platforms Abdullah Gharaibeh, Lauro Beltrão Costa, Elizeu Santos-Neto and Matei Ripeanu NetSysLab The University.

Energy-Delay Product (EDP)

21

Higher relative advantage