Top Banner
Node Architecture Implications for In-Memory Data Analytics on Scale-in Clusters Ahsan Javed Awan
17

Node Architecture Implications for In-Memory Data Analytics on Scale-in Clusters

Apr 11, 2017

Download

Data & Analytics

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Node Architecture Implications for In-Memory Data Analytics on Scale-in Clusters

Node Architecture Implications for In-Memory Data Analytics on Scale-in Clusters

Ahsan Javed Awan

Page 2: Node Architecture Implications for In-Memory Data Analytics on Scale-in Clusters

MotivationAbout me

● Erasmus Mundus Joint Doctoral Fellow at KTH Sweden and UPC Spain.● Visiting Researcher at Barcelona Super Computing Center.● Speaker at Spark Summit Europe 2016.● Written Licentiate Thesis, “Performance Characterization of In-Memory Data Analytics

with Apache Spark”● https://www.kth.se/profile/ajawan/

Page 3: Node Architecture Implications for In-Memory Data Analytics on Scale-in Clusters

MotivationWhy should we care about architecture support?

Page 4: Node Architecture Implications for In-Memory Data Analytics on Scale-in Clusters

MotivationCont..

*Source: SGI

● Exponential increase in core count.● A mismatch between the characteristics of emerging big data workloads and the

underlying hardware.● Newer promising technologies (Hybrid Memory Cubes, NVRAM etc)

● Clearing the clouds, ASPLOS' 12● Characterizing data analysis

workloads, IISWC' 13● Understanding the behavior of in-

memory computing workloads, IISWC' 14

Page 5: Node Architecture Implications for In-Memory Data Analytics on Scale-in Clusters

MotivationCont...

Scale-in: Fewer nodes of powerful machines

*Source: http://navcode.info/2012/12/24/cloud-scaling-schemes/

Phoenix ++,Metis, Ostrich, etc..

Hadoop, Spark,Flink, etc.. Our Focus

Page 6: Node Architecture Implications for In-Memory Data Analytics on Scale-in Clusters

Which Scale-out Framework ?

[Picture Courtesy: Amir H. Payberah]

● Tuning of Spark internal Parameters● Tuning of JVM Parameters (Heap size etc..)● Micro-architecture Level Analysis using Hardware Performance Counters.

Page 7: Node Architecture Implications for In-Memory Data Analytics on Scale-in Clusters

Progress Meeting 12-12-14Which Benchmarks ?

Page 8: Node Architecture Implications for In-Memory Data Analytics on Scale-in Clusters

Multicore Scalability of SparkMulti-core Scalability of Apache Spark?

Page 9: Node Architecture Implications for In-Memory Data Analytics on Scale-in Clusters

Multicore Scalability of SparkThe Problem of GC?

Page 10: Node Architecture Implications for In-Memory Data Analytics on Scale-in Clusters

Multicore Scalability of SparkImpact of NUMA Awareness?

Page 11: Node Architecture Implications for In-Memory Data Analytics on Scale-in Clusters

Multicore Scalability of SparkEffectiveness of Hyper-Threading?

Page 12: Node Architecture Implications for In-Memory Data Analytics on Scale-in Clusters

Multicore Scalability of SparkEfficacy of existing prefetchers?

Page 13: Node Architecture Implications for In-Memory Data Analytics on Scale-in Clusters

Our Approach2D PIM vs 3D Stacked PIM

High Bandwidth Memories are not required for Spark

Page 14: Node Architecture Implications for In-Memory Data Analytics on Scale-in Clusters

Multicore Scalability of SparkThe Problem of File I/O?

Page 15: Node Architecture Implications for In-Memory Data Analytics on Scale-in Clusters

Our ApproachUse Near Data Computing Architecture

● Implications of In-Memory Data Analytics with Apache Spark on Near Data Computing Architectures (under submission)

Page 16: Node Architecture Implications for In-Memory Data Analytics on Scale-in Clusters

Our ApproachConclusions

● We advise using executors with memory size less than or equal to 32GB and restrict each executor to use NUMA-local memory.

● We recommend to enable hyper-threading, disable next-line L1-D and adjacent cache line L2 prefetchers and lower the DDR3 speed to 1333.

● We also envision processors with 6 hyper-threaded cores without L1-D next line and adjacent cache line L2 prefetchers per socket.

● The use of high bandwidth memories like Hybrid memory cubes is not justified for in-memory data● analytics with Spark.

Page 17: Node Architecture Implications for In-Memory Data Analytics on Scale-in Clusters

THANK YOU.Email: [email protected]: www.kth.se/profile/ajawan/

Acknowledgements: Mats Brorsson(KTH)Vladimir Vlassov(KTH)Eduard Ayguade(UPC/BSC)