Top Banner
A Deeper Understanding of Spark’s Internals Aaron Davidson 07/01/2014
45

A deeper-understanding-of-spark-internals

Jan 29, 2018

Download

Documents

Cheng Min Chi
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: A deeper-understanding-of-spark-internals

A Deeper Understanding of Spark’s Internals Aaron Davidson"

07/01/2014

Page 2: A deeper-understanding-of-spark-internals

This Talk •  Goal: Understanding how Spark runs, focus

on performance

•  Major core components: – Execution Model – The Shuffle – Caching

Page 3: A deeper-understanding-of-spark-internals

This Talk •  Goal: Understanding how Spark runs, focus

on performance

•  Major core components: – Execution Model – The Shuffle – Caching

Page 4: A deeper-understanding-of-spark-internals

Why understand internals? Goal: Find number of distinct names per “first letter” sc.textFile(“hdfs:/names”)

.map(name => (name.charAt(0), name))

.groupByKey()

.mapValues(names => names.toSet.size)

.collect()

Page 5: A deeper-understanding-of-spark-internals

Why understand internals? Goal: Find number of distinct names per “first letter” sc.textFile(“hdfs:/names”)

.map(name => (name.charAt(0), name))

.groupByKey()

.mapValues(names => names.toSet.size)

.collect()

Andy Pat Ahir

Page 6: A deeper-understanding-of-spark-internals

Why understand internals? Goal: Find number of distinct names per “first letter” sc.textFile(“hdfs:/names”)

.map(name => (name.charAt(0), name))

.groupByKey()

.mapValues(names => names.toSet.size)

.collect()

Andy Pat Ahir

(A, Andy) (P, Pat) (A, Ahir)

Page 7: A deeper-understanding-of-spark-internals

Why understand internals? Goal: Find number of distinct names per “first letter” sc.textFile(“hdfs:/names”)

.map(name => (name.charAt(0), name))

.groupByKey()

.mapValues(names => names.toSet.size)

.collect()

Andy Pat Ahir

(A, [Ahir, Andy]) (P, [Pat])

(A, Andy) (P, Pat) (A, Ahir)

Page 8: A deeper-understanding-of-spark-internals

Why understand internals? Goal: Find number of distinct names per “first letter” sc.textFile(“hdfs:/names”)

.map(name => (name.charAt(0), name))

.groupByKey()

.mapValues(names => names.toSet.size)

.collect()

Andy Pat Ahir

(A, [Ahir, Andy]) (P, [Pat])

(A, Andy) (P, Pat) (A, Ahir)

Page 9: A deeper-understanding-of-spark-internals

Why understand internals? Goal: Find number of distinct names per “first letter” sc.textFile(“hdfs:/names”)

.map(name => (name.charAt(0), name))

.groupByKey()

.mapValues(names => names.toSet.size)

.collect()

Andy Pat Ahir

(A, [Ahir, Andy]) (P, [Pat])

(A, Set(Ahir, Andy)) (P, Set(Pat))

(A, Andy) (P, Pat) (A, Ahir)

Page 10: A deeper-understanding-of-spark-internals

Why understand internals? Goal: Find number of distinct names per “first letter” sc.textFile(“hdfs:/names”)

.map(name => (name.charAt(0), name))

.groupByKey()

.mapValues(names => names.toSet.size)

.collect()

Andy Pat Ahir

(A, [Ahir, Andy]) (P, [Pat])

(A, 2) (P, 1)

(A, Andy) (P, Pat) (A, Ahir)

Page 11: A deeper-understanding-of-spark-internals

Why understand internals? Goal: Find number of distinct names per “first letter” sc.textFile(“hdfs:/names”)

.map(name => (name.charAt(0), name))

.groupByKey()

.mapValues(names => names.toSet.size)

.collect()

Andy Pat Ahir

(A, [Ahir, Andy]) (P, [Pat])

(A, 2) (P, 1)

(A, Andy) (P, Pat) (A, Ahir)

res0 = [(A, 2), (P, 1)]

Page 12: A deeper-understanding-of-spark-internals

Spark Execution Model 1.  Create DAG of RDDs to represent

computation 2.  Create logical execution plan for DAG 3.  Schedule and execute individual tasks

Page 13: A deeper-understanding-of-spark-internals

Step 1: Create RDDs

sc.textFile(“hdfs:/names”)

map(name => (name.charAt(0), name))

groupByKey()

mapValues(names => names.toSet.size)

collect()

Page 14: A deeper-understanding-of-spark-internals

Step 1: Create RDDs

HadoopRDD

map()

groupBy()

mapValues()

collect()

Page 15: A deeper-understanding-of-spark-internals

Step 2: Create execution plan •  Pipeline as much as possible •  Split into “stages” based on need to

reorganize data

Stage 1 HadoopRDD

map()

groupBy()

mapValues()

collect()

Andy Pat Ahir

(A, [Ahir, Andy]) (P, [Pat])

(A, 2)

(A, Andy) (P, Pat) (A, Ahir)

res0 = [(A, 2), ...]

Page 16: A deeper-understanding-of-spark-internals

Step 2: Create execution plan •  Pipeline as much as possible •  Split into “stages” based on need to

reorganize data

Stage 1

Stage 2

HadoopRDD

map()

groupBy()

mapValues()

collect()

Andy Pat Ahir

(A, [Ahir, Andy]) (P, [Pat])

(A, 2) (P, 1)

(A, Andy) (P, Pat) (A, Ahir)

res0 = [(A, 2), (P, 1)]

Page 17: A deeper-understanding-of-spark-internals

•  Split each stage into tasks •  A task is data + computation •  Execute all tasks within a stage before

moving on

Step 3: Schedule tasks

Page 18: A deeper-understanding-of-spark-internals

Step 3: Schedule tasks Computation Data

hdfs:/names/0.gz

hdfs:/names/1.gz

hdfs:/names/2.gz

Task 0 Task 1 Task 2

hdfs:/names/3.gz

Stage 1 HadoopRDD

map() Task 3

hdfs:/names/0.gz

Task 0

HadoopRDD

map()

hdfs:/names/1.gz

Task 1

HadoopRDD

map()

Page 19: A deeper-understanding-of-spark-internals

Step 3: Schedule tasks

/names/0.gz

/names/3.gz

/names/0.gz

HadoopRDD

map() Time

HDFS

/names/1.gz

/names/2.gz

HDFS

/names/2.gz

/names/3.gz

HDFS

Page 20: A deeper-understanding-of-spark-internals

Step 3: Schedule tasks

/names/0.gz

/names/3.gz

HDFS

/names/1.gz

/names/2.gz

HDFS

/names/2.gz

/names/3.gz

HDFS

/names/0.gz

HadoopRDD

map() Time

Page 21: A deeper-understanding-of-spark-internals

Step 3: Schedule tasks

/names/0.gz

/names/3.gz

HDFS

/names/1.gz

/names/2.gz

HDFS

/names/2.gz

/names/3.gz

HDFS

/names/0.gz

HadoopRDD

map()

Time

Page 22: A deeper-understanding-of-spark-internals

Step 3: Schedule tasks

/names/0.gz

/names/3.gz

HDFS

/names/1.gz

/names/2.gz

HDFS

/names/2.gz

/names/3.gz

HDFS

/names/0.gz

HadoopRDD

map()

/names/1.gz

HadoopRDD

map() Time

Page 23: A deeper-understanding-of-spark-internals

Step 3: Schedule tasks

/names/0.gz

/names/3.gz

HDFS

/names/1.gz

/names/2.gz

HDFS

/names/2.gz

/names/3.gz

HDFS

/names/0.gz

HadoopRDD

map()

/names/1.gz

HadoopRDD

map()

Time

Page 24: A deeper-understanding-of-spark-internals

Step 3: Schedule tasks

/names/0.gz

/names/3.gz

HDFS

/names/1.gz

/names/2.gz

HDFS

/names/2.gz

/names/3.gz

HDFS

/names/0.gz

HadoopRDD

map()

/names/2.gz

HadoopRDD

map()

/names/1.gz

HadoopRDD

map()

Time

Page 25: A deeper-understanding-of-spark-internals

Step 3: Schedule tasks

/names/0.gz

/names/3.gz

HDFS

/names/1.gz

/names/2.gz

HDFS

/names/2.gz

/names/3.gz

HDFS

/names/0.gz

HadoopRDD

map()

/names/1.gz

HadoopRDD

map()

/names/2.gz

HadoopRDD

map()

Time

Page 26: A deeper-understanding-of-spark-internals

Step 3: Schedule tasks

/names/0.gz

/names/3.gz

HDFS

/names/1.gz

/names/2.gz

HDFS

/names/2.gz

/names/3.gz

HDFS

/names/0.gz

HadoopRDD

map()

/names/1.gz

HadoopRDD

map()

/names/2.gz

HadoopRDD

map()

Time

/names/3.gz

HadoopRDD

map()

Page 27: A deeper-understanding-of-spark-internals

Step 3: Schedule tasks

/names/0.gz

/names/3.gz

HDFS

/names/1.gz

/names/2.gz

HDFS

/names/2.gz

/names/3.gz

HDFS

/names/0.gz

HadoopRDD

map()

/names/1.gz

HadoopRDD

map()

/names/2.gz

HadoopRDD

map()

Time

/names/3.gz

HadoopRDD

map()

Page 28: A deeper-understanding-of-spark-internals

Step 3: Schedule tasks

/names/0.gz

/names/3.gz

HDFS

/names/1.gz

/names/2.gz

HDFS

/names/2.gz

/names/3.gz

HDFS

/names/0.gz

HadoopRDD

map()

/names/1.gz

HadoopRDD

map()

/names/2.gz

HadoopRDD

map()

Time

/names/3.gz

HadoopRDD

map()

Page 29: A deeper-understanding-of-spark-internals

The Shuffle

Stage 1

Stage 2

HadoopRDD

map()

groupBy()

mapValues()

collect()

Page 30: A deeper-understanding-of-spark-internals

The Shuffle

Stage  1  

Stage  2  

•  Redistributes data among partitions •  Hash keys into buckets •  Optimizations: – Avoided when possible, if"

data is already properly"partitioned

– Partial aggregation reduces"data movement

Page 31: A deeper-understanding-of-spark-internals

The Shuffle

Disk  

Stage  2  

Stage  1  

•  Pull-based, not push-based •  Write intermediate files to disk

Page 32: A deeper-understanding-of-spark-internals

Execution of a groupBy() •  Build hash map within each partition

•  Note: Can spill across keys, but a single key-value pair must fit in memory

A => [Arsalan, Aaron, Andrew, Andrew, Andy, Ahir, Ali, …], E => [Erin, Earl, Ed, …] …

Page 33: A deeper-understanding-of-spark-internals

Done!

Stage 1

Stage 2

HadoopRDD

map()

groupBy()

mapValues()

collect()

Page 34: A deeper-understanding-of-spark-internals

What went wrong? •  Too few partitions to get good concurrency •  Large per-key groupBy() •  Shipped all data across the cluster

Page 35: A deeper-understanding-of-spark-internals

Common issue checklist 1.  Ensure enough partitions for concurrency 2.  Minimize memory consumption (esp. of

sorting and large keys in groupBys) 3.  Minimize amount of data shuffled 4.  Know the standard library

1 & 2 are about tuning number of partitions!

Page 36: A deeper-understanding-of-spark-internals

Importance of Partition Tuning •  Main issue: too few partitions –  Less concurrency – More susceptible to data skew –  Increased memory pressure for groupBy,

reduceByKey, sortByKey, etc. •  Secondary issue: too many partitions •  Need “reasonable number” of partitions – Commonly between 100 and 10,000 partitions –  Lower bound: At least ~2x number of cores in

cluster – Upper bound: Ensure tasks take at least 100ms

Page 37: A deeper-understanding-of-spark-internals

Memory Problems •  Symptoms: –  Inexplicably bad performance –  Inexplicable executor/machine failures"

(can indicate too many shuffle files too) •  Diagnosis: –  Set spark.executor.extraJavaOptions to include

•  -XX:+PrintGCDetails •  -XX:+HeapDumpOnOutOfMemoryError

–  Check dmesg for oom-killer logs •  Resolution: –  Increase spark.executor.memory –  Increase number of partitions –  Re-evaluate program structure (!)

Page 38: A deeper-understanding-of-spark-internals

Fixing our mistakes sc.textFile(“hdfs:/names”)

.map(name => (name.charAt(0), name)) .groupByKey()

.mapValues { names => names.toSet.size }

.collect()

1.  Ensure enough partitions for concurrency

2.  Minimize memory consumption (esp. of large groupBys and sorting)

3.  Minimize data shuffle 4.  Know the standard library

Page 39: A deeper-understanding-of-spark-internals

Fixing our mistakes sc.textFile(“hdfs:/names”)

.repartition(6) .map(name => (name.charAt(0), name))

.groupByKey()

.mapValues { names => names.toSet.size }

.collect()

1.  Ensure enough partitions for concurrency

2.  Minimize memory consumption (esp. of large groupBys and sorting)

3.  Minimize data shuffle 4.  Know the standard library

Page 40: A deeper-understanding-of-spark-internals

Fixing our mistakes sc.textFile(“hdfs:/names”)

.repartition(6) .distinct() .map(name => (name.charAt(0), name))

.groupByKey()

.mapValues { names => names.toSet.size }

.collect()

1.  Ensure enough partitions for concurrency

2.  Minimize memory consumption (esp. of large groupBys and sorting)

3.  Minimize data shuffle 4.  Know the standard library

Page 41: A deeper-understanding-of-spark-internals

Fixing our mistakes sc.textFile(“hdfs:/names”)

.repartition(6) .distinct() .map(name => (name.charAt(0), name))

.groupByKey()

.mapValues { names => names.size } .collect()

1.  Ensure enough partitions for concurrency

2.  Minimize memory consumption (esp. of large groupBys and sorting)

3.  Minimize data shuffle 4.  Know the standard library

Page 42: A deeper-understanding-of-spark-internals

Fixing our mistakes sc.textFile(“hdfs:/names”)

.distinct(numPartitions = 6) .map(name => (name.charAt(0), name))

.groupByKey()

.mapValues { names => names.size } .collect()

1.  Ensure enough partitions for concurrency

2.  Minimize memory consumption (esp. of large groupBys and sorting)

3.  Minimize data shuffle 4.  Know the standard library

Page 43: A deeper-understanding-of-spark-internals

Fixing our mistakes sc.textFile(“hdfs:/names”)

.distinct(numPartitions = 6) .map(name => (name.charAt(0), 1)) .reduceByKey(_ + _) .collect()

1.  Ensure enough partitions for concurrency

2.  Minimize memory consumption (esp. of large groupBys and sorting)

3.  Minimize data shuffle 4.  Know the standard library

Page 44: A deeper-understanding-of-spark-internals

Fixing our mistakes sc.textFile(“hdfs:/names”)

.distinct(numPartitions = 6) .map(name => (name.charAt(0), 1)) .reduceByKey(_ + _) .collect()

Original: sc.textFile(“hdfs:/names”)

.map(name => (name.charAt(0), name))

.groupByKey()

.mapValues { names => names.toSet.size } .collect()

Page 45: A deeper-understanding-of-spark-internals

Questions?