1. PySpark Next generation cloud computing engine using Python Wisely Chen Yahoo! Taiwan Data team 2. Who am I? • Wisely Chen ( [email protected] ) • Sr. Engineer inYahoo![Taiwan]…
Vida Ha & Holden Karau - Strata SJ 2015 Everyday I’m Shufflin Tips for Writing Better Spark Jobs Who are we? Holden Karau ● Current: Software Engineer at Databricks.…
A brief intro to Apache Spark â You eat, I talk⦠Spark Framework ⢠Efficient data processing via in-memory RDD. ⢠A rich data-flow API (Java, Scala and Python).…