Top Banner
Highlights of EARL 2018 Adnan Fiaz Julian Ferry Hannah Frick Dragoș Moldovan-Grünfeld
51

Highlights of EARL 2018 - London R · EARL London 2018 •5th EARL London Conference •3 Keynote speakers •5 Workshops •3 Streams ... dependencies as a Docker container •liftr

May 20, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Highlights of EARL 2018 - London R · EARL London 2018 •5th EARL London Conference •3 Keynote speakers •5 Workshops •3 Streams ... dependencies as a Docker container •liftr

Highlights of EARL 2018

Adnan Fiaz

Julian Ferry

Hannah Frick

Dragoș Moldovan-Grünfeld

Page 2: Highlights of EARL 2018 - London R · EARL London 2018 •5th EARL London Conference •3 Keynote speakers •5 Workshops •3 Streams ... dependencies as a Docker container •liftr

Agenda

Highlights

Next

Facts

Page 3: Highlights of EARL 2018 - London R · EARL London 2018 •5th EARL London Conference •3 Keynote speakers •5 Workshops •3 Streams ... dependencies as a Docker container •liftr

Facts

Page 4: Highlights of EARL 2018 - London R · EARL London 2018 •5th EARL London Conference •3 Keynote speakers •5 Workshops •3 Streams ... dependencies as a Docker container •liftr

EARL London 2018

• 5th EARL London Conference

• 3 Keynote speakers

• 5 Workshops

• 3 Streams

• 56 Presentations– lightning talks for the first time

• 1 Panel Discussion

• 2 Evening Networking Events

Page 5: Highlights of EARL 2018 - London R · EARL London 2018 •5th EARL London Conference •3 Keynote speakers •5 Workshops •3 Streams ... dependencies as a Docker container •liftr

The Workshops

• R in 6 Hours

• Shiny – Beyond the Basics

• Deep Learning with Keras in R

• A Crash Course in Python for R Users

• Functional Programming with purrr

Page 6: Highlights of EARL 2018 - London R · EARL London 2018 •5th EARL London Conference •3 Keynote speakers •5 Workshops •3 Streams ... dependencies as a Docker container •liftr

Attendees

Page 7: Highlights of EARL 2018 - London R · EARL London 2018 •5th EARL London Conference •3 Keynote speakers •5 Workshops •3 Streams ... dependencies as a Docker container •liftr

Speakers

Page 8: Highlights of EARL 2018 - London R · EARL London 2018 •5th EARL London Conference •3 Keynote speakers •5 Workshops •3 Streams ... dependencies as a Docker container •liftr
Page 9: Highlights of EARL 2018 - London R · EARL London 2018 •5th EARL London Conference •3 Keynote speakers •5 Workshops •3 Streams ... dependencies as a Docker container •liftr

Reception

Page 10: Highlights of EARL 2018 - London R · EARL London 2018 •5th EARL London Conference •3 Keynote speakers •5 Workshops •3 Streams ... dependencies as a Docker container •liftr

On the way to the IWM

Page 11: Highlights of EARL 2018 - London R · EARL London 2018 •5th EARL London Conference •3 Keynote speakers •5 Workshops •3 Streams ... dependencies as a Docker container •liftr
Page 12: Highlights of EARL 2018 - London R · EARL London 2018 •5th EARL London Conference •3 Keynote speakers •5 Workshops •3 Streams ... dependencies as a Docker container •liftr

Data Driven Decision-Making

Adnan Fiaz

Page 13: Highlights of EARL 2018 - London R · EARL London 2018 •5th EARL London Conference •3 Keynote speakers •5 Workshops •3 Streams ... dependencies as a Docker container •liftr

Data Driven Decision-making

Keynotes:• Winning in a data-driven world, Edwina Dunn

• Building a Data Driven Company, Rich Pugh

Talks:• Decision Lead Data Science, Steven Wilkins

• A brief history of Data at Autotrader, Paul Owens

• R – The tool for Screwfix, Gavin Jackson

Page 14: Highlights of EARL 2018 - London R · EARL London 2018 •5th EARL London Conference •3 Keynote speakers •5 Workshops •3 Streams ... dependencies as a Docker container •liftr

Have a (data) strategy

“Focus on the data you need rather than the data you have” – Edwina Dunn

“Know how to build the ‘engine’, now it needs to drive the car” – Rich Pugh

Page 15: Highlights of EARL 2018 - London R · EARL London 2018 •5th EARL London Conference •3 Keynote speakers •5 Workshops •3 Streams ... dependencies as a Docker container •liftr

Not madmen but math (wo)men

“A key differentiator for businesses…is a culture of continuous learning” – Edwina Dunn

“The key role of data scientists in the coming years is one of educator” – Rich Pugh

Page 16: Highlights of EARL 2018 - London R · EARL London 2018 •5th EARL London Conference •3 Keynote speakers •5 Workshops •3 Streams ... dependencies as a Docker container •liftr

Special mention

Finding out what Parliament thinks, Sam Tazzyman (Ministry of Justice)

• Explaining complex topics simply

• Show your code in action (and link to it)

• Why so serious?

Page 17: Highlights of EARL 2018 - London R · EARL London 2018 •5th EARL London Conference •3 Keynote speakers •5 Workshops •3 Streams ... dependencies as a Docker container •liftr

Machine LearningJulian Ferry

Hannah Frick

Page 18: Highlights of EARL 2018 - London R · EARL London 2018 •5th EARL London Conference •3 Keynote speakers •5 Workshops •3 Streams ... dependencies as a Docker container •liftr

Balancing model complexityand interpretability

In defence of complexity:• The power of machine learning in segmenting

CRM databases, Jeremy Horne

• The making of a real-world Moneyball – finding undervalued players with h2o, Jo-Fai Chow

In defence of interpretability:• Understanding your model, Kasia Kulma

• Measuring Marketing Performance, Wojtek Kostelecki

Page 19: Highlights of EARL 2018 - London R · EARL London 2018 •5th EARL London Conference •3 Keynote speakers •5 Workshops •3 Streams ... dependencies as a Docker container •liftr

Complex models in CRM segmentation - Jeremy Horne

• How do we identify the customers on a CRM database who are most likely to make a purchase this month?

• Most databases are dominated by lower value segments

Page 20: Highlights of EARL 2018 - London R · EARL London 2018 •5th EARL London Conference •3 Keynote speakers •5 Workshops •3 Streams ... dependencies as a Docker container •liftr

Separating low value segments• Tools used:

– Kernlab package

– Boosting to focus on outliers –outcomes that are not ‘normal’

Key takeaway:

Machine learning models can help us differentiate between customers within the same group, where decision-tree type rules fail.

Page 21: Highlights of EARL 2018 - London R · EARL London 2018 •5th EARL London Conference •3 Keynote speakers •5 Workshops •3 Streams ... dependencies as a Docker container •liftr

In defence of interpretability –Kasia Kulma

Page 22: Highlights of EARL 2018 - London R · EARL London 2018 •5th EARL London Conference •3 Keynote speakers •5 Workshops •3 Streams ... dependencies as a Docker container •liftr

In defence of interpretability –Kasia Kulma

Page 23: Highlights of EARL 2018 - London R · EARL London 2018 •5th EARL London Conference •3 Keynote speakers •5 Workshops •3 Streams ... dependencies as a Docker container •liftr

In defence of interpretability –Kasia Kulma

Page 24: Highlights of EARL 2018 - London R · EARL London 2018 •5th EARL London Conference •3 Keynote speakers •5 Workshops •3 Streams ... dependencies as a Docker container •liftr

LIME – Local Interpretable Model-Agnostic Explanations

Page 25: Highlights of EARL 2018 - London R · EARL London 2018 •5th EARL London Conference •3 Keynote speakers •5 Workshops •3 Streams ... dependencies as a Docker container •liftr

Predicting baseball player performance with h2o, Jo-Fai Chow

• Problem: Finding undervalued baseball players in Major League Baseball (MLB)

Page 26: Highlights of EARL 2018 - London R · EARL London 2018 •5th EARL London Conference •3 Keynote speakers •5 Workshops •3 Streams ... dependencies as a Docker container •liftr

End result – Shiny + LIME

Page 27: Highlights of EARL 2018 - London R · EARL London 2018 •5th EARL London Conference •3 Keynote speakers •5 Workshops •3 Streams ... dependencies as a Docker container •liftr

The beauty of linear models, Wojtek Kostelecki• Modelling contributions to mileage

Page 28: Highlights of EARL 2018 - London R · EARL London 2018 •5th EARL London Conference •3 Keynote speakers •5 Workshops •3 Streams ... dependencies as a Docker container •liftr

The beauty of linear models, Wojtek KosteleckiUsing a linear model we can extract the individual contribution of each variable to sales

Page 29: Highlights of EARL 2018 - London R · EARL London 2018 •5th EARL London Conference •3 Keynote speakers •5 Workshops •3 Streams ... dependencies as a Docker container •liftr

David Smith – Not Hotdog

• Not Hotdog: Image recognition with R and the Custom Vision API

Page 30: Highlights of EARL 2018 - London R · EARL London 2018 •5th EARL London Conference •3 Keynote speakers •5 Workshops •3 Streams ... dependencies as a Docker container •liftr

David Smith – Not Hotdog

Page 31: Highlights of EARL 2018 - London R · EARL London 2018 •5th EARL London Conference •3 Keynote speakers •5 Workshops •3 Streams ... dependencies as a Docker container •liftr

David Smith – Not Hotdog

Page 32: Highlights of EARL 2018 - London R · EARL London 2018 •5th EARL London Conference •3 Keynote speakers •5 Workshops •3 Streams ... dependencies as a Docker container •liftr

David Smith – Not Hotdog

Page 33: Highlights of EARL 2018 - London R · EARL London 2018 •5th EARL London Conference •3 Keynote speakers •5 Workshops •3 Streams ... dependencies as a Docker container •liftr

David Smith – Not Hotdog

R Code: https://github.com/revodavid/nothotdog

Page 34: Highlights of EARL 2018 - London R · EARL London 2018 •5th EARL London Conference •3 Keynote speakers •5 Workshops •3 Streams ... dependencies as a Docker container •liftr

Lars Kjeldgaard - modelgrid

• A ‘caret’-based Framework for

Training Multiple Tax Fraud

Detection Models

• Framework for creating,

managing and training

multiple caret models

• Pipe-friendly

Page 35: Highlights of EARL 2018 - London R · EARL London 2018 •5th EARL London Conference •3 Keynote speakers •5 Workshops •3 Streams ... dependencies as a Docker container •liftr

Lars Kjeldgaard - modelgrid

library(modelgrid)

# create model grid object

credit_default_models <- model_grid()

# shared settings

credit_default_models <-

credit_default_models %>%

share_settings(

y = GermanCredit %>% pull(Class),

x = GermanCredit %>% select(-Class),

metric = "ROC",

trControl = tr_control

)

Page 36: Highlights of EARL 2018 - London R · EARL London 2018 •5th EARL London Conference •3 Keynote speakers •5 Workshops •3 Streams ... dependencies as a Docker container •liftr

Lars Kjeldgaard - modelgrid

# add a random forest model

credit_default_models <-

credit_default_models %>%

add_model(model_name = "Funky Forest",

method = "rf",

tuneGrid = data.frame(mtry = c(10, 20)))

# add an eXtreme gradient boosting model

credit_default_models <-

credit_default_models %>%

add_model(model_name = "Big Boost",

method = "xgbTree",

nthread = 8)

Page 37: Highlights of EARL 2018 - London R · EARL London 2018 •5th EARL London Conference •3 Keynote speakers •5 Workshops •3 Streams ... dependencies as a Docker container •liftr

Lars Kjeldgaard - modelgrid

# train models and evaluate

credit_default_models <- credit_default_models %>%

train(.)

credit_default_models$model_fits %>%

resamples(.) %>%

bwplot(.)

Page 38: Highlights of EARL 2018 - London R · EARL London 2018 •5th EARL London Conference •3 Keynote speakers •5 Workshops •3 Streams ... dependencies as a Docker container •liftr

Reproducibility and R in Production

Dragoș Moldovan-Grünfeld

Page 39: Highlights of EARL 2018 - London R · EARL London 2018 •5th EARL London Conference •3 Keynote speakers •5 Workshops •3 Streams ... dependencies as a Docker container •liftr

Reproducibility & R in Production

• Keynote:

– RMarkdown: The Bigger Picture, Garrett Grolemund, RStudio

• Talks:

– Beyond Prototypes. A Journey to The Production Land, Omayma Said, Freelance

– Bridging the gap between Data Scientists and Engineers; using R in production, Leanne Fitzpatrick, HelloSoda

Page 40: Highlights of EARL 2018 - London R · EARL London 2018 •5th EARL London Conference •3 Keynote speakers •5 Workshops •3 Streams ... dependencies as a Docker container •liftr

Garrett Grolemund (RStudio)

• Reproducibility crisis:

– ”We created a cargo cult by confusing math with science. Now we must undo it.”

– “Create maps, not proofs”

– “Reproducibility is an opportunity”

Page 41: Highlights of EARL 2018 - London R · EARL London 2018 •5th EARL London Conference •3 Keynote speakers •5 Workshops •3 Streams ... dependencies as a Docker container •liftr

Leanne Fitzpatrick (HelloSoda)

• “Bridging the gap between Data Scientists and Engineers; using R in production”

• Barriers to entry (R in production)– Engineering

– Infrastructure

– Data science

– Cultural

Page 42: Highlights of EARL 2018 - London R · EARL London 2018 •5th EARL London Conference •3 Keynote speakers •5 Workshops •3 Streams ... dependencies as a Docker container •liftr

Overcoming barriers

• Deployment:

– central to the data science process

– Solution: Docker

• Plumbing/ integration

– Solution: code as a service with Plumber

• Package and dependency management

– Solution: pacman

Page 43: Highlights of EARL 2018 - London R · EARL London 2018 •5th EARL London Conference •3 Keynote speakers •5 Workshops •3 Streams ... dependencies as a Docker container •liftr

Overcoming barriers (cont’d)

• Reproducible framework– Solution: Project Template http://projecttemplate.net

• Stability & error handling– Solution: testing & CI

– testthat and usethis

• Scaling– Solution: docker

• Culture– Solution: collaboration

Page 44: Highlights of EARL 2018 - London R · EARL London 2018 •5th EARL London Conference •3 Keynote speakers •5 Workshops •3 Streams ... dependencies as a Docker container •liftr

Omayma Said (Freelance)

• “Beyond Prototypes. A Journey to The Production Land”

• Challenges: reproducibility, portability, and accessibility

• Docker

• Use/Modify available Dockerfiles

• Use helper packages

Page 45: Highlights of EARL 2018 - London R · EARL London 2018 •5th EARL London Conference •3 Keynote speakers •5 Workshops •3 Streams ... dependencies as a Docker container •liftr

Helper packages

• containerit– Package an R workspace and all

dependencies as a Docker container

• liftr– Containerize R Markdown documents

for continuous reproducibility

• rize– A robust method to automagically dockerize

your Shiny application

Page 46: Highlights of EARL 2018 - London R · EARL London 2018 •5th EARL London Conference •3 Keynote speakers •5 Workshops •3 Streams ... dependencies as a Docker container •liftr

Special mention

Using R and Shiny to improve hospital operations, Christian Moroy and Jonathan Bruce (Edge Health)

• Predict how long operations take using R• Recommend free slots that should be filled

via Shiny• Disseminate daily reports via markdown +

email (from R)• Saved a predicted £4m in 2017/18

Page 47: Highlights of EARL 2018 - London R · EARL London 2018 •5th EARL London Conference •3 Keynote speakers •5 Workshops •3 Streams ... dependencies as a Docker container •liftr

Next?

Page 48: Highlights of EARL 2018 - London R · EARL London 2018 •5th EARL London Conference •3 Keynote speakers •5 Workshops •3 Streams ... dependencies as a Docker container •liftr

EARL US Roadshow

7 November 2018, Seattle, WA

Julia SilgeData Scientist @ Stack OverflowCo-author Text Mining with R with David RobinsonCo-author tidytext package

Page 49: Highlights of EARL 2018 - London R · EARL London 2018 •5th EARL London Conference •3 Keynote speakers •5 Workshops •3 Streams ... dependencies as a Docker container •liftr

EARL US Roadshow

9 November 2018, Houston, TX

Hadley WickhamChief Scientist @ RStudioAuthor of numerous books on RProlific R package author

Robert GentlemanVice President of Computational Biology @ 23andMeOne of the designers of the R programming language

Page 50: Highlights of EARL 2018 - London R · EARL London 2018 •5th EARL London Conference •3 Keynote speakers •5 Workshops •3 Streams ... dependencies as a Docker container •liftr

EARL US Roadshow

13 November 2018, Boston, MA

Bob Rudis (@hrbrmstr)Chief Security Data Scientist @ Rapid7Prolific tweeter, package author and blogger

Page 51: Highlights of EARL 2018 - London R · EARL London 2018 •5th EARL London Conference •3 Keynote speakers •5 Workshops •3 Streams ... dependencies as a Docker container •liftr

The End