Top Banner
Open Source @ IBM R Community Update Augustina Ragwitz, IBM Cognitive Open Tech August, 2017
13

Open Source @ IBM R Community Update...Open Source @ IBM R Community Update Augustina Ragwitz, IBM Cognitive Open Tech August, 2017

Oct 03, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Open Source @ IBM R Community Update...Open Source @ IBM R Community Update Augustina Ragwitz, IBM Cognitive Open Tech August, 2017

Open Source @ IBMR Community UpdateAugustina Ragwitz, IBM Cognitive Open Tech

August, 2017

Page 2: Open Source @ IBM R Community Update...Open Source @ IBM R Community Update Augustina Ragwitz, IBM Cognitive Open Tech August, 2017

2

Today's Agenda

• What is R?

• How was R created?

• Where is the R community?

• Technical overview

• What's the current status of R?

• What is next for the R community?

• R at IBM

• Let's Get Started

• Call to Action

Page 3: Open Source @ IBM R Community Update...Open Source @ IBM R Community Update Augustina Ragwitz, IBM Cognitive Open Tech August, 2017

3

What is R?

• Free Open Source alternative to Stata, Matlab, SPSS, and SAS

• Preferred by students because it is free; enter workforce alreadytrained in it (Fast Company, 2014)

• Developed by and for researchers and analysts that do not havea traditional programming background

• Easy to use; low overhead to get up and running

• Extensible through packages via CRAN, BioConductor, andROpenSci.

R is a programming environment for statistical analysis + graphics

Page 4: Open Source @ IBM R Community Update...Open Source @ IBM R Community Update Augustina Ragwitz, IBM Cognitive Open Tech August, 2017

4

How was R created?

• Ross Ihaka and Robert Gentleman(University of Auckland, NZ) in 1992

• First Stable beta: 2000

• Annual x.y.0 releases in Spring• Patches released as needed (x.y.z)

• Final patch release of previous versionjust the new one

• Current major version: 3.0.0

• Learn more about R core Internals: https://cran.r-project.org/doc/manuals/r-release/R-ints.html

Created by Statisticians for Statisticians

Page 5: Open Source @ IBM R Community Update...Open Source @ IBM R Community Update Augustina Ragwitz, IBM Cognitive Open Tech August, 2017

5

Where is the R community?

• CRAN – R Package Repository• https://cran.r-project.org/• User-submitted R code to

extend the R language

• R Foundation• https://www.r-project.org/foundation/• Support R community

• R Consortium• https://www.r-consortium.org/• Founded in 2015• Bridge Community and Enterprise

Interests• Platinum Companies include

IBM, Microsoft, RStudio• IBM: Board + Steering and

Marketing Committees

Page 6: Open Source @ IBM R Community Update...Open Source @ IBM R Community Update Augustina Ragwitz, IBM Cognitive Open Tech August, 2017

6

R: Analyst Technical Overview

Data gathering to analysis to publishing streamlined!

• Convert unstructured datainto tables (readr, tidyr)

• One-liner statisticalanalysis (dplyr)

• Easy data visualization(ggplot2)

• Reactive JavaScript appgenerated from R code (shiny)

• Publish research + code inHTML, PDF, and otherformats (rmarkdown, knitr)

# gathermy_data <- read_csv("my_data.csv")df <- as_data_frame(my_data)

# summarizedf <- df %>%filter(!is.na(name)) %>%separate(name, c("last", "first"), sep=",") %>%group_by(last) %>%summarise(total=n())

Page 7: Open Source @ IBM R Community Update...Open Source @ IBM R Community Update Augustina Ragwitz, IBM Cognitive Open Tech August, 2017

7

R: Developer Technical Overview

Integrating R into Production Workflows

Rserve provides a socket interface to existing applications> install.packages("Rserve")> library(Rserve)> Rserve()

Plumber generates API endpoints from R code for Rserve#* @post /sumaddTwo <- function(a, b){ as.numeric(a) + as.numeric(b) }

https://www.rplumber.io/

Page 8: Open Source @ IBM R Community Update...Open Source @ IBM R Community Update Augustina Ragwitz, IBM Cognitive Open Tech August, 2017

8

What is the current status of R? Over 11,000 packages on CRAN!

Most popular open source tool in academic research

Top language among industry data scientists

Page 9: Open Source @ IBM R Community Update...Open Source @ IBM R Community Update Augustina Ragwitz, IBM Cognitive Open Tech August, 2017

9

What's next for the R community?

• Improve Tooling and Support• Code Coverage• RHub (hosted testing + validation of R packages)

• Big Data and Cloud Improvements• Unified Framework/API for Distributed Computing• Better database integration via DBI• Support scalable Spatiotemporal/raster datasets

• Community Training and Outreach• Software Carpentry/Data Carpentry workshops• Support R User Groups (RUGs)• Diversity Initiatives (R-Ladies)

Page 10: Open Source @ IBM R Community Update...Open Source @ IBM R Community Update Augustina Ragwitz, IBM Cognitive Open Tech August, 2017

10

R at IBM

• Learn R and Data Science through CognitiveClass.AI• Data Science with R: https://cognitiveclass.ai/learn/data-science-r/

• Data Science Experience + R• RStudio: https://datascience.ibm.com/docs/content/analyze-data/rstudio-overview.html

• R + Watson Natural Language Understanding(NLU): https://apsportal.ibm.com/exchange/public/entry/view/1015c435b898fb629a7e7523be151aed

• DeveloperWorks Code• https://developer.ibm.com/code/patterns/detect-change-points-in-iot-sensor-data/

• https://developer.ibm.com/code/patterns/category/data-science/

Page 11: Open Source @ IBM R Community Update...Open Source @ IBM R Community Update Augustina Ragwitz, IBM Cognitive Open Tech August, 2017

11

R: Let's Get Started

• Install

• CRAN - https://cran.r-project.org/

• RStudio - https://www.rstudio.com/

• Learn

• Install the swirl package - http://swirlstats.com/

• R for Data Science by Hadley Wickham - http://r4ds.had.co.nz/

• Statistics for R course on Coursera (free to audit) - https://www.coursera.org/specializations/statistics

• Explore

• Big Data Analysis + Machine Learning with R + Apache Spark

• R4ML: https://www.ibm.com/support/knowledgecenter/en/SSPT3X_4.2.5/com.ibm.swg.im.infosphere.biginsights.tut.doc/doc/tut_Mod_R4ML.html

• Data Science for Automotive Lab (R in Jupyter Notebook)

• https://github.com/kurlare/DSforAutomotive

Page 12: Open Source @ IBM R Community Update...Open Source @ IBM R Community Update Augustina Ragwitz, IBM Cognitive Open Tech August, 2017

• Find a RUG Meetup• https://www.meetup.com/topics/r-project-for-statistical-computing/

• Attend an R conference• useR, EARL, RStudio::conf, Open Data Science West/East

• Use Twitter hashtag #rstats

• Join a Mailing List• https://www.r-project.org/mail.html

• Submit a community proposal• https://www.r-consortium.org/projects/call-for-proposals

• Join a Working Group• https://www.r-consortium.org/projects/isc-working-groups

• Subscribe to the IBM Code monthly newsletter (hotlink:https://www.pages03.net/ibmdeveloperworks/developerWorks-IBMCodeNewsletterSubscriptionPage-secure/)

• Subscribe to future Code Tech Talks (hotlink:https://www.pages03.net/ibmdeveloperworks/developerWorks-IBMCodeTechTalkSubscriptionPage-secure/)

Call to Action

Page 13: Open Source @ IBM R Community Update...Open Source @ IBM R Community Update Augustina Ragwitz, IBM Cognitive Open Tech August, 2017

Q & A

13