Top Banner
Evolving R for Commercial Use David Smith useR! 2010
18

Evolving(Rfor(Commercial(Use( · 2011-01-12 · Biglm with sequential data chunking Mortgage Default Example Airline Flights Example Evolving R for Commercial Use . Deployed(Applica8ons(•

Jun 29, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Evolving(Rfor(Commercial(Use( · 2011-01-12 · Biglm with sequential data chunking Mortgage Default Example Airline Flights Example Evolving R for Commercial Use . Deployed(Applica8ons(•

Evolving  R  for  Commercial  Use  

David  Smith  useR!  2010  

Page 2: Evolving(Rfor(Commercial(Use( · 2011-01-12 · Biglm with sequential data chunking Mortgage Default Example Airline Flights Example Evolving R for Commercial Use . Deployed(Applica8ons(•

R  is  awesome  

•  Open  Source,  Free  •  Language  •  Graphics  •  Sta>s>cs  •  Cu@ng-­‐edge  methods  

•  Community  

•  No  Limits  

2   Evolving  R  for  Commercial  Use  

“R  is  the  most  powerful  sta2s2cal  compu2ng  language  on  the  planet”  –  Norman  Nie  (CNET  News,  June  3  2010)  

Page 3: Evolving(Rfor(Commercial(Use( · 2011-01-12 · Biglm with sequential data chunking Mortgage Default Example Airline Flights Example Evolving R for Commercial Use . Deployed(Applica8ons(•

R  at  Work  

•  Windows  (on  the  desktop)  •  Developers  (not  necessarily  sta>s>cians)  •  Managed  by  IT,  not  users  

•  Produc>on  applica>ons  and  research  •  Big  data  sets  •  Deployed  as  part  of  a  process  

3   Evolving  R  for  Commercial  Use  

Page 4: Evolving(Rfor(Commercial(Use( · 2011-01-12 · Biglm with sequential data chunking Mortgage Default Example Airline Flights Example Evolving R for Commercial Use . Deployed(Applica8ons(•

Revolu8on  R  Enterprise  has    Open-­‐Source  R  Engine  at  the  core  

4  

R  Engine  

Community  Packages  

Technical  Support  

Mul>-­‐threaded  MKL  math  libraries  

Web-­‐Based  GUI  

Web  Services  

API  

Big  Data  Analysis  

Parallel  R  

RPE  Developer  

GUI  

Build  Assurance  

Revolu>on  –  Proprietary  

addi>ons  

Community  -­‐  Open  Source  

Revolu>on  –  Forthcoming  proprietary  addi>ons  

Evolving  R  for  Commercial  Use  

www.revolu>onanaly>cs.com/our-­‐vision  

Page 5: Evolving(Rfor(Commercial(Use( · 2011-01-12 · Biglm with sequential data chunking Mortgage Default Example Airline Flights Example Evolving R for Commercial Use . Deployed(Applica8ons(•

Open-­‐Core  SoGware  Model  

•  Open-­‐source  “core”  pla\orm  •  Bundled  with  proprietary  add-­‐ons  that  operate  with  core  pla\orm  – Add-­‐ons  licensed/sold  – Mark  Radcliffe,  OSI  General  Counsel:  

•  h_p://bit.ly/open-­‐core  –  revolu>onanaly>cs.com/downloads/  gpl-­‐sources.php  

Evolving  R  for  Commercial  Use  5  

Page 6: Evolving(Rfor(Commercial(Use( · 2011-01-12 · Biglm with sequential data chunking Mortgage Default Example Airline Flights Example Evolving R for Commercial Use . Deployed(Applica8ons(•

R  for  Development  

•  Researchers  prototyping  – Point-­‐and-­‐click  GUI  

•  Development  teams  building  applica>ons  – Development  environment  

•  Training  •  Support  

– Someone  to  call  for  help  

6   Evolving  R  for  Commercial  Use  

Page 7: Evolving(Rfor(Commercial(Use( · 2011-01-12 · Biglm with sequential data chunking Mortgage Default Example Airline Flights Example Evolving R for Commercial Use . Deployed(Applica8ons(•

Evolving  R  for  Commercial  Use  7  

Page 8: Evolving(Rfor(Commercial(Use( · 2011-01-12 · Biglm with sequential data chunking Mortgage Default Example Airline Flights Example Evolving R for Commercial Use . Deployed(Applica8ons(•

R  Produc8vity  Environment  

8   Evolving  R  for  Commercial  Use  

Page 9: Evolving(Rfor(Commercial(Use( · 2011-01-12 · Biglm with sequential data chunking Mortgage Default Example Airline Flights Example Evolving R for Commercial Use . Deployed(Applica8ons(•

IT:  Fearing  the  worst,  for  you  

•  Installa>on  (Upgrades)  •  Virus  checking  •  Pla\orm  support  (RHEL,  64-­‐bit  Windows)  

•  Mul>ple  version  control  

•  Support  – One  throat  to  choke!  

•  Contracts  and  licensing  

Evolving  R  for  Commercial  Use  9  

Page 10: Evolving(Rfor(Commercial(Use( · 2011-01-12 · Biglm with sequential data chunking Mortgage Default Example Airline Flights Example Evolving R for Commercial Use . Deployed(Applica8ons(•

R  for  Produc8on  Use  

•  Performance  (Speed)  •  Use  compu>ng  resources  

– Clusters,  Grids,    – Cloud  

•  Scale  to  large  data  sets  •  Valida>on  

Evolving  R  for  Commercial  Use  10  

Page 11: Evolving(Rfor(Commercial(Use( · 2011-01-12 · Biglm with sequential data chunking Mortgage Default Example Airline Flights Example Evolving R for Commercial Use . Deployed(Applica8ons(•

Intel  MKL  Benchmarks  (Windows)  

Computa8on   R  2.9.2   Revo  R    (1-­‐core)  

Revo  R    (4-­‐core)  

Speedup  (4-­‐core)  

Linear  Algebra1  

Matrix  Mul>ply     243  sec   22  sec   5.9  sec   41x  

Cholesky  Factoriza>on   23  sec   3.8  sec   1.1  sec   21x  

Singular  Value  Decomposi>on   62  sec   13  sec   4.9  sec   12.6x  

Principal  Components  Analysis   237  sec   41  sec   15.6  sec   15.2x  

Linear  Discriminant  Analysis   142  sec   49  sec   32.0  sec   4.4x  

General  R  Benchmarks2  

R  Benchmarks  (Matrix  Calc)   34  sec   6.6  sec   4.4  sec     7.7x  

R  Benchmarks  (Matrix  Func>ons)   20  sec   4.4  sec   2.1  sec   9.5x  

R  Benchmarks  (Program  Control)   4.7  sec   4  sec   4.2  sec   0x  

1.  h_p://www.revolu>onanaly>cs.com/why-­‐revolu>on-­‐r/benchmarks.php  2.  h_p://r.research.a_.com/benchmarks/  

11   Evolving  R  for  Commercial  Use  

Page 12: Evolving(Rfor(Commercial(Use( · 2011-01-12 · Biglm with sequential data chunking Mortgage Default Example Airline Flights Example Evolving R for Commercial Use . Deployed(Applica8ons(•

Cloud  Compu8ng  

•  foreach  replaces  for  loops  

•  Minimal  code  change  required  

•  Parallel  processing  on  CPUs  on  local  machine,  cluster,  or  cloud  

•  Significant  speedups  

#  Birthday  problem  simula>on  run  on  2.4  GHz  Thinkpad  T500  with  dual  core,  #  64  bit  cpu  and  3  GB  of  RAM  

birthday  <-­‐  func>on(n)  {  #  n  is  the  number  of  people  in  the  room    m  <-­‐  10000                #    m  s  the  number  of  rooms  to  simulate      x  <-­‐  numeric(m)  

 for  (i  in  1:m)  {        b  <-­‐  sample(1:365,n,repl=T)        #  simulate  birthdays  for  n  people        x[i]  <-­‐  n  -­‐  length(unique(b))    }    

               mean(x)    average  number  of  matches  over  m  simula>ons  }  

#  run  the  loop  sequen>ally  

system.>me(for(j  in  1:100)  birthday(j))  

#  Results  of    sequen>al    test  run  on  2.4  GHz  Thinkpad  T500  

#  Elapsed:  50.94  

#  run  the  test  with  parallelR,  two  simultaneous  workers  library(nws)  

require("doNWS")  s  <-­‐  sleigh(workerCount=2)  registerDoNWS(s)  

system.>me(x  <-­‐  foreach  (j=1:100)  %dopar%  birthday(j))        

#  Results  of  parallel  test    #  Elapsed:  28.75    

12   Evolving  R  for  Commercial  Use  

Page 13: Evolving(Rfor(Commercial(Use( · 2011-01-12 · Biglm with sequential data chunking Mortgage Default Example Airline Flights Example Evolving R for Commercial Use . Deployed(Applica8ons(•

Revolution  Confidential  revoScaleR Performance

13

-87%

revoAnalytics 80s

Base R + big(g)lm 600s

Base R N/A

-99.7%

300s

N/A

Dataset

Technique

Machine

Alternative

10M rows & 6 variables

Logistic regression

2-core laptop

Bigglm with all data in-memory

123M rows & 26 variables

Linear regression

8-core desktop

Biglm with sequential data chunking

Mortgage Default Example Airline Flights Example

Evolving R for Commercial Use

Page 14: Evolving(Rfor(Commercial(Use( · 2011-01-12 · Biglm with sequential data chunking Mortgage Default Example Airline Flights Example Evolving R for Commercial Use . Deployed(Applica8ons(•

Deployed  Applica8ons  

•  R  as  part  of  a  process  –  Batch  mode  –  Repor>ng  –  Interac>ve  Applica>ons  

•  Integra>on  – With  applica>ons,  data,  and  systems  – Modern  standards  –  Reliable  (support  many  users,  lots  of  data)  – Users  &  Security  – Maintenance  

Evolving  R  for  Commercial  Use  14  

Page 15: Evolving(Rfor(Commercial(Use( · 2011-01-12 · Biglm with sequential data chunking Mortgage Default Example Airline Flights Example Evolving R for Commercial Use . Deployed(Applica8ons(•

Web  Services  Integra8on  

Page 16: Evolving(Rfor(Commercial(Use( · 2011-01-12 · Biglm with sequential data chunking Mortgage Default Example Airline Flights Example Evolving R for Commercial Use . Deployed(Applica8ons(•

Community:  Inside-­‐R.org  

Evolving  R  for  Commercial  Use  16  

Page 17: Evolving(Rfor(Commercial(Use( · 2011-01-12 · Biglm with sequential data chunking Mortgage Default Example Airline Flights Example Evolving R for Commercial Use . Deployed(Applica8ons(•

Revolu8on  R  Enterprise  Produc'on-­‐Grade  Sta's'cal  Analysis  for  Business   High-­‐performance  R  for  mul>processor  systems  

 Sta>s>cal  Analysis  of  Terabyte-­‐Class  Data  Sets   Deploy  R  Applica>ons  via  Web  Services   Easy-­‐to-­‐Use  Graphical  User  Interface   Parallel  Programming  on  Clusters  /  Cloud   Modern  Integrated  Development  Environment   Valida>on  for  use  in  regulated  environments  

 Telephone  and  email  technical  support   Training  and  consul>ng  services  

17   Evolving  R  for  Commercial  Use  

Page 18: Evolving(Rfor(Commercial(Use( · 2011-01-12 · Biglm with sequential data chunking Mortgage Default Example Airline Flights Example Evolving R for Commercial Use . Deployed(Applica8ons(•

Thank  You!  

David  Smith  david@revolu>onanaly>cs.com  

blog.revolu>onanaly>cs.com