Top Banner
An introduction to Microsoft R Services Microsoft R Open and Microsoft R Server 498 – Show and Tell Gregg Barrett
20

Introduction to Microsoft R Services

Jan 08, 2017

Download

Technology

Gregg Barrett
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Introduction to Microsoft R Services

An introduction to Microsoft R Services

Microsoft R Open and Microsoft R Server

498 – Show and Tell Gregg Barrett

Page 2: Introduction to Microsoft R Services

Introduction

This presentation will briefly cover the following:

- Why consider MRO and R Server

- R Server

- MRO

- Microsoft R Services/R Server Platform- DistributedR

- RevoScaleR/ScaleR

- ConnectR

- DevelopR

- DeployR

- Resources

- References

Page 3: Introduction to Microsoft R Services

Why consider MRO and R Server

- You get the optionality of working with R and the added benefits of Microsoft R Open (MRO) and R Server

- Performance

- MRO is FREE

- R Server is FREE – well for students at least through DreamSpark

Page 4: Introduction to Microsoft R Services

Why consider MRO and R Server

(Gartner, 2015)

Page 5: Introduction to Microsoft R Services

Definition: Originally released in 1993, R is a mature, domain-specific and open-sourced language for statistical analysis workloads.

Trend Analysis: Gartner client inquiry levels for R remain light and range from exploratory to best-practice adopter themes; however, like MATLAB, the number of inquiries has increased substantially in recent years. External data sources reflect a growth in R usage across the industry as well. We expect inquiry levels to increase consistently through 2017.

Time to Next Market Phase: 2 to 5 years

Business Impact: The significant impact of "big data" analytics and real-time data analysis is driving demand for languages such as R and MATLAB beyond previous entrenched market niches and into increasingly mainstream programming workloads. In particular, adopters are turning to R as a free alternative to platforms such as SAS and SPSS.

User Advice: Consider R as a free and open-source solution for workloads that require advanced statistical computing or data mining capabilities with minimal coding and optimal maintenance costs over more general-purpose languages.

Sample Vendors: Microsoft, Oracle, TIBCO Software, IBM, Wolfram Research (Gartner, 2015)

Why consider MRO and R Server

Page 6: Introduction to Microsoft R Services

Why consider MRO and R Server

(Microsoft, 2016)

Page 7: Introduction to Microsoft R Services

R Server

- Revolution R Enterprise (RRE) was developed by Revolution Analytics

- RRE is intended to offer a fast, cost effective enterprise-class big data analytics platform

- Revolution Analytics was acquired by Microsoft

- RRE is now Microsoft R Server

- R Server is free for students and can be obtained through DreamSpark

- Logon or create a profile at DreamSpark using your university credentials: https://www.dreamspark.com/Product/Product.aspx?productid=105

Page 8: Introduction to Microsoft R Services

- RRE uses an R engine called Revolution R Open

- The Revolution R Open engine is now called Microsoft R Open (MRO)

- MRO is intended to be an enhanced distribution of open source R from Microsoft Corporation. Specifically Microsoft R Open leverages high-performance, multi-threaded math libraries to deliver performance boosts. This means that functions in R that use, for example, matrix multiplication, will run faster out of the box.

- Just like R, Microsoft R Open is open source and free

- You can download MRO here: https://mran.revolutionanalytics.com/download/

- MRO is intended to support a variety of big data statistics, predictive modelling, and machine learning capabilities

- At the time of this writing the latest version of MRO is version 3.2.5

MRO

Page 9: Introduction to Microsoft R Services

- It is important to note that R Server uses a different version of MRO

- At the time of this writing the latest version of MRO for R Server is version 3.2.2

- MRO for R Server can be found here: https://mran.revolutionanalytics.com/download/mro-for-mrs/

- MRO for R Server is a prerequisite for R Server

- After downloading and installing MRO whether it be the version for R Server or not, download and install MKL

- MKL is the Intel Math Kernel Library

- Important: Install Microsoft R Open first before MKL

MRO

Page 10: Introduction to Microsoft R Services

Microsoft R Services/R Server Platform

Note: There are name changes due to the Microsoft acquisition with the “Revo” designation/reference falling away – making things a little more challenging.

Page 11: Introduction to Microsoft R Services

Microsoft R Services is positioned as R for the Enterprise.

The feature set provided by the Microsoft R Services software can be categorized as follows:

- Microsoft R Open: High performance math libraries installed on top of a stable version of Open Source R

- DistributedR: Parallel and distributed computing framework for Big Data Analytics

- RevoScaleR/ScaleR: High performance, scalable, parallelized and distributable for Big Data Analytics in R

- ConnectR: Data connections for the Big Data Analytics

- DevelopR: An integrated development environment (IDE) for R on Windows

- DeployR: A web services software development kit for integrating R with third party products (including business intelligence, data visualization, rules engines, etc.)

Microsoft R Services/R Server Platform

Page 12: Introduction to Microsoft R Services

DistributedR

DistributedR allows you to run the same R script on multiple platforms; you can create a model in one environment such as a workstation and then deploy it on a different environment such as an on-site Microsoft SQL Server, a Teradata platform, or a Hadoop cluster in the cloud. You just need to specify the information about where these computations should be performed and what data should be analyzed.

For information on supported computing environments, look for the “compute contexts” in the RevoScaleR package.

Page 13: Introduction to Microsoft R Services

RevoScaleR

RevoScaleR/ScaleR package provides efficient, scalable computational power and allows for the development of ready-to-deploy suites of data processing and analytics with R.

To learn more, look for the RevoScaleR “rx” analysis and data manipulation functions and “rxExec” for HPC functionality. If you are computing decision trees, also check out the included RevoTreeView package that allows you to interactively visualize your decision trees.

Or run the following script: ?RevoScaleR

Page 14: Introduction to Microsoft R Services

The RevoScaleR package provides a way for you to connect with the data you may have stored in a variety of formats: SAS, SPSS, Teradata, ODBC, delimited and fixed format text, and Hadoop Distributed File System (HDFS) text files. You have a choice of:

1. keeping the data as is and analyzing it directly with RevoScaleR analysis functions,

2. extracting the data you want to analyze and storing it in the efficient and higher performance .xdf file format provided with the RevoScaleR package, or

3. bringing some or all of your data into memory as an R data frame to use with any R analysis function.

To learn more, look for data sources in the RevoScaleR package.

Note: The RevoScaleR package is included with every distribution of RRE/R Server, and is automatically loaded into memory when you start the program. So all of the “rx” functions mentioned are at your fingertips.

You can get information on them by using the ? at the command line, for example: ?rxLinMod

ConnectR

Page 15: Introduction to Microsoft R Services

DevelopR

Microsoft R Services provides a tool for the R developer to efficiently create sets of R scripts—the R Productivity Environment (RPE).

Working on a Windows workstation with the RPE, the R developer has a full-featured Visual Studio-like integrated development environment for R, including an indispensable visual debugger for R. The RPE has a customizable workspace, including an enhanced Script Editor, an Object Browser, a Solution Explorer, and an R Command Console.

Page 16: Introduction to Microsoft R Services

DeployR

The optional DeployR package provides the tools for doing just that; it is a full-featured web services software development kit for R which allows programmers to use Java, JavaScript or .Net to integrate the R analysis output with a third party package.

There are now Accelerators for DeployR which are starter kits for integrating with tools including:

- Microsoft Excel

- Tableau

- Jaspersoft

- QlikView

Page 17: Introduction to Microsoft R Services

R Server User Interface

Page 18: Introduction to Microsoft R Services

Resources

R Services 2016 Getting Started Guide:

https://packages.revolutionanalytics.com/doc/8.0.0/win/MicrosoftRServices_Getting_Started.pdf

Webinar “Using Microsoft R Server to Address Scalability Issues in R”: https://channel9.msdn.com/blogs/Cloud-and-Enterprise-Premium/Using-Microsoft-R-Server-to-Address-Scalability-Issues-in-R

Task Views are guides on CRAN that group sets of R packages and functions by type of analysis, fields, or methodologies. You can browse and find packages organized by task view:

https://mran.microsoft.com/taskview/

Page 19: Introduction to Microsoft R Services

Resources

Software available to NU students:

http://www.it.northwestern.edu/software/

https://northwestern.onthehub.com/WebStore/Welcome.aspx

https://www.dreamspark.com/Student/Software-Catalog.aspx

Page 20: Introduction to Microsoft R Services

Gartner. (2015). IT Market Clock for Programming Languages, 2015. [Diagram]. Retrieved from Gartner. (2015).

IT Market Clock for Programming Languages, 2015. [pdf]. https://www.gartner.com/doc/3145117/it-market-clock-programming-languages

Gartner. (2015). IT Market Clock for Programming Languages, 2015. [pdf]. Retrieved from https://www.gartner.com/doc/3145117/it-market-clock-programming-languages

Microsoft, (2016). The Benefits of Multithreaded Performance with Microsoft R Open. [webpage]. Retrieved from

https://mran.microsoft.com/documents/rro/multithread/

Microsoft, (2016). R Services 2016 Getting Started Guide. [pdf]. Retrieved from https://packages.revolutionanalytics.com/doc/8.0.0/win/MicrosoftRServices_Getting_Started.pdf

References