Microsoft and Revolution Analytics: What’s the Add-Value? MARK TABLADILLO PH.D. – MICROSOFT MVP JUNE 29, 2015
Aug 04, 2015
Microsoft and Revolution
Analytics: What’s the
Add-Value?MARK TABLADILLO PH.D. – MICROSOFT MVP
JUNE 29, 2015
Mark Tab
Consulting
Training
Teaching
Presenting
SQL Server MVP
Linked In
@MarkTabNet
Outline
1) an overview of current data science technologies from Microsoft;
2) a description of the R language;
3) a brief review of the add-value for R with Azure Machine Learning, and
4) a description of the performance architecture and demo of the
language constructs developed by Revolution Analytics
Current Data Science Technologies
• SQL Server License (Win OS)
• Business Intelligence or Enterprise
SQL Server Analysis Services Data Mining
• Excel 2007 or Higher
• X64 betterExcel Data Mining Add-In
• Free or Paid Tiers
• Any OS
Microsoft Azure Machine Learning
• Open Source
• Mono-Project, Visual StudioF#
• SQL Server 2016Revolution Analytics
Data Scientist
Interact directly with data
Built-in to SQL Server
Data Developer/DBAManage data and
analytics together
Built-in advanced analyticsIn-database analytics
Example Solutions
• Fraud detection
• Sales forecasting
• Warehouse efficiency
• Predictive maintenance
Relational Data
Analytic Library
T-SQL Interface
Extensibility
?R
R Integration
010010
100100
010101
Microsoft Azure
Machine Learning Marketplace
New R scripts
010010
100100
010101
010010
100100
010101
010010
100100
010101
010010
100100
010101
010010
100100
010101
AML
Gallery
ML
Studio
SSMS /
RSSRS /
CR
Excel /
PVPower
BI.com
Fisher’s Iris flower datasetmachine learning
Description of
the R Language
R
RSTUDIO
RATTLE
Growth and Demand for R
R is the highest paid IT skill
Dice.com, Jan 2014
R most-used data science language after SQL
O’Reilly, Jan 2014
R is used by 70% of data miners
Rexer, Sep 2013
R is #15 of all programming languages
RedMonk, Jan 2014
R growing faster than any other data science language
KDnuggets, Aug 2013
More than 2 million users worldwide
R Usage GrowthRexer Data Miner Survey, 2007-2013
70% of data miners report using
R
R is the first choice of more
data miners than any other
software
Source: www.rexeranalytics.com
R with Azure
Machine
Learning
Revolution
Analytics
2007: The Beginning
13
2008: Revolutions Blog14
R in the News
15
2009
New York Times:Data Analysts Captivated by R’s Power
2009
Revolution R Enterprise
version 3
First R Debugging IDE
16
2010: User Group Sponsorships
17
141 R User Groups
Rows of data 1 billion 1 billion
Parameters “just a few” 7
Time 80 seconds 44 seconds
Data location In memory On disk
Nodes 32 5
Cores 384 20
RAM 1,536 GB 80 GB
Double
45%
1/6th
5%
5%Revolution R is faster on the same amount of data, despite using approximately a 20th as many cores, a 20th as
much RAM, a 6th as many nodes, and not pre-loading data into RAM.
Bottom Line: Revolution R Enterprise Performance = Greatly Reduced TCO
*As published by SAS in HPC Wire, April 21, 2011
Logistic Regression:
18
2010: Head to Head with SAS
2011: RHadoop
19
github.com/RevolutionAnalytics/RHadoop
2013Shaking up the industryA Gartner Magic Quadrant
Visionary
20
2014: Technical Support for Open Source RAdviseR™ from Revolution Analytics
21
Technical support for open source R, from the R experts.
10x5 email and phone support
Support for R, validated packages, and third-party software connections
On-line case management and knowledgebase
Access to technical resources, documentation and user forums
Exclusive on-line webinars from community experts
Guaranteed response times
Also available: expert hands-on and on-line training for R, from Revolution Analytics AcademyR.
http://www.revolutionanalytics.com/adviser
http://revolutionanalytics.com/academyr-training-
education
SummaryWATCH FOR SQL SERVER
2016
Abstract
Microsoft has been a leader in the enterprise analytics space for years. In 2014, Microsoft had already created R language functionality within Azure Machine Learning. On April 6, 2015, Microsoft and closed on a deal to acquire Revolution Analytics, a company focusing on scalable processing solutions initiated by the well-known R language. Many data science projects and initial demos do not need high-volume solutions: however, having a high-volume answer for the R language allows for planning or working toward the largest data science solutions.
This presentation describes the add-value for the Revolution Analytics acquisition. The talk covers 1) an overview of current data science technologies from Microsoft; 2) a description of the R language; 3) a brief review of the add-value for R with Azure Machine Learning, and 4) a description of the performance architecture and demo of the language constructs developed by Revolution Analytics. Most of the presentation will be focused on sections two and four. It is anticipated that these technologies will be partially if not fully integrated into SQL Server 2016.