Top Banner
L A F R A S MPI C R Doug Service Stephen Weller Daniel Hanson July , Microsoft Machine Learning Revolution Analytics ©Microsoft R/Finance
17

Leveraging Azure From Rpast.rinfinance.com/agenda/2016/workshop/DougService.pdf · Build Spark, Hadoop, MPI compute clusters in Azure Portal, or languages such as Bash, PowerShell,

Oct 03, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Leveraging Azure From Rpast.rinfinance.com/agenda/2016/workshop/DougService.pdf · Build Spark, Hadoop, MPI compute clusters in Azure Portal, or languages such as Bash, PowerShell,

Leveraging Azure From RAzure Spark and MPI Clusters from R

Doug ServiceStephen WellerDaniel HansonJuly 3, 2016

Microsoft Machine LearningRevolution Analytics

©Microsoft 2015 R/Finance 2016 1

Page 2: Leveraging Azure From Rpast.rinfinance.com/agenda/2016/workshop/DougService.pdf · Build Spark, Hadoop, MPI compute clusters in Azure Portal, or languages such as Bash, PowerShell,

Outline

1. Introduction

2. Azure

3. MPI Cluster

4. Portfolio Optimization Demo

©Microsoft 2015 R/Finance 2016 2

Page 3: Leveraging Azure From Rpast.rinfinance.com/agenda/2016/workshop/DougService.pdf · Build Spark, Hadoop, MPI compute clusters in Azure Portal, or languages such as Bash, PowerShell,

Introduction

Page 4: Leveraging Azure From Rpast.rinfinance.com/agenda/2016/workshop/DougService.pdf · Build Spark, Hadoop, MPI compute clusters in Azure Portal, or languages such as Bash, PowerShell,

Introduction

GoalsLeverage Azure compute clusters from R to solve compute or dataparallel finance problems faster

1. Login to Azure accounts with $200 spending limit you can useduring and after the presentation

2. Run and review R demos on pre-configured R Server Spark andMPI compute clusters

©Microsoft 2015 R/Finance 2016 3

Page 5: Leveraging Azure From Rpast.rinfinance.com/agenda/2016/workshop/DougService.pdf · Build Spark, Hadoop, MPI compute clusters in Azure Portal, or languages such as Bash, PowerShell,

Azure

Page 6: Leveraging Azure From Rpast.rinfinance.com/agenda/2016/workshop/DougService.pdf · Build Spark, Hadoop, MPI compute clusters in Azure Portal, or languages such as Bash, PowerShell,

Azure

AdvantagesEliminates the expense of buying, maintaining, and continuallyupgrading a data center. Only pay for the resources you use.

Microsoft facility in Quincy Washington

©Microsoft 2015 R/Finance 2016 4

Page 7: Leveraging Azure From Rpast.rinfinance.com/agenda/2016/workshop/DougService.pdf · Build Spark, Hadoop, MPI compute clusters in Azure Portal, or languages such as Bash, PowerShell,

Azure

AdvantagesBuild Spark, Hadoop, MPI compute clusters in Azure Portal, orlanguages such as Bash, PowerShell, node.js, or C# to access from R

©Microsoft 2015 R/Finance 2016 5

Page 8: Leveraging Azure From Rpast.rinfinance.com/agenda/2016/workshop/DougService.pdf · Build Spark, Hadoop, MPI compute clusters in Azure Portal, or languages such as Bash, PowerShell,

Azure

Azure is a collection of integrated cloud services

• Compute - virtual machines (VMs)• Linux: Ubuntu, Redhat, CentOS...• Windows: Windows Server, Windows Enterprise...

• Networking - connect VMs• Internal virtual network• Public IP address and domain name

• Database - deploy to VMs• Oracle, OrientDB, Redis, SQL Server, MySQL

• Data Analytics - pre-configured• HDInsight, Stream Analytics, Cloudera

• Storage

©Microsoft 2015 R/Finance 2016 6

Page 9: Leveraging Azure From Rpast.rinfinance.com/agenda/2016/workshop/DougService.pdf · Build Spark, Hadoop, MPI compute clusters in Azure Portal, or languages such as Bash, PowerShell,

MPI Cluster

Page 10: Leveraging Azure From Rpast.rinfinance.com/agenda/2016/workshop/DougService.pdf · Build Spark, Hadoop, MPI compute clusters in Azure Portal, or languages such as Bash, PowerShell,

MPI Cluster

Four virtual machinesAll Nodes: desktop + worker

• Ubuntu Server 16.04• Open message passing interface (OpenMPI)• Open secure shell (OpenSSH)• Network file system (NFS)• R plus packages

Desktop node

• Ubuntu Mate Cloudtop desktop• X remote desktop protocol (XRDP)• Visual Studio Code editor• Sublime Text 3 editor

©Microsoft 2015 R/Finance 2016 7

Page 11: Leveraging Azure From Rpast.rinfinance.com/agenda/2016/workshop/DougService.pdf · Build Spark, Hadoop, MPI compute clusters in Azure Portal, or languages such as Bash, PowerShell,

MPI Cluster

R Packages

• foreach• doMPI• Rmpi

Gotchas

• rsh (ssh) must work reciprocally from all nodes, requires bothpublic and private SSH key files on every node

• Development R scripts must be on all nodes in same location,best solution exports working directory on desktop node tocompute nodes via Network File System (NFS)

• High performance configuration uses desktop in cloud due tohigh speed network connections to worker nodes

©Microsoft 2015 R/Finance 2016 8

Page 12: Leveraging Azure From Rpast.rinfinance.com/agenda/2016/workshop/DougService.pdf · Build Spark, Hadoop, MPI compute clusters in Azure Portal, or languages such as Bash, PowerShell,

Portfolio Optimization Demo

Page 13: Leveraging Azure From Rpast.rinfinance.com/agenda/2016/workshop/DougService.pdf · Build Spark, Hadoop, MPI compute clusters in Azure Portal, or languages such as Bash, PowerShell,

S&P 500 Portfolio Optimization

Algorithm

• Select top 30% of stocks in each S&P index sectorIndustrials, Health Care, Information Technology etc.

• Form uniformly drawn random portfolios of 30 stocks• Perform a minimum CVaR analysis on every portfolio• Select the portfolio with the highest return• Generate the efficient frontier for highest return portfolio

©Microsoft 2015 R/Finance 2016 9

Page 14: Leveraging Azure From Rpast.rinfinance.com/agenda/2016/workshop/DougService.pdf · Build Spark, Hadoop, MPI compute clusters in Azure Portal, or languages such as Bash, PowerShell,

S&P 500 Portfolio Optimization

Optimization Run TimeTransport Machines Threads Time (mins) ScriptNone 1 1 4.3162 RunPortST.sh

MPI 1 4 1.6641 RunPortMT.sh

MPI 4 1 1.4296 RunPortMPI.sh

RunAalysis.sh - Generates analysis report

©Microsoft 2015 R/Finance 2016 10

Page 15: Leveraging Azure From Rpast.rinfinance.com/agenda/2016/workshop/DougService.pdf · Build Spark, Hadoop, MPI compute clusters in Azure Portal, or languages such as Bash, PowerShell,

S&P 500 Portfolio Optimization

Demo Directory/nfs/mpidemos/rfinance/RAzureCluster/demo/portfolioOptimization

Demo Files

• PortfolioMPI.R - portfolio optimization• PortfolioMPIResults.R - generates optimization report

©Microsoft 2015 R/Finance 2016 11

Page 16: Leveraging Azure From Rpast.rinfinance.com/agenda/2016/workshop/DougService.pdf · Build Spark, Hadoop, MPI compute clusters in Azure Portal, or languages such as Bash, PowerShell,

S&P 500 Portfolio Optimization

Using foreach

eres <- foreach(cdx=1:nnode,.packages='fPortfolio') %dopar% {

# Get the combinations for the current node.

ncmbs <- cmbs[,rngs[cdx,1]:rngs[cdx,2]]

ret <- list()

for (idx in 1:ncol(ncmbs)) {

ret <- c(ret,list(list(Cmb=cmbs[,idx],

Stats=calcMinCVaRPort(spxret.ts[,ncmbs[,idx]]))))

}

return(ret)

}

©Microsoft 2015 R/Finance 2016 12

Page 17: Leveraging Azure From Rpast.rinfinance.com/agenda/2016/workshop/DougService.pdf · Build Spark, Hadoop, MPI compute clusters in Azure Portal, or languages such as Bash, PowerShell,

S&P 500 Portfolio Optimization

Review Portfolio optimization output

©Microsoft 2015 R/Finance 2016 13