Top Banner
Copyright © 2003, SAS Institute Inc. All rights reserved. SAS is a registered trademark or trademark of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Other brand and product names are registered trademarks or Trademarks of their respective companies Parallelizatio n in Action with SAS Analytic Procedures Robert Cohen Senior Research Statistician Linear Models R&D
34

Copyright © 2003, SAS Institute Inc. All rights reserved. SAS is a registered trademark or trademark of SAS Institute Inc. in the USA and other countries.

Dec 13, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Copyright © 2003, SAS Institute Inc. All rights reserved. SAS is a registered trademark or trademark of SAS Institute Inc. in the USA and other countries.

Copyright © 2003, SAS Institute Inc. All rights reserved.SAS is a registered trademark or trademark of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Other brand and product names are registered trademarks or Trademarks of their respective companies

Parallelization in Action with SAS Analytic Procedures Robert CohenSenior Research StatisticianLinear Models R&D

Page 2: Copyright © 2003, SAS Institute Inc. All rights reserved. SAS is a registered trademark or trademark of SAS Institute Inc. in the USA and other countries.

Copyright © 2003, SAS Institute Inc. All rights reserved. 2

Your Rise and Shine Menu

Parallelization adds value to the IVC

Multithreading to provide parallel execution

How do you measure scalability

Selected demonstrations

Marketing: I should have slept in

Boring: I should have left when I had the chance

Insulting: This guy thinks I’m a 10 year old

Deceiving: The truth, but not the whole truth

Page 3: Copyright © 2003, SAS Institute Inc. All rights reserved. SAS is a registered trademark or trademark of SAS Institute Inc. in the USA and other countries.

Copyright © 2003, SAS Institute Inc. All rights reserved. 3

IVC: Parallelization Adds Value

Complete today’s analyses faster

Analyze tomorrow’s problems within today’s time constraints

Multithreaded Procedures

Parallel access

to data

Page 4: Copyright © 2003, SAS Institute Inc. All rights reserved. SAS is a registered trademark or trademark of SAS Institute Inc. in the USA and other countries.

Copyright © 2003, SAS Institute Inc. All rights reserved. 4

The IVC in Action

IC

V

Page 5: Copyright © 2003, SAS Institute Inc. All rights reserved. SAS is a registered trademark or trademark of SAS Institute Inc. in the USA and other countries.

Copyright © 2003, SAS Institute Inc. All rights reserved. 5

Changes You Have to Make in Your Legacy Code

TINSTAAFL

There are

exceptions

Page 6: Copyright © 2003, SAS Institute Inc. All rights reserved. SAS is a registered trademark or trademark of SAS Institute Inc. in the USA and other countries.

Copyright © 2003, SAS Institute Inc. All rights reserved. 6

Unthreaded GLM: 2 CPU Box

Thread View: Running Waiting I/O Blocked Exited

GLM runs in a single thread

GLM never blocks this thread

GLM work is NOT done in parallel

Page 7: Copyright © 2003, SAS Institute Inc. All rights reserved. SAS is a registered trademark or trademark of SAS Institute Inc. in the USA and other countries.

Copyright © 2003, SAS Institute Inc. All rights reserved. 7

Unthreaded GLM: 2 CPU Box

Thread View: Running Waiting I/O Blocked Exited

CPU Utilization: CPU 1 CPU 2

Page 8: Copyright © 2003, SAS Institute Inc. All rights reserved. SAS is a registered trademark or trademark of SAS Institute Inc. in the USA and other countries.

Copyright © 2003, SAS Institute Inc. All rights reserved. 8

Unthreaded GLM: 2 CPU Box

Thread View: Running Waiting I/O Blocked Exited

Combined CPU Utilization

100

50.

0.

Page 9: Copyright © 2003, SAS Institute Inc. All rights reserved. SAS is a registered trademark or trademark of SAS Institute Inc. in the USA and other countries.

Copyright © 2003, SAS Institute Inc. All rights reserved. 9

Multithreaded GLM: 1 Active Thread 2 CPU Box

Thread View: Running Waiting I/O Blocked Exited

Worker threads used for specific tasks

Invert X’X matrix

GLM thread blocks while a worker thread is active

GLM Thread

GLM does not execute in parallel

Page 10: Copyright © 2003, SAS Institute Inc. All rights reserved. SAS is a registered trademark or trademark of SAS Institute Inc. in the USA and other countries.

Copyright © 2003, SAS Institute Inc. All rights reserved. 10

Multithreaded GLM: 1 Active Thread 2 CPU Box

Thread View: Running Waiting I/O Blocked Exited

CPU Utilization: CPU 1 CPU 2

Page 11: Copyright © 2003, SAS Institute Inc. All rights reserved. SAS is a registered trademark or trademark of SAS Institute Inc. in the USA and other countries.

Copyright © 2003, SAS Institute Inc. All rights reserved. 11

Multithreaded GLM: 1 Active Thread 2 CPU Box

Thread View: Running Waiting I/O Blocked Exited

Combined CPU Utilization

100

50.

0.

Page 12: Copyright © 2003, SAS Institute Inc. All rights reserved. SAS is a registered trademark or trademark of SAS Institute Inc. in the USA and other countries.

Copyright © 2003, SAS Institute Inc. All rights reserved. 12

Multithreaded GLM: 2 Active Threads 2 CPU Box

Thread View: Running Waiting I/O Blocked Exited

GLM thread spawns off worker threads

GLM ThreadInvert X’X matrix

Two independent worker threads per task

Work is done in parallel

Page 13: Copyright © 2003, SAS Institute Inc. All rights reserved. SAS is a registered trademark or trademark of SAS Institute Inc. in the USA and other countries.

Copyright © 2003, SAS Institute Inc. All rights reserved. 13

Multithreaded GLM: 2 Active Threads 2 CPU Box

Thread View: Running Waiting I/O Blocked Exited

CPU Utilization: CPU 1 CPU 2

Page 14: Copyright © 2003, SAS Institute Inc. All rights reserved. SAS is a registered trademark or trademark of SAS Institute Inc. in the USA and other countries.

Copyright © 2003, SAS Institute Inc. All rights reserved. 14

Multithreaded GLM: 2 Active Threads 2 CPU Box

Thread View: Running Waiting I/O Blocked Exited

Combined CPU Utilization

100

50.

0.

Page 15: Copyright © 2003, SAS Institute Inc. All rights reserved. SAS is a registered trademark or trademark of SAS Institute Inc. in the USA and other countries.

Copyright © 2003, SAS Institute Inc. All rights reserved. 15

Multithreaded GLM: 4 Active Threads 2 CPU Box

Thread View: Running Waiting I/O Blocked Exited

Page 16: Copyright © 2003, SAS Institute Inc. All rights reserved. SAS is a registered trademark or trademark of SAS Institute Inc. in the USA and other countries.

Copyright © 2003, SAS Institute Inc. All rights reserved. 16

Threading ComparisonMultithreaded GLM: 2 CPU Box

Thread View: Running Waiting I/O Blocked Exited

Page 17: Copyright © 2003, SAS Institute Inc. All rights reserved. SAS is a registered trademark or trademark of SAS Institute Inc. in the USA and other countries.

Copyright © 2003, SAS Institute Inc. All rights reserved. 17

Amdahl’s Law

CPUs Speedup

1 1.00 2 1.67 4 2.50 8 3.33 16 4.00 4.4432

PF = 80% Not Scalable Scalable

Page 18: Copyright © 2003, SAS Institute Inc. All rights reserved. SAS is a registered trademark or trademark of SAS Institute Inc. in the USA and other countries.

Copyright © 2003, SAS Institute Inc. All rights reserved. 18

Amdahl’s LawParallelizable

Fraction100%

99%

95%

90%

80%

60%

Page 19: Copyright © 2003, SAS Institute Inc. All rights reserved. SAS is a registered trademark or trademark of SAS Institute Inc. in the USA and other countries.

Copyright © 2003, SAS Institute Inc. All rights reserved. 19

Scalability in PROC REG:Wide Data and Scalar I/O

Speedups

Linear

Amdahl, PF=93%

Test Details

50,000 observations

500 predictors

Stepwise Selection

Scalar I/O

Page 20: Copyright © 2003, SAS Institute Inc. All rights reserved. SAS is a registered trademark or trademark of SAS Institute Inc. in the USA and other countries.

Copyright © 2003, SAS Institute Inc. All rights reserved. 20

Scalability in PROC REG:Wide Data and Scalar I/O

Speedups

Linear

Amdahl, PF=93%

Test Details

50,000 observations

500 predictors

Stepwise Selection

Scalar I/OAchieved

Page 21: Copyright © 2003, SAS Institute Inc. All rights reserved. SAS is a registered trademark or trademark of SAS Institute Inc. in the USA and other countries.

Copyright © 2003, SAS Institute Inc. All rights reserved. 21

Scalability in PROC REG:Narrow Data, Parallel I/O

Test Details

4 million observations

20 predictors

Parallel I/O

Speedups

Linear

Amdahl, PF=99.9%

Page 22: Copyright © 2003, SAS Institute Inc. All rights reserved. SAS is a registered trademark or trademark of SAS Institute Inc. in the USA and other countries.

Copyright © 2003, SAS Institute Inc. All rights reserved. 22

Scalability in PROC REG:Narrow Data, Parallel I/O

Test Details

4 million observations

20 predictors

Parallel I/O

Speedups

Linear

Amdahl, PF=99.9%

Achieved

Page 23: Copyright © 2003, SAS Institute Inc. All rights reserved. SAS is a registered trademark or trademark of SAS Institute Inc. in the USA and other countries.

Copyright © 2003, SAS Institute Inc. All rights reserved. 23

Speedups

Linear

Amdahl, PF=93%

Test Details

500,000 observations

Predictors: 50 continuous 15 classification

Logistic model

Parallel I/O

Scalability in PROC DMREG

Page 24: Copyright © 2003, SAS Institute Inc. All rights reserved. SAS is a registered trademark or trademark of SAS Institute Inc. in the USA and other countries.

Copyright © 2003, SAS Institute Inc. All rights reserved. 24

Scalability in PROC DMREG

Speedups

Achieved

Linear

Amdahl, PF=93%

Test Details

500,000 observations

Predictors: 50 continuous 15 classification

Logistic model

Parallel I/O

Page 25: Copyright © 2003, SAS Institute Inc. All rights reserved. SAS is a registered trademark or trademark of SAS Institute Inc. in the USA and other countries.

Copyright © 2003, SAS Institute Inc. All rights reserved. 25

Baseline Speedup and Scalability in PROC DMREG

Linear

Amdahl, PF = 93%

Speedups

Achieved

V9/V8 ***

Test Details

500,000 observations

Predictors: 50 continuous 15 classification

Logistic model

Parallel I/O

Page 26: Copyright © 2003, SAS Institute Inc. All rights reserved. SAS is a registered trademark or trademark of SAS Institute Inc. in the USA and other countries.

Copyright © 2003, SAS Institute Inc. All rights reserved. 26

Scalability in PROC GLM

Linear

Amdahl, PF = 98%

SpeedupsTest Details

6000 observations

4 classificationvariables

2000 parameters

Page 27: Copyright © 2003, SAS Institute Inc. All rights reserved. SAS is a registered trademark or trademark of SAS Institute Inc. in the USA and other countries.

Copyright © 2003, SAS Institute Inc. All rights reserved. 27

Scalability in PROC GLM

Linear

Amdahl, PF = 98%

SpeedupsTest Details

6000 observations

4 classificationvariables

2000 parameters

Achieved

Superlinear

Scalability!

Page 28: Copyright © 2003, SAS Institute Inc. All rights reserved. SAS is a registered trademark or trademark of SAS Institute Inc. in the USA and other countries.

Copyright © 2003, SAS Institute Inc. All rights reserved. 28

Scalability in PROC LOESS

Linear

Amdahl, PF=95%

Speedups

Test Details

4000 observations

18 models evaluated

Confidence limits forselected model

Page 29: Copyright © 2003, SAS Institute Inc. All rights reserved. SAS is a registered trademark or trademark of SAS Institute Inc. in the USA and other countries.

Copyright © 2003, SAS Institute Inc. All rights reserved. 29

Scalability in PROC LOESS

Linear

Amdahl, PF=95%

Speedups

Test Details

4000 observations

18 models evaluated

Confidence limits forselected model Achieved

Page 30: Copyright © 2003, SAS Institute Inc. All rights reserved. SAS is a registered trademark or trademark of SAS Institute Inc. in the USA and other countries.

Copyright © 2003, SAS Institute Inc. All rights reserved. 30

Scalability in PROC LOESS

Linear

Amdahl, PF=99%

Speedups

Test Details

4000 observations

1 model specified

Confidence limits forspecified model

Page 31: Copyright © 2003, SAS Institute Inc. All rights reserved. SAS is a registered trademark or trademark of SAS Institute Inc. in the USA and other countries.

Copyright © 2003, SAS Institute Inc. All rights reserved. 31

Scalability in PROC LOESS

Linear

Amdahl, PF=99%

Speedups

Test Details

4000 observations

1 model specified

Confidence limits forspecified model Achieved

Page 32: Copyright © 2003, SAS Institute Inc. All rights reserved. SAS is a registered trademark or trademark of SAS Institute Inc. in the USA and other countries.

Copyright © 2003, SAS Institute Inc. All rights reserved. 32

Partially Multithreaded Procedures

Base SAS• PROC SORT

• PROC SUMMARY

• SQL (Group by,Order by)

Enterprise Miner• PROC DMDB

• PROC DMREG

• PROC DMINE

SAS/STAT• PROC GLM

• PROC LOESS

• PROC REG

• PROC ROBUSTREG

NOTE: Not all usages of these procedures are scalable.

Your mileage may vary!

Page 33: Copyright © 2003, SAS Institute Inc. All rights reserved. SAS is a registered trademark or trademark of SAS Institute Inc. in the USA and other countries.

Copyright © 2003, SAS Institute Inc. All rights reserved. 33

Reading Between the Lines

Parallelization adds value to the IVC

Multithreading to provide parallel execution

How do you measure scalability

Selected demonstrations

Analyze bigger volumes of data

Not as boring as I feared

Predicting scalability is a subtle task

Some of my jobs will run faster in SAS 9

Page 34: Copyright © 2003, SAS Institute Inc. All rights reserved. SAS is a registered trademark or trademark of SAS Institute Inc. in the USA and other countries.

Copyright © 2003, SAS Institute Inc. All rights reserved. 34

Questions and hopefully answers