Top Banner
Spring 2011 1 MATLAB Tutorial Series MATLAB Tutorial Series Tuning MATLAB for Better Performance Tuning MATLAB for Better Performance Kadin Tseng Scientific Computing and Visualization, IS&T Boston University
46

Spring 20111 MATLAB Tutorial Series Tuning MATLAB for Better Performance Kadin Tseng Scientific Computing and Visualization, IS&T Boston University.

Jan 21, 2016

Download

Documents

Merryl Lewis
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Spring 20111 MATLAB Tutorial Series Tuning MATLAB for Better Performance Kadin Tseng Scientific Computing and Visualization, IS&T Boston University.

Spring 2011 1

MATLAB Tutorial SeriesMATLAB Tutorial Series

Tuning MATLAB for Better PerformanceTuning MATLAB for Better Performance

Kadin TsengScientific Computing and Visualization, IS&T

Boston University

Page 2: Spring 20111 MATLAB Tutorial Series Tuning MATLAB for Better Performance Kadin Tseng Scientific Computing and Visualization, IS&T Boston University.

Spring 2011 2

Topics CoveredTopics Covered

1.1. Performance Issues Performance Issues

1.1 Memory Allocations1.1 Memory Allocations

1.2 Vector Representations1.2 Vector Representations

1.3 Compiler1.3 Compiler

1.4 Other Considerations1.4 Other Considerations

22. Multiprocessing with . Multiprocessing with MATLABMATLAB

Page 3: Spring 20111 MATLAB Tutorial Series Tuning MATLAB for Better Performance Kadin Tseng Scientific Computing and Visualization, IS&T Boston University.

Spring 2011 3

1.11.1 Memory Access Memory Access1.21.2 Vector Representations Vector Representations1.31.3 Compiler Compiler1.41.4 Other Considerations Other Considerations

1.1. Performance Issues Performance Issues

Page 4: Spring 20111 MATLAB Tutorial Series Tuning MATLAB for Better Performance Kadin Tseng Scientific Computing and Visualization, IS&T Boston University.

Spring 2011 4

Memory access patterns often affect computational Memory access patterns often affect computational

performance. Some effective ways to enhanceperformance. Some effective ways to enhance

performance in performance in MATLABMATLAB : :

Allocate arrayAllocate array memory before using it memory before using it For-loops For-loops OrderingOrdering ComputeCompute and save array in-place where and save array in-place where

applicable.applicable.

1.11.1 Memory Memory AccessAccess

Page 5: Spring 20111 MATLAB Tutorial Series Tuning MATLAB for Better Performance Kadin Tseng Scientific Computing and Visualization, IS&T Boston University.

Spring 2011 5

MATLAB arrays are allocated in contiguous address space. MATLAB arrays are allocated in contiguous address space.

How Does MATLAB Allocate How Does MATLAB Allocate Arrays ?Arrays ?

Without pre-allocation

x = 1;for i=2:4 x(i) = i;end

MemoryAddress

Array element

1

x(1)

… . . .

2000 x(1)

2001 x(2)

2002 x(1)

2003 x(2)

2004 x(3)

. . . . . .

10004 x(1)

10005 x(2)

10006 x(3)

10007 x(4)

Page 6: Spring 20111 MATLAB Tutorial Series Tuning MATLAB for Better Performance Kadin Tseng Scientific Computing and Visualization, IS&T Boston University.

Spring 2011 6

MATLAB arrays are allocated in contiguous address space.MATLAB arrays are allocated in contiguous address space. Pre-allocate arrays enhance performance significantly. Pre-allocate arrays enhance performance significantly.

How … Arrays ? How … Arrays ? ExamplesExamples

n=5000;ticfor i=1:n x(i) = i^2;endtocWallclock time = 0.00046 seconds

n=5000; x = zeros(n,1);ticfor i=1:n x(i) = i^2;endtocWallclock time = 0.00004 seconds

The timing data are recorded on Katana. The actual times may vary depending on the processor.

Page 7: Spring 20111 MATLAB Tutorial Series Tuning MATLAB for Better Performance Kadin Tseng Scientific Computing and Visualization, IS&T Boston University.

Spring 2011 7

MATLAB uses pass-by-reference if passed array isMATLAB uses pass-by-reference if passed array is

used without changes; a copy will be made if the array isused without changes; a copy will be made if the array is

modified. MATLAB calls it “lazy copy.” Consider the following example:modified. MATLAB calls it “lazy copy.” Consider the following example:

function y = lazyCopy(A, x, b, change)function y = lazyCopy(A, x, b, change)

If change, A(2,3) = 23; end % change forces a local copy of a If change, A(2,3) = 23; end % change forces a local copy of a

y = A*x + b; % use x and b directly from calling program y = A*x + b; % use x and b directly from calling program

pause(2) % keep memory longer to see it in Task Manager pause(2) % keep memory longer to see it in Task Manager

On Windows, can use Task Manager to monitorOn Windows, can use Task Manager to monitor

memory allocation history.memory allocation history.

>> n = 5000; A = rand(n); x = rand(n,1); b = rand(n,1);>> n = 5000; A = rand(n); x = rand(n,1); b = rand(n,1);

>> y = lazyCopy(A, x, b, 0);>> y = lazyCopy(A, x, b, 0);

>> y = lazyCopy(A, x, b, 1);>> y = lazyCopy(A, x, b, 1);

Passing Arrays Into A Passing Arrays Into A Function Function

Page 8: Spring 20111 MATLAB Tutorial Series Tuning MATLAB for Better Performance Kadin Tseng Scientific Computing and Visualization, IS&T Boston University.

Spring 2011 8

Best if inner-most loop is for array left-most index, etc. (column-Best if inner-most loop is for array left-most index, etc. (column-major)major)

For a multi-dimensional array, For a multi-dimensional array, x(i,j)x(i,j), the 1D representation of the , the 1D representation of the same array, x(k), follows column-wise order and inherently same array, x(k), follows column-wise order and inherently possesses the contiguous propertypossesses the contiguous property

For-loop For-loop OrderingOrdering

n=5000; x = zeros(n);for i=1:n % rows for j=1:n % columns x(i,j) = i+(j-1)*n; endend

Wallclock time = 0.88 seconds

n=5000; x = zeros(n);for j=1:n % columns for i=1:n % rows x(i,j) = i+(j-1)*n; endend

Wallclock time = 0.48 seconds

for i=1:n*n

x(i) = i;end

x = 1:n*n;

Page 9: Spring 20111 MATLAB Tutorial Series Tuning MATLAB for Better Performance Kadin Tseng Scientific Computing and Visualization, IS&T Boston University.

Spring 2011 9

Compute and save array in-place improves performance (and Compute and save array in-place improves performance (and reduce memory usage)reduce memory usage)

Compute In-Compute In-placeplace

x = rand(5000);ticy = x.^2;toc

Wallclock time = 0.30 seconds

x = rand(5000);ticx = x.^2;toc

Wallclock time = 0.11 seconds

Caveat:

May not be worthwhile if it involves data type or size change …

Page 10: Spring 20111 MATLAB Tutorial Series Tuning MATLAB for Better Performance Kadin Tseng Scientific Computing and Visualization, IS&T Boston University.

Spring 2011 10

Generally, better to use function instead of script Generally, better to use function instead of script m-file m-file

Script m-file is loaded into memory and evaluate one line at a Script m-file is loaded into memory and evaluate one line at a time. Subsequent uses require reloading.time. Subsequent uses require reloading.

Function m-file is compiled into a pseudo-code and is loaded Function m-file is compiled into a pseudo-code and is loaded on first application. Subsequent uses of the function will be on first application. Subsequent uses of the function will be faster without reloading.faster without reloading.

Function is modular; self cleaning; reusable.Function is modular; self cleaning; reusable. Global variables are expensive; difficult to track.Global variables are expensive; difficult to track. Physical memory is much faster than virtual mem.Physical memory is much faster than virtual mem. Avoid passing large matrices to a function and modifying only a Avoid passing large matrices to a function and modifying only a

handful of elements.handful of elements.

OtherOther ConsiderationsConsiderations

Page 11: Spring 20111 MATLAB Tutorial Series Tuning MATLAB for Better Performance Kadin Tseng Scientific Computing and Visualization, IS&T Boston University.

Other Considerations Other Considerations (cont’d)(cont’d)

loadload and and savesave are efficient to handle whole data file; are efficient to handle whole data file; textscantextscan is is more memory-efficient to extract text meeting specific criteria. more memory-efficient to extract text meeting specific criteria.

Don’t reassign array that results in change of data type or shape.Don’t reassign array that results in change of data type or shape. Limit m-files size and complexity.Limit m-files size and complexity. Computationally intensive jobs often require large memory …Computationally intensive jobs often require large memory … Structure of array more memory-efficient than array of structures.Structure of array more memory-efficient than array of structures.

Spring 2011 11

Page 12: Spring 20111 MATLAB Tutorial Series Tuning MATLAB for Better Performance Kadin Tseng Scientific Computing and Visualization, IS&T Boston University.

Spring 2011 12

Maximize memory availability.Maximize memory availability. 32-bit systems < 2 or 3 GB 32-bit systems < 2 or 3 GB 64-bit systems running 32-bit MATLAB < 4GB64-bit systems running 32-bit MATLAB < 4GB 64-bit systems running 64-bit MATLAB < 8TB 64-bit systems running 64-bit MATLAB < 8TB (93 GB on some Katana nodes)(93 GB on some Katana nodes)

Minimize memory usage. (Details to follow …)Minimize memory usage. (Details to follow …)

Memory ManagementMemory Management

Page 13: Spring 20111 MATLAB Tutorial Series Tuning MATLAB for Better Performance Kadin Tseng Scientific Computing and Visualization, IS&T Boston University.

Spring 2011 13

Use Use clear, pack clear, pack or other memory saving means when possible. If or other memory saving means when possible. If double precision (default) is not required, the use of ‘single’ data type double precision (default) is not required, the use of ‘single’ data type could save substantial amount of memory. For example,could save substantial amount of memory. For example,

>> x=ones(10,'single'); y=x+1; % y inherits single from x>> x=ones(10,'single'); y=x+1; % y inherits single from x Use Use sparse sparse to reduce memory footprint on sparse matrices to reduce memory footprint on sparse matrices

>> n=5000; A = zeros(n); A(3,2) = 1; B = ones(n);>> n=5000; A = zeros(n); A(3,2) = 1; B = ones(n);

>> C = A*B;>> C = A*B;

>> As = sparse(A);>> As = sparse(A);

>> Cs = As*B; % it can save time for low density>> Cs = As*B; % it can save time for low density

>> A2 = sparse(n,n); A2(3,2) = 1;>> A2 = sparse(n,n); A2(3,2) = 1;

Minimize Memory Minimize Memory UsageUsage

Page 14: Spring 20111 MATLAB Tutorial Series Tuning MATLAB for Better Performance Kadin Tseng Scientific Computing and Visualization, IS&T Boston University.

Spring 2011 14

Use “matlab –nojvm …” saves lots of memory – if warrantedUse “matlab –nojvm …” saves lots of memory – if warranted Memory usage queryMemory usage query

For Unix:For Unix:

Katana% topKatana% top

For Windows:For Windows:

>> m = feature('memstats'); % largest contiguous free block>> m = feature('memstats'); % largest contiguous free block

Use MS Windows Task Manager to monitor memory allocation.Use MS Windows Task Manager to monitor memory allocation. Distribute memory among multiprocessors via MATLABDistribute memory among multiprocessors via MATLAB

Parallel Computing Toolbox.Parallel Computing Toolbox.

Minimize Memory Usage Minimize Memory Usage (Cont’d)(Cont’d)

Page 15: Spring 20111 MATLAB Tutorial Series Tuning MATLAB for Better Performance Kadin Tseng Scientific Computing and Visualization, IS&T Boston University.

Spring 2011 15

MATLAB provides a few functions for processing real,MATLAB provides a few functions for processing real,noncomplex, data specifically. These functions are morenoncomplex, data specifically. These functions are moreefficient than their generic versions:efficient than their generic versions: realpow – power for real numbersrealpow – power for real numbers realsqrt – square root for real numbersrealsqrt – square root for real numbers reallog – logarithm for real numbersreallog – logarithm for real numbers realmin/realmax – min/max for real numbersrealmin/realmax – min/max for real numbers

Special Functions for Real NumbersSpecial Functions for Real Numbers

n = 1000; x = 1:n;x = x.^2;ticx = sqrt(x);toc

Wallclock time = 0.00022 seconds

n = 1000; x = 1:n;x = x.^2;ticx = realsqrt(x);toc

Wallclock time = 0.00004 seconds

• isreal reports whether the array is real

• single/double converts data to single-/double-precision

Page 16: Spring 20111 MATLAB Tutorial Series Tuning MATLAB for Better Performance Kadin Tseng Scientific Computing and Visualization, IS&T Boston University.

Spring 2011 16

MATLAB is designed for vector and matrix operations. The use of MATLAB is designed for vector and matrix operations. The use of forfor--loop, in general, can be expensive, especially if the loop count is loop, in general, can be expensive, especially if the loop count is large or nested.large or nested.

Without array pre-allocation, its size extension in a for-loop is costly Without array pre-allocation, its size extension in a for-loop is costly as shown before.as shown before.

From a performance standpoint, From a performance standpoint, in general, in general, vector representation vector representation should be used in place of should be used in place of forfor-loops.-loops.

Vector OperationsVector Operations

i = 0;for t = 0:.01:100 i = i + 1; y(i) = sin(t);end

Wallclock time = 0.1069 seconds

t = 0:.01:100;y = sin(t);

Wallclock time = 0.0007 seconds

Page 17: Spring 20111 MATLAB Tutorial Series Tuning MATLAB for Better Performance Kadin Tseng Scientific Computing and Visualization, IS&T Boston University.

Spring 2011 17

Vector Operations of ArraysVector Operations of Arrays

>> A = magic(3) % define a 3x3 matrix A>> A = magic(3) % define a 3x3 matrix AA = A = 8 1 68 1 6 3 5 7 3 5 7 4 9 24 9 2>> B = A^2; % B = A * A;>> B = A^2; % B = A * A;>> C = A + B; >> C = A + B; >> b = 1:3 % define b as a 1x3 row vector >> b = 1:3 % define b as a 1x3 row vector b = b = 1 2 3 1 2 3 >> [A, b'] % add b transpose as a 4th column to A >> [A, b'] % add b transpose as a 4th column to A ans = ans = 8 1 6 1 8 1 6 1 3 5 7 2 3 5 7 2 4 9 2 3 4 9 2 3

Page 18: Spring 20111 MATLAB Tutorial Series Tuning MATLAB for Better Performance Kadin Tseng Scientific Computing and Visualization, IS&T Boston University.

Spring 2011 18

Vector OperationsVector Operations>> [A; b] % add b as a 4th row to A >> [A; b] % add b as a 4th row to A ans = ans = 8 1 6 8 1 6 3 5 7 3 5 7 4 9 24 9 2 1 2 3 1 2 3 >> A = zeros(3) % zeros generates 3*3 array of 0’s>> A = zeros(3) % zeros generates 3*3 array of 0’sA = A = 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 >> B = 2*ones(2,3) % ones generates 2 * 3 array of 1’s>> B = 2*ones(2,3) % ones generates 2 * 3 array of 1’sB = B = 2 2 2 2 2 2 2 2 2 2 2 2

Alternatively,Alternatively,>> B = repmat(2,2,3) % matrix replication>> B = repmat(2,2,3) % matrix replication

Page 19: Spring 20111 MATLAB Tutorial Series Tuning MATLAB for Better Performance Kadin Tseng Scientific Computing and Visualization, IS&T Boston University.

Spring 2011 19

Vector OperationsVector Operations

>> y = (1:5)’;>> y = (1:5)’;

>> n = 3; >> n = 3;

>> B = y(:, ones(1,n)) % B = y(:, [1 1 1]) or B=[y y y] >> B = y(:, ones(1,n)) % B = y(:, [1 1 1]) or B=[y y y]

B = B =

1 1 1 1 1 1

2 2 2 2 2 2

3 3 3 3 3 3

4 4 44 4 4

5 5 5 5 5 5

Again, Again, BB can be generated via repmat as can be generated via repmat as

>> B = repmat(y, 1, 3); >> B = repmat(y, 1, 3);

Page 20: Spring 20111 MATLAB Tutorial Series Tuning MATLAB for Better Performance Kadin Tseng Scientific Computing and Visualization, IS&T Boston University.

Spring 2011 20

Vector OperationsVector Operations

>> A = magic(3) >> A = magic(3) A = A = 8 1 68 1 6 3 5 73 5 7 4 9 2 4 9 2 >> B = A(:, [1 3 2]) % switch 2nd and third columns of A>> B = A(:, [1 3 2]) % switch 2nd and third columns of AB = B = 8 6 18 6 1 3 7 53 7 5 4 2 9 4 2 9 >> A(:, 2) = [ ] % delete second column of A >> A(:, 2) = [ ] % delete second column of A A = A = 8 68 6 3 73 7 4 2 4 2

Page 21: Spring 20111 MATLAB Tutorial Series Tuning MATLAB for Better Performance Kadin Tseng Scientific Computing and Visualization, IS&T Boston University.

Spring 2011 21

Vector Operation ExampleVector Operation Examplen = 3000; x = zeros(n);for j=1:n for i=1:n x(i,j) = i+(j-1)*n; x(i,j) = x(i,j)^2; endend

Wallclock time = 0.128 seconds

n = 3000;i = (1:n) '; I = repmat(i,1,n); % replicate along jx = I + (I'-1)*n;x = x.^2;

Wallclock time = 0.127 seconds

Notes on the vector form of the computations :

• To eliminate the for-loops, all values of i and j must be made available at once. The ones or repmat utilities can be used to replicate rows or columns. In this special case, J = I' is used to save computations and memory.

• Often, there are trade-offs between efficiency and memory using the vector form. Here, the creation of I adds to the memory and compute time. However, as more works can be leveraged against I, efficiency improves.

Page 22: Spring 20111 MATLAB Tutorial Series Tuning MATLAB for Better Performance Kadin Tseng Scientific Computing and Visualization, IS&T Boston University.

Spring 2011 22

Functions Useful in VectorizingFunctions Useful in VectorizingFunction

Description

all Test to see if all elements are of a prescribed value

any Test to see if any element is of a prescribed value

zeros Create array of zeroes

ones Create array of ones

repmat Replicate and tile an array

find Find indices and values of nonzero elements

diff Find differences and approximate derivatives

squeeze Remove singleton dimensions from an array

prod Find product of array elements

sum Find the sum of array elements

cumsum Find cumulative sum

shiftdim Shift array dimensions

logical Convert numeric values to logical

sort Sort array elements in ascending /descending order

Page 23: Spring 20111 MATLAB Tutorial Series Tuning MATLAB for Better Performance Kadin Tseng Scientific Computing and Visualization, IS&T Boston University.

Spring 2011 23

Laplace EquationLaplace Equation

Laplace Equation:

Boundary Conditions:

(1)

(2)

Analytical solution:

(3)

0y

u

x

u

2

2

2

2

10 010

101

100

y yuyu

x exsinxu

x xsinxux

),(),(

)(),(

)(),(

1y0 1;x0 exsinyxu xy )(),(

Page 24: Spring 20111 MATLAB Tutorial Series Tuning MATLAB for Better Performance Kadin Tseng Scientific Computing and Visualization, IS&T Boston University.

Spring 2011 24

Discrete Laplace EquationDiscrete Laplace Equation

Discretize Equation (1) by centered-difference yields:

where n and n+1 denote the current and the next time step, respectively, while

For simplicity, we take

(4)

(5)

mj m; i4

uuuuu ,1,2,,1,2,

n1i,j

n1i,j

n1,ji

n1,jin

ji

1,

1m

1yx

), yjx(iu

m; jm) i,y(xuun

jinn

i,j 1,0,1,2,1,0,1,2,

Page 25: Spring 20111 MATLAB Tutorial Series Tuning MATLAB for Better Performance Kadin Tseng Scientific Computing and Visualization, IS&T Boston University.

Spring 2011 25

Computational DomainComputational Domain

,m1,2,j ,m;1,2, i4

uuuuu

n1i,j

n1i,j

n1,ji

n1,ji1n

ji

,

0u(1,y)

0u(0,y)

)(),( xsinxu 0

xexsin(u(x,1) )

x, i

y, j

Page 26: Spring 20111 MATLAB Tutorial Series Tuning MATLAB for Better Performance Kadin Tseng Scientific Computing and Visualization, IS&T Boston University.

Spring 2011 26

Five-point Finite-Difference StencilFive-point Finite-Difference Stencil

x

Interior cells.

Where solution of the Laplace equation is sought.

(i, j)

Exterior cells.

Green cells denote cells where homogeneous boundary conditions are imposed while non-homogeneous boundary conditions are colored in blue.

x x

x

x

o

x x

x x

o

Page 27: Spring 20111 MATLAB Tutorial Series Tuning MATLAB for Better Performance Kadin Tseng Scientific Computing and Visualization, IS&T Boston University.

Spring 2011 27

SOR Update FunctionHow to vectorize it ?1.Remove the for loops2.Define i = ib:2:ie;3.Define j = jb:2:je;4.Use sum for del

% equivalent vector code fragmentjb = 2; je = n+1; ib = 3; ie = m+1;i = ib:2:ie; j = jb:2:je;up = ( u(i ,j+1) + u(i+1,j ) + ... u(i-1,j ) + u(i ,j-1) )*0.25;u(i,j) = (1.0 - omega)*u(i,j) + omega*up;del = del + sum(sum(abs(up-u(i,j))));

% original code fragmentjb = 2; je = n+1; ib = 3; ie = m+1;for i=ib:2:ie for j=jb:2:je up = ( u(i ,j+1) + u(i+1,j ) + ... u(i-1,j ) + u(i ,j-1) )*0.25; u(i,j) = (1.0 - omega)*u(i,j) +omega*up; del = del + abs(up-u(i,j)); endend

More efficient way ?

Page 28: Spring 20111 MATLAB Tutorial Series Tuning MATLAB for Better Performance Kadin Tseng Scientific Computing and Visualization, IS&T Boston University.

Spring 2011 28

Solution Contour PlotSolution Contour Plot

Page 29: Spring 20111 MATLAB Tutorial Series Tuning MATLAB for Better Performance Kadin Tseng Scientific Computing and Visualization, IS&T Boston University.

Spring 2011 29

m Matrix size

Wallclock ssor2Dij

for loops

Wallclockssor2Dji

reverse loops

Wallclockssor2Dv

vector

128 1.01 0.98 0.26

256 8.07 7.64 1.60

512 65.81 60.49 11.27

1024 594.91 495.92 189.05

SOR Update Function

Page 30: Spring 20111 MATLAB Tutorial Series Tuning MATLAB for Better Performance Kadin Tseng Scientific Computing and Visualization, IS&T Boston University.

Spring 2011 30

Integration ExampleIntegration Example• An integration of the cosine function between 0 and

π/2• Integration scheme is mid-point rule for simplicity.• Several parallel methods will be demonstrated.

cos(x) a = 0; b = pi/2; % rangem = 8; % # of incrementsh = (b-a)/m; % incrementp = numlabs; n = m/p; % inc. / workerai = a + (i-1)*n*h;aij = ai + (j-1)*h;

h

x=bx=a

mid-point of increment

p

i

n

j

hij

p

i

n

j

ha

a

b

ahadxxdxx

ij

ij 1 12

1 1

)cos()cos()cos(

Worker 1

Worker 2

Worker 3

Worker 4

Page 31: Spring 20111 MATLAB Tutorial Series Tuning MATLAB for Better Performance Kadin Tseng Scientific Computing and Visualization, IS&T Boston University.

Spring 2011 31

Integration Example — the KernelIntegration Example — the Kernel

function intOut = Integral(fct, a, b, n)%function intOut = Integral(fct, a, b, n)% performs mid-point rule integration of "fct"% fct -- integrand (cos, sin, etc.)% a -- starting point of the range of integration% b –- end point of the range of integration% n -- number of increments% Usage example: >> Integral(@cos, 0, pi/2, 500) % 0 to pi/2

h = (b – a)/n; % increment lengthintOut = 0.0; % initialize integralfor j=1:n % sum integrals aij = a +(j-1)*h; % function is evaluated at mid-interval intOut = intOut + fct(aij+h/2)*h;end

Vector form of the function:function intOut = Integral(fct, a, b, n)h = (b – a)/n;aij = a + (0:n-1)*h;intOut = sum(fct(aij+h/2))*h;

n

j

hij ha

12)cos(

Page 32: Spring 20111 MATLAB Tutorial Series Tuning MATLAB for Better Performance Kadin Tseng Scientific Computing and Visualization, IS&T Boston University.

Spring 2011 32

Integration Example — Serial IntegrationIntegration Example — Serial Integration

% serial integrationtic m = 10000; a = 0; b = pi*0.5; intSerial = Integral(@cos, a, b, m);toc

Page 33: Spring 20111 MATLAB Tutorial Series Tuning MATLAB for Better Performance Kadin Tseng Scientific Computing and Visualization, IS&T Boston University.

Spring 2011 33

Integration Example BenchmarksIntegration Example Benchmarks

Timings (seconds) obtained on a quad-core Xeon X5570 Computation linearly proportional to # of increments. FORTRAN and C timings are an order of magnitude faster.

increment For loop vector

10000 0.011 0.0002

20000 0.022 0.0004

40000 0.044 0.0008

80000 0.087 0.0016

160000 0.1734 0.0033

320000 0.3488 0.0069

Page 34: Spring 20111 MATLAB Tutorial Series Tuning MATLAB for Better Performance Kadin Tseng Scientific Computing and Visualization, IS&T Boston University.

Spring 2011 34

mccmcc is a MATLAB compiler: is a MATLAB compiler: It compiles m-files into C codes, object libraries, It compiles m-files into C codes, object libraries,

or stand-alone executables.or stand-alone executables. A stand-alone executable generated with A stand-alone executable generated with mcc mcc

can run on can run on compatible platformscompatible platforms without an without an installed MATLAB or a MATLAB license.installed MATLAB or a MATLAB license.

Many MATLAB general and toolbox licenses are Many MATLAB general and toolbox licenses are available. Infrequently, MATLAB access may be available. Infrequently, MATLAB access may be denied if all licenses are checked out. Running a denied if all licenses are checked out. Running a stand-alone requires NO licenses and no waiting.stand-alone requires NO licenses and no waiting.

CompileCompilerr

Page 35: Spring 20111 MATLAB Tutorial Series Tuning MATLAB for Better Performance Kadin Tseng Scientific Computing and Visualization, IS&T Boston University.

Compiler (Cont’d)Compiler (Cont’d)

Some compiled codes may run more efficiently Some compiled codes may run more efficiently than m-files because they are not run in than m-files because they are not run in interpretive mode.interpretive mode.

A stand-alone enables you to share it without A stand-alone enables you to share it without revealing the source.revealing the source.

www.bu.edu/tech/research/training/tutorials/matlab/vector/miscs/compiler/

Spring 2011 35

Page 36: Spring 20111 MATLAB Tutorial Series Tuning MATLAB for Better Performance Kadin Tseng Scientific Computing and Visualization, IS&T Boston University.

Compiler (Cont’d)Compiler (Cont’d)

How to build a standalone executableHow to build a standalone executable>> mcc –o ssor2Dijc –m ssor2Dij>> mcc –o ssor2Dijc –m ssor2Dij

How to run ssor2Dijc on KatanaHow to run ssor2Dijc on KatanaKatana% run_ssor2Dijc.sh /usr/local/apps/matlab_2009b Katana% run_ssor2Dijc.sh /usr/local/apps/matlab_2009b 256 256256 256

Details:Details:• The m-file is ssor2Dij.mThe m-file is ssor2Dij.m• Input arguments to code are processed as strings by Input arguments to code are processed as strings by mccmcc. Convert. Convert with with str2numstr2num if need be. ssor2Dij.m requires 2 inputs; if need be. ssor2Dij.m requires 2 inputs; m, nm, n if isdeployed, m=str2num(m); endif isdeployed, m=str2num(m); end• Output cannot be returned; either save to file or display on screen.Output cannot be returned; either save to file or display on screen.• The executable is ssor2DijcThe executable is ssor2Dijc• run_ssor2Dijc.sh is the run script generated by run_ssor2Dijc.sh is the run script generated by mcc.mcc.• None of SOR codes benefits, in runtime, from None of SOR codes benefits, in runtime, from mccmcc..

Spring 2011 36

MATLAB root Input arguments

Page 37: Spring 20111 MATLAB Tutorial Series Tuning MATLAB for Better Performance Kadin Tseng Scientific Computing and Visualization, IS&T Boston University.

Spring 2011 37

profileprofile - profiler to identify “hot spots” for - profiler to identify “hot spots” for performance enhancement.performance enhancement.

mlintmlint - for inconsistencies and suspicious - for inconsistencies and suspicious constructs in M-files.constructs in M-files.

debugdebug - MATLAB debugger. - MATLAB debugger. guideguide - Graphical User Interface design tool. - Graphical User Interface design tool.

MATLAB Programming MATLAB Programming ToolsTools

Page 38: Spring 20111 MATLAB Tutorial Series Tuning MATLAB for Better Performance Kadin Tseng Scientific Computing and Visualization, IS&T Boston University.

Spring 2011 38

To use profile viewer, DONOT start MATLAB with –nojvm optionTo use profile viewer, DONOT start MATLAB with –nojvm option>> profile on –detail 'builtin' –timer 'real' >> profile on –detail 'builtin' –timer 'real' >> % run code to be>> % run code to be>> % profiled here>> % profiled here>> %>> %>> %>> %>> profile viewer % view profiling data>> profile viewer % view profiling data>> profile off % turn off profile>> profile off % turn off profilerr

Profiling example.Profiling example.>> profile on>> profile on>> ssor2Dij % profiling the SOR Laplace solver>> ssor2Dij % profiling the SOR Laplace solver>> profile viewer >> profile viewer >> profile off>> profile off

MATLAB profilerMATLAB profiler

Turns on profiler. –detail ''builtin'' enables MATLAB builtin functions; -timer ''real'' reports wallclock time.

Page 39: Spring 20111 MATLAB Tutorial Series Tuning MATLAB for Better Performance Kadin Tseng Scientific Computing and Visualization, IS&T Boston University.

Spring 2011 39

Two ways to save profiling data:Two ways to save profiling data:

1.1. Save into a directory of HTML filesSave into a directory of HTML files

Viewing is static, i.e., the profiling data displayed correspond to aViewing is static, i.e., the profiling data displayed correspond to a

prescribed set of options. View with a browser.prescribed set of options. View with a browser.

2. Saved as a MAT file2. Saved as a MAT file

Viewing is dynamic; you can change the options. Must be viewedViewing is dynamic; you can change the options. Must be viewed

in the MATLAB environment.in the MATLAB environment.

How to Save Profiling DataHow to Save Profiling Data

Page 40: Spring 20111 MATLAB Tutorial Series Tuning MATLAB for Better Performance Kadin Tseng Scientific Computing and Visualization, IS&T Boston University.

Spring 2011 40

Viewing is static, Viewing is static, i.e., i.e., the profiling data displayed correspond to athe profiling data displayed correspond to aprescribed set of options. View with a browser.prescribed set of options. View with a browser.

>> profile on>> profile on

>> plot(magic(20)) >> plot(magic(20))

>> profile viewer >> profile viewer

>> p = profile('info'); >> p = profile('info');

>> profsave(p, ‘my_profile') % html files in my_profile dir>> profsave(p, ‘my_profile') % html files in my_profile dir

Profiling – save as HTML filesProfiling – save as HTML files

Page 41: Spring 20111 MATLAB Tutorial Series Tuning MATLAB for Better Performance Kadin Tseng Scientific Computing and Visualization, IS&T Boston University.

Spring 2011 41

Viewing is dynamic; you can change the options. Must be viewed inViewing is dynamic; you can change the options. Must be viewed inthe MATLAB environment.the MATLAB environment.

>> profile on>> profile on>> plot(magic(20)) >> plot(magic(20)) >> profile viewer >> profile viewer >> p = profile('info'); >> p = profile('info'); >> save myprofiledata p >> save myprofiledata p >> clear p>> clear p>> load myprofiledata >> load myprofiledata >> profview(0,p) >> profview(0,p)

Profiling – save as MAT fileProfiling – save as MAT file

Page 42: Spring 20111 MATLAB Tutorial Series Tuning MATLAB for Better Performance Kadin Tseng Scientific Computing and Visualization, IS&T Boston University.

Spring 2011 42

mlintmlint is used to identify coding inconsistencies and make is used to identify coding inconsistencies and make coding performance improvement recommendations.coding performance improvement recommendations.

mlintmlint is a standalone utility; it is an option in is a standalone utility; it is an option in profileprofile.. MATLAB editor provides this feature.MATLAB editor provides this feature. Debug mode can also be invoked through editor.Debug mode can also be invoked through editor.

MATLAB “grammar checker”MATLAB “grammar checker”

Page 43: Spring 20111 MATLAB Tutorial Series Tuning MATLAB for Better Performance Kadin Tseng Scientific Computing and Visualization, IS&T Boston University.

Spring 2011 43

Running Running MATLAB MATLAB in Command in Command Line Mode and BatchLine Mode and Batch

Katana% matlab -nodisplay –nosplash –r “n=4, myfile(n); exit”

Add –nojvm to save memory if Java is not required

For batch jobs on Katana, use the above command in thebatch script.

Visit http://www.bu.edu/tech/research/computation/linux-cluster/katana-cluster/runningjobs/ for instructions on running batch jobs.

Page 44: Spring 20111 MATLAB Tutorial Series Tuning MATLAB for Better Performance Kadin Tseng Scientific Computing and Visualization, IS&T Boston University.

Spring 2011

Comment Out Block Of Comment Out Block Of Statements Statements

On occasions, one wants to comment out an entire block of lines.On occasions, one wants to comment out an entire block of lines.

If you use the MATLAB editorIf you use the MATLAB editor::• Select statement block with mouse, thenSelect statement block with mouse, then

– press Ctrl R keys to insert % to column 1 of each line.press Ctrl R keys to insert % to column 1 of each line.– press Ctrl T keys to remove % on column 1 of each line.press Ctrl T keys to remove % on column 1 of each line.

If you use another editor:If you use another editor:• %{%{

n = 3000;n = 3000;

x = rand(n);x = rand(n);

%}%}• if 0if 0

n = 3000;n = 3000;

x = rand(n);x = rand(n);

endend

Page 45: Spring 20111 MATLAB Tutorial Series Tuning MATLAB for Better Performance Kadin Tseng Scientific Computing and Visualization, IS&T Boston University.

Spring 2011 45

• Explicit parallel operationsExplicit parallel operations MATLAB Parallel Computing Toolbox TutorialMATLAB Parallel Computing Toolbox Tutorial www.bu.edu/tech/research/training/tutorials/matlab-pct/

• Implicit parallel operationsImplicit parallel operations– Require shared-memory computer architecture (Require shared-memory computer architecture (i.e., i.e.,

multicore).multicore).– Feature on by default. Turn it off withFeature on by default. Turn it off with katana% matlab –singleCompThreadkatana% matlab –singleCompThread– Specify number of threads with maxNumCompThreads Specify number of threads with maxNumCompThreads To be deprecated. Still supported on Version 2010bTo be deprecated. Still supported on Version 2010b– Activated by vector operation of applications such as Activated by vector operation of applications such as

hyperbolic or trigonometric functions, some LaPACK hyperbolic or trigonometric functions, some LaPACK routines, Level-3 BLAS. routines, Level-3 BLAS.

– See “Implicit Parallelism” section of the above link. See “Implicit Parallelism” section of the above link.

Multiprocessing With Multiprocessing With MATLABMATLAB

Page 46: Spring 20111 MATLAB Tutorial Series Tuning MATLAB for Better Performance Kadin Tseng Scientific Computing and Visualization, IS&T Boston University.

Spring 2011 46

Useful SCV Info Useful SCV Info

SCV home page (http://scv.bu.edu/) (http://scv.bu.edu/) Resource ApplicationsResource Applications ( (https://acct.bu.edu/SCF)) HelpHelp

Web-based tutorials (http://scv.bu.edu/)Web-based tutorials (http://scv.bu.edu/) (MPI, OpenMP, MATLAB, IDL, Graphics tools)(MPI, OpenMP, MATLAB, IDL, Graphics tools) HPC consultations by appointmentHPC consultations by appointment

Kadin Tseng ([email protected]) Kadin Tseng ([email protected]) Doug Sondak ([email protected])Doug Sondak ([email protected])

[email protected], [email protected]@twister.bu.edu, [email protected]

Please help us do better in the future by participating in a quick survey: http://scv.bu.edu/survey/spring11tut_survey.html