Top Banner
Lecture 12: OpenMP Abhinav Bhatele, Department of Computer Science Introduction to Parallel Computing (CMSC498X / CMSC818X)
23

Lecture 12: OpenMP · 2020. 10. 23. · Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING User-specified loop scheduling • Schedule clause • type: static, dynamic, guided, runtime

Dec 30, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Lecture 12: OpenMP · 2020. 10. 23. · Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING User-specified loop scheduling • Schedule clause • type: static, dynamic, guided, runtime

Lecture 12: OpenMPAbhinav Bhatele, Department of Computer Science

Introduction to Parallel Computing (CMSC498X / CMSC818X)

Page 2: Lecture 12: OpenMP · 2020. 10. 23. · Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING User-specified loop scheduling • Schedule clause • type: static, dynamic, guided, runtime

Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING

Announcements

• Use office hours

• If you foresee not being able to complete assignments for a valid reason, email me asap instead of after the deadline

2

Page 3: Lecture 12: OpenMP · 2020. 10. 23. · Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING User-specified loop scheduling • Schedule clause • type: static, dynamic, guided, runtime

Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING

saxpy (single precision a*x+y) example

3

for (int i = 0; i < n; i++) { z[i] = a * x[i] + y[i];}

Page 4: Lecture 12: OpenMP · 2020. 10. 23. · Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING User-specified loop scheduling • Schedule clause • type: static, dynamic, guided, runtime

Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING

saxpy (single precision a*x+y) example

3

for (int i = 0; i < n; i++) { z[i] = a * x[i] + y[i];}

#pragma omp parallel for

Page 5: Lecture 12: OpenMP · 2020. 10. 23. · Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING User-specified loop scheduling • Schedule clause • type: static, dynamic, guided, runtime

Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING

Overriding defaults using clauses

• Specify how data is shared between threads executing a parallel region

• private(list)

• shared(list)

• default(shared | none)

• reduction(operator: list)

• firstprivate(list)

• lastprivate(list)

4

https://www.openmp.org/spec-html/5.0/openmpsu106.html#x139-5540002.19.4

Page 6: Lecture 12: OpenMP · 2020. 10. 23. · Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING User-specified loop scheduling • Schedule clause • type: static, dynamic, guided, runtime

Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING

private clause

• Each thread has its own copy of the variables in the list

• Private variables are uninitialized when a thread starts

• The value of a private variable is unavailable to the master thread after the parallel region has been executed

5

Page 7: Lecture 12: OpenMP · 2020. 10. 23. · Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING User-specified loop scheduling • Schedule clause • type: static, dynamic, guided, runtime

Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING

default clause

• Determines the data sharing attributes for variables for which this would be implicitly determined otherwise

6

Page 8: Lecture 12: OpenMP · 2020. 10. 23. · Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING User-specified loop scheduling • Schedule clause • type: static, dynamic, guided, runtime

Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING

Anything wrong with this example?

7

val = 5;

#pragma omp parallel for private(val)for (int i = 0; i < n; i++) { ... = val + 1;}

Page 9: Lecture 12: OpenMP · 2020. 10. 23. · Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING User-specified loop scheduling • Schedule clause • type: static, dynamic, guided, runtime

Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING

Anything wrong with this example?

7

val = 5;

#pragma omp parallel for private(val)for (int i = 0; i < n; i++) { ... = val + 1;}

The value of val will not be available to threads inside the loop

Page 10: Lecture 12: OpenMP · 2020. 10. 23. · Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING User-specified loop scheduling • Schedule clause • type: static, dynamic, guided, runtime

Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING

Anything wrong with this example?

8

#pragma omp parallel for private(val)for (int i = 0; i < n; i++) { val = i + 1;}

printf(“%d\n”, val);

Page 11: Lecture 12: OpenMP · 2020. 10. 23. · Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING User-specified loop scheduling • Schedule clause • type: static, dynamic, guided, runtime

Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING

Anything wrong with this example?

8

#pragma omp parallel for private(val)for (int i = 0; i < n; i++) { val = i + 1;}

printf(“%d\n”, val);

The value of val will not be available to the master thread outside the

loop

Page 12: Lecture 12: OpenMP · 2020. 10. 23. · Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING User-specified loop scheduling • Schedule clause • type: static, dynamic, guided, runtime

Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING

firstprivate clause

• Initializes each thread’s private copy to the value of the master thread’s copy

9

val = 5;

#pragma omp parallel for firstprivate(val)for (int i = 0; i < n; i++) { ... = val + 1;}

Page 13: Lecture 12: OpenMP · 2020. 10. 23. · Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING User-specified loop scheduling • Schedule clause • type: static, dynamic, guided, runtime

Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING

lastprivate clause

• Writes the value belonging to the thread that executed the last iteration of the loop to the master’s copy

• Last iteration determined by sequential order

10

Page 14: Lecture 12: OpenMP · 2020. 10. 23. · Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING User-specified loop scheduling • Schedule clause • type: static, dynamic, guided, runtime

Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING

lastprivate clause

• Writes the value belonging to the thread that executed the last iteration of the loop to the master’s copy

• Last iteration determined by sequential order

10

#pragma omp parallel for lastprivate(val)for (int i = 0; i < n; i++) { val = i + 1;}

printf(“%d\n”, val);

Page 15: Lecture 12: OpenMP · 2020. 10. 23. · Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING User-specified loop scheduling • Schedule clause • type: static, dynamic, guided, runtime

Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING

reduction(operator: list) clause• Reduce values across private copies of a variable

• Operators: +, -, *, &, |, ^, &&, ||, max, min

11

#pragma omp parallel forfor (int i = 0; i < n; i++) { val += i;}

printf(“%d\n”, val);

https://www.openmp.org/spec-html/5.0/openmpsu107.html#x140-5800002.19.5

Page 16: Lecture 12: OpenMP · 2020. 10. 23. · Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING User-specified loop scheduling • Schedule clause • type: static, dynamic, guided, runtime

Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING

reduction(operator: list) clause• Reduce values across private copies of a variable

• Operators: +, -, *, &, |, ^, &&, ||, max, min

11

#pragma omp parallel forfor (int i = 0; i < n; i++) { val += i;}

printf(“%d\n”, val);

reduction(+: val)

https://www.openmp.org/spec-html/5.0/openmpsu107.html#x140-5800002.19.5

Page 17: Lecture 12: OpenMP · 2020. 10. 23. · Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING User-specified loop scheduling • Schedule clause • type: static, dynamic, guided, runtime

Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING

User-specified loop scheduling

• Schedule clause

• type: static, dynamic, guided, runtime

• static: iterations divided as evenly as possible (#iterations/#threads)

• chunk < #iterations/#threads can be used to interleave threads

• dynamic: assign a chunk size block to each thread

• When a thread is finished, it retrieves the next block from an internal work queue

• Default chunk size = 1

12

schedule (type[, chunk])

Page 18: Lecture 12: OpenMP · 2020. 10. 23. · Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING User-specified loop scheduling • Schedule clause • type: static, dynamic, guided, runtime

Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING

Other schedules

• guided: similar to dynamic but start with a large chunk size and gradually decrease it for handling load imbalance between iterations

• auto: scheduling delegated to the compiler

• runtime: use the OMP_SCHEDULE environment variable

13

https://software.intel.com/content/www/us/en/develop/articles/openmp-loop-scheduling.html

Page 19: Lecture 12: OpenMP · 2020. 10. 23. · Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING User-specified loop scheduling • Schedule clause • type: static, dynamic, guided, runtime

Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING

Calculate the value of

14

π = ∫1

0

41 + x2

int main(int argc, char *argv[]){ ... n = 10000;

h = 1.0 / (double) n; sum = 0.0;

for (i = 1; i <= n; i += 1) { x = h * ((double)i - 0.5); sum += (4.0 / (1.0 + x * x)); } pi = h * sum;

...}

Page 20: Lecture 12: OpenMP · 2020. 10. 23. · Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING User-specified loop scheduling • Schedule clause • type: static, dynamic, guided, runtime

Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING

Calculate the value of

15

π = ∫1

0

41 + x2

int main(int argc, char *argv[]){ ... n = 10000; h = 1.0 / (double) n; sum = 0.0;

#pragma omp parallel for firstprivate(h) private(x) reduction(+: sum) for (i = 1; i <= n; i += 1) { x = h * ((double)i - 0.5); sum += (4.0 / (1.0 + x * x)); } pi = h * sum;

...}

Page 21: Lecture 12: OpenMP · 2020. 10. 23. · Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING User-specified loop scheduling • Schedule clause • type: static, dynamic, guided, runtime

Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING

Parallel region

• All threads execute the structured block

• Number of threads can be specified just like the parallel for directive

16

#pragma omp parallel [clause [clause] ... ] structured block

Page 22: Lecture 12: OpenMP · 2020. 10. 23. · Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING User-specified loop scheduling • Schedule clause • type: static, dynamic, guided, runtime

Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING

Synchronization

• Concurrent access to shared data may result in inconsistencies

• Use mutual exclusion to avoid that

• critical directive

• atomic directive

• Library lock routines

17

https://software.intel.com/content/www/us/en/develop/documentation/advisor-user-guide/top/appendix/adding-parallelism-to-your-program/replacing-annotations-with-openmp-code/adding-openmp-code-to-synchronize-the-shared-resources.html

Page 23: Lecture 12: OpenMP · 2020. 10. 23. · Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING User-specified loop scheduling • Schedule clause • type: static, dynamic, guided, runtime

Abhinav Bhatele

5218 Brendan Iribe Center (IRB) / College Park, MD 20742

phone: 301.405.4507 / e-mail: [email protected]