Top Banner
OpenMP – Introduction* *UHEM yaz çalıştayı notlarından derlenmiştir. (uhem.itu.edu.tr)
29

OpenMP – Introduction* *UHEM yaz çalıştayı notlarından derlenmiştir. (uhem.itu.edu.tr)

Dec 31, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: OpenMP – Introduction* *UHEM yaz çalıştayı notlarından derlenmiştir. (uhem.itu.edu.tr)

OpenMP – Introduction*

*UHEM yaz çalıştayı notlarından derlenmiştir. (uhem.itu.edu.tr)

Page 2: OpenMP – Introduction* *UHEM yaz çalıştayı notlarından derlenmiştir. (uhem.itu.edu.tr)

Outline

• What is OpenMP?– Introduction (Code Structure, Directives, Threads etc.)– Limitations– Data Scope Clauses

• Shared, Private

– Work-sharing constructs– Synchronization

Page 3: OpenMP – Introduction* *UHEM yaz çalıştayı notlarından derlenmiştir. (uhem.itu.edu.tr)

What is OpenMP?

• An Application Program Interface (API) that may be used to explicitly direct multithreaded, shared memory parallelism

• OpenMP is managed by the nonprofit technology consortium OpenMP Architecture Review Board (or OpenMP ARB), jointly defined by a group of major computer hardware and software vendors, including AMD, IBM, Intel, Cray, HP, Fujitsu, Nvidia, NEC, Red Hat, Texas Instruments, Oracle Corporation, and more.

• Portable & Standardized– API exist both C/C++ and Fortan 90/77– Multi platform Support (Unix, Linux etc.)

Page 4: OpenMP – Introduction* *UHEM yaz çalıştayı notlarından derlenmiştir. (uhem.itu.edu.tr)

OpenMP Specifications

• Version 3.1, Complete Specifications, July 2011

• Version 3.0, May 2008

• Version 2.5, May 2005 (C/C++ & Fortran)

• Version 2.0– C/C++, March 2002– Fortran, November 2000

• Version 1.0– C/C++, October 1998– Fortran, October 1997

Detailed Info: http://www.openmp.org/wp/openmp-specifications/

Page 5: OpenMP – Introduction* *UHEM yaz çalıştayı notlarından derlenmiştir. (uhem.itu.edu.tr)

Intel & GNU OpenMP

• Intel Compilers– OpenMP 2.5 conforming

– Nested parallelisim

– Workqueuing extension to OpenMP

– Interoperability with POSIX and Windows threads

– OMP_DYNAMIC support

• GNU OpenMP (OpenMP+gcc)– OpenMP 3.0 Support (gcc 4.4 and later)

Page 6: OpenMP – Introduction* *UHEM yaz çalıştayı notlarından derlenmiştir. (uhem.itu.edu.tr)

OpenMP Programming Model

• Explicit parallelism

• Thread based parallelism; program runs with user specified number of multiple thread

• Uses fork & join model

Synchronization Point (“barrier”, “critical region”, “single processor region”)

Page 7: OpenMP – Introduction* *UHEM yaz çalıştayı notlarından derlenmiştir. (uhem.itu.edu.tr)

• Shared Memory Model– Each thread must be reach a shared memory (SMP)

Limitations of OpenMP

Page 8: OpenMP – Introduction* *UHEM yaz çalıştayı notlarından derlenmiştir. (uhem.itu.edu.tr)

Terminology and Behavior

• OpenMP Team = Master + Worker

• Parallel Region is a block of code executed by all threads simultaneously (has implicit barrier)– The master thread always has

thread id 0– Parallel regions can be nested– If clause can be used to guard the

parallel region

Page 9: OpenMP – Introduction* *UHEM yaz çalıştayı notlarından derlenmiştir. (uhem.itu.edu.tr)

Terminology and Behavior

• A Work-Sharing construct divides the execution of the enclosed code region among the members of the team. (Loop, Section etc.)

Page 10: OpenMP – Introduction* *UHEM yaz çalıştayı notlarından derlenmiştir. (uhem.itu.edu.tr)

OpenMP Code Structure

#include <omp.h>main () {int var1, var2, var3;/* Serial code */.../* Beginning of parallel section. Fork a team of threads.Specify variable scoping */

#pragma omp parallel private(var1, var2) shared(var3){Parallel section executed by all threads..All threads join master thread and disband}

/* Resume serial code */..}

#include <omp.h>main () {int var1, var2, var3;/* Serial code */.../* Beginning of parallel section. Fork a team of threads.Specify variable scoping */

#pragma omp parallel private(var1, var2) shared(var3){Parallel section executed by all threads..All threads join master thread and disband}

/* Resume serial code */..}

C/C++

Page 11: OpenMP – Introduction* *UHEM yaz çalıştayı notlarından derlenmiştir. (uhem.itu.edu.tr)

OpenMP Directives

Format in C/C++:

#pragma omp: Required for all OpenMP C/C++ directives.

directivename: A valid OpenMP directive. Must appear after the pragma and before any clauses.

[clause, ...] : Optional. Clauses can be in any order, and repeated as necessary unless otherwise restricted.

#pragma omp directivename [clause, ...] \

Page 12: OpenMP – Introduction* *UHEM yaz çalıştayı notlarından derlenmiştir. (uhem.itu.edu.tr)

OpenMP Directives

Example:

General Rules: Directives follow conventions of the C/C++ standards for

compiler directives. Case sensitive Only one directivename may be specified per directive Long directive lines can be "continued" on succeeding lines

by escaping the newline character with a backslash ("\") at the end of a directive line.

#pragma omp parallel default(shared) private(beta,pi)

Page 13: OpenMP – Introduction* *UHEM yaz çalıştayı notlarından derlenmiştir. (uhem.itu.edu.tr)

OpenMP Directives

PARALLEL Region Construct: A parallel region is a block of code that will be executed by

multiple threads. This is the fundamental OpenMP parallel construct.

#pragma omp parallel [clause ...]

Page 14: OpenMP – Introduction* *UHEM yaz çalıştayı notlarından derlenmiştir. (uhem.itu.edu.tr)

OpenMP Directives

C/C++ OpenMP structured block definition.

#pragma omp parallel [clause ...]{ structured_block}

Page 15: OpenMP – Introduction* *UHEM yaz çalıştayı notlarından derlenmiştir. (uhem.itu.edu.tr)

When a thread reaches a PARALLEL directive

• It creates a term of threads and becomes the master of the team

• The master is a member of that team, it has thread number 0 within that team (THREAD ID)

• Starting from the beginning of this parallel region, the code is duplicated and all threads will execute that code

• There is an implied barrier at the end of a parallel section

• Only the master thread continues execution past this point

Page 16: OpenMP – Introduction* *UHEM yaz çalıştayı notlarından derlenmiştir. (uhem.itu.edu.tr)

Lab: Helloworld

Page 17: OpenMP – Introduction* *UHEM yaz çalıştayı notlarından derlenmiştir. (uhem.itu.edu.tr)

Lab: Compiling Helloworld

$ gcc -fopenmp omp_hello.c -o omp_hello

$ export OMP_NUM_THREADS=2

$ ./omp_helloHello World from thread = 0Hello World from thread = 1

Page 18: OpenMP – Introduction* *UHEM yaz çalıştayı notlarından derlenmiştir. (uhem.itu.edu.tr)

Lab: Helloworld

• Set environment variables (export)

• Run your OpenMP compile

bash: $ ./omp_helloHello OpenMP!Hello OpenMP!Hello OpenMP!Hello OpenMP!

bash: $ export OMP_NUM_THREADS=4

Optional Exercise:1 - set OMP_NUM_THREADS to an higher value (such as 10)2- repeat example.

Page 19: OpenMP – Introduction* *UHEM yaz çalıştayı notlarından derlenmiştir. (uhem.itu.edu.tr)

OpenMP Constructs

Page 20: OpenMP – Introduction* *UHEM yaz çalıştayı notlarından derlenmiştir. (uhem.itu.edu.tr)

Data Scope Attribute Clauses

• SHARED Clause:– It declares variables in its list to be shared to each thread.– Behavior

• The pointer of the object of the same type is declared once for each thread in the team

• All threads reference to the original object

shared (list)

C/C++

Page 21: OpenMP – Introduction* *UHEM yaz çalıştayı notlarından derlenmiştir. (uhem.itu.edu.tr)

Data Scope Attribute Clauses

• PRIVATE Clause:– It declares variables in its list to be private to each thread.– Behavior

• A new object of the same type is declared once for each thread in the team

• All references to the original object are replaced with references to the new object

• Variables declared PRIVATE are uninitialized for each thread (FIRSTPRIVATE can be used for initialization of variables)

private (list)

C/C++

Page 22: OpenMP – Introduction* *UHEM yaz çalıştayı notlarından derlenmiştir. (uhem.itu.edu.tr)

A work-sharing construct divides the execution of the enclosed code region among the members of team that encounter it.

Must be enclosed in a parallel region otherwise it is simply ignored.

Work-sharing constructs do not launch/create new threads.

There is no implied barrier upon entry to a work-sharing construct. However there is an implicit barrier at the end of a work-sharing construct.

Work-Sharing Constructs

Page 23: OpenMP – Introduction* *UHEM yaz çalıştayı notlarından derlenmiştir. (uhem.itu.edu.tr)

• Types

Work-Sharing Constructs

Page 24: OpenMP – Introduction* *UHEM yaz çalıştayı notlarından derlenmiştir. (uhem.itu.edu.tr)

shares iterations of a loop across the team. Represents a type of "data parallelism".

breaks work into separate, discrete sections. Each section is

executed by a thread. Can be used to implement a type of

"functional parallelism".

serializes a section of code

Work-Sharing Constructs

Page 25: OpenMP – Introduction* *UHEM yaz çalıştayı notlarından derlenmiştir. (uhem.itu.edu.tr)

Work-Sharing Constructs

• for directive (C/C++)

#pragma omp for [clause ...] { for_loop}

Page 26: OpenMP – Introduction* *UHEM yaz çalıştayı notlarından derlenmiştir. (uhem.itu.edu.tr)

• schedule clause: schedule(kind [,chunk_size])– static: less overhead, default on many OpenMP compilers– dynamic & guided: useful for poorly balanced and unpredictable

workload. In guided the size of chunk decreases over time.– runtime: If this schedule is selected, the decision regarding

scheduling kind is made at run time. The schedule and (optional) chunk size are set through the OMP_SCHEDULE environment variable.

Work-Sharing Constructs

Page 27: OpenMP – Introduction* *UHEM yaz çalıştayı notlarından derlenmiştir. (uhem.itu.edu.tr)

• schedule clause:– describes how iterations of the loop are divided

among the threads in the team

Loop iterations are divided into pieces of size chunk statically

When a thread finishes one chunk, it is dynamically assigned another. The default chunk size is 1.

The chunk size is exponentially reduced with each dispatched piece of the iteration space. The default chunk size is 1.

Work-Sharing Constructs

Page 28: OpenMP – Introduction* *UHEM yaz çalıştayı notlarından derlenmiştir. (uhem.itu.edu.tr)

• nowait (C/C++) clause:– If specified, then threads do not synchronize at the end of the

parallel loop. Threads proceed directly to the next statements after the loop.

Work-Sharing Constructs

Page 29: OpenMP – Introduction* *UHEM yaz çalıştayı notlarından derlenmiştir. (uhem.itu.edu.tr)

Work-Sharing Lab : nowait