This lesson is being piloted (Beta version)

ACENET Summer School - Directive-Based Parallel Programming with OpenMP and OpenACC: Glossary

Key Points

Introduction
  • Shared-memory parallel programs break up large problems into a number of smaller ones and execute them simultaneously

  • OpenMP programs are limited to a single physical machine

  • OpenMP libraries are built into all commonly used C, C++, or Fortran compilers

A Parallel Hello World Program
  • OpenMP pragmas direct the compiler what to parallelize and how to parallelize it.

  • By using the environment variable OMP_NUM_THREADS, it is possible to control how many threads are used.

  • The order in which parallel elements are executed cannot be guaranteed.

  • A compiler that isn’t aware of OpenMP pragmas will compile a single-threaded program.

OpenMP Work Sharing Constructs
  • Data parallelism refers to the execution of the same task simultaneously on multiple computing cores.

  • Functional parallelism refers to the concurrent execution of different tasks on multiple computing cores.

Parallel Operations with Arrays
  • With the parallel for pragma, a loop is executed in parallel

  • When a variable is accessed by different threads, incorrect results can be produced.

  • Each thread receives a copy of the variable created by the private clause

Race Conditions with OpenMP
  • Race conditions can be avoided by using the omp critical or the omp atomic directives

  • The best option to parallelize summation is to use the reduction directive

OpenMP Tasks
  • OpenMP can manage general parallel tasks

  • tasks now allow the parallelization of applications exhibiting irregular parallelism such as recursive algorithms, and pointer-based data structures.

Calculating Many-Body Potentials
  • Writing vectorization-friendly code can take additional time but is mostly worth the effort

  • Different loop scheduling may compensate for unbalanced loop iterations

Programming GPUs with OpenMP
  • OpenMP offers a quick path to accelerated computing with less programming effort

Introduction to GPU Programming with OpenACC
  • OpenACC offers a quick path to accelerated computing with less programming effort

Glossary

compiler
A program that translates statements written in a programming language into machine language used by CPU.
fork
A system call used for creating a new process running concurrently with the process calling fork.
pointer
A special variable used to store address of a variable rather than its value.
pragma
The method specified by the C standard for providing additional information to the compiler, beyond what is conveyed in the language itself.
race condition
A data race occurs when two instructions from different threads access the same memory location, at least one of these accesses is a write and there is no synchronization that is mandating any particular order among these accesses.
thread
The smallest unit of computing that can be scheduled by an operating system.