This lesson is being piloted (Beta version)

OpenMP Work Sharing Constructs

Overview

Teaching: 20 min
Exercises: 10 min
Questions
  • During parallel code execution, how is the work distributed between threads?

Objectives
  • Learn about the OpenMP constructs for worksharing.

The omp parallel for

...
#pragma omp parallel for
    for (i=0; i < N; i++)
        c[i] = a[i] + b[i];
...

The omp sections

Functional parallelism can be implemented using sections.

...
#pragma omp parallel shared(a,b,c,d) private(i)
  {
#pragma omp sections nowait
    {
#pragma omp section
    for (i=0; i < N; i++)
      c[i] = a[i] + b[i];
#pragma omp section
    for (i=0; i < N; i++)
      d[i] = a[i] * b[i];
    }  /* end of sections */
  }  /* end of parallel region */
...

nowait - do not wait for all threads to finish.

Using parallel sections with different thread counts

Compile the file sections.c and run it with a different number of threads. Start with 1 thread:

srun -c1 ./a.out

In this example there are two sections, and the program prints out which thread is handling which section.

  • What happens if the number of threads and the number of sections are different?
  • More threads than sections?
  • Less threads than sections?

Solution

If there are more threads than sections, only some threads will execute a section. If there are more sections than threads, the implementation defines how the extra sections are executed.

Applying the parallel directive

What will the following code do?

omp_set_num_threads(8);
#pragma omp parallel
for(i=0; i < N; i++){C[i] = A[i] + B[i];}

Answers:

  1. One thread will execute each iteration sequentially
  2. The iterations will be evenly distributed across 8 threads
  3. Each of the 8 threads will execute all iterations sequentially overwriting the values of C.

Solution

The correct answer is 3.

Key Points

  • Data parallelism refers to the execution of the same task simultaneously on multiple computing cores.

  • Functional parallelism refers to the concurrent execution of different tasks on multiple computing cores.