ACENET Summer School - General: Glossary

Key Points

Introduction
  • Parallel computing is much better suited for modelling, simulating and understanding complex, real-world phenomena.

  • Modern computers have several levels of parallelism

Parallel Computer Architecture
  • Sequential computers, including modern CPU cores, resemble very well von Neumann’s 1945 design.

  • Parallel computers can be characterized as collections of von Neumann CPUs.

  • Parallel computers may be shared-memory, distributed-memory, or both.

Parallel Programming Models
  • There are many layers of parallelism in modern computer systems

  • An application can implement any or all of vectorization, multithreading, and message passing

Performance and Scalability
  • An increase of the number of processors usually leads to a decrease in efficiency.

  • An increase of problem size usually leads to an increase in efficiency.

  • A parallel problem can often be solved efficiently by increasing the number of processors and the problem size simultaneously. This is called “weak scaling”.

  • Not every problem is amenable to weak scaling.

Independent Tasks and Job Schedulers
Input and Output
Analyzing Performance Using a Profiler
  • Don’t start to parallelize or optimize your code without having used a profiler first.

  • A programmer can easily spend many hours of work “optimizing” a part of the code which eventually speeds up the program by only a minuscule amount.

  • When viewing the profiler report, look for areas where the largest amounts of CPU time are spent, working your way down.

  • Pay special attention to areas that you didn’t expect to be slow.

Thinking in Parallel
  • Adapting a sequential code so it will run efficiently parallel needs both planning and experimentation.

  • It is vital to first understand both the problem and the sequential algorithm.

  • Shorter independent tasks need more overall communication.

  • Longer tasks and large variations in task-length can cause resources to be left unused.

  • Domain Decomposition can be used in many cases to reduce communication by processing short-range interactions locally.

  • There are many textbooks and publications that describe different parallel algorithms. Try finding existing solutions for similar problems.

Glossary

FIXME