MPI Performance Analysis and Operating System Jitter

Statistical Analysis of Communication Time on the IBM SP2

Theodore B. Tabe Janis Hardwick Quentin F. Stout
University of Michigan

Extended Abstract: For parallel computers, the execution time of communication routines is an important determinate of users' performance. We measured the MPL and MPI performance of the IBM SP2, observing that the higher-level collective communication routines such as MPI_ALLTOALL show a drop in performance as the number of processors involved in the communication increases. While a few others have also studied the SP2's communication performance, they have reported only average performance, and failed to comment on the drop in performance or determined its causes.

We generated a distribution of times for these routines and developed a simulator in an attempt to recreate the observed distribution. By studying distributions of communication times and by refining the simulator, we were able to discern that the performance decrease is due to the variation in the communication times of the lower-level send-receive primitives upon which the higher-level communication routines are built. This variation is in turn caused by the deleterious effects of interrupts generated by an operating system (AIX) which is not tuned to high-performance parallel computing. The interupts degrade performance in an additive manner, spreading their effects throughout the system. This behavior is sometimes known as jitter, and its elimination is necessary in order for systems to be able to efficiently use thousands of processors. Modern systems recognize this as a serious performance issue and aim to reduce the jitter caused by the run-time system.

Another aspect of this analysis is that the time of the interupts have a distribution that is heavy-tailed. This was slightly surprising and again is due to the operating system being designed for workstations with a wide variety of system function requirements.

Our results were obtained for IBM's MPL message-passing library, which is currently the most highly tuned of the communication libraries available. However, other measurements show that the same results hold for the MPI (Message Passing Interface) library.

Keywords: collective communication, performance evaluation, all-to-all, MPI_ALLTOALL, MPI_SEND, benchmarking, message passing, parallel computer, communication overhead, operating system jitter, interrupts, heavy tail distribution

Complete paper. This paper appears in Computing Science and Statistics 27 (1995), pp. 347-351.

Related work

Theoretical limits: A paper analyzing the limits of hypercube performance for all-to-all communication, using a model more powerful than the SP2, is here.
Performance analysis: This work motivated an analysis of synchronization delays caused by stoichastic behavior .

My papers in parallel computing, and an overview of my research.
Parallel computing: a brief explanation of parallel computing, a tutorial, Parallel Computing 101, and a list of parallel computing resources.