Theodore B. Tabe
Quentin F. Stout
Computer Science and Engineering, University of Michigan
Abstract: The statistical analysis of traces taken from the NAS Parallel Benchmarks (NPB) can tell one much about the type of network traffic that can be expected from scientific applications run on distributed-memory parallel computers. For instance, such applications utilize a relatively few number of communication library functions, the length of their messages is widely varying, they use many more short messages than long ones, and within a single application the messages tend to follow simple patterns. Information such as this can be used by software and hardware designers to optimize their systems for the highest possible performance.
We analyze both the static and dynamic aspects of the MPI procedure calls in these benchmarks, characterizing the routines that are utilized and the message lengths involved. We also provide the communication kernals of these benchmarks, showing that they have loop-based structures that are easily characterized.
Keywords: parallel computing, benchmarks, trace analysis, message-passing, distributed memory parallel computer, communication patterns, MPI, NAS Parallel Benchmarks, performance analysis, performance skeleton, MPI_SEND, MPI_RECV, MPI_BCAST, MPI_ALLREDUCE, MPI_REDUCE, MPI_WAIT
Complete paper. This paper is University of Michigan Computer Science and Engineering technical report CSE-TR-386-99.
An analysis of the performance of MPI collective communication, such as MPI_ALLTOALL, showing the deleterious effects of operating system jitter. Julia Lipman did a follow-up mathematical analysis of this.
A modest explanation of parallel computing, a tutorial, Parallel Computing 101, and a list of parallel computing resources.
An overview of our work, and relevant papers in parallel computing.
|Copyright © 2004-2017 Quentin F. Stout.|