Why collect MPI tracing data?
Collecting MPI tracing data can help you identify places where you have a performance problem in an MPI program that could be due to MPI calls. Examples of possible performance problems are load balancing, synchronization delays, communications bottlenecks.
How is MPI tracing data collected by the Collector?
The Collector interposes wrapper functions on the real MPI functions to trace calls and the time spent in these calls. The MPI functions that are traced are listed in the following table.
MPI_Allgather | MPI_Allgatherv |
MPI_Allreduce | MPI_Alltoall |
MPI_Alltoallv | MPI_Barrier |
MPI_Bcast | MPI_Bsend |
MPI_Gather | MPI_Gatherv |
MPI_Irecv | MPI_Isend |
MPI_Recv | MPI_Reduce |
MPI_Reduce_scatter | MPI_Rsend |
MPI_Scan | MPI_Scatter |
MPI_Scatterv | MPI_Send |
MPI_Sendrecv | MPI_Sendrecv_replace |
MPI_Ssend | MPI_Wait |
MPI_Waitall | MPI_Waitany |
MPI_Waitsome | MPI_Win_fence |
MPI_Win_lock |
What MPI tracing metrics can I see in the Performance Analyzer?
MPI tracing data is converted into the following metrics:
Metric | Definition |
---|---|
MPI Receives | Number of receive operations in MPI functions that receive data |
MPI Bytes Received | Number of bytes received in MPI functions |
MPI Sends | Number of send operations in MPI functions that send data |
MPI Bytes Sent | Number of bytes sent in MPI functions |
MPI Time | Time spent in all calls to MPI functions |
Other MPI Calls | Number of calls to other MPI functions |
The number of bytes recorded as received or sent is the buffer size given in the call. This might be larger than the actual number of bytes received or sent. In the global communication functions and collective communication functions, the number of bytes sent or received is the maximum number, assuming direct interprocessor communication and no optimization of the data transfer or re-transmission of the data.
Similarly, the number of send operations or receive operations might be larger than the actual number of operations performed.
What are the limitations on MPI tracing data collection?
MPI tracing is supported only for the Sun HPC ClusterToolsTM software implementation of MPI.
You cannot collect MPI tracing data from a program that is already running unless the Sampling Collector library, libcollector.so, has been preloaded. See Collecting Performance Data on a Running Process for more information.
For more information, see the Performance Analyzer manual.
See also | |
---|---|
Collecting Performance Data |