COMMTEST IBM p690 Point-to-Point Communication Performance
Performance Studies using
COMMTEST
IBM p690 SWAP Performance
(unordered swap using MPI within a node)
(performance measured per processor when all processors in node participating in a ring shift)
The following figures are concatenations of the estimated bandwidth
graphs for the 8KB, 128KB, and 2MB experiments. While these experiments
were not designed for determining achieveable bandwidth for
message sizes other than 8 KBytes, 128 KBytes, or 8 MBytes, they are no more
ill-designed for this purpose than most pingpong and pingping tests.
These figures present one measure of the accuracy of these estimates.
If the bandwidths for the same message sizes agree between the different
experiments, it increases our faith in the validity of the estimates. Where
they do not agree, the interpretation of which figure is most accurate depends
on whether a single or small number of repetitions is likely to be more
representative of what would be seen in practice than a larger number
of repetitions. Note that measurement of a single
repetition is more difficult to measure accurately, at least for the small
message sizes, and may suffer more variability in timings. All of these
experiments were repeated three times, however, and only the minimum times
are presented.
unordered swap using nonblocking send
unordered swap using nonblocking receive
unordered swap using nonblocking send and receive
unordered swap using ready send
unordered swap using nonblocking ready send
native sendrecv
unordered swap using nonblocking sync. send
unordered swap using nonblocking receive with sync. send
unordered swap using nonblocking sync. send and receive