put logo here
CSMD
people
people
sitemap
search

Performance of COMMTEST on the NEC SX-6

NEC SX6 Performance Evaluation

COMMTEST is a suite of codes that measure the performance of MPI interprocessor communication. The codes differ from other similar test suites in their emphasis on evaluating the impact of communication protocol and packet size in the context of "common usage". However, the performance we report here should be similar to that measured using other codes. For more detail on COMMTEST, visit http://www.csm.ornl.gov/~worley/studies/commtest.html.

The first set of graphs describe the observed peak bidirectional bandwidth on the SX-6/8 for four different experiments:

  1. process 0 swaps data with process 1
  2. process i swaps data with process i+1 for i=0,2,...,P-1. For the SX-6/8, four pairs of processes (0-1, 2-3, 4-5, 6-7) are swapping data simultaneously.
  3. process i swaps data with process i+P/2 for i=0,...,P/2-1. For the SX-6/8, four pairs of processes (0-4, 1-5, 2-6, 3-7) are swapping data simultaneously.
  4. process i receives data from process i-1 and sends data to process i+1 in a P-processor ring. For the SX-6/8, we denote this by 0-1-2-3-4-5-6-7-0 .

From the above figure, all four experiments appear to have similar performance, but it is difficult to see differences using a log scale with such a large range. The following three figures describe performance using log-linear plots for three different maximum message sizes.

From these data, the additional coordination required in the ring experiment decreases its performance compared to the other experiments, except for the largest message sizes. In contrast, the first three experiments have identical performance except for the largest message sizes. For messages larger than 32KB, the performance for the three experiments involving all processors falls behind that of the experiment using only two processes. A change of communication protocol can be observed between messages of size 64KB and 128KB for all experiments.

The next set of graphs compare MPI communication performance on SX-6/8 with that of other SMP nodes used in high performance scientific computing. As before, the first figure uses a log-log plot, while subsequent figures use log-linear plots and different maximum message sizes.

From these data, the IBM p690 and NEX SX-6/8 exhibit the same performance for messages of size 2KB or less. For larger messages, the SX-6/8 performs significantly better, reaching 7 times the performance of the p690 when swapping 2MB messages.

The final set of figures compares MPI communication performance between the SX-6/8 and the other platforms for the second experiment. For this experiment, the number of process pairs communicating simultaneously varies with the size of the SMP node. It is still a relevant comparison as such simultaneous communications typify global communication operators such as MPI_ALLTOALLV and MPI_ALLREDUCE.

From these data, performance on the SX-6/8 is never worse than the performance on the other systems, and is as much as 20 times better than the others when swapping 2MB messages.

ornl | ccs | csm| disclaimer | search

URL http://www.csm.ornl.gov/evaluation/SX6/COMMTEST.SX6.html
Updated: Friday, 12-Jul-2002 10:01:57 EDT
webmaster