PSTSWM AlphaSC-500 Point-to-Point Communication Performance

Performance Studies using

PSTSWM


Compaq AlphaServer SC SWAP Performance

(unordered swap of 2MB message using MPI within a node)

(performance measured per processor when all processors in node communicating)

Date/Person: January 26, 2000 / P. Worley
Platform: Compaq AlphaServer SC at Oak Ridge National Laboratory (colt.ccs.ornl.gov):
     16 ES40 4-way SMP nodes (500 MHz Alpha 21264 with 4MB L2 cache)
Environment: Digital UNIX V5.0;   RMS 2.36
Communication Library: MPI
SWAP size: 262144 REAL*8 floating point values each direction
Message size: Largest - 262144 REAL*8 floating point values
Smallest - 256 REAL*8 floating point values
Processors: 0 and 1
2 and 3
Latency Definition:(T1024-T512)/512
Model Error Range:[1,1024]
Results:

unordered simple swap
Data Statistics
bidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 194.70 104.27 94.4%
10 iter. 203.48 104.99 63.5%

unordered swap using nonblocking send
Data Statistics
bidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 239.97 105.08 63.8%
10 iter. 239.55 102.26 61.8%
1 iter. w/overlap 258.27 105.96 58.7%
10 iter. w/overlap 237.30 103.56 62.0%

unordered swap using nonblocking receive
Data Statistics
bidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 282.19 98.38 50.1%
10 iter. 239.76 100.68 64.3%
1 iter. w/overlap 285.39 86.14 37.3%
10 iter. w/overlap 286.10 82.46 31.6%

unordered swap using nonblocking send and receive
Data Statistics
bidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 278.13 98.95 47.6%
10 iter. 240.81 97.30 60.8%
1 iter. w/overlap 284.46 86.05 34.7%
10 iter. w/overlap 284.53 84.60 31.5%

unordered swap using ready send
Data Statistics
bidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 282.33 116.60 35.7%
10 iter. 284.92 120.26 40.0%
1 iter. w/overlap 284.82 86.89 37.6%
10 iter. w/overlap 286.22 83.25 32.0%

unordered swap using nonblocking ready send
Data Statistics
bidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 282.47 115.11 35.9%
10 iter. 283.94 117.08 38.1%
1 iter. w/overlap 268.39 86.73 40.0%
10 iter. w/overlap 286.23 84.52 30.1%

native sendrecv
Data Statistics
bidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 283.85 98.42 49.3%
10 iter. 240.46 100.94 63.9%

unordered swap using nonblocking sync. send
Data Statistics
bidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 281.71 99.72 40.4%
10 iter. 238.18 100.43 55.7%
1 iter. w/overlap 230.43 105.90 57.0%
10 iter. w/overlap 236.09 108.27 60.4%

unordered swap using nonblocking receive with sync. send
Data Statistics
bidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 284.00 99.07 41.8%
10 iter. 237.60 98.89 53.2%
1 iter. w/overlap 284.34 99.47 42.2%
10 iter. w/overlap 284.30 102.78 41.7%

unordered swap using nonblocking sync. send and receive
Data Statistics
bidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 275.13 97.62 43.7%
10 iter. 239.56 97.61 52.9%
1 iter. w/overlap 270.17 88.52 35.3%
10 iter. w/overlap 285.79 89.83 33.5%


Protocol Sensitivity Summary for Bidirectional Swap of 2097152 Bytes (1 iterations/no overlap)
Runtime Statistics
Msg Sizemin Secmin Sec/Msgmax MBytes/Sec(mean-min)/min(median-min)/min(max-min)/min
  2048   9.9639e-02   9.7304e-05   42.09   0.07   0.03   0.27 
  4096   4.9137e-02   9.5971e-05   85.36   0.10   0.06   0.37 
  8192   2.6687e-02   1.0425e-04   157.17   0.08   0.06   0.23 
  16384   2.1095e-02   1.6480e-04   198.83   0.04   0.03   0.18 
  32768   1.9455e-02   3.0398e-04   215.59   0.07   0.06   0.22 
  65536   1.6660e-02   5.2061e-04   251.76   0.10   0.08   0.29 
  131072   1.5529e-02   9.7059e-04   270.09   0.14   0.14   0.39 
  262144   1.5418e-02   1.9273e-03   272.04   1.67   0.13   15.67 
  524288   1.5129e-02   3.7823e-03   277.23   2.68   0.26   24.76 
  1048576   1.4849e-02   7.4244e-03   282.47   0.69   0.16   5.39 
  2097152   1.4769e-02   1.4769e-02   284.00   0.39   0.01   3.35 
Five Fastest
Protocols
Msg Size1st2nd3rd4th5th
  2048   2   3   6   9   8 
  4096   3   2   6   9   1 
  8192   2   6   1   3   9 
  16384   9   2   3   6   8 
  32768   4   5   9   1   8 
  65536   4   5   2   3   8 
  131072   5   4   9   1   2 
  262144   4   5   8   6   2 
  524288   4   5   8   9   6 
  1048576   5   4   3   2   9 
  2097152   8   6   2   4   7 
       Number of Proctocols With
Runtimes Within X% of Min
Msg Size1%5%25%
  2048    3   6   8 
  4096    3   5   8 
  8192    2   5   10 
  16384    4   8   10 
  32768    2   4   10 
  65536    1   3   9 
  131072    1   2   9 
  262144    1   2   9 
  524288    1   2   4 
  1048576    2   2   8 
  2097152    5   8   8 


Protocol Sensitivity Summary for Bidirectional Swap of 2097152 Bytes (10 iterations/no overlap)
Runtime Statistics
Msg Sizemin Secmin Sec/Msgmax MBytes/Sec(mean-min)/min(median-min)/min(max-min)/min
  2048   9.9656e-02   9.7320e-05   42.09   0.07   0.03   0.27 
  4096   4.9686e-02   9.7043e-05   84.42   0.08   0.03   0.33 
  8192   2.6326e-02   1.0283e-04   159.32   0.08   0.06   0.24 
  16384   1.9682e-02   1.5376e-04   213.11   0.05   0.04   0.18 
  32768   1.8725e-02   2.9257e-04   224.00   0.06   0.05   0.23 
  65536   1.5823e-02   4.9446e-04   265.08   0.11   0.12   0.30 
  131072   1.5055e-02   9.4096e-04   278.59   0.16   0.17   0.38 
  262144   1.4983e-02   1.8729e-03   279.93   0.17   0.19   0.38 
  524288   1.4721e-02   3.6803e-03   284.92   0.25   0.27   0.56 
  1048576   1.4764e-02   7.3822e-03   284.08   0.31   0.32   0.80 
  2097152   1.4772e-02   1.4772e-02   283.94   0.45   0.45   1.29 
Five Fastest
Protocols
Msg Size1st2nd3rd4th5th
  2048   3   9   6   2   8 
  4096   6   3   2   0   9 
  8192   3   2   6   1   9 
  16384   3   6   2   9   8 
  32768   5   4   2   6   8 
  65536   5   4   3   6   2 
  131072   4   5   3   9   1 
  262144   4   5   2   7   8 
  524288   4   5   6   2   8 
  1048576   4   5   3   2   9 
  2097152   5   4   2   8   9 
       Number of Proctocols With
Runtimes Within X% of Min
Msg Size1%5%25%
  2048    1   7   8 
  4096    3   7   8 
  8192    3   4   10 
  16384    4   6   10 
  32768    2   5   10 
  65536    2   2   9 
  131072    1   2   9 
  262144    2   2   9 
  524288    1   2   3 
  1048576    2   2   2 
  2097152    2   2   2 


Protocol Sensitivity Summary for Bidirectional Swap of 2097152 Bytes (1 iterations with overlap)
Runtime Statistics
Msg Sizemin Secmin Sec/Msgmax MBytes/Sec(mean-min)/min(median-min)/min(max-min)/min
  2048   9.2524e-02   9.0355e-05   45.33   0.07   0.08   0.20 
  4096   4.8275e-02   9.4286e-05   86.88   0.05   0.06   0.17 
  8192   2.6769e-02   1.0457e-04   156.68   0.04   0.02   0.17 
  16384   2.0322e-02   1.5877e-04   206.39   0.05   0.03   0.24 
  32768   1.8103e-02   2.8285e-04   231.70   0.09   0.02   0.32 
  65536   1.6160e-02   5.0499e-04   259.55   0.10   0.11   0.32 
  131072   1.5260e-02   9.5378e-04   274.85   0.11   0.02   0.45 
  262144   1.5053e-02   1.8816e-03   278.64   0.77   0.09   6.81 
  524288   1.4867e-02   3.7167e-03   282.13   2.06   0.20   19.10 
  1048576   1.4697e-02   7.3484e-03   285.39   0.35   0.09   1.70 
  2097152   1.4726e-02   1.4726e-02   284.82   0.25   0.10   1.24 
Five Fastest
Protocols
Msg Size1st2nd3rd4th5th
  2048   2   5   3   4   9 
  4096   5   2   4   3   6 
  8192   2   4   5   6   3 
  16384   3   2   5   8   4 
  32768   2   8   9   3   5 
  65536   4   3   8   2   6 
  131072   3   4   8   2   9 
  262144   8   4   3   2   9 
  524288   3   4   8   2   5 
  1048576   2   3   8   4   9 
  2097152   4   3   8   2   6 
       Number of Proctocols With
Runtimes Within X% of Min
Msg Size1%5%25%
  2048    4   5   10 
  4096    4   5   10 
  8192    4   6   10 
  16384    3   6   10 
  32768    1   6   9 
  65536    3   4   8 
  131072    4   6   9 
  262144    2   4   8 
  524288    3   4   6 
  1048576    3   4   6 
  2097152    4   5   8 


Protocol Sensitivity Summary for Bidirectional Swap of 2097152 Bytes (10 iterations with overlap)
Runtime Statistics
Msg Sizemin Secmin Sec/Msgmax MBytes/Sec(mean-min)/min(median-min)/min(max-min)/min
  2048   9.2429e-02   9.0263e-05   45.38   0.09   0.10   0.21 
  4096   4.9287e-02   9.6264e-05   85.10   0.05   0.04   0.14 
  8192   2.6378e-02   1.0304e-04   159.01   0.06   0.06   0.10 
  16384   1.9332e-02   1.5103e-04   216.96   0.05   0.02   0.21 
  32768   1.7485e-02   2.7321e-04   239.88   0.09   0.04   0.34 
  65536   1.5196e-02   4.7488e-04   276.01   0.09   0.02   0.36 
  131072   1.4660e-02   9.1626e-04   286.10   0.11   0.02   0.39 
  262144   1.4841e-02   1.8551e-03   282.62   0.11   0.03   0.39 
  524288   1.4654e-02   3.6635e-03   286.22   0.14   0.01   0.57 
  1048576   1.4654e-02   7.3269e-03   286.23   0.21   0.05   0.89 
  2097152   1.4690e-02   1.4690e-02   285.53   0.28   0.01   1.29 
Five Fastest
Protocols
Msg Size1st2nd3rd4th5th
  2048   2   4   3   5   9 
  4096   6   2   4   0   3 
  8192   6   2   4   1   3 
  16384   2   4   9   5   8 
  32768   9   3   5   2   4 
  65536   3   5   2   4   8 
  131072   2   4   8   3   5 
  262144   2   4   8   3   5 
  524288   4   2   5   3   8 
  1048576   5   9   3   4   8 
  2097152   2   5   3   8   4 
       Number of Proctocols With
Runtimes Within X% of Min
Msg Size1%5%25%
  2048    2   4   10 
  4096    1   6   10 
  8192    1   5   10 
  16384    2   7   10 
  32768    3   6   9 
  65536    4   6   9 
  131072    3   6   9 
  262144    3   6   9 
  524288    5   6   6 
  1048576    2   6   6 
  2097152    5   6   6 

DISCUSSION


Patrick H. Worley / ( worleyph@ornl.gov)
Last Modified Monday, 15-Jul-2002 10:01:55 EDT.
86442 accesses since 1/2/96.