PSTSWM AlphaSC-500 Point-to-Point Communication Performance

Performance Studies using

PSTSWM


Compaq AlphaServer SC SWAP Performance

(ordered swap of 128KB message using MPI within a node)

(performance measured per processor when all processors in node communicating)

Date/Person: January 26, 2000 / P. Worley
Platform: Compaq AlphaServer SC at Oak Ridge National Laboratory (colt.ccs.ornl.gov):
     16 ES40 4-way SMP nodes (500 MHz Alpha 21264 with 4MB L2 cache)
Environment: Digital UNIX V5.0;   RMS 2.36
Communication Library: MPI
SWAP size: 16384 REAL*8 floating point values each direction
Message size: Largest - 16384 REAL*8 floating point values
Smallest - 16 REAL*8 floating point values
Processors: 0 and 1
2 and 3
Latency Definition:(T1024-T512)/512
Model Error Range:[1,1024]
Results:

ordered simple swap
Data Statistics
unidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 233.31 11.56 69.1%
10 iter. 264.97 11.45 69.8%

ordered swap using nonblocking send
Data Statistics
unidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 222.53 11.17 69.8%
10 iter. 261.19 11.40 69.9%
1 iter. w/overlap 224.21 11.39 69.5%
10 iter. w/overlap 253.87 11.54 70.4%

ordered swap using nonblocking receive
Data Statistics
unidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 227.32 12.12 69.0%
10 iter. 259.06 12.54 68.1%
1 iter. w/overlap 225.25 12.42 68.1%
10 iter. w/overlap 258.25 14.75 65.7%

ordered swap using nonblocking send and receive
Data Statistics
unidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 223.75 12.46 67.8%
10 iter. 265.13 12.67 68.1%
1 iter. w/overlap 240.41 13.09 67.2%
10 iter. w/overlap 259.53 14.54 66.8%

ordered swap using ready send
Data Statistics
unidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 235.49 27.61 48.0%
10 iter. 258.64 27.08 48.5%
1 iter. w/overlap 238.14 12.21 68.9%
10 iter. w/overlap 260.73 14.41 66.6%

ordered swap using nonblocking ready send
Data Statistics
unidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 219.37 25.31 51.7%
10 iter. 248.70 25.64 51.3%
1 iter. w/overlap 230.52 12.83 67.7%
10 iter. w/overlap 259.92 14.84 66.4%

synchronous
Data Statistics
unidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 235.11 26.76 48.5%
10 iter. 253.97 26.61 48.7%

ordered swap using nonblocking sync. send
Data Statistics
unidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 235.78 17.41 60.5%
10 iter. 261.11 17.72 59.6%
1 iter. w/overlap 226.53 17.52 59.4%
10 iter. w/overlap 259.32 18.41 59.4%

ordered swap using nonblocking receive with sync. send
Data Statistics
unidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 236.46 17.84 59.6%
10 iter. 260.14 17.85 59.6%
1 iter. w/overlap 238.92 16.95 61.6%
10 iter. w/overlap 255.93 19.90 59.1%

ordered swap using nonblocking sync. send and receive
Data Statistics
unidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 235.91 18.35 58.7%
10 iter. 254.85 18.02 59.5%
1 iter. w/overlap 238.88 17.50 61.1%
10 iter. w/overlap 257.72 20.28 58.3%

ordered simple swap using sync. send
Data Statistics
unidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 238.53 18.06 58.9%
10 iter. 261.30 17.93 59.1%


Protocol Sensitivity Summary for Unidirectional Swap of 131072 Bytes (1 iterations/no overlap)
Runtime Statistics
Msg Sizemin Secmin Sec/Msgmax MBytes/Sec(mean-min)/min(median-min)/min(max-min)/min
  128   3.0943e-02   3.0217e-05   8.47   0.44   0.42   1.01 
  256   1.9109e-02   3.7321e-05   13.72   0.36   0.35   0.78 
  512   2.2801e-02   8.9066e-05   11.50   0.12   0.11   0.29 
  1024   1.1686e-02   9.1300e-05   22.43   0.12   0.10   0.28 
  2048   6.0400e-03   9.4375e-05   43.40   0.11   0.07   0.27 
  4096   3.3448e-03   1.0453e-04   78.37   0.09   0.05   0.26 
  8192   1.9860e-03   1.2412e-04   132.00   0.09   0.07   0.21 
  16384   1.4848e-03   1.8560e-04   176.55   0.09   0.08   0.16 
  32768   1.3762e-03   3.4405e-04   190.48   0.04   0.03   0.07 
  65536   1.2892e-03   6.4460e-04   203.34   0.02   0.02   0.09 
  131072   1.0990e-03   1.0990e-03   238.53   0.03   0.01   0.09 
Five Fastest
Protocols
Msg Size1st2nd3rd4th5th
  128   0   1   2   3   7 
  256   0   1   2   3   10 
  512   0   1   3   2   10 
  1024   1   0   2   3   10 
  2048   1   0   3   2   10 
  4096   0   10   7   1   3 
  8192   0   1   10   7   2 
  16384   0   10   9   3   8 
  32768   0   3   7   1   10 
  65536   10   7   8   9   0 
  131072   10   8   9   7   4 
       Number of Proctocols With
Runtimes Within X% of Min
Msg Size1%5%25%
  128    2   2   4 
  256    1   2   4 
  512    2   4   8 
  1024    2   4   8 
  2048    1   3   9 
  4096    2   6   10 
  8192    1   4   11 
  16384    1   2   11 
  32768    1   9   11 
  65536    5   10   11 
  131072    2   8   11 


Protocol Sensitivity Summary for Unidirectional Swap of 131072 Bytes (10 iterations/no overlap)
Runtime Statistics
Msg Sizemin Secmin Sec/Msgmax MBytes/Sec(mean-min)/min(median-min)/min(max-min)/min
  128   3.0645e-02   2.9927e-05   8.55   0.44   0.41   1.01 
  256   1.8924e-02   3.6961e-05   13.85   0.36   0.32   0.80 
  512   2.2694e-02   8.8647e-05   11.55   0.12   0.10   0.28 
  1024   1.1425e-02   8.9258e-05   22.94   0.12   0.09   0.29 
  2048   5.9738e-03   9.3341e-05   43.88   0.10   0.07   0.25 
  4096   3.2348e-03   1.0109e-04   81.04   0.09   0.04   0.22 
  8192   1.9179e-03   1.1987e-04   136.68   0.07   0.03   0.21 
  16384   1.4147e-03   1.7684e-04   185.30   0.04   0.02   0.12 
  32768   1.3026e-03   3.2565e-04   201.25   0.02   0.02   0.08 
  65536   1.1526e-03   5.7630e-04   227.44   0.03   0.02   0.09 
  131072   9.8874e-04   9.8874e-04   265.13   0.02   0.02   0.07 
Five Fastest
Protocols
Msg Size1st2nd3rd4th5th
  128   0   1   2   3   7 
  256   0   1   2   3   7 
  512   0   1   2   3   10 
  1024   0   1   2   3   10 
  2048   1   0   2   3   10 
  4096   1   0   7   2   10 
  8192   0   7   1   10   2 
  16384   2   0   3   10   7 
  32768   2   0   10   3   8 
  65536   0   6   2   10   3 
  131072   3   0   10   1   7 
       Number of Proctocols With
Runtimes Within X% of Min
Msg Size1%5%25%
  128    2   2   4 
  256    2   2   4 
  512    2   4   8 
  1024    2   4   8 
  2048    2   4   9 
  4096    2   6   11 
  8192    2   8   11 
  16384    2   8   11 
  32768    3   9   11 
  65536    1   10   11 
  131072    2   10   11 


Protocol Sensitivity Summary for Unidirectional Swap of 131072 Bytes (1 iterations with overlap)
Runtime Statistics
Msg Sizemin Secmin Sec/Msgmax MBytes/Sec(mean-min)/min(median-min)/min(max-min)/min
  128   3.0243e-02   2.9534e-05   8.67   0.28   0.12   0.98 
  256   1.8832e-02   3.6781e-05   13.92   0.23   0.10   0.77 
  512   2.2882e-02   8.9381e-05   11.46   0.07   0.04   0.25 
  1024   1.1677e-02   9.1228e-05   22.45   0.07   0.06   0.25 
  2048   6.0758e-03   9.4934e-05   43.15   0.07   0.06   0.26 
  4096   3.2872e-03   1.0273e-04   79.75   0.09   0.07   0.25 
  8192   2.0286e-03   1.2679e-04   129.22   0.07   0.03   0.21 
  16384   1.5064e-03   1.8830e-04   174.02   0.07   0.07   0.20 
  32768   1.3970e-03   3.4925e-04   187.65   0.15   0.10   0.50 
  65536   1.2820e-03   6.4100e-04   204.48   0.09   0.03   0.32 
  131072   1.0904e-03   1.0904e-03   240.41   0.05   0.04   0.17 
Five Fastest
Protocols
Msg Size1st2nd3rd4th5th
  128   0   1   4   2   3 
  256   0   1   4   2   3 
  512   0   1   2   4   3 
  1024   1   0   4   2   3 
  2048   1   0   5   2   7 
  4096   1   7   0   10   8 
  8192   10   4   1   0   7 
  16384   0   8   4   2   10 
  32768   4   3   10   0   7 
  65536   8   10   7   3   4 
  131072   3   8   9   4   6 
       Number of Proctocols With
Runtimes Within X% of Min
Msg Size1%5%25%
  128    1   2   6 
  256    1   2   6 
  512    2   6   11 
  1024    2   5   10 
  2048    1   5   10 
  4096    1   3   11 
  8192    4   6   11 
  16384    1   3   11 
  32768    1   5   9 
  65536    2   6   9 
  131072    4   6   11 


Protocol Sensitivity Summary for Unidirectional Swap of 131072 Bytes (10 iterations with overlap)
Runtime Statistics
Msg Sizemin Secmin Sec/Msgmax MBytes/Sec(mean-min)/min(median-min)/min(max-min)/min
  128   3.0173e-02   2.9465e-05   8.69   0.37   0.25   0.99 
  256   1.8714e-02   3.6550e-05   14.01   0.29   0.21   0.77 
  512   2.3084e-02   9.0173e-05   11.36   0.11   0.10   0.26 
  1024   1.1718e-02   9.1549e-05   22.37   0.10   0.09   0.24 
  2048   6.0805e-03   9.5008e-05   43.11   0.08   0.06   0.22 
  4096   3.2743e-03   1.0232e-04   80.06   0.08   0.07   0.20 
  8192   1.9270e-03   1.2043e-04   136.04   0.07   0.07   0.17 
  16384   1.4085e-03   1.7607e-04   186.11   0.05   0.05   0.12 
  32768   1.3171e-03   3.2927e-04   199.04   0.03   0.01   0.16 
  65536   1.1510e-03   5.7549e-04   227.76   0.03   0.01   0.15 
  131072   9.9284e-04   9.9284e-04   264.03   0.02   0.02   0.04 
Five Fastest
Protocols
Msg Size1st2nd3rd4th5th
  128   0   1   4   2   3 
  256   0   1   2   4   3 
  512   0   1   10   2   4 
  1024   0   1   10   4   2 
  2048   0   1   10   4   7 
  4096   0   10   1   7   4 
  8192   0   1   7   2   10 
  16384   0   4   2   8   10 
  32768   2   0   10   4   7 
  65536   4   2   6   10   3 
  131072   10   0   4   6   5 
       Number of Proctocols With
Runtimes Within X% of Min
Msg Size1%5%25%
  128    1   2   5 
  256    1   2   6 
  512    1   2   10 
  1024    2   2   11 
  2048    2   4   11 
  4096    1   4   11 
  8192    1   5   11 
  16384    1   4   11 
  32768    5   10   11 
  65536    3   10   11 
  131072    2   11   11 

DISCUSSION


Patrick H. Worley / ( worleyph@ornl.gov)
Last Modified Monday, 15-Jul-2002 10:02:32 EDT.
86884 accesses since 1/2/96.