PSTSWM AlphaSC-500 Point-to-Point Communication Performance

Performance Studies using

PSTSWM


Compaq AlphaServer SC SWAP Performance

(unordered swap of 8KB message using MPI within a node)

(performance measured per processor when all processors in node communicating)

Date/Person: January 26, 2000 / P. Worley
Platform: Compaq AlphaServer SC at Oak Ridge National Laboratory (colt.ccs.ornl.gov):
     16 ES40 4-way SMP nodes (500 MHz Alpha 21264 with 4MB L2 cache)
Environment: Digital UNIX V5.0;   RMS 2.36
Communication Library: MPI
SWAP size: 1024 REAL*8 floating point values each direction
Message size: Largest - 1024 REAL*8 floating point values
Smallest - 1 REAL*8 floating point values
Processors: 0 and 1
2 and 3
Latency Definition:(T1024-T512)/512
Model Error Range:[1,1024]
Results:

unordered simple swap
Data Statistics
bidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 81.03 32.28 56.5%
10 iter. 114.46 33.99 58.2%

unordered swap using nonblocking send
Data Statistics
bidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 123.19 47.06 45.3%
10 iter. 143.07 46.34 47.8%
1 iter. w/overlap 106.25 44.28 47.3%
10 iter. w/overlap 151.70 44.57 50.4%

unordered swap using nonblocking receive
Data Statistics
bidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 108.79 29.19 60.0%
10 iter. 148.57 35.22 58.0%
1 iter. w/overlap 118.38 20.60 65.5%
10 iter. w/overlap 152.89 24.37 65.7%

unordered swap using nonblocking send and receive
Data Statistics
bidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 105.03 30.94 59.0%
10 iter. 157.09 40.00 52.6%
1 iter. w/overlap 121.18 20.78 66.2%
10 iter. w/overlap 154.77 24.44 65.4%

unordered swap using ready send
Data Statistics
bidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 92.99 56.15 45.2%
10 iter. 122.93 56.41 47.2%
1 iter. w/overlap 112.84 20.27 66.4%
10 iter. w/overlap 163.51 24.03 66.5%

unordered swap using nonblocking ready send
Data Statistics
bidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 91.12 55.86 46.0%
10 iter. 124.37 55.40 48.2%
1 iter. w/overlap 102.14 20.93 63.5%
10 iter. w/overlap 153.21 23.98 66.0%

native sendrecv
Data Statistics
bidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 105.16 34.69 59.3%
10 iter. 154.27 36.49 57.1%

unordered swap using nonblocking sync. send
Data Statistics
bidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 114.25 33.71 58.0%
10 iter. 142.82 33.77 59.2%
1 iter. w/overlap 97.76 50.56 43.3%
10 iter. w/overlap 140.39 37.91 57.7%

unordered swap using nonblocking receive with sync. send
Data Statistics
bidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 99.30 35.02 55.2%
10 iter. 138.78 34.82 57.9%
1 iter. w/overlap 104.09 33.48 57.1%
10 iter. w/overlap 148.22 34.97 60.1%

unordered swap using nonblocking sync. send and receive
Data Statistics
bidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 100.89 34.30 55.5%
10 iter. 147.23 35.57 57.7%
1 iter. w/overlap 113.78 31.74 56.2%
10 iter. w/overlap 149.90 31.91 60.8%


Protocol Sensitivity Summary for Bidirectional Swap of 8192 Bytes (1 iterations/no overlap)
Runtime Statistics
Msg Sizemin Secmin Sec/Msgmax MBytes/Sec(mean-min)/min(median-min)/min(max-min)/min
  8   2.9397e-02   2.8708e-05   0.56   0.35   0.23   0.97 
  16   1.3558e-02   2.6481e-05   1.21   0.46   0.35   1.14 
  32   5.6744e-03   2.2166e-05   2.89   0.68   0.57   1.55 
  64   3.2542e-03   2.5423e-05   5.03   0.64   0.63   1.41 
  128   2.0896e-03   3.2650e-05   7.84   0.45   0.36   1.07 
  256   1.1912e-03   3.7225e-05   13.75   0.39   0.39   0.89 
  512   1.5426e-03   9.6413e-05   10.62   0.09   0.05   0.29 
  1024   7.8060e-04   9.7575e-05   20.99   0.09   0.05   0.29 
  2048   3.8800e-04   9.7000e-05   42.23   0.14   0.10   0.35 
  4096   2.0600e-04   1.0300e-04   79.53   0.16   0.12   0.39 
  8192   1.3300e-04   1.3300e-04   123.19   0.22   0.22   0.52 
Five Fastest
Protocols
Msg Size1st2nd3rd4th5th
  8   3   2   6   0   7 
  16   3   6   2   0   7 
  32   6   3   2   0   7 
  64   3   6   2   7   8 
  128   2   6   0   3   8 
  256   3   6   2   0   8 
  512   2   3   6   9   1 
  1024   6   3   1   2   8 
  2048   6   2   9   8   1 
  4096   3   2   1   6   7 
  8192   1   7   2   6   3 
       Number of Proctocols With
Runtimes Within X% of Min
Msg Size1%5%25%
  8    1   1   7 
  16    1   2   3 
  32    1   1   2 
  64    1   2   3 
  128    1   1   2 
  256    2   2   3 
  512    1   5   8 
  1024    1   6   8 
  2048    1   2   8 
  4096    1   2   7 
  8192    1   1   7 


Protocol Sensitivity Summary for Bidirectional Swap of 8192 Bytes (10 iterations/no overlap)
Runtime Statistics
Msg Sizemin Secmin Sec/Msgmax MBytes/Sec(mean-min)/min(median-min)/min(max-min)/min
  8   3.4459e-02   3.3651e-05   0.48   0.19   0.06   0.66 
  16   1.6425e-02   3.2081e-05   1.00   0.23   0.09   0.73 
  32   7.2166e-03   2.8190e-05   2.27   0.38   0.25   0.98 
  64   4.2025e-03   3.2832e-05   3.90   0.36   0.25   0.86 
  128   2.4817e-03   3.8777e-05   6.60   0.26   0.16   0.67 
  256   1.2762e-03   3.9880e-05   12.84   0.35   0.32   0.76 
  512   1.5711e-03   9.8195e-05   10.43   0.07   0.02   0.25 
  1024   7.8502e-04   9.8127e-05   20.87   0.07   0.03   0.28 
  2048   3.8170e-04   9.5425e-05   42.92   0.10   0.06   0.30 
  4096   1.8768e-04   9.3840e-05   87.30   0.13   0.08   0.41 
  8192   1.0430e-04   1.0430e-04   157.09   0.14   0.10   0.37 
Five Fastest
Protocols
Msg Size1st2nd3rd4th5th
  8   2   0   7   6   8 
  16   2   3   6   0   7 
  32   2   3   0   7   8 
  64   3   2   7   0   8 
  128   6   2   0   7   8 
  256   3   2   6   0   8 
  512   3   2   7   8   6 
  1024   2   9   6   8   7 
  2048   6   2   3   9   8 
  4096   3   1   6   8   2 
  8192   3   6   2   9   1 
       Number of Proctocols With
Runtimes Within X% of Min
Msg Size1%5%25%
  8    3   5   7 
  16    1   4   7 
  32    1   1   5 
  64    1   1   4 
  128    1   2   7 
  256    1   1   3 
  512    1   8   9 
  1024    1   7   8 
  2048    2   4   8 
  4096    1   2   8 
  8192    1   2   7 


Protocol Sensitivity Summary for Bidirectional Swap of 8192 Bytes (1 iterations with overlap)
Runtime Statistics
Msg Sizemin Secmin Sec/Msgmax MBytes/Sec(mean-min)/min(median-min)/min(max-min)/min
  8   2.1110e-02   2.0615e-05   0.78   0.50   0.56   1.43 
  16   1.0583e-02   2.0670e-05   1.55   0.47   0.54   1.40 
  32   5.1064e-03   1.9947e-05   3.21   0.50   0.59   1.61 
  64   3.1606e-03   2.4692e-05   5.18   0.47   0.47   1.60 
  128   1.8228e-03   2.8481e-05   8.99   0.37   0.39   1.36 
  256   1.1432e-03   3.5725e-05   14.33   0.27   0.27   1.10 
  512   1.3558e-03   8.4738e-05   12.08   0.12   0.13   0.27 
  1024   6.7940e-04   8.4925e-05   24.12   0.13   0.16   0.26 
  2048   3.7400e-04   9.3500e-05   43.81   0.08   0.07   0.19 
  4096   1.9300e-04   9.6500e-05   84.89   0.12   0.14   0.34 
  8192   1.3520e-04   1.3520e-04   121.18   0.15   0.16   0.40 
Five Fastest
Protocols
Msg Size1st2nd3rd4th5th
  8   4   3   2   5   6 
  16   3   5   2   4   6 
  32   4   5   2   3   6 
  64   4   3   2   5   6 
  128   2   4   3   5   6 
  256   3   2   4   6   5 
  512   5   2   3   4   9 
  1024   2   3   4   5   9 
  2048   4   5   3   2   9 
  4096   7   2   3   5   4 
  8192   3   2   9   4   1 
       Number of Proctocols With
Runtimes Within X% of Min
Msg Size1%5%25%
  8    3   4   4 
  16    3   4   4 
  32    1   5   5 
  64    1   4   5 
  128    3   4   5 
  256    2   4   5 
  512    2   4   9 
  1024    1   2   9 
  2048    2   4   10 
  4096    2   3   9 
  8192    1   2   9 


Protocol Sensitivity Summary for Bidirectional Swap of 8192 Bytes (10 iterations with overlap)
Runtime Statistics
Msg Sizemin Secmin Sec/Msgmax MBytes/Sec(mean-min)/min(median-min)/min(max-min)/min
  8   2.4588e-02   2.4011e-05   0.67   0.32   0.42   0.85 
  16   1.2286e-02   2.3995e-05   1.33   0.30   0.37   0.85 
  32   6.0319e-03   2.3562e-05   2.72   0.32   0.39   0.88 
  64   3.4831e-03   2.7212e-05   4.70   0.35   0.28   1.01 
  128   1.9686e-03   3.0760e-05   8.32   0.33   0.32   1.09 
  256   1.2321e-03   3.8502e-05   13.30   0.25   0.21   0.91 
  512   1.4368e-03   8.9798e-05   11.40   0.09   0.10   0.19 
  1024   7.1358e-04   8.9198e-05   22.96   0.10   0.10   0.21 
  2048   3.6968e-04   9.2420e-05   44.32   0.08   0.08   0.22 
  4096   1.9172e-04   9.5860e-05   85.46   0.08   0.07   0.20 
  8192   1.0020e-04   1.0020e-04   163.51   0.11   0.09   0.36 
Five Fastest
Protocols
Msg Size1st2nd3rd4th5th
  8   4   2   5   3   9 
  16   4   2   3   5   9 
  32   2   5   3   4   9 
  64   4   2   3   5   6 
  128   5   4   2   3   9 
  256   3   4   2   5   6 
  512   3   5   4   2   6 
  1024   2   4   5   3   6 
  2048   4   2   3   5   9 
  4096   6   2   5   3   4 
  8192   4   3   5   2   1 
       Number of Proctocols With
Runtimes Within X% of Min
Msg Size1%5%25%
  8    2   4   4 
  16    2   4   4 
  32    2   4   4 
  64    1   3   4 
  128    3   4   5 
  256    1   4   6 
  512    4   4   10 
  1024    1   4   10 
  2048    1   4   10 
  4096    1   3   10 
  8192    1   1   9 

DISCUSSION


Patrick H. Worley / ( worleyph@ornl.gov)
Last Modified Monday, 15-Jul-2002 10:02:17 EDT.
86545 accesses since 1/2/96.