PSTSWM T3E-900 Point-to-Point Communication Performance

Performance Studies using

PSTSWM


SGI/Cray Research T3E-900 SWAP Performance

(unordered swap of 2MB message using SHMEM)

(performance measured per processor when all processors in node communicating)

Date/Person: September 30, 1999 / P. Worley
Platform: T3E-900 at National Energy Research Scientific Computing Center (mcurie.nersc.gov):
   532 450-MHz DEC Alpha EV5 RISC processors
Environment: UNICOS/mk 2.0.3.41
mpt 1.2.1.3
Cray CF90 Version 3.1.0.3
Communication Library: SHMEM
SWAP size: 262144 REAL*8 floating point values each direction
Message size: Largest - 262144 REAL*8 floating point values
Smallest - 256 REAL*8 floating point values
Processors: 0 and 2
1 and 3
Latency Definition:(T1024-T512)/512
Results:

unordered swap using nonblocking send
Data Statistics
bidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 554.40 8.49 8.7%
10 iter. 566.72 8.64 10.9%
1 iter. w/overlap 547.77 7.32 5.9%
10 iter. w/overlap 565.15 6.67 6.9%

unordered swap using nonblocking receive
Data Statistics
bidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 422.31 10.08 12.1%
10 iter. 427.50 10.10 12.8%
1 iter. w/overlap 419.25 8.33 7.7%
10 iter. w/overlap 426.75 7.78 7.8%

DISCUSSION


Patrick H. Worley / ( worleyph@ornl.gov)
Last Modified Monday, 15-Jul-2002 10:24:18 EDT.
86259 accesses since 1/2/96.