PSTSWM T3E-900 Point-to-Point Communication Performance

Performance Studies using

PSTSWM


SGI/Cray Research T3E-900 SWAP Performance

(unordered swap of 8KB message using SHMEM)

(performance measured per processor when all processors in node communicating)

Date/Person: September 30, 1999 / P. Worley
Platform: T3E-900 at National Energy Research Scientific Computing Center (mcurie.nersc.gov):
   532 450-MHz DEC Alpha EV5 RISC processors
Environment: UNICOS/mk 2.0.3.41
mpt 1.2.1.3
Cray CF90 Version 3.1.0.3
Communication Library: SHMEM
SWAP size: 1024 REAL*8 floating point values each direction
Message size: Largest - 1024 REAL*8 floating point values
Smallest - 1 REAL*8 floating point value
Processors: 0 and 2
1 and 3
Latency Definition:(T1024-T512)/512
Results:

unordered swap using nonblocking send
Data Statistics
bidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 261.06 8.26 31.4%
10 iter. 405.16 6.85 16.9%
1 iter. w/overlap 243.04 6.89 10.2%
10 iter. w/overlap 393.93 6.34 15.3%

unordered swap using nonblocking receive
Data Statistics
bidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 250.57 7.53 11.5%
10 iter. 307.91 5.96 29.0%
1 iter. w/overlap 235.81 6.90 9.9%
10 iter. w/overlap 366.82 6.00 13.4%

DISCUSSION


Patrick H. Worley / ( worleyph@ornl.gov)
Last Modified Monday, 15-Jul-2002 10:24:22 EDT.
86299 accesses since 1/2/96.