PSTSWM Origin2000 Point-to-Point Communication Performance

Performance Studies using

PSTSWM


SGI Origin2000 SWAP Performance

(ordered swap of 8KB message using SHMEM)

(performance measured per processor when all processors in node communicating)

Date/Person: September 30, 1999 / P. Worley
Platform: SGI Origin2000 at Los Alamos National Laboratory:
   128 250-MHz MIPS R10000 processors
Environment: IRIX 6.5
mpt_1.3.0.0
MIPSpro Compilers: Version 7.2.1
Communication Library: SHMEM
SWAP size: 1024 REAL*8 floating point values each direction
Message size: Largest - 1024 REAL*8 floating point values
Smallest - 1 REAL*8 floating point values
Processors: 0 and 2
1 and 3
Latency Definition:(T1024-T512)/512
Results:

ordered swap using nonblocking send
Data Statistics
unidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 56.73 6.11 14.9%
10 iter. 242.08 6.05 35.8%
1 iter. w/overlap 54.76 6.46 7.9%
10 iter. w/overlap 299.42 5.34 19.5%

ordered swap using nonblocking receive
Data Statistics
unidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 77.58 6.11 12.4%
10 iter. 340.77 6.11 25.4%
1 iter. w/overlap 70.87 6.27 17.0%
10 iter. w/overlap 361.20 3.69 16.2%

ordered swap using nonblocking send and receive
Data Statistics
unidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 55.20 6.07 7.5%
10 iter. 292.15 6.09 21.7%
1 iter. w/overlap 55.80 5.90 7.2%
10 iter. w/overlap 290.91 6.11 21.7%

ordered swap using ready send
Data Statistics
unidirectional bandwidth estimated latency model error
(peak MByte/sec) (usec/msg) (max. rel. error)
1 iter. 75.57 6.13 18.0%
10 iter. 286.03 6.07 22.6%
1 iter. w/overlap 66.06 5.98 7.1%
10 iter. w/overlap 317.52 6.29 24.4%

DISCUSSION


Patrick H. Worley / ( worleyph@ornl.gov)
Last Modified Monday, 15-Jul-2002 10:07:49 EDT.
86822 accesses since 1/2/96.