PSTSWM POWER4-1.3 Serial Performance

Performance Studies using

PSTSWM


IBM POWER4 Regatta Serial Performance

(comparing performance when running multiple instances)

(fixed memory size per node)

Date/Person: October 18, 2001 / P. Worley
Platform: IBM Regatta H system at Oak Ridge National Laboratory (cheetah.ccs.ornl.gov):
     1 16-way Regatta H SMP nodes (1.3 GHz POWER4, 2 8-way Mulitchip Modules)
Environment: AIX 5.1
Code Version: 6.7.4
Make Options: MACH=sp COMM=mpi PRECISION=8 PERF=n WORKSPACE=20000000 MATH=essl
Compilation Options: mpxlf -O3 -qarch=auto -qtune=auto -qcache=auto
Link Options: -bmaxdata:0x70000000
Number of steps: T5, T10, T21, T42: 241 or 481
T85: 49 or 97
T170: 49 or 97
Number of processors per node: 1, 2, 3, or 4
Notes: using ESSL library routines for Fourier transforms and BLAS (-lessl)

T5 MFlop rates

Per Processor Rate

Per Node Rate


T10 MFlop rates

Per Processor Rate

Per Node Rate


T21 MFlop rates

Per Processor Rate

Per Node Rate


T42 MFlop rates

Per Processor Rate

Per Node Rate


T85 MFlop rates

Per Processor Rate

Per Node Rate


T170 MFlop rates

Per Processor Rate

Per Node Rate


DISCUSSION


Patrick H. Worley / ( worleyph@ornl.gov)
Last Modified Monday, 15-Jul-2002 10:22:36 EDT.
82511 accesses since 1/2/96.