PSTSWM Paragon Serial Performance

Performance Studies using

PSTSWM


Intel Paragon Serial Performance

(Using KAI Math Library)

Date/Person: January 16, 1998 / P. Worley
Platform: Intel Paragon XP/S 150 MP at Oak Ridge National Laboratory:
     1024 MP nodes (3 50-MHz iPSC/860 processors per node)
Environment: Paragon OSF/1 Release 1.0.4 Server 1.4 R1_4_5
f77/Paragon Paragon Version R5.0.3
Code Version: 6.2
Make Options: MACH=paragon-mp COMM=mpi PRECISION=8 PERF=n MATH=kai WORKSPACE=5000000
Compilation Options: if77 -O4 -Mnodepchk -Knoieee -Msafealloc
Number of steps: T42: 241 or 481
T85: 49 or 97
Notes: using KAI library routines for Fourier transforms and BLAS

MEASURED TIME PER TIMESTEP (SEC)

Problem L1 L2 L3 L16
T42 .297 .593 .886 4.74
T85 1.85 3.70 5.57  

MEASURED MFLOP/SEC RATES

Problem L1 L2 L3 L16
T42 13.9 13.9 14.0 13.9
T85 13.1 13.1 13.1  

DISCUSSION

While the MP nodes have three 50 MHz i860 processors (one of which is normally dedicated to interprocessor communication), only one of these was active for these measurements. The bus bandwidth between the processors and node memory prevent the efficient utilization of more than one processor for computation for this code . Typical of many spectral codes (especially those originally developed on vector machines), PSTSWM uses each datum only 1.3 times on the average for each load from memory (as measured by the Cray hardware performance monitor).

Patrick H. Worley / ( worleyph@ornl.gov)
Last Modified Monday, 15-Jul-2002 10:29:06 EDT.
3453 accesses since 1/2/96.