PSTSWM AlphaSC-667 Serial Performance

Performance Studies using

PSTSWM


Compaq AlphaServer SC Serial Performance

(Using CXML Math Library)

Date/Person: April 17, 2000 / P. Worley
Platform: Compaq AlphaServer SC at Oar Ridge National Laboratory (colt.ccs.ornl.gov):
     16 ES40 4-way SMP nodes (667 MHz Alpha 21264a with 8MB L2 cache)
Environment: Digital UNIX V5.0;   RMS 2.36
Code Version: 6.7.2
Make Options: MACH=alpha-sc COMM=mpi PRECISION=8 PERF=n WORKSPACE=22000000 MATH=cxml
Compilation Options: f90 -O4 -assume accuracy_sensitive -math_library accurate -align dcommons -align records -arch host -tune host
    or f90 -O4 -assume noaccuracy_sensitive -math_library fast -align dcommons -align records -arch host -tune host
    or f90 -O5 -assume noaccuracy_sensitive -math_library fast -align dcommons -align records -arch host -tune host
Number of steps: T42: 241 or 481
T85: 49 or 97
T170: 49 or 97
Notes: using CXML library routines for Fourier transforms and BLAS

-O4 accurate

MEASURED TIME PER TIMESTEP (SEC)

Problem L1 L2 L3 L16
T42 0.007 0.015 0.024 0.167
T85 0.043 0.096 0.149 1.235
T170 0.329 0.730 1.004  

MEASURED MFLOP/SEC RATES

Problem L1 L2 L3 L16
T42 564.9 549.9 525.0 394.6
T85 564.6 502.8 488.7 314.0
T170 465.4 419.1 457.2  

-O4 fast

MEASURED TIME PER TIMESTEP (SEC)

Problem L1 L2 L3 L16
T42 0.007 0.015 0.024 0.170
T85 0.044 0.096 0.149 1.247
T170 0.324 0.728 1.014  

MEASURED MFLOP/SEC RATES

Problem L1 L2 L3 L16
T42 555.4 541.5 517.9 388.1
T85 555.6 504.6 488.5 310.9
T170 472.6 420.2 452.9  

-O5 fast

MEASURED TIME PER TIMESTEP (SEC)

Problem L1 L2 L3 L16
T42 0.007 0.015 0.024 0.240
T85 0.043 0.094 0.143 1.675
T170 0.317 0.731 0.986  

MEASURED MFLOP/SEC RATES

Problem L1 L2 L3 L16
T42 552.2 542.5 519.7 275.0
T85 560.6 514.2 508.8 231.6
T170 482.3 418.4 465.6  

DISCUSSION


Patrick H. Worley / ( worleyph@ornl.gov)
Last Modified Wednesday, 06-Dec-2000 12:35:59 EST.
3072 accesses since 1/2/96.