PSTSWM AlphaSC-667 Serial Performance

Performance Studies using

PSTSWM


Compaq AlphaServer SC Serial Performance

Date/Person: April 17, 2000 / P. Worley
Platform: Compaq AlphaServer SC at Oak Ridge National Laboratory (colt.ccs.ornl.gov):
     16 ES40 4-way SMP nodes (667 MHz Alpha 21264a with 8MB L2 cache)
Environment: Digital UNIX V5.0;   RMS 2.36
Code Version: 6.7.2
Make Options: MACH=alpha-sc COMM=mpi PRECISION=8 PERF=n WORKSPACE=22000000
Compilation Options: f90 -O4 -assume accuracy_sensitive -math_library accurate -align dcommons -align records -arch host -tune host
    or f90 -O4 -assume noaccuracy_sensitive -math_library fast -align dcommons -align records -arch host -tune host
    or f90 -O5 -assume noaccuracy_sensitive -math_library fast -align dcommons -align records -arch host -tune host
Number of steps: T42: 241 or 481
T85: 49 or 97
T170: 49 or 97
Notes: using PSTSWM Fortran routines for Fourier transforms and BLAS

-O4 accurate

MEASURED TIME PER TIMESTEP (SEC)

Problem L1 L2 L3 L16
T42 0.008 0.016 0.025 0.185
T85 0.048 0.107 0.162 1.410
T170 0.355 0.791 1.110  

MEASURED MFLOP/SEC RATES

Problem L1 L2 L3 L16
T42 527.8 509.7 488.9 357.1
T85 506.1 453.9 449.4 275.1
T170 431.4 386.9 413.7  

-O4 fast

MEASURED TIME PER TIMESTEP (SEC)

Problem L1 L2 L3 L16
T42 0.008 0.016 0.026 0.185
T85 0.049 0.107 0.163 1.405
T170 0.348 0.784 1.112  

MEASURED MFLOP/SEC RATES

Problem L1 L2 L3 L16
T42 522.4 503.2 484.0 356.6
T85 498.9 453.4 447.2 276.0
T170 440.0 390.1 412.9  

-O5 fast

MEASURED TIME PER TIMESTEP (SEC)

Problem L1 L2 L3 L16
T42 0.009 0.018 0.028 0.267
T85 0.051 0.112 0.170 1.903
T170 0.352 0.823 1.125  

MEASURED MFLOP/SEC RATES

Problem L1 L2 L3 L16
T42 474.4 464.4 444.9 247.9
T85 471.7 433.2 427.8 203.8
T170 435.1 371.7 408.0  

DISCUSSION


Patrick H. Worley / ( worleyph@ornl.gov)
Last Modified Wednesday, 06-Dec-2000 12:35:59 EST.
3136 accesses since 1/2/96.