| home | about us | contact | ||||
![]() |
| |||
| CSM Home | |||||||||||||||||||||||||||||||||
|
PSTSWM on the Cray X1Code Version ComparisonsThese experiments compare the single (MSP) processor performance of different implementations of PSTSWM:
The following graphs compare the performance of the four different implementations when compiled with agg2 optimization and run with 64 MByte pages. The first three figures plot the performance as a function of horizontal resolution for 1, 18, and 66 vertical levels. The other two figures plot the performance as a function of number of levels for fixed horizontal resolutions of T42 and T85.
From these data, the original code does not perform well on the X1, apparently running primarily on the scalar unit. The version ported and tuned for the SX6 also does not perform as well (on the X1) as the version ported and tuned specifcally for the X1. This is most obvious for large horizontal resolutions. Finally, fixing the number of vertical levels at compile time improves performance significantly for small numbers of vertical levels. The current hypothesis is that the compiler attempts to stream over the vertical level loop, which, if there are fewer than 4 levels, results in idle hardware. By specifying loop lengths at compile time, the compiler can make more appropriate decisions. This interpretation is consistent with the fact that the performance does not increase much as the number of vertical levels increases for the compile-time cases. In contrast, performance increases significantly from 1 to 18 vertical levels when using runtime specification of vertical levels. |
||||||||||||||||||||||||||||||||
|
ORNL
| Directorate
| CSM
| NCCS
| ORNL Disclaimer
| Search
Staff only: CSM computers | who, what, where? | news |
|||||||||||||||||||||||||||||||||
URL: http://www.csm.ornl.gov/evaluation/PHOENIX/PSTSWM-code.CRAYX1.html Updated: Saturday, 01-Nov-2003 20:02:08 EST webmaster |
|||||||||||||||||||||||||||||||||