home  |  about us  |  contact  
 
 CSM Home
LDRD Proposal
 
 Home > Early Evaluation >

Evaluation of Early Systems


Cray X1 Results

Oak Ridge National Laboratory installed a 32 processor Cray X1 on March 18, 2003. The system grew to 256 processors in October, 2003, and to 512 processors in May, 2004. The following results represent achievable lower bounds, in that the operating system, compilers, communication libraries, and math libraries are all undergoing active development. (The impact of these changes are demonstrated in a number of the following performance studies.) Please note that the performance of a number of applications is significantly better in the more recent talks and papers than in the older talks and papers.

ORNL Papers

Cray X1 Evaluation Status Report (PDF), P. A. Agarwal, et al. in Proceedings of the 46th Cray User Group Conference, Knoxville, TN, May 17-20, 2004.

Adventures in Vectorizing the Community Land Model ( PDF) F. M. Hoffman, M. Vertenstein, H. Kitabata, J. B. White III, P. H. Worley, J. B. Drake, M. Cordery, in Proceedings of the 46th Cray User Group Conference, Knoxville, TN, May 17-21, 2004. (186380 bytes)

Experience with the Full CCSM (PDF), J. B. Drake, P. H. Worley, I. Carpenter, M. Cordery, in Proceedings of the 46th Cray User Group Conference, Knoxville, TN, May 17-21, 2004. (192691 bytes)

GYRO: Analyzing New Physics in Record Time (PDF), M. R. Fahey, J. Candy, in Proceedings of the 46th Cray User Group Conference, Knoxville, TN, May 17-20, 2004.

The Performance Evolution of the Parallel Ocean Program on the Cray X1 ( PDF), P. H. Worley, J. Levesque, in Proceedings of the 46th Cray User Group Conference, Knoxville, TN, May 17-20, 2004.

Cray X1 Evaluation Status Report (PDF), P. A. Agarwal, R. A. Alexander, E. Apra, S. Balay, A. S. Bland, J.Colgan, E. F. D'Azevedo, J. J. Dongarra, T. H. Dunigan, Jr., M. R. Fahey, R. A. Fahey, A. Geist, M. Gordon, R. J. Harrison, D. Kaushik, M. Krishnakumar, P. Luszczek, B. Messer, A. Mezzacappa, J. A. Nichols, J. Nieplocha, L. Oliker, T. Packwood, M. S. Pindzola, T. C. Schulthess, J. S. Vetter, J. B. White, III, T. L. Windus, P. H. Worley, T. Zacharia, ORNL Technical Report ORNL/TM-2004/13, January, 2004.

Early Evaluation of the Cray X1 ( PDF), T. H. Dunigan, Jr., M. R. Fahey, J. B. White III, P. H. Worley, in Proceedings of the ACM/IEEE Conference on High Performance Networking and Computing (SC03), Phoenix, AZ, November 15-21, 2003.

Early Operations Experience with the Cray X1 at the Oak Ridge National Laboratory Center for Computational Sciences (PDF) , A. S. Bland, R. Alexander, S. M. Carter, K. D. Matney, in Proceedings of the 45th Cray User Group Conference, Columbus, OH, May 12-16, 2003.

DOE Ultrascale Evaluation Plan of the Cray X1 (PDF), M. R. Fahey, J. B. White III, in Proceedings of the 45th Cray User Group Conference, Columbus, OH, May 12-16, 2003.

An Optimization Experiment with the Community Land Model on the Cray X1 ( PDF), J. B. White III, in Proceedings of the 45th Cray User Group Conference, Columbus, OH, May 12-16, 2003.

Early Evaluation of the Cray X1 at Oak Ridge National Laboratory ( PDF), P. H. Worley, T. H. Dunigan, in Proceedings of the 45th Cray User Group Conference, Columbus, OH, May 12-16, 2003.

Cray X1 Evaluation (PDF), A. S. Bland, et al, ORNL Technical Report ORNL/TM-2003/67, March, 2003.

ORNL Presentations

GYRO: Analyzing New Physics in Record Time (PDF), M. R. Fahey, J. Candy, 46th Cray User Group Conference, Knoxville Marriott, Knoxville, Tennessee, May 20, 2004.

A Progress Report on the Cray X1 A Progress Report on the Cray X1 Evaluation by CCS at ORNL (PDF), J. S. Vetter, 46th Cray User Group Conference, Knoxville Marriott, Knoxville, Tennessee, May 18, 2004.

The Performance Evolution of the Parallel Ocean Program on the Cray X1 ( HTML PDF), P. H. Worley, J. Levesque, 46th Cray User Group Conference, Knoxville Marriott, Knoxville, Tennessee, May 18, 2004.

Cray X1 Optimization: A Customer's Perspective, ( HTML PDF) P. H. Worley, 46th Cray User Group Conference, Knoxville Marriott, Knoxville, Tennessee, May 18, 2004

Cray X1 Evaluation: Overview and Scalability Analysis ( HTML), P. H. Worley, SIAM Conference on Parallel Processing for Scientific Computing 2004, Hyatt at Fisherman's Wharf, San Francisco, California, February 26, 2004.

(Minor variations of this talk were also given at the ANL/ORNL Site Visit of the National Research Council/National Academies Computer Science and Telecommunications Board, Argonne National Laboratory, Argonne, Illinois, March 2, 2004 and at the Cray X1 Review, Oak Ridge National Laboratory, Oak Ridge, Tennessee, February 10, 2004.)

Early Evaluation of the Cray X1 ( HTML), P. H. Worley, SC2003, Phoenix Convention Center, Phoenix, Arizona, November 19, 2003.

Scalable Supercomputer Solving Superconductivity (PDF), J. B. White III, SC2003, Cray Exhibit Booth, Phoenix Convention Center, Phoenix, Arizona, November 19, 2003.

Early Evaluation of the Cray X1 - Part 1.5 ( HTML), P. H. Worley, SC2003, Cray Exhibit Booth, Phoenix Convention Center, Phoenix, Arizona, November 18, 2003.

Latest Performance Results from ORNL: Cray X1 and SGI Altix ( HTML), P. H. Worley, System and Application Performance Workshop, 2003 LACSI Symposium, Eldorado Hotel, Santa Fe, New Mexico, October 27, 2003. (A minor variation of this talk was also given at the Computer Science Department, University of Tennessee, Knoxville, Tennessee, January 9, 2004.)

CCSM Component Performance Benchmarking and Status of the Cray X1 at ORNL ( HTML), P. H. Worley, Computing in the Atmospheric Sciences Worksop 2003, L'Imperial Palace Hotel, Annecy, France, September 10, 2003.

Early Operations Experience with the Cray X1 at the Oak Ridge National Laboratory Center for Computational Sciences (PDF), A. S. Bland, R. Alexander, S. M. Carter, K. D. Matney, 45th Cray User Group Conference, Hyatt on Capital Square, Columbus, Ohio, May 12-16, 2003.

DOE Ultrascale Evaluation Plan of the Cray X1 (PDF), M. R. Fahey, J. B. White III, 45th Cray User Group Conference, Hyatt on Capital Square, Columbus, Ohio, May 12-16, 2003.

An Optimization Experiment with the Community Land Model on the Cray X1 (PDF), J. B. White III, 45th Cray User Group Conference, Hyatt on Capital Square, Columbus, Ohio, May 12-16, 2003.

Early Evaluation of the Cray X1 at Oak Ridge National Laboratory ( HTML), P. H. Worley, T. H. Dunigan, 45th Cray User Group Conference, Hyatt on Capital Square, Columbus, Ohio, May 13, 2003.

Other Papers, Presentations, and Data on Cray X1 Performance

Benchmark Studies

  • Tom Dunigan's results, including
  • PSTSWM Experiments (April-June 2003)

    While subject to some interpretation, the PSTSWM results indicate the following.

    1. Running the code without modifying for vectorization demonstrated poor performance (never more than 400 MFlops/sec, and typically less than 250 MFlops/sec).

    2. Modest modifications were sufficient to achieve 4.0 GFlops/sec in the best case. If the vertical dimension is specified at compile time, the best performance increases to 6.0 GFlops/sec. The Fourier transform (coded in Fortran) achieved only 25% of peak. The Legendre transform (coded in Fortran) performance increases with problem size, achieving better than 50% of peak for the largest problem sizes when specifying the vertical dimension at compile-time. These are the performance critical operations.

    3. Performance increases with problem size, both horizontal and vertical resolution.

    4. Running instances of the code on all processors of the SMP node simultaneously showed almost no performance degradation, unlike all other systems for which we have data.

    5. Assigning processes to SSP processors directly instead of allowing the compiler to assign work within an MSP has some performance advantage in some cases, but the compiler does a reasonable job overall. The PSTSWM data does not indicate that using SSPs directly is a useful optimization strategy.

    6. System comparisons using a small climate-size problem resolution (T42L18):
      • a single MSP processor in the Cray X1 SX-6/8 is 2.8 times faster than a single processor in an IBM p690 and 10% faster than the SX-6.
      • a 4 processor X1 SMP node has 27% less throughput than a 32 processor p690 when making simultaneous serial runs


  • POP Experiments (April-June 2003)

  • The POP results indicate the following.
    1. Running the code without modifications achieved performance similar to that on the IBM p690 cluster, but this was over 5 times slower than when optimized on the Cray X1.

    2. The optimizations used to port POP to the Earth Simulator work reasonably well on the X1. Using Co-Array Fortran to decrease the latency in latency-sensitive algorithms improves performance further. Modifying the Earth Simulator optimizations to take into account the Cray X1 architecture should also improve performance. The latter work is ongoing.

    3. System comparisons using the one degree benchmark problem that corresponds to how POP is used in coupled climate simulations:
      • POP on the Cray X1 is more than 5 times faster than the fastest of the nonvector systems examined (the IBM p690 cluster) for the same number of processors.
      • POP on the Cray X1 is 7% slower than POP on the Earth Simulator for the same number of processors. As optimization is ongoing, this is likely to change over time.




 
   CSM Projects   
   Colossal Magneto Resistance   
   Compound Wavelet Matrix   
   Scalable First Principles Methods for Electronic Transport   
   Electronic Notebook   
   Earth System Grid   
   Functionally Graded Materials   
   New Fourier Transforms Methods   
   Statistical Physics of Fracture   
   Adaptive Mesh Refinement for Multiphysics Applications   
   High-Performance Circuit-Switched Networks   
   Packet-Switched and Circuit-Switched Networks   
   Infiniband Connections across the United States   
   Siemens Competition National Finals   
   Protein Dynamics   
     
  INCITE Funded Projects  
   An Integrated Approach to the Rational Design of Chemical Catalysts   
   Multidimensional Simulations of Core Collapse Supernovae   
   Predictive and accurate Monte Carlo based simulations for Mott insulators, cuprate superconductors, and nanoscale systems   
   Cellulosic Ethanol: Physical Basis of Recalcitrance to Hydrolysis of Lignocellulosic Biomass   
   Clean and Efficient Coal Gasifier Designs using Large-Scale Simulations   
   Climate-Science Computational End Station Development and Grand Challenge Team   
   Modeling Reactive Flows in Porous Media   
   Assessing Global Climate Response of the NCAR-CCSM3: CO2 Sensitivity and Abrupt Climate Change   
   Performance Evaluation and Analysis Consortium End Station   
   
  ORNL | Directorate | CSM | NCCS | ORNL Disclaimer | Search
Staff only: CSM computers | who, what, where? | news
 
URL: http://www.csm.ornl.gov/evaluation/PHOENIX/index.html
Updated: Friday, 19-Nov-2004 09:40:00 EST

webmaster