Summary of DOE SciDAC Workshop on
Porting CCSM to the CRAY X1
A workshop on porting the Community Climate System Model (CCSM) to the CRAY X1 was held on February 6, 2003 at the Center Green Campus of the National Center of Atmospheric Research (NCAR) in Boulder, Colorado. The goals of the meeting were as follows.
- To identify individuals and organizations that are interested or actively engaged in porting one or more of the CCSM component models CAM, POP, CICE, CLM2, and the coupler to vector systems.
- To hear reports of progress and problems in current CCSM vectorization activities.
- To identify gaps or issues in the current efforts.
- To establish lines of communication between the different vectorization efforts and with the NCAR software engineers, in order to coordinate future work and to encourage the sharing of results and code.
- To begin defining requirements and procedures for the adoption of vector-friendly code in future released versions of CCSM.
Over 35 people attended the workshop, including representatives from NCAR, from the DOE Laboratories Los Alamos National Laboratory (LANL), Lawrence Berkeley National Laboratory (LBNL), and Oak Ridge National Laboratory (ORNL), from NASA-Goddard, from the Japanese Institute CRIEPI, and from the computer vendors Cray and NEC. A partial list of the attendees is attached to the end of this document.
The morning was devoted to a vectorization tutorial and informal presentations:
- Introduction to vector architectures
- James B. (Trey) White III (ORNL)
- CCM/CAM work at NEC
- Dave Parks (NEC)
- Coding Styles for SX Systems
- John Michalakes (NCAR)
- CAM on the Cray X1
- Matthew Cordery (Cray) & Nathan Wichmann (Cray)
- POP on the Cray X1
- John Levesque (Cray) & Phil Jones (LANL)
- Status of Vectorizing CICE
- Phil Jones (LANL)
(See also http://www.csm.ornl.gov/meetings/climate2.html.)
The afternoon began with an open discussion of the different vectorization efforts and their relationship with CCSM development. The meeting participants then divided into subgroups, discussing each of the component models in more depth. The subgroups reported on their discussions during the final session of the meeting. The following summarizes the conclusions of the meeting.
- There are three separate (but interrelated) efforts examining the vectorization of the CCSM.
This list of projects and institutions is somewhat misleading in that all of the efforts are working with the component model developers to some degree. In particular, POP and CICE developers at LANL are working with both NEC and Cray in porting and optimizing the POP and CICE models on the vector systems.
- A project involving scientists from NCAR, NEC, CRIEPI, and Fujitsu (America) is currently porting the CCSM to Earth Simulator. This project has specific deadlines and deliverables, and is focused solely on the Earth Simulator. It is also focused solely on the finite volume dynamical core of CAM, in order to support atmospheric chemistry, and is ignoring vectorization of both the Eulerian or Semi-Lagrangian spectral dynamical cores.
- A project between ORNL and Cray is currently porting CCSM components to the X1. ORNL will use these in an evaluation of the X1 architecture and an assessment of the production-readiness of the X1 for climate change simulations.
- The Malone/Drake SciDAC project in climate has a responsibility for evaluating new architectures for climate simulations, and is currently tracking both NEC and Cray porting efforts. This SciDAC project involves personnel at ANL, LANL, LBNL, LLNL, NASA-Goddard, NCAR, ORNL, and PNNL.
- The number of individuals involved with these porting efforts is small. Both individuals and organizations are interested in sharing results and code, and in working together to the extent possible to accelerate the vectorization work. The primary complication is that the best performance on the NEC SX-6 and the Cray X1 may involve different optimizations. For code to make it into a released version of the CCSM, it should not be highly optimized on one system to the detriment of other systems. Responsibility for identifying a compromise between the different vectorization efforts has been assigned to the NCAR and DOE lab participants, as noted below.
- The desire of all participants is that the work on vectorization make it into the released version of the CCSM. At the moment, vector systems are not among the CCSM target architectures, and vectorization of components is not a high priority for NCAR CCSM software engineers. However, the Change Review Boards (CRBs) for the component models are favorably inclined to adopt modifications to CCSM code that improve vector performance as long as they do not degrade performance on the current target architectures and do not negatively impact the readability and maintainability of the code. This is not specific to vectorization. Changes that satisfy these requirements and improve performance on any high performance computing system that will be used by the CCSM community are encouraged.
- Two CVS branches in the CAM development trunk will be created at NCAR for the Cray and NEC optimization modifications, respectively. These will enable code sharing, as well as make it easier to keep the vector versions of the CAM synchronized with the normal CAM development. At least one CVS branch will be created for the CLM2 vectorization work. The other models (POP, CICE) involve much fewer developers, and will be coordinated with the development teams directly.
- As described above, modifications for vectorization will be submitted to the component model Change Review Boards or appropriate working groups in the same fashion as any other proposed modification. ORNL personnel who are already CAM and CLM developers will be responsible for evaluating and submitting changes proposed by Cray. NCAR personnel working with NEC, CRIEPI, and Fujitsu will have a corresponding responsibility. The expectation is that ORNL and NCAR will coordinate these check-in requests before submitting them to the relevant CRB.
- One important open issue is the involvement of the developers of the next generation coupler (CPL6) in vectorization. This work is partly funded by the SciDAC project, and explicit direction by the SciDAC PIs to port and tune CPL6 on vector systems is required to address this issue in a timely manner.
For additional information, contact Patrick Worley (firstname.lastname@example.org).
A. Status of Component Model Activities.
The following template is used.
- Climate Component
- Code: name of code(s)
- Organizations: participants' organizations. Being listed does not necessarily indicate an organizational commitment. Being listed also does not mean that an individual in the organization has been identified to participate in these activities yet. The expectation of participation in the near future is sufficient for inclusion.
- People: Number of people from each organization. If known, names are included as well. These are conservative counts, and the actual number of participants is likely to be higher. The amount of time that each participant will devote to these activities is not indicated. NOTE: Some of these names are tentative and may yet change.
- What: target platforms
- Issues: list of current activities; list of known problems that must be addressed in order to make progress.
- Responsibilities: Individual or organizational responsibilities in the case that the work has been partitioned.
- SWE: software enginnering activities, decisions, and issues
- Code: Community Atmopheric Model (CAM)
- Organizations: Cray, CRIEPI, NASA Goddard, NCAR, NEC, ORNL
- Cray (1+): Matthew Cordery
- NEC (2+): Dave Parks
- NCAR (1 ): Byron Boville
- ORNL (2 ): James B. White, Patrick Worley
- What: NEC SX-6 (in particular, the Earth Simulator (ES40)), Cray X-1
- Compiler analysis (loop marks), porting, and profiling is ongoing.
- NEC expects to have single node vectorization and optimization for the SX-6 complete by early Fall.
- The radiation routines and cloud scheme are the focus of much of the current work.
- Spectral Eulerian dynamical core (ORNL/Cray);
- Finite Volume dynamical core (NCAR/NEC)
- Two branches in the CAM CVS code repository will be created at NCAR, one for the SX-6 modifications and one for the X1 modifications.
- Vector-friendly modifications will be checked into the development branch as soon as reasonable (subject to Change Review Board approval) using the usual check-in procedures. NCAR (Byron Boville) and ORNL (Patrick Worley) will be responsible for testing and submitting these modifications.
- Code: Community Land Model (CLM2)
- Organizations: NCAR, ORNL, Cray, NEC
- NCAR (1): Mariana Vertenstein
- ORNL (2): Forrest Hoffman, James B. White
- Cray (1): Matthew Cordery
- NEC/Tokyo (1): Hideyuki Kitahara
- What: NEC SX-6 (ES40), Cray X-1
- Promising approaches have been identified.
- Currently investigating implementation issues.
- Experiments are currently limited by available resources.
- Vectorization modifications must preserve OpenMP/cache friendliness (via parameters or other knobs).
- Coordination with NEC/Tokyo must still be worked out, as it is important that the same general approach be taken by both the SX-6 and X1 ports.
- At least one branch in the CLM2 CVS code repository will be created at NCAR.
- Vector-friendly modifications will be checked into the development branch as soon as reasonable (subject to Change Review Board approval) using the usual check-in procedures. Initially, ORNL (Forrest Hoffman) will be responsible for testing and submitting these modifications.
- Code: Parallel Ocean Program (POP)
- Organization: Cray, CRIEPI, LANL, NCAR
- LANL (1): Philip Jones
- Cray (1): John Levesque
- NCAR (2): Nancy Norton
- CRIEPI (2): Yoshikatsu Yoshida, Daisuke Tsumune
- What: SX-6 (ES40), Cray X-1
- Coordination between CRIEPI and Cray changes.
- Unlike the other components, significant vectorization work is complete, and is highly successful. Parallel algorithm issues, such as latency in the 2D baroclinic solver, currently limit performance scalability.
- I/O performance is another concern for both the Earth Simulator and the X1.
- Cray and CRIEPI have both been working with POP1.4.3; POP2.0 vectorization and the role of POP2.0 in the CCSM are not clear.
- Responsibility: Coordination (LANL/Phil Jones)
- SWE: Phil Jones will be responsible for testing and submitting vector-friendly modifications to the Ocean Working Group "CRB".
- Organization: Cray, Fujitsu (America), LANL, NCAR
- Fujitsu (1): Clifford Chen
- LANL (1): Bill Lipscomb
- NCAR (1): Julie Schramm
- What: SX-6 (ES40), Cray X-1
- Significant progress has been made, improving performance on both vector (Fujitsu VPP500) and nonvector systems, while retaining bit-for-bit agreement with original code on nonvector systems.
- Further improvements in vector performance are possible.
- The current modifications still need to be reviewed by the Polar Working Group.
- Vectorization and optimization efforts are complicated by the proposed merger of CICE and CSIM. A plan is being developed to coordinate this merger, and the issue should be resolved in the near future.
- There is as yet no advocate/interface within the CCSM development community for CICE vectorization.
- Code: CPL6
- Organizations: ANL, Cray, NEC, NCAR
- ANL (1+): Jay Larson
- Fujitsu (1):
- What: SX-6 (ES40), Cray X-1
- The primary coupler responsibility is data routing, which is not a vectorization issue, but is affected by latency and bandwidth characteristics of the target platforms
- remap is the primary computational task, which is best characterized as sparse matrix multiplication.
- The next generation coupler is CPL6, which will be added to the CCSM in the next few months. This should be the target for vectorization.
- The unit tester for MCT (the infrastructure for CPL6) runs on both the NEC and Fujitsu parallel vector systems. Performance has not been evaluated, but the code was developed and optimized for nonvector systems.
- There is as yet no formal request to the CPL6 developers to optimize the coupler on vector systems.
- SWE: There is as yet no advocate/interface for vectorizing CPL6 within the CCSM development community
- Coupled System
- Code: CCSM
- Organizations: NCAR/NEC/CRIEPI/Fujitsu, ORNL/CRAY
- Problem resolution impacts performance. This has already been addressed in the Earth Simulator port project. Problem resolution is important to determine what percentage of vectorization is achievable, and what percentage is needed in order to achieve performance goals.
- Configuration experiments will be required to determine how best to run the coupled model, which will in turn depend on the success in vectorization of the component models and the problem resolution.
- All components in CCSM need to be modified for vectorization, and the modifications need to be validated in the context of the full CCSM.
- The goal is to produce a vector-friendly CCSM that will become part of a future CCSM release.
B. Workshop Participants
A partial list of attendees follows:
Name   Institution Byron Boville   NCAR Lawrence Buja   NCAR Matthew Cordery   Cray Tony Craig   NCAR/CGD John Drake   ORNL Tom Engel   NCAR/SCD Mark R. Faney   ORNL Steve Gombosi   NCAR/SCD Helen He   LBNL Tom Henderson   NCAR Forrest Hoffman   ORNL Phil Jones   LANL Brian Kauffman   NCAR Rory Kelly   NCAR/SCD William Large   NCAR John Levesque   Cray John Michalakes   NCAR/MMM David Parks   NEC Bill Putman   NASA-DAO Julie Schramm   NCAR Daisuke Tsumune   NCAR/CRIEPI Vince Wayland   NCAR Nathan Wichmann   Cray James B. White III   ORNL Patrick H. Worley   ORNL Woo-Sun Yang   LBNL