The Accelerated Climate Prediction Initiative:

Bringing the Promise of Simulation to the

Challenge of Climate Change

June 1998

Dear Reader,

The United States Global Change Research Program and its researchers anticipate that ever more complex and sophisticated models of the Earth System will be required for both research and assessment needs. The proposed Accelerated Climate Prediction Initiative is an ambitious program to dramatically increase the rate of climate simulation model development and application to produce decade-to-century-scale forecasts of climate change with regional resolution. The Department of Energy intends to couple its expertise and capabilities in simulation science to the scientific capabilities in the broader global change research community.

The following document describes the concept for the initiative, as well as the technological and research needs that must be met to achieve the goal of substantially reducing the uncertainties surrounding model-based projections of decade-to-century climate change. I sincerely appreciate the work of the more than twenty scientists from the National Laboratory system, other Federal research institutions, and universities who have contributed to defining this effort. This document is not a final plan. Rather, it is a proposal for a conversation with the stake-holders and constituents in both the climate and computational research communities to address how best to employ the promise of simulation science to the challenge of long-term climate prediction.

In that spirit, we welcome your feedback on this initiative.

Ari Patrinos

Associate Director for Biological and Environmental Research

Office of Energy Research

 

Contents

Executive Summary

 

Policy Needs for Climate Prediction

Climate Prediction: The Science and the Models

Collaborating at a Distance

Speed, Memory, and Parallel Computing: Running the Models

Benchmarking: Ensuring Computational Performance and Usability

Building Regional Climate Collaboration Centers

Institutional Commitment and Interaction

ACPI Implementation Strategy

 

Glossary

References

 

For More Information.

 

Executive Summary

Energy and environmental policy development requires greater model certainty at global and regional levels, reducing the time required to complete model simulations, and more detailed information about climate change effects than are possible given the current states of both science and technology. The United States requires an unprecedented acceleration and extension of the modeling state of the art to reduce existing uncertainties about long-term climate change and provide regional specification in climate change projections to support national and international energy and environmental policies that must be formulated and implemented early next decade.

Toward this end, the Accelerated Climate Prediction Initiative's goals are:

* To accelerate progress in climate simulation model development and application;

* To substantially reduce the uncertainties in decade-to-century model-based projections of climate change; and

* To increase the availability and usability of climate change projections to the broader climate change research and assessment communities.

Progress depends on advances in scientific knowledge (which depend, in part, on knowledge generated by models) and on the physical ability of computers, databases, networks and associated computational infrastructure to process large amounts of data in short amounts of time. Accelerating computer technology development, therefore, is a necessary element in improving climate models.

The ACPI takes an integrated view of the improvements required to accelerate progress in climate simulation and projection of any climate changes. Interrelated activities in model development and evaluation, simulations and projections, and analysis and assessment must be addressed by concurrent improvements in the models themselves, data availability and usefulness, computer speed and memory, collaborating and data management capabilities, and institutional interaction (Figure 1). The implementation strategy for the ACPI includes a partnership with the Accelerated Strategic Computing Initiative within the DOE Strategic Simulation Initiative. This suite of improvements and relationships is collectively essential to build an effective program to deliver projections that policymakers can apply to analyze climate-related issues.

Figure 1. Accelerating the US climate prediction capability in the context of an integrated program allows international and national policies to be addressed. Concurrent improvemennts are required in the models themselves, data availability and usefulness, computer speed and memory, collaborating and data management capabilities, and institutional interaction.

Model Improvements. Breakthroughs in climate simulation require that the modeling community construct models capable of simulating the principal climate components, as well as the dynamical behavior of the fully coupled earth climate system. Full climate models, like the Parallel Climate Model (PCM), are built by coupling separate component models of the global ocean, the atmosphere, sea ice, and the land surface.

Major improvements in climate model simulation and projection come from increasing spatial resolution, that is, representing more details in space, and from improving process parameterizations that describe small scale dynamic and thermodynamic processes. Better projections of future climate will result from ensembles composed of multiple simulation runs. However, the dominant limiting factor in present-day models (with their present levels of resolution) is the amount of computing time required to run complete multi-decadal or century projections. Therefore, improvements in models must be accompanied by improvements in computing speed.

Data Availability and Usefulness. The results of model runs are evaluated using observational data, which are fragmentary, prone to error, and sometimes difficult to acquire and format for the purpose of evaluating climate models. For example, because current climate simulations typically produce large errors in radiation fluxes, model evaluation needs include surface-based and satellite data on clouds and radiative transfer. Other needs for observational data include those for precipitation, near-surface temperature, sea-surface temperature, air-sea exchange, and dynamic global terrestrial vegetation.

Distance Computing and Data Management Capability. The infrastructure to support the ACPI must provide resources at a scale to enable model development and evaluation, ensemble simulation projection and analysis and assessment of the projections. These capabilities are well beyond anything available today or planned for tomorrow, outside of the national security arena. Productive use of the advanced resources available to computational scientists requires not only a revolution in computational methods, but also a corresponding revolution in the tools for analyzing, tracking and managing computational experiments, and managing complex data input and output data sets. Scientists will need effective problem-solving environments (PSEs), augmented by intelligent middleware called the DoE Advanced Research Collaboration Fabric (DARC-Fabric), support for shared access to distributed data sources and applications and the ability to easily understand and manipulate the data. The network element of the DARC-Fabric will need to support interactive collaboration, multiple classes of network traffic, online experimentation, visualization, and so forth. These advances are encompassed in the term "Collaboratory," an ideal state in which computing and communication technologies render separations in time and distance meaningless as groups jointly tackle large-scale problems.

Computing Improvements. To meet ACPI requirements, computational resources able to sustain 10 trillion floating point operations per second (TFLOPS) must be available in the middle of the next decade, assuming that algorithmic improvements can double the codes' computational efficiency. Expecting that the typical efficiency of parallel codes in the 2003 time frame will be 25%, the hardware platform required in mid-2003 will need to perform 40 TFLOPS (peak), and have 20 Tbytes of memory. In addition, the memory bandwidth, network latency, and I/O subsystem must be balanced for effective use. To ensure system usability and effectiveness for code development and results analysis, benchmarking must demonstrate the performance of the system with respect to the needs of climate simulation.

Collaborating Capability. Regional Climate Collaboration Centers (RCCCs) will link the climate simulation community and the assessment and policymaking communities, recognizing that the best projections are useless unless they can be used to project effects on humans to evaluate candidate policies. The RCCCs will have two major functions, service and research. The service function is to provide the tools, information, and expertise required to understand and assess the implications of climate change for the country. The research function is to develop and improve the tools and expertise needed to produce this information.

Institutional Interaction. The ACPI will generate new challenges for its member institutions; these challenges include managing the tensions between institutional missions and ACPI objectives, formulating and strengthening collaborations, and negotiating a balance between program goals and personal rewards. Explicit attention to these issues, supported by institutional science theory and feedback, will allow the ACPI to achieve its potential.

Implementation Partnerships. The ACPI implementation plan will need to be developed to complement the larger Strategic Simula-tion Initiative, of which it is a part and in concert with the Accelerated Computing Initiative (ASCI). The ambitious acceleration of scientific progress, model development and application, and information dissemination requires an equally ambitious acceleration of computational and information technologies. Keeping the program in balance requires that the first priority is to implement the APCI system of institutions and research collaborations with currently available technology to lay the foundation for future progress.

Conclusions. The ACPI can make a powerful contribution to effective climate change policies by advances in the simulation science on which climate prediction is based. Simul-taneous acceleration of hardware and software developments in computing interacting with increases in knowledge of climate processes will create a synergistic program in simulation science. Explicit attention to institution-building will enable the collaborations needed to fulfill the ambitious goals of the ACPI.

The Regional Climate Collaboration Centers (RCCC) are essential for the ACPI to fulfill its goals. The RCCC provide the connections among global climate science, and national and regional assessments and analyses. They are the means by which new knowledge is disseminated to the institutions and people who can use it to mitigate or adapt to climate change.

The final outcome will be the wide availability of data and scientifically credible climate projections in usable formats. This will benefit the scientific community, policymakers, firms, and households seeking to understand climate change and its impacts.

Ancillary benefits from the ACPI are numerous. The collaboration and computational infrastructure that will be developed for ACPI can be readily adapted and implemented for other scientific disciplines, such as health and biology, materials, and fusion energy. Advances in models for long-term climate prediction will enable rapid improvements for both weather forecasting and seasonal prediction. The challenge of building and applying climate models will be a stringent test for new computational technologies, thereby accelerating the pace of production supercomputing evolution.

Policy Needs for Climate Prediction

Over the past three centuries, humans have altered the atmosphere's chemical composition with little understanding of the consequences. The theory documenting the relationship between concentrations of "greenhouse gases" (most notably carbon dioxide) and the temperature of the lower atmosphere is over 100 years old. Only in the past two decades, however, has the concern over the exponential growth of greenhouse gas emissions generated an international scientific consensus that unrestrained future emissions may precipitate an unnatural and potentially disruptive change of the earth's climate. Computer simulation models are the primary tools for integrating knowledge about the workings of the climate system and developing projections of possible climate change. These functions are crucial to building the scientific basis for predicting climate change and characterizing its effects. Improvements to these functions are critically needed in the short and medium terms to prepare for the long term.

Energy and environmental policy development requires greater model certainty at global and regional levels, reducing the time required to complete model simulations, and more detailed information about climate change effects, than is possible given the current states of both science and technology. Currently, three ongoing and related policy formulation programs and processes define the demands for climate change projections over the next several years. The Intergovernmental Panel on Climate Change generates an international scientific assessment every five years, with the next assessment scheduled for 2001. The Framework Convention on Climate Change lays out a process for international cooperation to stabilize greenhouse gas concentrations. Finally, the US government is required to produce periodic national assessments to evaluate domestic systems that will be affected by climate change. The first of these assessments is scheduled to be completed in 1998. More accurate model projections are needed by each of these programs to provide the baseline scientific knowledge at a level of certainty that is sufficient to base decisions and to evaluate candidate policies.

A framework to reduce greenhouse gas emissions requires the definition of an attainable goal for stabilizing the atmosphere's composition of greenhouse gases. Establishing long-term goals provides direction to efforts to mitigate the effects of rising atmospheric greenhouse gas concentrations and gives quantitative expression to the objective of the 1992 Framework Convention on Climate Change (FCCC) as stated in Article 2:

...stabilization of greenhouse gas concentrations in the atmosphere at a level that would prevent dangerous anthropogenic interference with the climate system. Such a level should be achieved within a time-frame sufficient to allow ecosystems to adapt naturally to climate change, to ensure that food production is not threatened and to enable economic development to proceed in a sustainable manner.

To achieve this objective, policymakers must balance potential for allowing natural adaptation and provision for sustainable economic development.

While there is little doubt that a long-term strategy is needed to stabilize atmospheric greenhouse gas concentrations to mitigate long-term climate change, the impact of emission reduction strategies is highly uncertain and the subject of intense public debate. Figure 2 shows the estimated total costs to achieve stabilization at various atmospheric equilibrium levels of carbon dioxide associated with three different strategies for technology development. In addition, particular economic sectors, such as energy, agriculture and transportation, have large stakes in even small changes in policies developed to stabilize atmospheric greenhouse gas concentrations.

 

Figure 2. MiniCAM cocsts of stabilizing the atmospheric carbon dioxide concentration at levels ranging from 450 ppmv to 750 ppmv for three alternative sets of emissions trajectories. Each point along each curve represents an independent set of model projections. (Figure courtesy of Jae Edmonds, PNNL)

 

The Need for an Accelerated Climate Prediction Initiative

Although integrated assessment tools can project costs to achieve stabilization, identifying an acceptable equilibrium level and evaluating the time-dependent climate change along the path to stabilization require a highly detailed understanding and quantitative estimation of climatic effects at regional scales. Under all potential scenarios for stabilizing atmospheric concentrations, the greenhouse gas forcing of the climate system will be stronger than it is presently, resulting in continued climate change. The magnitude of the forcing will be greater than any that has been recorded, understood, and studied with existing, archived weather records. Therefore, climate prediction over decades to centuries will require that ever more complete models of the climate system be continuously developed, evaluated and employed in order to make informed judgments and policies. The climate system must be well understood and simulated if decisions based on model projections have such potentially profound impact on the American economy, but existing projections are inadequate to evaluate greenhouse gas reduction strategies.

The United States requires an unprecedented acceleration and extension of the modeling state of the art to reduce existing uncertainties about long-term climate change and provide regional specification in climate change projections to support national and international energy and environmental policies that must be formulated and implemented early next decade.

Toward this end, the Accelerated Climate Prediction Initiative's goals are:

* To accelerate progress in climate simulation model development and application;

* To substantially reduce the uncertain-ties in decade-to-century model-based projections of climate change; and

* To increase the availability and usability of climate change projections to the broader climate change research and assessment communities.

Simulation Science-- The Department of Energy's Vision

Progress depends on advances in scientific knowledge (which depend, in part, on knowledge generated by models) and on the physical ability of computers, databases, networks and associated computational infrastructure to process large amounts of data in short amounts of time. Accelerating computer technology development, therefore, is a necessary element in improving climate models.

The Department of Energy (DOE) recently identified simulation science as one of its principal twenty-first century objectives. The DOE, through its laboratory system and collaborators, intends to apply the power of computational and information technologies to create simulation laboratories for complex systems that cannot be recreated in a physical laboratory setting or user facility. Simulation, as a full partner to scientific experimentation, will be the newest enabling scientific tool of the twenty-first century. Toward this end, the DOE has launched the Strategic Simulation Initiative (SSI), under its science and technology (Energy Research) mission, to build on the investments and capabilities made in simulation for the Science-Based Stockpile Stewardship/Accelerated Strategic Computing Initiative (ASCI), under its Defense mission. Accordingly, the opportunity exists for cooperation among the DOE and its federal agency partners within the US Global Change Research Program (USGCRP), to develop functional and accurate decadal and longer term climate projections. These are the forecasts that will be necessary for evaluating the impact of greenhouse gas reduction strategies on future climate, before any policies can be implemented.

Climate Prediction: The Science and the Models

At the core of the Accelerated Climate Prediction Initiative (ACPI) is the requirement to produce and assess quantifiably predictive simulations of the global and regional climate. According to Lorenz (1975), climate prediction is the process of determining how atmospheric statistics gathered over a given averaging period (typically several years or longer) evolve over longer time frames. As a necessary prerequisite for future projection, a climate model must be able to accurately reproduce the observed climate in both time and space, and must do so by maintaining proper relationships among various parts of the climate system. For example, interannual to multi-decadal variations in the climate system are related to coupled atmosphere-ocean mechanisms. Some of these mechanisms that have been identified are the El Niño-Southern Oscillation (ENSO) phenomenon, the Quasi-biennial Oscillation, monsoonal dynamics, and decadal scale circulation oscillations such as that found in the North Atlantic. If climate models are to both properly simulate the current climate and provide scientific-ally credible projections of future climate, they must accurately simulate these types of large-scale climate oscillations. Achieving this level of prediction capability will involve:

* development of component models (e.g. ocean, atmosphere) that are evaluated through comparison with observations and experiments;

* incorporation of advanced sub-models, or parameterizations, of climate processes, such as clouds, chemical interactions, and surface and subsurface water transport;

* coupled climate simulations with sufficient resolution to accurately simulate weather systems, ocean eddies, and

surface exchange processes that affect climate dynamics;

* climate change projections composed of multiple-realization simulation ensembles, from which probability distributions will be constructed to quantify the statistics of the projections; and

* human-based analysis of simulations initially stored as large data sets.

Developing Climate Simulation Models

Breakthroughs in climate simulation require that the modeling community construct models capable of producing projections of the climate system that closely simulate the real, observed system at both the global level and in regional detail. Models need to include and accurately simulate the principal climate components, as well as the dynamical behavior of the fully coupled earth climate system. Full climate models are built by coupling separate component models of the global ocean, the atmosphere, sea ice, and the land surface. Presently, the land-surface models are closely associated with atmospheric components and the sea-ice models are closely associated with the ocean components. Both the atmosphere and ocean are simulated with three-dimensional general circulation models (GCMs) that essentially describe the fluid motion of the air and water with systems of equations grounded in classical Newtonian fluid dynamical theory. In other words, the atmosphere and ocean are divided into many small fluid volumes, and the changing forces and properties of the fluid in each volume are mathematically calculated as a function of time. On the largest scales, both the atmosphere and oceans can be thought of as statically stable, thin, spherical shells of fluid in a reference frame that rotates with the earth. Motions are driven by thermodynamic processes which are manifested as pressure and density gradients. Within these components are parameterizations (or sub-models) describing important climate processes that influence the thermodynamic properties and composition of the fluid systems, such as vegetation effects, precipitation pathways,glaciers, atmospheric radiative transfer, clouds, etc.

Coupling the component models introduces problems related to the mismatches between the spatial and time scales. From a dynamical systems perspective, the atmosphere responds relatively rapidly to changes in internal forcings and boundary conditions, with a time scale of a few weeks. On the other hand, the ocean circulation encompasses time scales that range from a few weeks in the ocean mixed-layer that is in contact with the atmosphere, to hundreds of years in the deep, slow moving ocean abysses. The persistent, nonlinear interactions among these scales make climate projection a complex simulation exercise with aspects of both initial-value and boundary-value mathematical problems. Small, but systematic errors in the models and parameterizations can dramatically influence projections of any scenario associated with anthropogenic changes because of the long periods of model integration.

Despite this complexity, experience in weather prediction and, more recently, the success in ENSO prediction provide optimism for progress in developing longer term climate prediction tools. Model development activities are typically iterative, requiring many numerical simulations to establish the proper choices for the numerous process parameterizations incorporated in a global model. The nonlinear interactions between the various physics components and model dynamics make it impossible to assign a priori values to many of the free parameters in the process parameterizations. Extensive simulation is required at each stage of model development for evaluation against both theory and observations. Numerical experiments can range from 6 months to 15 years of simulated time. The longer experiments are designed to evaluate the low-frequency behavior and interannual response of the model. Scientists employ ensemble modeling to ascertain whether the model produces statistically reliable simulation characteristics of observed climate features. In these runs, several experiments differing only slightly in their initial conditions simulate the same time period, and the resulting statistics provide measures of the simulated climate.

In recent years, several groups have engaged in developing both modeling frameworks and fully developed models that hold promise for simulating climate. For illustration, several modern codes are described in further detail to elicit the computational requirements necessary to reach ACPI goals. The most appropriate models to perform the actual simulations described above have yet to be developed, but they will in most cases build upon the experiences gained with the current generation of models. The following descriptions contain a fair amount of technical detail to establish the basis for assumptions about future computing requirements. The texts by Trenberth (1992), Washington and Parkinson (1986), and Haltiner and Williams (1980) provide excellent descriptions of the methods and modeling techniques mentioned in these sections.

Atmospheric Models

Atmospheric models contain the most complexity. Parameterizations in the atmospheric codes describe the transfer of long- and short-wave radiation in the atmosphere, surface energy exchanges, atmospheric boundary layer processes, small-scale vertical and horizontal transport processes, and the effects of vertically propagating gravity waves. The National Center for Atmospheric Research (NCAR) Community Climate Model, version 3 (CCM3) typifies current state-of-the-art atmospheric models used for climate studies. The dynamical core of the model employs the spectral transform method to integrate equations that describe atmospheric motion. Fluid divergence, vorticity, surface pressure and temperature are predicted using a truncated set of spherical harmonics in wavenumber space and an alias-free transform grid for calculations in physical space. Presently, this model uses a 42-wave triangular truncation (T42) in the horizontal with 18 vertical levels. The transform grid corresponding to the T42 truncation contains 128 longitude points and 64 latitude points, corresponding to a horizontal grid resolution of approximately 300 km in tropical regions. A semi-Lagrangian method is used to evaluate the transport of moisture and other atmospheric constituents, which are also defined on the transform grid. The vertical coordinate system is a hybrid sigma-pressure coordinate that is terrain following at the surface of the earth. Solar forcing includes the provision for a diurnal cycle, where radiative heating rates are evaluated once every hour. Boundary conditions over the ocean surface can either be specified, using prescribed sea-surface temperatures and sea-ice distributions, or can be evaluated using a fully interactive ocean (in the form of a simple slab mixed-layer representation) and thermodynamic sea-ice model. Over land surfaces, the CCM3 incorporates a comprehensive land-surface model (LSM) which treats the exchange of energy, momentum, water, and CO2 between the atmosphere and land. It accounts for ecological differences among vegetation types, and hydraulic and thermal differences among soil types, and allows for multiple surface types (including lakes and wetlands) within a grid cell. Snow cover, sea ice and other properties affecting surface albedo are evaluated dynamically.

Numerical simulations using horizontal resolutions have been made to examine the role of horizontal resolution on the simulated climate. Recent analyses of high-resolution climate simulations suggest that, even at T170 (80 km) resolution, certain aspects of the numerical solutions continue to show improvement when compared to lower resolution results. Thus, higher resolution solutions, i.e., above T170, may provide additional improvements in the quality of simulations. Increases in model resolution can come at a large price in terms of the overall computational demand. For many components of the model framework, this increase is directly proportional to the number of additional spatial degrees of freedom. Increases in resolution also require that smaller time steps be taken, for stability and accuracy reasons, further increasing the computational cost to simulate a specific time-period. Estimating the increased computational burden is complicated by the fact that the operations count for some of the numerical algorithms in global models also scale nonlinearly with resolution. Examples include the Legendre transform portion of the spectral dynamics, which scales quadratically with the number of latitudes, and the longwave radiative transfer calculation, which scales quadratically with the number of vertical levels.

These model-to-model nuances make it difficult to quantify the computational requirements as a function of resolution with great certainty. For the purposes of this discussion, we will make hypothetical projections of the performance of CCM3 at a variety of resolutions for current algorithms and with algorithmic changes proposed for the dynamical core. The computational requirements for CCM3 have been estimated as a function of resolution. As a reference point for the present state of computation, a 16-processor C90 can simulate one day of a T170 18 level model configuration in about 12 wall-clock minutes. At this rate, a 15-year integration would require almost seven weeks of elapsed time. To accomplish the same feat in eight hours at the target resolution of 30km (T426 truncation) would require a machine capable of sustaining around 13 TFLOPS.

Ocean Models

Nearly all ocean models presently employed for global climate studies are based on the Bryan-Cox-Semtner series of ocean models originally developed at the NOAA Geophysical Fluid Dynamics Laboratory (GFDL). They are all based upon the same set of prognostic and diagnostic equations, and share most of the same assumptions and approximations. Central to these models is the numerical implementation of a form of the primitive equations cast in a three-dimensional Eulerian framework on a fixed coordinate grid. They are frequently referred to as z-coordinate models, because their vertical coordinate is depth. Two ocean models of this class that have been used extensively for climate studies, both as stand-alone component models and coupled to atmospheric GCMs, are the Modular Ocean Model (MOM) maintained by the GFDL and the Parallel Ocean Program (POP), developed at Los Alamos National Labora-tory. POP was built specifically for climate simulation on massively parallel and highly scaleable computer systems. Because the code was completely rewritten for newer architectures, the developers were able to include more modern numerical and computational methods to increase both the speed and versatility of the model, without the difficulties associated with making large scale changes to existing codes. For this reason, POP is useful as a reference to determine the future computational requirements for APCI.

POP has been used to carry out a sequence of near-global ocean simulations with a resolution of 0.28 (30 km) at the Equator on a Mercator grid extending from 78N and 78S, with 20 depth levels. The simulated variability of sea-surface height fell short of satellite measurements in regions of strong currents like the Gulf Stream, indicating that higher resolution was needed. Because this was only feasible with a basin-scale model, an Atlantic Ocean-only simulation was done with 1/10 (11 km) resolution at the equator and 40 vertical levels. The model domain extended from 20S to 73N on a Mercator grid, so the spatial resolution varied from 11 km at the equator to 3 km at 73N. Observed surface winds for the period 1985-1994 were used to force the model, as was done in the global runs. The behavior of the Gulf Stream is quite realistic, and the simulated sea-surface height variability in the Gulf Stream agrees well with satellite altimeter measurements.

Results from the Atlantic Ocean simulation were extrapolated to a fully global run on a displaced-pole grid with 1/10 resolution. To complete 50 days in 24 hours would require sustained performance of about 30 GFLOPS, which should be achievable in 1999 on the phase-3 Origin 2000 system at Los Alamos. The prospect of improved code efficiency based on research now in progress, better representation of the deep ocean with fewer vertical levels through use of a hybrid coordinate, and more powerful machines should make overnight turnaround of multi-year runs feasible at this resolution. To meet the objective set above for atmospheric models, 8-hour turnaround for 15-year runs, would require sustained performance of 10 TFLOPS. This is consistent with the 13 TFLOPS result obtained above for the CCM3 atmospheric model.

Coupled Models

Recently, a new coupled climate model, the Parallel Climate Model (PCM), was developed from the CCM3 and POP models specifically to run on massively parallel and highly scaleable systems. The T42 version of CCM3 was coupled to a 2/3 degree average resolution version of POP. The development history of this model provides a preview of the challenges and opportunities faced by the ACPI. The component models were developed through collaborations involving four major groups at three different institutions. Additional help in the form of computational and numerical expertise came from two additional laboratories. The full development cycle time was about six years, although only a few scientists were actually working full time on the project at any given point during this period. In addition, the PCM project was not the primary focus of any of the participants until the last year. Coupling the components poses both physical and computational difficulties. These were overcome through a specialized code, called the flux coupler, that provides the interpolation, averaging and rescaling necessary to properly specify the transfer of energy and materials through the component model boundaries. Preliminary tests show that this model is capable of attaining 20 GFLOPS performance on a 512 processor Cray T3E computer, roughly 5% of the machine's peak speed. Within the next few months, preliminary experiments on the Origin 2000 series and IBM SP2 series of computers will begin.

Running Simulations for Climate Change Projections

After component models are developed, tested, coupled together, then tested again in coupled mode, scientists will design simulation ensembles to produce the probability distribution functions (pdfs) of such quantities as rainfall, temperature, and storm frequency. The projections must be able to reproduce the statistics observed in the recent past and present and then simulate the projected changes with time that would be expected from various combinations of anthropogenic forcings in the future. The models then need to be forced with a variety of realistic or probabilistic anthropogenic forcings in the context of different stabilization scenarios. These scenarios must include "before and after" ensembles to estimate the impact of proposed reductions. Each ensemble might include 10 or more realizations in order to produce stable statistics.

This information generated will only be useful if it is disseminated to the broad and diverse user base that is concerned with climate change. The projections will be used to estimate a wide range of important climate parameters, including many that are not now possible, such as the increase in tropical storm frequency, number of days of heavy rainfall, and other possible climatic changes that may accompany a continued build-up of greenhouse gas concentrations. The impact of any mitigation proposals, and the accurate projection of future climate changes at different stabilization levels will be based on model projections. Models should be able to estimate not only the direct effects of anthropogenic substances on large-scale climate but also a number of contributing problems, for example, plankton, and the transport and redistribution in the ocean of phosphates and other fertilizers. This preliminary analysis of component and coupled models operating at the currently estimated target resolutions and parameterization complexity consistently points to the need for computer systems capable of sustaining ten to forty trillion mathematical calculations per second.

Observation-Based Model Evaluation

Potentially, modelers have access to numerous sources of observational data that can be used for climate model evaluation. These data exist on many time scales, and represent both point and area observations. However, this wide variety also means that numerous problems must be overcome before the data can be used. Accordingly, a concerted effort is required to assemble and analyze the data and put them in a format that is useful for model evaluation. Researchers would additionally benefit because the development and analysis of these diverse data sets would allow not only robust model evaluations, but also greatly improve the state of our understanding of historical climate change.

Different parts of the model development cycle require different observational data for evaluation. Parameterization development many times relies on the information from process level experiments. Scientists evaluate component and coupled models based on broader scale comparisons with observed statistics and dynamics. Extensive data sets at both the process and system levels are available, mostly as results from USGCRP observational programs. The accuracy of the observational data needs to be addressed prior to model evaluation. For example, rain gauges do not catch 100% of the rainfall that occurs. Data homogeneity problems exist in most time series of in-situ measurements (Easterling et al. 1996). In a homogeneous time series, variations in that time series are purely the result of variations in the climate system; an inhomogeneous time series has both purely climate-driven variations and variations from, for example, instrument changes or changes at the observing site.

Cloud and radiation parameterizations serve as very good examples of the problems related to linking model development with existing data sources. Current climate model simulations typically produce large errors in radiation fluxes, particularly over high latitudes (Randall et al 1998). Therefore, a wide variety of radiation observations need to be assembled. Cloud observations are generally available from either surface-based measurement systems or satellite measurements. Again, the time-scale of the variation or feature being examined determines the type of data used in model evaluation. Surface-based observations have a long period of record; however, these are point measurements and have some homogeneity problems. High-quality data from the DOE Atmos-pheric Radiation Measurement (ARM) Program and the NASA Earth Observing System (EOS) will provide critical new information to evaluate cloud and radiative transfer parameterizations.

On the larger scale, most of the effects of potential climate change will result from changes in precipitation and near-surface temperature. These variables are among the most important in climate model evaluation. Land surface-based climate observations with the greatest spatial coverage and with century or longer periods of record are generally those from daily weather observing sites. Long-term climate data suitable for examining trends and multi-decadal variations in climate currently are usually only available digitally as monthly values. The only parameters that have been observed consistently and reliably over widespread areas of the terrestrial world for over a century are near-surface atmospheric temperature, precipitation, and barometric pressure.

Since climate models are or will work on very short time-steps (e.g., 5-minute or less) and with increasing spatial resolution, data also must be available that allow diagnostics of some processes on time-scales ranging from seasonal down to something approaching the individual model time-step, and on increasingly finer spatial scales. Data from some automated networks are available for short-period totals such as 5 to 15 minutes, hourly, and daily. Most of these networks have data available only for short periods of record; however, a few data sets have a relatively long period of record (50 years). Remotely sensed estimates of precipitation are potentially useful for verifying the spatial extent of rain/no rain areas. However, quantitative measures of precipitation totals from satellites are still inadequate and must be verified with ground-based measurements. Radar data provide another source of high-spatial resolution estimates of precipitation.

Marine observations of both the state of the ocean and the atmosphere above the ocean are clearly some of the most crucial data for direct assessment of quasi-periodic oscillations such as ENSO as well as for identification of century-scale trends in the ocean-atmosphere system. These data include observations of sea surface temperatures (SSTs), surface air pressure, air temperature, winds, sub-surface temperature, and salinity. Satellite altimeters provide high resolution observations of sea-surface height, which is extremely useful for model evaluation, particularly with the higher resolution eddy-resolved ocean models. Monitoring and comparison of model simulated sea ice coverage with surface observations is critical, because of the strong feedback from ice-albedo at high latitudes.

Three-dimensional data sets of atmospheric temperature, humidity and pressure are critical in examining climate trends and quasi-periodic oscillations. These data are particularly useful for examining spatial variations in tropospheric and stratospheric temperatures. Gridded, four-dimensional (three spatial dimensions and time) data sets are becoming available to climate researchers through data assimilation efforts. Data assimilation is the process of ingesting observations into a model. The result is a comprehensive and dynamically consistent data set that represents the best estimate of the state of the atmosphere at that time. The assimilation process fills data voids with model projections and provides a suite of data-constrained estimates of unobserved quantities such as vertical motion, radiative fluxes, and precipitation. The National Centers for Environmental Prediction are conducting a new analysis of historical weather data using models from its suite of operational forecast models. The Data Assimilation Office at NASA's Goddard Space Flight Center is currently producing a multiyear gridded global atmospheric data set for use in climate research, including tropospheric chemistry applications. The analysis incorporates rawinsonde reports, satellite retrievals of geopotential thickness, cloud-motion winds, and aircraft, ship, and rocketsonde reports. The available output data include all prognostic variables and a large number of diagnostic quantities such as heating rates, precipitation, surface fluxes, cloud fraction, and the height of the planetary boundary layer. All quantities are saved every 6 h at the full resolution of the assimilating general circulation model. Selected surface quantities are saved every 3 h.

Other observation-based data sets may be useful to evaluate the more complete models that scientists envision. Dynamic global land cover is only now beginning to be used in climate modeling. Land surface energy exchange is a crucial part of tropospheric thermodynamics and is sensitive to land cover variations. Furthermore, terrestrial ecosystems are major sources and sinks of greenhouse gases. Therefore, models need to be able to reproduce the current surface ecosystems as well as the ecosystem changes associated with climate change. Data are needed, at least initially, on areal extent of different land surface types, and how they have changed through time. This last point is particularly problematic, since much of what is currently known about global distributions of land cover is derived from satellites. Large-scale measurement of vegetation structure, such as Leaf-Area Index (LAI), has recently improved greatly. Soil moisture is widely recognized to exert a major influence on the surface energy fluxes. Soil moisture data from various parts of the world are available only in limited quantities. Snow observations are available from both surface-based and satellite-based observing systems. Most surface observing stations in areas that receive snowfall make observations of snow depth or snow cover. Furthermore, in the some regions snow observing networks exist. Observations of the growth and retreat of glaciers provide some long-term point measurements. Since glacier growth/retreat is an integrated function of changes in temperature, precipitation, and radiation parameters (e.g. wind blown dirt on the radiative surface), glacier data could be useful for future generations of high-resolution climate models.

The Need for New Computing and Information Technology

The immediate technological obstacles standing in the way of accelerating climate model development are the inadequacies of computer and networking resources available to drastically improve the capabilities and productivity of the climate model developers and users of climate change information. A multi-institutional team working on improving climate models needs an electronic collaboration environment that facilitates communication and enhances scientific interaction. Researchers need to access the data resources available to evaluate the impact of model improvements, and overnight turnaround on computers during the model evaluation stage. Many developers currently experience weeklong delays in test simulations and have limited diagnostic tools available to them to evaluate the models.

As an example of what is needed, scientists would ideally like to be able to complete the shorter experiments in a period of minutes and the longer experiments overnight (i.e., in under 8 hours of elapsed time). Using the CCM3 at T42 resolution as an example, a dedicated 16-processor C90 is capable of completing a 6-month simulation in about 45 minutes, and a 15-year simulation in about 24 hours. Similarly, a 64 processing element (PE) Origin 2000 with processor speeds of 250 MHz can complete a 15-year simulation in approximately 27 hours. These performance characteristics indicate that the best existing computer technology would need to be enhanced by at least a factor of three to accommodate the existing computational throughput requirements of the atmospheric modeling community for current low-resolution model configurations. Newer models envisioned to be running in the next few years must have both increased resolution and improved process parameterizations to accurately simulate the storm-scale phenomena that are important to climate. This translates into increased computational demands of several orders of magnitude.

The dominant time-limiting factor is the ability to run a complete multi-decade or multi-century projection. Typically, scientists are willing to allow about one month to complete the most sophisticated simulations. Simulations that take longer will not be executed, and models capable of running simulations faster than that rate will be improved to the extent that the computational load increases to the one-month turnaround time. Thus, the ACPI technology strategy must provide both (1) an expansive set of distributed resources to enable model, method, and application development, code verification and evaluation, and analysis and assessment of global and regional climate projections, and (2) a powerful simulation platform to efficiently accomplish the primary simulation tasks.

Collaborating at a Distance

The infrastructure to support the ACPI must provide resources at a scale to enable model development and evaluation, ensemble projection simulation and analysis and assessment of the projections. These capabilities will be well beyond anything available today or planned for tomorrow, outside of the national security arena. In order to achieve this state, ACPI must accelerate its technology acquisition while simultaneously developing the science, models, methods, and software required to make these resources useful in predicting the climate. The spatially distributed user and development groups envisioned in ACPI necessitate tools that enable these groups to easily interact. The term "Collaboratory" (defined by William Wulf in 1989) describes an ideal state in which computing and communication technologies render separations in time and distance meaningless as groups of researchers jointly tackle large-scale problems. Contemporary collaboratory technologies include audio/video conferencing; chat spaces; shared resources, including whiteboards, file systems, electronic notebooks, and application spaces. However, the unique scale and nature of the ACPI model development and science efforts will likely require significant advances in these technologies as well as the means with which to integrate and manage these capabilities. This paradigm shift will require a vigorous computer science research effort, exploration of alternative technologies, and partnerships with industry.

These new capabilities will require substantial engineering development, backed by a strong program of applied computer science research. The research component is essential because the goals of ACPI reach beyond the state of the art in numerous areas. Research advances are required to achieve the full potential of the program. The software infrastructure includes algorithms and numerical methods; integrated tools technology; numerical libraries, tools, visualization and data management; and collaboration technology. Conceptually, the hardware and software technology to enable collaboration may be viewed from the scientists' perspective using the following hierarchy (Figure 3).

Figure 3. Conceptual view of the layers comprising the ACPI collaboration environment.

Understanding and developing appropriate collaboration models and technologies for code development and climate prediction demonstrate the need for problem solving environments (PSEs). Productive use of the advanced resources available to computational scientists requires not only a revolution in computational methods, but also a corresponding revolution in the tools for analyzing, tracking and managing computational experiments, managing complex input and output data sets. A PSE provides all of the computational facilities necessary to solve a target class of problems. These features include advanced solution methods, automatic or semiautomatic selection of solution methods, and ways to easily incorporate novel solution methods as well as all of the data handling needs with easy-to-use graphical user interfaces (GUIs). Specific tasks that will be addressed under this initiative include development of interactive computational steering methods, visual analysis tools, data mining techniques, and tools for collaborative access and sharing of data between different user groups and different sites. A particularly critical need that will be met by the PSE is the construction of a framework for developing and testing model parameterizations. With this framework, scientists outside the core model development group can build and evaluate process parameterizations in a setting that resembles the actual model under development. One early prototype of this concept is the single-column version of the CCM3.

The success of the ACPI will not only be based on the use of the fastest computers, but will also be dependent on another, more dynamic element. Its geographically distributed collaborative nature requires an active set of persistent resources and servers that effectively and dynamically manage the underlying infrastructure to support their scientific collaboration. This intelligent middleware, the DoE Advanced Research Collaboration Fabric (DARC-Fabric) (Figure 4), will augment the capabilities normally provided in a more static PSE, to dynamically create virtual collaborations. The ACPI will create a new level of expectation by the scientists with respect to their supporting infrastructure. The DARC-Fabric is the important binding glue between the application and the underlying infrastructure; however, it only helps the researcher manage and use the infrastructure. The ACPI needs to provide the most advanced computing, visualization, collaboration, and database capabilities to scientists in order for them to be successful in their scientific pursuits.

Figure 4.The collaboration network fabric is essential to link the application and the underlying infrastructure.

 

Database Management

As simulation capability approaches TFLOPS limits, the ability to analyze simulation results will exceed standard visualization practices. The ACPI environment must provide scientists, analysts, and code developers the capability to interact with the immense data sets that will be produced by simulations. In 2003, the primary ACPI platform will have 20 Tbytes of memory. If typical runs will be on the order of 100 years and a Tbyte per week is stored during this simulation, then the data set to be viewed, analyzed, and assessed is 5.2 Pbytes. To complicate the situation, ACPI scientists will reside outside of the DOE Laboratories, thus requiring a distributed solution. Even in a local environment, data may be stored on a variety of devices, ranging from high volume tape drives to disk farms, and within various forms of data management systems. In moving toward increased collaboration between groups of researchers, a major challenge is to provide the technical software infrastructure to support shared access to distributed data sources and applications. At the foundation, this infrastructure must provide the capability to specify, standardize, and share metadata, that is, information about scientific data sets. Researchers need to understand the content of remote databases, formulate queries, transfer data, and eventually tie together disparate data management systems into more powerful systems. It then becomes feasible to issue data requests or queries that access many different data storage locations simultaneously, while hiding the underlying complexity of the systems accessed. This model extends to sharing of applications, libraries, and software modules.

At the center of the DARC-Fabric is a core collection of distributed computing services (operating system, client/server support, authentication, etc.) which supports all other services and applications. These services are within the scope of current technology. The middle layer consists of a suite of services that must be developed to support general classes of applications:

* support services for collaborative data analysis, including visualization

* support for distributed, collaborative data management

* support for application/module sharing (e.g., diagnostic libraries).

To move toward true collaborative sharing of data, a number of software services need to be developed and integrated. The key step is to establish a common, standard form for information, so that it can be shared and properly interpreted outside the local environment. The next step is to register this information, so that remote collaborative users can access it, build standard, shared interfaces, and construct mappings to software and data. This step relies on the development of globally accessible metadata registries, and utilities for building and maintaining such facilities. Programs and query facilities can then be developed to simplify the task of accessing, transferring, and filtering remote information.

Network Layer

The network is the most important part of the fabric in that it actually provides the connectivity necessary between the users and the resources they require for their research. Future networks must face the daunting task of supporting multiple classes of network traffic, with multiple policies, such as production and research or secure and unclassified traffic, in an efficient and effective manner (Aiken et al., 1997). Network researchers will develop and deploy new levels of network services and capabilities. These advances may be achieved through selective enhancements to the Energy Sciences Network (ESNet) capabilities, but will also require increased capabilities and bandwidth to traditionally non-ESnet sites as well as to virtual networks that span non-DOE infrastructure, aggregation points, and other physical networks. Development and deployment of enhanced network management tools to augment these increased network capabilities are paramount to the success of the ACPI. The advanced network, coupled with intelligent middleware, will provide the end user with the capability to effectively utilize the underlying infrastructure. This includes the ability to dynamically specify, build, and manage the virtual networks that will connect colleagues with each other as well as with a wide array of network based resources necessary to form the collaboration. Interactive collaboration, teleconferencing and tele-presence will require the support of multiple virtual networks and channels, some with low latency for control data, others that control jitter for audio and video applications, and others tuned to maximize bandwidth and throughput. Online experimentation, database management, visualization, and distributed computing require some combination of these and other types of virtual network channels.

In order to properly support the ACPI collaboration across national labs, federal research centers, and universities, advanced network technologies will need to be supported not only on DOE Laboratory and wide area network (e.g., ESnet) infrastructure, but also on University campus infrastructure, peering points, and other wide-area network (WAN) infrastructures. An integral part of any such richly interconnected infrastructure will require the development and deployment of advanced network status, management, and debugging tools for both the ACPI researcher as well as network administrators. Secure distributed resource management is essential for combining and integrating all network based collaboratory components into a finely woven fabric that envelopes the researcher and puts all of the computational and collaborative resources at the individual's finger tips.

The chosen architecture and actual distribution of the ACPI collaboratory resources, as well as the mode and frequency of the number and types of concurrent virtual collaborations will ultimately determine the aggregate or burst/peak bandwidth requirements. The ACPI will generate two major types of network traffic. The majority of the data generated by the ACPI large-scale simulations will be placed on local, large-scale high speed storage devices and then copied to the ACPI regional centers as a bulk data transfer (assumed to happen during off peak hours) on an infrequent basis (e.g., perhaps once every 24 hours). A Tbyte of data can be sent from the source to one regional center in 2.2 hours using a Gbit per second network (assuming 100% line utilization sustained over the 2.2 hour period). The bulk of the other traffic will be generated by the remote visualization and manipulation of subsets of the generated data. To fill one workstation screen with animated color graphics of a simulation will require between 100 and 500 Mbits per second, depending on the resolution of the workstation and quality of the animation. Scheduling and sharing of the network is paramount for the success of the ACPI collaboration. This implies that Gbit per second speed networks will be necessary in the 2000-2002 time frame to support the ACPI collaboratory.

 

Speed, Memory and Parallel Computing: Running the Models

Increases in model resolution can exact a large price in terms of the overall computational demand. For many components of the model framework, this increase is directly proportional to the increase in the number of additional spatial degrees of freedom. Increases in resolution also require that smaller time steps be taken for stability and accuracy, increasing the computational cost to simulate a specific time-period. Estimating the increased computational burden is further complicated because the operations that count for some of the numerical algorithms in global models also scale nonlinearly with resolution. Examples include the Legendre transform portion of the spectral dynamics, which scales quadratically with the number of latitudes, and the longwave radiative transfer calculation, which scales quadratically with the number of vertical levels. These model-to-model nuances make it difficult to quantify the computational requirements as a function of resolution with great certainty. Improvements in numerical algorithms and techniques for increasing efficiency on parallel computers will help offset some of the additional computational demand. However, since it is difficult to accurately project the magnitude of these demands on computational performance, resolution has been used as the primary metric to establish resource needs because it establishes a lower bound on the growth of computational requirements.

Climate models, and especially coupled models, place high demands on the performance of all aspects of the computational environment. To meet ACPI requirements, computational resources able to sustain 10 trillion floating point operations per second (TFLOPS) must be available in the middle of the next decade, assuming that algorithmic improvements can double the codes' computational efficiency. Expecting that the typical efficiency of parallel codes in the 2003 time frame will be 25%, the hardware platform required in mid-2003 will need to perform 40 TFLOPS (peak), and have 20 Tbytes of memory. An aggressive, but attainable, model development strategy requires that the primary climate prediction platform speed needs to increase from one TFLOPS in 1999 to forty TFLOPS in mid-2003. The increase should proceed at an even pace. Starting off too slowly risks progress at the end of the period, whereas a too rapid increase risks inefficiency. The table on page 21 provides specific targets for peak speed, memory, and primary system disk storage, as well as estimated ranges for the number of nodes, footprint size (occupied physical floorspace), power, and cost for the systems that should be in place if ACPI is to follow the accelerated technology path.

In addition to floating point performance, the memory bandwidth, network latency and I/O subsystem must be balanced for effective use of the computer. Any imbalance will become a bottleneck and inhibit the overall performance of the system. Stability of these components is also critical for long runs and tight schedules. Prior experience with climate model scalability on massively parallel computers indicates that fast memory access, low latency, and high bandwidth interconnection networks are critical to effective use of more than 100 processors. As the number of processors increases, faster interconnection networks are required for good parallel efficiencies to be realized.

 

 

_1/1/1999

_7/1/2000

1/1/2002

_7/1/2003_

PEAK SPEED (TeraFLOPS)

1.0

3.4

11.7

40

MEMORY

(TeraBYTES)

0.5

1.7

5.9

20.0

DISK

(TeraBYTES)

15.0

51.3

176.0

600.0

NODES

 

> 1000

< 2000

> 2000

< 3000

> 3500

< 5000

> 5000

< 9000

FOOTPRINT

(1000 sq. ft.)

> 1.0

< 2.0

> 2.0

< 3.5

> 3.5

< 6.0

> 6.0

< 10.0

POWER

(MW)

> 1.0

< 2.0

> 2.0

< 3.5

> 3.0

< 5.0

> 5.0

< 9.0

SYSTEM COST

($ M)

> 20

< 30

> 35

< 50

> 60

< 80

> 105

< 135

 

The dominant high-performance computer platform over the next 5-10 years will likely be a cluster of shared-memory multi-processors (SMPs). Each SMP will be made up of dozens to hundreds of individual processors within a single, integrated operating unit. Each individual system will, for the most part, be constructed from commodity components; however, the internal interconnection network will likely be based on a proprietary design. The connections among the clustered SMP systems will be based on one of several communications standards (e.g., HIPPI-800). The system storage should consist of a large local disk farm, with a 30:1 disk to memory size ratio. The local disk farm must be part of an integrated parallel I/O system design, including a high performance disk subsystem, a coherent I/O software library and a single, distributed file system. In addition, the system must provide high-speed external network connections for moving data rapidly to analysis and visualization servers and tertiary storage. A standards-compliant software suite, including operating systems, parallel compilers, and parallel communications libraries, is essential for climate simulation applications to utilize these hardware resources. The system must support high-level, standardized debugging and performance analysis tools to enable rapid and efficient application development. Software for resource management is necessary to allow model codes to scale to cluster dimensions without significant loss of efficiency.

Since the computer will be made from clusters of multi-processors, several groups can simultaneously perform model development and evaluation work. A flexible operating environment will allow assignment of individual SMP systems into unlinked clusters for development and evaluation. Run in this unlinked mode, these clusters need to be very stable and powerful enough to provide overnight turnaround on the component model runs. Linking the clusters provides the performance necessary for coupled model ensemble projections. Proper infrastructure to support scheduling and use of shared I/O facilities must be developed to support this capability.

 

Benchmarking: Ensuring Computational Performance and Usability

The ACPI program will involve both research and development activities and implementation of comprehensive climate modeling simulations. Therefore, its success will depend not only on raw computational performance but also on an effective environment for code development and analysis of model results. These issues suggest that the ACPI benchmarking approach should evaluate both raw computational performance and system usability. The TFLOPS performance level achieved on an application is not a reliable measure of scientific throughput. For example, an algorithm may achieve a very high floating point rate on a certain computer architecture, but may be inefficient at actually solving the problem. Therefore, accurate and useful benchmarks must be based on a variety of numerical methods that have proven themselves efficient for various climate modeling problems. In addition, formal tracking of benchmark results throughout the course of the program will provide an objective measure of progress.

To meet these objectives, ACPI will support benchmarking and performance evaluation in four areas: (1) identify or create scalable, computationally intensive benchmarks that reflect the production computing needs of the climate modeling community, and evaluate the scalability and capability of high performance systems; (2) establish system performance metrics appropriate for climate modeling; (3) establish usability metrics for assessing the entire research/development/analysis environment; and (4) establish a repository of benchmark results for measurement of progress in the initiative and for international comparison.

Identify or create scalable, computationally intensive benchmarks that reflect the needs of the climate modeling community. The purpose of this activity is to gain insight into the computational performance of machines that will be proposed by vendors in response to ACPI.

Benchmarks for scientific computing applications in general have traditionally used either a functional approach or a performance-oriented approach. In the functional approach, a benchmark represents a workload well if it performs the function of the workload, e.g., a climate modeling workload is represented by a subset of climate modeling codes. In the performance-oriented approach, a benchmark represents a workload well if it exhibits the same performance characteristics as the workload, e.g., a climate modeling workload would be represented by a set of kernels, code fragments, and a communication and I/O test. The assumption is that the system performance can be predicted from an aggregate model of the performance kernels.

In the last five years, there has been major progress in benchmarking massively and highly parallel machines for scientific applications. For example, the NAS Parallel Benchmarks (NPB) were a major step toward assessing the computational performance of early MPPs and served well to compare MPPs with parallel vector machines. With increasing industry emphasis on parallelism, a number of other organizations, as part of their procurement process, are designing benchmark suites to evaluate the computational performance of highly parallel systems. ACPI will likely use mainly a functional approach and identify or develop a set of benchmark models: an atmospheric GCM benchmark, an ocean GCM benchmark, and so on. The benchmark suite also must include a set of simulated coupled models. These benchmarks should be developed in the spirit of the NPB: the models should exhibit all the important floating point, memory access, communication, and I/O characteristics of a realistic model, but the code should be scalable, portable, compact, and relatively easy to implement. This is a substantial effort, because many extant benchmark codes for climate modeling are not readily portable, or are not currently well suited or even capable of running on more than a few tens of processors.

Establish system performance metrics appropriate for climate modeling. For climate modeling, the most important metric for scientific throughput is wall clock seconds per model day or model year. One sensitive parameter affecting this metric is resolution. Higher resolution increases the amount of parallelism available for exploitation on highly parallel computer systems. ACPI will likely select a few full-scale climate applications and intercompare parallel systems using the wall clock seconds per model year metric at different resolutions.

Establish usability metrics and system attributes for assessing the entire research, development, and analysis environment. These metrics will measure the important system features that contribute to the overall utility of a parallel system. Features include code development tools, batch and interactive management capabilities, I/O performance, performance of the memory hierarchy, compiler speed and availability, operating system speed and stability, networking interface performance, mean time between failures or interrupts. Many of these features have been weak on current highly parallel computers as compared to vector supercomputers. ACPI will likely develop a new set of utility benchmarks and attributes for highly and massively parallel machines as well as clusters of SMPs. The goal of these benchmarks is to derive a set of metrics that measure the overall utility of machines and systems for a productive climate modeling environment. Such an environment would emphasize model flexibility, rapid job turnaround, and effective capabilities for analysis and data access. Also, the system attributes will specify a set of desired capabilities necessary to administer and manage the system and its resources. At this time no industry standard benchmark addresses these issues. ACPI can drive the state of the art in performance evaluation by initiating such an effort.

Establish a repository of benchmark results for measurement of progress in the initiative and for international comparison. Benchmarking codes, metrics, and the detailed benchmark results must be available to the public and the high performance computing community through frequent publication of reports, Web pages, etc., with the goal of influencing the computer industry. ACPI will immediately benefit from using these benchmarks to acquire the next-generation of production machines. ACPI will create a central repository of benchmark codes and performance data. It will also publish annual reports about the measured performance gains and accomplishments of the initiative, as well as providing international comparison data. Over the duration of the ACPI, the rapid development of new high-performance computing technology will continue, with rapidly changing architectures and new software and tools becoming available. Hence a benchmark must be designed to be able to evaluate systems across several hardware generations, diverse architectures, and changing software environments. In return, with a well-designed benchmark, ACPI can drive the high performance computing industry and create a better focus in the industry on high-end computing.

 

Building Regional Climate Collaboration Centers

The regional collaboratory component of the ACPI will link its predictive capability, and the assessment and policymaking communities. The objective of the Regional Climate Collaboration Centers (RCCC) is both to meet the immediate needs of these communities and to advance the suite of capabilities for performing impact and policy assessment. The RCCC will incorporate a four-part strategy to accomplish its mission (Figure 5): (1) tool development to assess the local-scale effects of climate change and climate variability and for the production of specialized climate projection products related to them; (2) information dissemination via specialized climate projection products, including an archive of quality-assured global-scale climate simulations, and other essential information required for impact and policy assessment; (3) expert assistance to the assessment community and government and private-sector decision-makers in understanding the possible effects of climate change and climate variability on human activity and in evaluating measures for mitigating or adapting to these changes; and (4) service to the assessment community and government and private-sector decision-makers by providing them tools, products, and information they require to do their jobs.

Figure 5. Regional Climate Collaboration Centers, through two major functions- service and research-will provide a wealth of tools, information and databases for users in research, assessment and policymaking communities.

The RCCC will translate the findings of research and their implications for human activities to individuals, organizations and government institutions so that they can make intelligent and informed choices. The RCCC will conduct significant research and development activities, but these activities will be driven by the needs of the RCCC's user community.

Linked regional collaboratories will be located at DOE national laboratories; they will develop strong partnerships with research universities and other government laboratories engaged in climate and climate impacts research and information dissemination (e.g., NCAR, GFDL, regional NOAA laboratories and NOAA-sponsored institutes, NASA, the Department of Agriculture, and EPA). Since the immediate effects of climate change and climate variability are inherently local and regional, each collaboratory will focus on problems with regional importance. The implications of these regional effects, however, have national and international significance. In addition, the economic impacts of climate change and climate variability can usually be understood only in the context of a global perspective. Thus, the RCCC will provide this national and global integration.

The RCCC, therefore, will have two major functions, service and research. The service function is to provide the tools, information, and expertise required to understand and assess the implications of climate change for the country. The research function is to develop and improve the tools and expertise needed to produce this information. The service function can begin immediately. Many of the tools and much of the expertise required to assess the regional effects of climate change and climate variability already exist. As the understanding and ability to predict climate improves, however, the questions asked of assessments will become more detailed and sophisticated. The RCCC's research function will ensure that the capabilities for assessment and for supplying more detailed climate and climate-related information keep pace. These functions consist of a Climate Products Division, the RCCC's Information Delivery System, a Computational Tool Library, and an Assessment Products Division.

The products and services of the RCCC must be user-driven. Several classes of users can be envisioned, including the broad spectrum of climate researchers, policy and technical professionals who require climate information, educators, and the general public. Each of these users requires different types of information, access, and support. The RCCC will draw on experience from the ARM program, CDIAC, PCMDI, the NASA DAACs, NCDC, the NOAA regional climate centers, state climatologists' offices and others, to develop and deliver specialized data packages and access or display formats appropriate for selected user groups. In some cases, specialized data packages may be developed for individual collaborators with RCCC. Examples of value-added products include filtered data sets, mapped or regridded data, and various averaged variables. In addition, results from specialized regional models such as stream flow, snow pack, and agricultural productivity for specific crops could also be produced and distributed

A central activity of the RCCC will be the creation and maintenance of general and specialized data products and information for assessing regional effects of climate change. These products will include not only an archive of all global climate simulations produced under the ACPI and pertinent simulations generated by other modeling groups, but also specialized products derived from these simulations as well as other information needed to determine the effects of an altered climate on essential resources and activities. Fast, reliable and interactive delivery of this information to the user community is essential to ACPI's mission. A possible configuration of the information delivery system consists of a web-based information delivery interface, a user support group, and educational outreach component. The information delivery interface will provide direct access to the products and tools developed and maintained by the RCCC. The user support group will be the human interface to the users, and the educational outreach component will provide certain specialized educational activities. Chief among these will be education and training of local, state, and regional government and private-sector decision-makers on regional problems associated with climate change and climate variability, and training in the use of appropriate assessment tools. The RCCC will also function as a source of content material for persons interested in developing climate-related educational materials for public schools or for college-level instruction.

As part of its basic activity, the RCCC will maintain a distributed software tool library. The library will maintain and provide user access to the computational methods used to produce the RCCC's products and to specialized capabilities and methods for modeling, understanding, and assessing climate effects. Examples include methods and capabilities for

* assessing effects of climate change and climate variability on water resources, energy generation and use, agricultural production, and other critical activities,

* translating the results of global-scale climate models to the time and space scales required for assessment, and

* evaluating the effects of climate change and climate variability and the effectiveness of actions for mitigating these effects.

 

The RCCC will also provide specific assessment products and specialized expertise to researchers and experts engaged in the assessment and policy formulation processes, including participation in national and international assessments. To meet these obligations, the RCCC must maintain an experienced and diversified staff of researchers and other technical experts to assist national and local decision-makers with climate impact and policy assessment. The RCCC will be charged to develop and promote cooperative efforts in impact assessment among regional universities, DOE laboratories, NOAA-sponsored climate institutes and other relevant organizations.

 

Institutional Commitment and Interaction

Establishing a collaborative activity of the proposed scope and magnitude of the ACPI will present significant institutional challenges to the organizations and individuals involved. The history of major collaborative research successes ranges from the Manhattan Project through recent programs in the field of climate research (such as the ARM and CHAAMP programs). In the first case, institutional interaction and commitment grew from a clear and compelling political imperative (winning the war), justifying a centralized command structure. Other programs have gone through an extended process of ad hoc trial-and-error learning that imposed high costs on dedicated individuals mediating among the interests and priorities of multiple institutions and agencies. The advantage of the command mode is its immediate effectiveness; its downside is backlash. The advantage of longer term institutional learning is that it builds resilience into programmatic institutions; its disadvantage is that it takes time. The ACPI will improve on these institutional models by incorporating an institutional learning process explicitly into program development and operation.

Many of the scientists, program managers, and institutions engaged in establishing and operating ACPI will be those who have already contributed to the success of previous climate change prediction research programs. They have already developed a fund of invaluable experience that can and will be applied to the development of the ACPI. However, the ACPI is likely to introduce new challenges for institutional collaboration, new partners will be involved, and the scope of the work will be qualitatively transformed. The acquired craft skills and organization-building experience of ACPI participants need therefore to be supplemented with the insights of institutional research and empirical studies of technical collaboration.

Organizational development aspects of the ACPI will be the purview of individuals at existing institutions (for example, GFDL, NCAR) or newly created institutions (for example, the RCCCs) who are assigned the task of coordinating and integrating the various activities involved in generating and analyzing climate projections. A dedicated institutional analyst will synergize these organization-building efforts by monitoring interactions, decision-making processes, and informal and formal operational practices. The analyst will provide feedback to these individuals and other stakeholders, raising process-oriented issues for discussion and resolution. Two major organizational imperatives will have to be addressed.

First, sustainable success of the ACPI will depend as much on its ability to become embedded in the participating institutions as on its ability to attract those institutions to participate in the first place. Important as it may be, the ACPI is not likely to be or become the sole or even the principal driving force of the larger institutions (such as universities and national laboratories) that participate in it. This means that institutional support will have to be designed and built in such a way that the ACPI is viewed by each of the participating institutions as contributing to and being a part of its broader purpose.

Second, because it will be operating in an inherently uncertain political system subject to comparable nonlinearities as the natural systems it will be exploring, the ACPI needs to be designed for institutional sustainability. Institutional analysis suggests that this is best achieved by nurturing pluralism within the institution. This means providing support for multiple viewpoints and approaches so that a broad repertoire of institutional strategies is always available within the institution to deal with unexpected threats and to enable it to take advantage of unforeseen opportunities.

The ACPI organizational development must be grounded in institutional research and guided by results and principles developed by that research. Several of these are particularly relevant to establishing and operating the ACPI:

* Sustainable collaborations are usually based on interests formed in the course of negotiating the relationships among the parties. Although prospective partners may approach collaboration with preconceived interests, these rarely form the basis of sustainable collaborations.

* Effective channels for communication among all concerned parties will reduce instances where participants become concerned that others may be engaging in strategic behavior intended to advance their own interests before those of the collaboration.

* To the extent possible, personal rewards of participating individuals should be aligned with the interests of the collaboration as a whole, so that individual commitments to collaboration-building do not become frustrated by pressures to conform to the narrower goals of their institution of origin.

* The ACPI should build on existing research activities looking for synergies of institutional and investigator interests that advance the state of the art rather than develop top-down agendas.

* Organization-building activities should focus on nurturing long-term sustainable relationships at least as much as on achieving immediate or medium term goals. While clearly articulated goals are important spurs for action, an overriding focus on goals is irrational when scientists and program managers are confronting a highly uncertain outcome.

These are examples of systematic insights that should guide the deliberations to design, develop, and operate the ACPI.

 

ACPI Implementation Strategy

The ACPI implementation plan will need to be developed to complement the larger Strategic Simulation Initiative, of which it is a part. The ambitious acceleration of scientific progress, model development and application, and information dissemination requires an equally ambitious acceleration of computational and information technologies. Keeping the program in balance requires that the first priority is to implement the APCI system of institutions and research collaborations with currently available technology to lay the foundation for future progress. Therefore, the following implementation principles should be followed to maintain program balance and to achieve the ACPI objectives.

* The extensive multi-institutional collaboration that is needed to develop, evaluate and apply climate models must be started immediately. The institutions involved extend beyond the DOE laboratory system and must be linked together immediately by a preliminary version of the DARC-Fabric networking capability.

* The RCCC system should be established quickly and linked to the DARC-Fabric. The RCCC system can begin work immediately to collect and integrate climate change information that is available, but difficult to access. In addition, the human and technology connections among the RCCC and the model collaborations can be constructed.

* A schedule for computer platform acquisition and deployment should be developed in time for early acquisition of the first machine.

* An ambitious grant-based, basic research component encompassing both climate and computational sciences should be started immediately so that creativity and innovation can penetrate the more applied parts of the program.

* A model development program should be initiated that plans for models to be ready as new computers become available. This means at any given time, model development, evaluation and application are proceeding in parallel on successive generations of model codes.

Thus, to achieve its scientific goals on time, ACPI must proceed in concert with the SSI and ASCI. Nevertheless, the requirements set forth for the ACPI must be met for the SSI to be successful. Effective partnerships and multi-disciplinary teamwork among climate researchers, computational scientists, and technology developers are required for success.

Conclusions

The ACPI can make a powerful contribution to effective climate change policies by advances in the simulation science on which predictions of climate are based. Simultaneous acceleration of hardware and software developments in computing interacting with increases in knowledge of climate processes will create a synergistic program in simulation science. Explicit attention to institution-building will enable the collaborations needed to fulfill the ambitious goals of the ACPI.

The Regional Climate Collaboration Centers (RCCC) are essential for the ACPI to achieve its full potential. The RCCC provide the connections among global climate science, and national and regional assessments and analyses. They are the means by which new knowledge is disseminated to the institutions and people who can use it to mitigate or adapt to climate change.

The final outcome will be the wide availability of data and scientifically credible climate projections in usable formats. This will benefit the scientific community, policymakers, firms, and households seeking to understand climate change and its impacts.

Ancillary benefits from the ACPI are numerous. The collaboration and computational infrastructure that will be developed for ACPI can be readily adapted and implemented for other scientific disciplines, such as health and biology, materials, and fusion energy. Advances in models for long-term climate prediction will enable rapid improvements for both weather forecasting and seasonal prediction. The challenge of building and applying climate models will be a stringent test for new computational technologies, thereby accelerating the pace of production supercomputing evolution.

References

Aiken, R, RA Carlson, IT Foster, TC Kuhfuss, RL Stevens, and L Winkler. January 1997. Architecture of the Multi-Modal Organizational Research and Production Heterogeneous Network (MORPHnet), [Online report]. Available URL: http://www.anl.gov/ECT/ Public/research/ File: morphnet.html

Easterling, DR, TC Peterson, and TR Karl. 1996. "On the development and use of homogenized climate data sets." Journal of Climate 9:1429-1434.

Haltiner, GJ, and RT Williams. 1980. Numerical Prediction and Dynamic Meteorology. John Wiley and Sons, New York.

Lorenz, EN. 1975. "Climate Predictability." In The Physical Basis of Climate and Climate Modelling, Global Atmospheric Research Programme Publication No. 16, pp. 132-136. World Meteorological Organization, Geneva.

Randall, D, J Curry, D Battisti, G Flato, R Grumbine, S Hakkinen, D Martinson, R Preller, J Walsh, and J Weatherly. 1998. "Status of and outlook for large-scale modeling of atmosphere-ice-ocean interactions in the Arctic."Bull. Amer. Meteor. Soc. 79: 197-219.

Trenberth, KE (Ed.). 1992. Climate System Modeling. Cambridge University Press, Cambridge.

Washington, WM, and CL Parkinson. 1986. An Introduction to Three-Dimensional Climate Modeling. University Science Books, Mill Valley, CA.

 

Glossary

Benchmarking-- measuring the performance of a computer according to a specified standard

Byte-- a data storage measurement unit, typically equal to eight binary digits (or bits) of information

Ensemble-- several experiments differing only slightly in their initial conditions, run as a group to generate probability distributions

Evaluation (of models)-- comparison of simulation model results with various types of observational data to determine whether or not the model represents the climate correctly

Floating point operations per second (FLOPS)-- a measure of how fast a computer processes tasks; computing systems envisioned in this report measure floating point performance

General circulation model (GCM)-- a computer modeling code that simulates the global dynamics and thermodynamics of the atmosphere or ocean

Giga-, Tera-, Peta- -- prefixes denoting multiplying factors of, respectively, 109, 1012, and 1015

Latency-- a measure of how much computer processing potential is not being used (typically because of difficulties in input/output)

Memory bandwidth-- a measure of how fast data or programs can be swapped in and out of storage

Network bandwidth-- a measure of volume of data transfer capacity

Parameterization-- a sub-model or algorithm within a climate or weather model that simulates processes that function on spatial scales smaller than those explicitly resolved by the model grid

Problem-solving environment (PSE)-- a suite of software products that can be used collaboratively to solve a target class of problems

Projections-- a collection of model simulations forecasting probabilities of future climate

Resolution-- measure of model spatial precision

Scalability-- the property of a shared-memory multiple processor (SMP) cluster that allows the number of clusters to increase without loss of efficiency

Simulation science-- inquiry conducted by means of computer models, especially predictive models; equal to laboratory science in importance for scientific inquiry, used when laboratory experiments are impractical or impossible

Standards-- in computing, accepted practices and methodologies for communications, compiling, etc. promulgated by organizations such as ANSI

Visualization-- the display of static or animated images derived from large data sets

Wall clock time-- the true elapsed time required to complete a task on a computer

For More Information

Dr. Ari Patrinos

ER-70

US Department of Energy

19901 Germantown Road

Germantown, MD 20874