Hydrogeochemistry Application

Now, let us switch gear and look at a hydrogeochemistry system on this same Melton Branch Site. The problem we chose to study is a hypothetical uranium mill tailing problem. The reason that watershed scale study of hydrogeochemical systems in the application side is still rare these days is largely caused by lack of data and computational capacity in both software and hardware support. Collecting and testing multispecies samples is extremely expensive, not to mention that we need analyses of samples at a large watershed scale. The hypothetical problem here is partly a proof of concept that watershed scale hydrogeochemistry can be modeled with ease and partly an exercise to examine the experimental and computational support that we should anticipate in the real world application that is part of our current ongoing project.

But, first of all, let me recount a bit of the legacy of the cold war. From the 1940s to the 1960s, the Manhattan Project Engineering District and the Atomic Energy Commission purchased a lot of uranium ores in the country. The waste resulted from the extraction of uranium is the prime pollutant of this study, which is at best acidic and radioactive. In the next few slides that I’m going to show you, we are modeling a hydrogeochemical system with 7 component species and 49 derived species. The seven component species are calcium, carbonate, uranium (VI), sulfate, phosphate, iron, proton (H+, pH). The reactions involved include aqueous complexation, for example, calcium sulfate; precipitation/dissolution, for example, calcium carbonate; and acid/base reactions, for example, those involving carbonate and hydrogen ions. We will need a large amount of CPU time for chemical equilibrium calculation on a 15K grid of 7 degrees of freedom on each finite element node. So, we chose the parallel groundwater flow code PFEM, originally the 3DFEMWATER code written by Professor Yeh at Penn State and later adopted to the Intel hypercube and paragon parallel supercomputers by my coauthor Ed D’Azevedo, to calculate transient groundwater flow using boundary and initial conditions common to the study area. This computation was conducted on the Intel Paragons at ORNL. The output from PFEM, including water content and velocity, were then fed into the parallel HYDROGEOCHEM 3D, written by Professor Yeh and parallelized by myself using a high-performance FORTRAN compiler form the Portland Group. We presented this work in the AGU Fall Meeting last year, and you can find the poster at http://www.ccs.ornl.gov/staff/gwo/agu96fall.html.

The hypothetical scenario is that we have a tank containing tailing waste to be treated near the southeastern corner of the site, which is on the upper left corner of the isosurface plot here, and the tank is leaking. The hypothetical system has an initial pH of 7.6, but the leakage contains a solution of pH 1.3. The initial concentration of uranium (as UO22+) is 1.0x10-7, while the leakage has a concentration of 5x10-4. There was constant recharge of water to the watershed from precipitation and the discharge water from the watershed is to be collected and treated near the bottom of the hill. This isosurface plot, 26.5 days after the tank started leaking, shows a large amount of uranium input right beneath the leaking storage tank and a large increase of pH in the same area. Near the bottom of the watershed, however, discharge of water and reaction with the carbonate resulted in decrease of pH. While modeling of the watershed using previously collected geochemistry database and sensitivity and uncertainty analysis have been planned in the research project, our current focus is largely on the optimal implementation of the coupled chemical kinetics-equilibrium and solute transport and computational efficiency of the implemented computer code. For a large-scale application such as the Melton Branch Watershed, one of the first issues that a model user may encountered is the capacity of the computational facility. A complex geochemical system in a large, heterogeneous setting usually requires hundreds of megabytes of memory, if not gigabytes. This could easily overwhelm the most advanced and well-equipped Unix workstations available today. One would usually need a mainframe computer larger than those. And, then, there is this even more itching issue of computational efficiency. A conventional one single processor or CPU machine execute commands serially, aside from sharing resources with other users on the computer. However, for lots of applications in the computational sciences, parallelism can be straightforwardly implemented and the speed up of computation can usually allow not only more thorough analysis but also better characterization of uncertainties. So, we believe that before these computational issues can be satisfactorily resolved, watershed scale modeling of hydrogeochemistry may not be very realistic. So, bear with me if you think I’m throwing a computer scientist’s pitch for parallel computing at you, because we believe that it is one of the major road blocks for the real applications of currently available hydrogeochemistry codes that the academic world has invested so heavily over the years.

Parallel algorithm has never been new to computer scientists, but its recent resurrection owes a lot to the so-called off-the-shelf advantage and the advances in microprocessor technology. For the cases of Intel paragons, the processors are arranged into a two-dimensional mesh, with each node, shown as gray boxes here, containing two to three Pentium or i86 processors. One of these processors will handle the message piped through the message channel and the rest will do the computation. Each of these nodes has its own memory (32MB, 64MB, or 128MB), and the message passing usually sustains at 128 megabits per second. Parallel machines of this architecture are called distributed memory parallel computers. Their rival sibling are called share memory parallel computer, in which every processor on the machine access to the same memory space. Cray T3D is an example. Computationally, it always matters that we map the computational domain to the architecture of the computer. Solute transport, because of its nature of communicating state variables such as concentration from one location to another, parallelism is not always easy. For example, we may decompose the Melton Branch Watershed to a two-dimensional mesh of subdomains, each containing a certain number of finite elements, and assign the subdomains to the processor nodes. Each processor node worries about the elements assigned to it only, except that the additional internal boundary created artificially by decomposition needs to be accounted for. In order to do so, a processor node needs to ask his neighbors for update of the state variables, thus comes the message passing. For hydrogeochemical systems with tens or hundreds of species, however, most of the CPU times may be used for geochemistry calculations. And, fortunately, the geochemistry calculations are usually localized inside the subdomains. No message passing would be necessary. This step, in computer scientists’ jargon, is perfectly parallel. This understanding makes a lot of sense to the parallel algorithm on parallel computers. But, we’ve got to keep in mind that we still have solute transport that may require a lot of message passing, especially, when the groundwater velocity is fast and Lagrangian particle tracking is used to calculate the advection of solutes.

This slide shows, for an 8 time-step simulation, the increase of computational efficiency with increasing number of processor nodes. The computation time is scaled to the time needed for one single processor node, which is 100%. Among the 100% of CPU time, 93% of that is used for chemical equilibrium calculation. Given the nature of chemical equilibrium calculations, one would expect that increasing the number of processor nodes will reduce the CPU time. This is quite true and the CPU time is decreasing until the number of processor nodes exceeds 65. At 65 processor nodes, the message passing and I/O time has become so big in proportion that further increase of processor nodes may actually mean poor utilization of computational resources. Nonetheless, there is still room for improvement, for example, (1) improvement of hardware message bandwidth will reduce communication overhead of a computer model and thereby reduce CPU time needed for solute transport calculations and (2) implementation of parallel I/O may reduce the inevitably large I/O time for field scale applications.