In my job in the Scientific Computing group, I work with several different projects to get them up and running at the petascale. While the majority of our users have experience running their codes in parallel, until they access our machines, they have never had the opportunity to run on such a large machine and solve such large problems. For my projects, I am available to do anything from tell them the best compiler flags or improve their job scripts, to becoming an embedded member of their code development team.
In addition, I work on the Joule project, in which four codes are selected to be improved in performance over the course of the year, and the results of these efforts reported to OMB. I coordinate the NCCS side of the effort.
I am currently working with the NUCCOR nuclear physics code team to load balance their code and generally improve its performance. In this case, we have to divide up the work across all the processors in a scalable way. Right now, the code scales up to about 1000 cores, but we need to get it to a point where it can work well on a leadership supercomputer such as Jaguar, ORNL's flagship machine.
I worked on load balancing techniques for the octtrees in MADNESS. On a leadership supercomputer, we want to evenly distribute the load over hundreds or even thousands of processors, while minimizing the number of broken links. Something that makes our load balancing requirements unique is the fact that we utilize the whole tree for some operations, rather than just the leaf nodes. I came up with some ideas that suited our purposes. For more about our project, see our SCIDAC project web page.
I'm involved with this project modeling the biofuel supply chain infrastructure. The premise is, where should we put biorefineries and storage, and what crops should be grown? This boils down to a large mixed-integer linear program. My role is to facilitate the solution of this problem at the petascale.