# Events

## Workshops and Conferences

**Society for Industrial and Applied Mathematics Annual Meeting**

The SIAM Annual Meeting is the largest applied math conference held every year. Guannan Zhang and Miroslav Stoyanov organized a mini-symposium on "Recent Advances in Numerical Methods for Partial Differential Equations with Random Inputs", which is a rapidly growing field that is of great importance to science. There were 12 invited speakers both national labs and academia including ORNL, Argonne National Lab, Florida State University, University of California, University of Pittsburgh, Virginia Tech, University of Minnesota, Auburn University. The mini-symposium was well attended by an even wider variety of researchers. The mini-symposium gave participants an opportunity to discuss their current work as well as future development of the field.

**Durmstrang-2 Review**

The Fall review for the Durmstrang-2 project was held on September 10-11 in Maryland. Durmstrang-2 is a DoD/ORNL collaboration in extreme scale high performance computing. The long term goal of the project is to support the achievement of sustained exascale processing on applications and architectures of interest to both partners. The Durmstrand-2 project is managed from the Extreme Scale Systems Center (ESSC) of CCSD.

Steve Poole, Chief Scientist of CSMD, presented the overview and general status update at the Fall review. Benchmarks R&D discussion was facilitated by Josh Lothian, Matthew Baker, Jonathan Schrock, and Sarah Powers of ORNL; Languages and Compilers R&D discussion was facilitated by Matthew Baker, Oscar Hernandez, Pavel Shamis, and Manju Venkata of ORNL; I/O and FileSystems R&D discussion was facilitated by Brad Settlemyer of ORNL; Networking R&D discussion was facilitated by Nagi Rao, Susan Hicks, Paul Newman, Neena Imam, and Yehuda Braiman of ORNL; Power Aware Computing R&D discussion was facilitated by Chung-Hsing Hsu of ORNL; System Schedulers R&D discussion was facilitated by Greg Koenig and Sarah Powers of ORNL. A special panel on Lustre was also convened to discuss best practices and path forward. Panelists included both DoD and ORNL members. The topics of discussion during the executive session of the review included continued funding/growth of the program, task progression, and development of performance metrics for the project.

Upcoming events for ESSC include: OpenSHMEM Birds-of-a-Feather session at Supercomputing 2013, OpenSHMEM booth at Supercomputing 2013, and OpenSHMEM Workshop (date to be announced).

ORNL R&D Staff to recently join the ESSC team is Dr. Tiffany Mintz (Computer Science Research Group).

**CAEBAT Annual Review**

The Computer-Aided Engineering for Batteries (CAEBAT) program [1] is funded through the Vehicle Technologies (VT) program office within the DOE Office of Energy Efficiency and Renewable Energy (EERE). This program, led by NREL and including industry and university partners, is developing computational tools for the design and analysis of batteries with improved performance and lower cost. CSDM staff in the Computational Engineering and Energy Science (CEES) and Computer Science (CS) groups are leading development of the shared computational infrastructure used across the program, known as the Open Architecture Software (OAS), as well as defining standards for input and battery "state" representations [2].

On Aug. 27, 2013, the ORNL team (Sreekanth Pannala, Srdjan Simunovic, Wael Elwasif, Sergiy Kalnaus, Jay Jay Billings, Taylor Patterson, and CEES Group Leader John Turner) hosted the CAEBAT Program Manager, Brian Cunningham, at ORNL. This visit served as an annual review for the ORNL CAEBAT effort, and provided a venue for the team to demonstrate progress in simulation capabilities, including an initial demonstration of the use of the NEAMS Integrated Computational Environment (NiCE) with OAS [3].

[1] http://www.nrel.gov/vehiclesandfuels/energystorage/caebat.html

[2] http://energy.ornl.gov/CAEBAT/

[3] http://sourceforge.net/apps/mediawiki/niceproject/index.php?title=CAEBAT

**First OpenSHMEM Workshop: Experiences, Implementations and Tools**

October 23-25

The OpenSHMEM workshop is an annual event dedicated to the promotion and advancement of parallel programming with the OpenSHMEM programming interface and to helping shape its future direction. It is the premier venue to discuss and present the latest developments, implementation technology, tools, trends, recent research ideas and results related to OpenSHMEM and its use in applications. This year's workshop will also emphasize the future direction of OpenSHMEM and related technologies, tools and frameworks. We will also focus on future extensions for OpenSHMEM and hybrid programming on platforms with accelerators. Although, this is an OpenSHMEM specific workshop, we welcome ideas used for other PGAS languages/APIs that may be applicable to OpenSHMEM.

Topics of interest for conference include (but are not limited to):

- Experiences in OpenSHMEM applications in any domain
- Extensions to and shortcomings of current OpenSHMEM specification
- Hybrid heterogeneous or many-core programming with OpenSHMEM and other languages or APIs (i.e. OpenCL, OpenACC, CUDA, OpenMP)
- Experiences in implementing OpenSHMEM on new architectures
- Low level communication layers to support OpenSHMEM or other PGAS languages/APIs
- Performance evaluation of OpenSHMEM or OpenSHMEM-based applications
- Power/energy studies of OpenSHMEM
- Static analysis and verification tools for OpenSHMEM
- Modeling and performance analysis tools for OpenSHMEM and/or other PGAS languages/APIs
- Auto-tuning or optimization strategies for OpenSHMEM programs
- Runtime environments and schedulers for OpenSHMEM
- Benchmarks and validation suites for OpenSHMEM

http://www.csm.ornl.gov/workshops/openshmem2013/

**Fourth Workshop on Data Mining in Earth System Science**

June 5-7

CSMD researcher Forrest Hoffman organized the Fourth Workshop on Data Mining in Earth System Science (DMESS 2013; http://www.climatemodeling.org/workshops/dmess2013/) with co-conveners Jitendra Kumar (ORNL), J. Walter Larson (Australian National University, AUSTRALIA), and Miguel D. Mahecha (Max Planck Institute for Biogeochemistry, GERMANY). This workshop was held in conjunction with the 2013 International Conference on Computational Sciences (ICCS 2013; http://www.iccs-meeting.org/iccs2013/) in Barcelona, Spain, on June 5-7, 2013, and was chaired by J. Walter Larson. Richard T. Mills and Brian Smith of ORNL both presented papers in the DMESS 2013 session. These papers were published in volume 18 of Procedia Computer Science and are available at http://dx.doi.org/10.1016/j.procs.2013.05.411 and http://dx.doi.org/10.1016/j.procs.2013.05.408

**Special Symposium on Phenology**

April 14-18

CSMD researcher Forrest Hoffman co-organized a Special Symposium on Phenology with Bill Hargrove and Steve Norman (USDA Forest Service) and Joe Spruce (NASA Stennis Space Center) at the 2013 U.S.-International Association for Landscape Ecology Annual Symposium (US-IALE 2013; http://www.usiale.org/austin2013/), which was held April 14-18, 2013, in Austin, Texas. Hoffman also gave an oral presentation in this symposium. Titled "Developing Phenoregion Maps Using Remotely Sensed Imagery", this presentation described application of a data mining algorithm to the entire record of MODIS satellite NDVI for the conterminous U.S. at 250 m resolution to delineate annual maps of phenological regions. In addition, I was co-author on four other oral presentations at the US-IALE Symposium, including one by Jitendra Kumar (ORNL) that described an imputation technique for estimating tree suitability from sparse measurements.

**SOS 17 Conference**

March 25-28

Successful workshop on Big Data and High Performance Computing hosted by ORNL in Jekyll Island Georgia

SOS is an invitation-only 2 1/2 day meeting held each year by Sandia labs, Oak Ridge National Laboratory, and Swiss Technical institute. This year it was hosted by ORNL in Jekyll Island Georgia on March 25-28, 2013.

The theme this year was "The intersection of High Performance Computing and Big Data." There were 40 speakers and panelists from around the world representing views from industry, academia, and national laboratories. The first day focused on the gaps between big computing and big data and the challenges of turning science data into knowledge. On the second day the talks and panels focused on where HPC and big data intersect and the state of big-data analysis software. The morning of the third day focused on the politics of big data including the issues of data ownership.

Findings of the meeting include the fact that large experimental facilities such as CERNs Large Hadron Collider, and the new telescopes coming online already generate prodigious amounts of scientific data. The volume and speed that data is generated requires that the data be analyzed on the fly and only a tiny fraction be kept. The amount kept still amounts to many petabytes. The attendees stressed how important provenance is to the use of the archived data by other researchers around the world. The majority of today's scientific data is only of value to the original researcher, because the data lacks the meta-data required for others to use it. The talks and panels clearly showed the intersection of high performance computing and big data. They also showed that the converse is not necessarily true, i.e. big data (as defined by Google and Amazon) does not require high performance computing. These vendors and their customers are able to get their work done on large, distributed networks of independent PCs. The meeting was filled with lively discussion, and provocative questions.

For those wanting to know more, the agenda and talks are posted on the SOS17 website: http://www.csm.ornl.gov/workshops/SOS17/

**SIAM SEAS 2013 Annual Meeting**

March 22-24

On March 22-24, Oak Ridge National Laboratory and the University of Tennessee hosted the 37th annual meeting of the SIAM Southeastern Atlantic Section. The meeting included approximately 160 registered participants, of which roughly 60 were students and 20 were from ORNL. There were 4 plenary talks, 24 mini-symposium sessions, seven contributed sessions, and a poster session. Awards were given to students for Best Paper and Best Poster presentations. Attendees were also given guided tours of the Graphite Reactor, the Spallation Neutron Source, and the National Center for Computational Science. The meeting was organized by Chris Baker (ORNL), Cory Hauck (ORNL), Jillian Trask (UT), Lora Wolfe (ORNL), and Yulong Xing (ORNL/UT).

**Durmstrang-2**

March 18-19

The semi-annual review for the Durmstrang-2 project was held on March 18-19 in Maryland. Durmstrang-2 is a DoD/ORNL collaboration in extreme scale high performance computing. The long term goal of the project is to support the achievement of sustained exascale processing on applications and architectures of interest to both partners. The Durmstrand-2 project is managed from the Extreme Scale Systems Center (ESSC) of CCSD.

Steve Poole, Chief Scientist of CSMD, presented the overview and general status update at the March review. Benchmarks R&D discussion was facilitated by Josh Lothian, Matthew Baker, Jonathan Schrock, and Sarah Powers of ORNL; Languages and Compilers R&D discussion was facilitated by Matthew Baker, Oscar Hernandez, Pavel Shamis, and Manju Venkata of ORNL; I/O and FileSystems R&D discussion was facilitated by Brad Settlemyer of ORNL; Networking R&D discussion was facilitated by Nagi Rao, Susan Hicks, Paul Newman, and Steve Poole of ORNL; Power Aware Computing R&D discussion was facilitated by Chung-Hsing Shu of ORNL; System Schedulers R&D discussion was facilitated by Greg Koenig and Sarah Powers of ORNL. The topics of discussion during the executive session of the review included continued funding/growth of the program and development of performance metrics for the project.

**APS 2013 March Meeting**

March 18-22

The American Physical Society (APS) March Meeting is the largest physics meeting in the world, focusing on research from industry, universities, and major labs. Participation in this years' meeting held in Baltimore, MD (March 18-22, 2013) by staff members of the Computational Chemical and Materials Sciences (CCMS) Group included 24 different talks (bold names are from CCMS).

**Monojoy Goswami, Bobby G. Sumpter**, "Morphology and Dynamics of Ion Containing Polymers using Coarse Grain Molecular Dynamics Simulation", Talk in Session T32: Charged and Ion Containing Polymers (March 21, 2013) APS National Meeting, Baltimore.

Debapriya Banerjee, Kenneth S. Schweizer, **Bobby G. Sumpter**, Mark D. Dadmun, "Dispersion of small nanoparticles in random copolymer melts", Talk in Session F32: Polymer Nanocomposites II (March 19, 2013) APS National Meeting, Baltimore.

**Rajeev Kumar, Bobby G. Sumpter**, S. Michael Kilbey II, "00003 Charge regulation and local dielectric function in planar polyelectrolyte brushes", Talk in Session U32: Charged Polymers and Ionic Liquids (March 21, 2013) APS National Meeting, Baltimore.

Alamgir Karim, David Bucknall, Dharmaraj Raghavan, **Bobby Sumpter**, Scott Sides, "In-situ Neutron Scattering Determination of 3D Phase-Morphology Correlations in Fullerene -Polymer Organic Photovoltaic Thin Films", Talk in Session Y33: Organic Electronics and Photonics-Morphology and Structure I (March 22, 2013) APS National Meeting, Baltimore.

Geoffrey Rojas, P. Ganesh, Simon Kelly, **Bobby G. Sumpter**, John Schlueter, Petro Maksymovych," Molecule/Surface Interactions and the Control of Electronic Structure In Epitaxial Charge Transfer Salts", Talk in Session U35: Search for New Superconductors III (March 21, 2013) APS National Meeting, Baltimore.

Geoffrey A. Rojas, P. Ganesh, Simon Kelly, **Bobby G. Sumpter**, John A. Schlueter, Petro Maksymovich, "Density Functional Theory studies of Epitaxial Charge Transfer Salts", Talk in Session N35: Search for New Superconductors III (March 20, 2013) APS National Meeting, Baltimore.

Arthur P. Baddorf, Qing Li, Chengbo Han, J. Bernholc, Humberto Terrones, **Bobby G. Sumpter, Miguel Fuentes-Cabrera**, Jieyu Yi, Zheng Gai, Peter Maksymovych, Minghu Pan," Electron Injection to Control Self-Assembly and Disassembly of Phenylacetylene on Gold", Talk in Session C33: Organic Electronics and Photonics - Interfaces and Contacts (March 18, 2013) APS National Meeting, Baltimore.

Mina Yoon, Kai Xiao, Kendal W. Clark, An-Ping Li, David Geohegan, **Bobby G. Sumpter**, Sean Smith, "Understanding the growth of nanoscale organic semiconductors: the role of substrates", Talk in Session Z33: Organic Electronics and Photonics - Morphology and Structure II (March 22, 2013) APS National Meeting, Baltimore.

Chengbo Han, Wenchang Lu, Jerry Bernholc, **Miguel Fuentes-Cabrera**, Humberto Terrones, **Bobby G. Sumpter**, Jieyu Yi, Zheng Gai, Arthur P. Baddorf, Qing Li,. Peter Maksymovych, Minghu Pan, "Computational Study of Phenylacetylene Self-Assembly on Au(111) Surface", Talk in Session C33: Organic Electronics and Photonics - Interfaces and Contacts (March 18, 2013) APS National Meeting, Baltimore.

Jaron Krogel, **Jeongnim Kim**, David Ceperley "Prospects for efficient QMC defect calculations: the energy density applied to Ge self-interstitials", Talk in Session J24: Quantum Many-Body Systems and Methods I (March 19, 2013) APS National Meeting, Baltimore.

Kendal Clark, **Xiaoguang Zhang**, Ivan Vlassiouk, Guowei He,Gong Gu, Randall Feenstra, An-Ping Li, "Mapping the Electron Transport of Graphene Boundaries Using Scanning Tunneling Potentiometry", Talk in Session G6: CVD Graphene - Doping and Defects (March 19, 2013) APS National Meeting, Baltimore.

Gregory Brown, **Donald M. Nicholson**, Markus Eisenbach, **Kh. Odbadrakh** "Wang-Landau or Statistical Mechanics", Talk in Session G6: Equilibrium Statistical Mechanics, Followed by GSNP Student Speaker Award (March 18, 2013) APS National Meeting, Baltimore.

**Don Nicholson, Kh. Odbadrakh**, German Samolyuk, G. Malcolm Stocks," Calculated magnetic structure of mobile defects in Fe", Session Y16: Magnetic Theory II (March 22, 2013) APS National Meeting, Baltimore.

**Khorgolkhuu Odbadrakh, Don Nicholson**, Aurelian Rusanu, German Samolyuk, Yang Wang, Roger Stoller, **Xiaoguang Zhang**, George Stocks, "Coarse graining approach to First principles modeling of structural materials", Session A43: Multiscale modeling--Coarse-graining in Space and Time I (March 18, 2013) APS National Meeting, Baltimore.

**M. G. Reuter** & P. D. Williams, "The Information Content of Conductance Histogram Peaks: Transport Mechanisms, Level Alignments, and Coupling Strengths" Talk in Session R43: Electron Transfer, Charge Transfer and Transport Session, (March 20,2013) APS National Meeting, Baltimore.

**Paul R. C. Kent**, Panchapakesan Ganesh, Jeongnim Kim, Mina Yoon, Fernando Reboredo, "Binding and Diffusion of Li in Graphite: Quantum Monte Carlo Benchmarks and validation of Van der Waals DFT" Talk in Session A5: Van der Waals Bonding in Advanced Materials – Materials Behavior, (March 18, 2013) APS National Meeting, Baltimore.

Peter Staar, **Thomas Maier**, Thomas Schulthess, "DCA+: Incorporating self-consistently a continuous momentum self-energy in the Dynamical Cluster Approximation" Talk in Session N24, APS National Meeting, Baltimore.

**Thomas Maier**, Peter Hirschfeld, Douglas Scalapino, Yan Wang, Andreas Kreisel, "Pairing strength and gap functions in multiband superconductors: 3D effects" Talk in Session G37: Electronic Structute Methods II,(March 20, 2013) APS National Meeting, Baltimore.

**Thomas Maier**, Yan Wang, Andreas Kreisel, Peter Hirschfeld, Douglas Scalapino, "Spin fluctuation theory of pairing in AFe2As2" Talk in Session G37: Electronic Structure Methods II,(March 20, 2013), APS National Meeting, Baltimore.

Peter Hirschfeld, Andreas Kreisel, Yan Wang, Milan Tomic, Harald Jeschke, Anthony Jacko, Roser Valenti, **Thomas Maier**, Douglas Scalapino, "Pressure dependence of critical temperature of bulk FeSe from spin fluctuation theory" Talk in Session G37: Electronic Structure Methods II (March 20, 2013), APS National Meeting, Baltimore.

Markus Eisenbach, Junqi Yin, **Don M. Nicholson**, Ying Wai Li, "First principles calculation of finite temperature magnetism in Ni", Talk in Session C17: Magnetic Theory I (March 18, 2013), APS National Meeting, Baltimore.

Madhusudan Ojha, **Don M. Nicholson**, Takeshi Egami, "Ab-initio atomic level stresses in Cu-Zr crystal, liquid and glass phases", Talk in Session G42: Focus Session: Physics of Glasses and Viscous Liquids I (March 19, 2013), APS National Meeting, Baltimore.

Junqi Yin, Markus Eisenbach, **Don Nicholson**, "Spin-lattice coupling in BCC iron", Talk in Session T39: Metals Alloys and Metallic Structures (March 21, 2013), APS National Meeting, Baltimore.

German Samolyuk, Yuri Osetsky, Roger Stoller, **Don Nicholson**, George Malcolm Stocks, "The modification of core structure and Peierls barrier of 1/2$<111>$ screw dislocation in bcc Fe in presence of Cr solute atoms", Talk in Session T39: Metals Alloys and Metallic Structures (March 21, 2013), APS National Meeting, Baltimore.

**SIAM-CSE13**

February 25 - March 1

The CSMD had a strong showing at SIAM-CSE13 with over 25 presentations from staff members from the division. This conference is a leading conference in computer science and mathematics, drawing thousands of researchers from across the globe and supported jointly by NSF and DOE. Division scientist organized eight different mini-symposiums with close to a hundred invited speakers in areas of modern libraries (Christopher Baker), climate (Kate Evans), nuclear simulations (Bobby Philip), kinetic theory (Cory Hauck), hybrid architecture linear algebra (Ed D'Azevedo), UQ and stochastic inverse problems (Clayton Webster), and Structural Graph Theory, Sparse Linear Algebra, and Graphical Models (Blair Sullivan).

## Seminars

**April 7, 2014** - Tom Scogland: *Runtime Adaptation for Autonomic Heterogeneous Computing*

Heterogeneity is increasing at all levels of computing, certainly with the rise in general purpose computing with GPUs in everything from phones to supercomputers. More quietly it is increasing with the rise of NUMA systems, hierarchical caching, OS noise, and a myriad of other factors. As heterogeneity becomes a fact of life at every level of computing, efficiently managing heterogeneous compute resources is becoming a critical task. In order to make the problem tractable we must develop methods and systems to allow software to adapt to the hardware it finds within a given node at runtime. The goal is to make the complex functions of heterogeneous computing autonomic, handling load balancing, memory coherence and other performance critical factors in the runtime. This talk will discuss my research into this area, including the design of a work-sharing construct for CPU and GPU resources in OpenMP and automated memory reshaping/re-mapping for locality.

Dr. Scogland is a candidate for a postdoctoral position with the Computer Science Research Group

**April 4, 2014** - Alex McCaskey: *Effects of Electron-Phonon Coupling in Single-Molecule Magnet Transport Junctions Using a Hybrid Density Functional Theory and Model Hamiltonian Approach*

Recent experiments have shown that junctions consisting of individual single-molecule magnets (SMMs) bridged between two electrodes can be fabricated in three-terminal devices, and that the characteristic magnetic anisotropy of the SMMs can be affected by electrons tunneling through the molecule. Vibrational modes of the SMM can couple to electronic charge and spin degrees of freedom, and this coupling also influences the magnetic and transport properties of the SMM. The effect of electron-phonon coupling on transport has been extensively studied in small molecules, but not yet for junctions of SMMs. The goals of this talk will be two-fold: to present a novel approach for studying the effects of this electron-phonon coupling in transport through SMMs that utilizes both density functional theory calculations and model Hamiltonian construction and analysis, and to present a software framework based on this hybrid approach for the simulation of transport across user-defined SMMs . The results of these simulations will indicate a characteristic suppression of the current at low energies that is strongly dependent on the overall electron-phonon coupling strength and number of molecular vibrational modes considered.

Mr. McCaskey is a candidate for a graduate position in the Computer Science Research Group

**March 26, 2014** - Steven Wise: *Convergence of a Mixed FEM for a Cahn-Hilliard-Stokes System*

Abstract and Bio sent on behalf of the speaker:

Co-Authors: Amanda Diegel and Xiaobing Feng

Abstract: In this talk I will describe a mixed finite element method for a modified Cahn-Hilliard equation coupled with a non-steady Darcy-Stokes flow that models phase separation and coupled fluid flow in immiscible binary fluids and di-block copolymer melts. I will focus both on numerical implementation issues for the scheme as well as the convergence analysis. The time discretization is based on a convex splitting of the energy of the equation. I will show that our scheme is unconditionally energy stable with respect to a spatially discrete analogue of the continuous free energy of the system and unconditionally uniquely solvable. We can show, in addition, that the phase variable is bounded in L^\infty(0,T,L^\infty) and the chemical potential is bounded in L\infty(0,T,L^2), unconditionally in both two and three dimensions, for any finite final time T. In fact the bounds in such estimates grow only (at most) linearly in T. I will prove that these variables converge with optimal rates in the appropriate energy norms in both two and three dimensions. Finally, I will discuss some extensions of the scheme to approximate solutions for diffuse interface flow models with large differences in density.

Bio:

Steven Wise is an associate professor of mathematics at the University of Tennessee. He specializes in fast adaptive nonlinear algebraic solvers for numerical PDE, numerical analysis, and scientific computing more broadly. Before coming to the University of Tennessee, he was a postdoc and visiting assistant professor of mathematics and biomedical engineering at the University of California, Irvine. He earned a PhD in engineering physics from the University of Virginia in 2003.

**March 18, 2014** - Zhiwen Zhang: *A Dynamically Bi-Orthogonal Method for Time-Dependent Stochastic Partial Differential Equation*

We propose a dynamically bi-orthogonal method (DyBO) to study time dependent stochastic partial differential equations (SPDEs). The objective of our method is to exploit some intrinsic sparse structure in the stochastic solution by constructing the sparsest representation of the stochastic solution via a bi-orthogonal basis. It is well-known that the Karhunen-Loeve expansion minimizes the total mean squared error and gives the sparsest representation of stochastic solutions. However, the computation of the KL expansion could be quite expensive since we need to form a covariance matrix and solve a large-scale eigenvalue problem. In this talk, we derive an equivalent system that governs the evolution of the spatial and stochastic basis in the KL expansion. Unlike other reduced model methods, our method constructs the reduced basis on-the-fly without the need to form the covariance matrix or to compute its eigen-decomposition. We further present an adaptive strategy to dynamically remove or add modes, perform a detailed complexity analysis, and discuss various generalizations of this approach. Several numerical experiments will be provided to demonstrate the effectiveness of the DyBO method.

Bio:

Zhiwen Zhang is a postdoctoral scholar in the Department of Computing and Mathematical Sciences, California Institute of Technology. He graduated from the Department of Mathematical Sciences, Tsinghua University in 2011, where he was awarded the degree of Ph.D. in Applied Mathematics. From 2008 to 2009, he was studied in the University of Wisconsin at Madison as a visiting student. His research interests lie in the applied analysis and numerical computation of problems arising from quantum chemistry, wave propagation, porous media, cell evolution, Bayesian updating, stochastic fluid dynamics and random heterogeneous media.

**March 4, 2014** - David Seal: *Beyond the Method of Lines Formulation: Building Spatial Derivatives into the Temporal Integrator*

Abstract: High-order solvers for hyperbolic conservation laws often fall under two disparate categories. On one hand, the method of lines formulation starts by discretizing the spatial variables, and then a system of ODEs is solved using an appropriate time-integrator. On the other hand, Lax-Wendroff discretizations immediately convert Taylor series in time to discrete spatial derivatives. In this talk, we present generalizations of these methods including high-order discontinuous Galerkin (DG) methods based on multiderivative time-integrators, as well as high-order finite difference weighted essentially non-oscillatory (WENO) methods based on the Picard Integral Formulation (PIF) of the conservation law. Multiderivative time integrators are extensions of Runge-Kutta and Taylor methods. They reduce the overall storage required for a Runge-Kutta method, and they introduce flexibility to the Taylor series in time methods by allowing for new coefficients to be used at various stages. In the multiderivative DG method, "modified fluxes'' are used to define high-order Riemann problems, which are similar to those defined in the generalized Riemann problem solvers incorporated in the Arbitrary DERivative (ADER) methods. The finite difference WENO method is based on a Picard Integral Formulation of the PDE, where we first integrate in time, and then work on discretizing the temporal integral. The present formulation is automatically mass conservative, and therefore it introduces the possibility of modifying finite difference fluxes for the purpose of accomplishing tasks such as positivity preservation, or reducing the number of expensive non-linear WENO reconstructions. For now, we present results for a single-step version of the PIF-WENO method which lends itself to incorporating adaptive mesh refinement technology. Results for one- and two-dimensional conservation laws are presented, and they indicate that the new methods compete well with current state of the art technology.

**February 21, 2014** - Zhou Li: *Harnessing high-resolution mass spectrometry and high-performance supercomputing for quantitative characterization of a broad range of protein post-translational modifications in a natural microbial community*

Microbial communities populate and shape diverse ecological niches within natural environments. The physiology of organisms in natural consortia has been studied with community proteomics. However, little is known about how free-living microorganisms regulate protein activities through post-translational modifications (PTMs). Here, we harnessed high-performance mass spectrometry and supercomputing for identification and quantification of a broad range of PTMs (including hydroxylation, methylation, citrullination, acetylation, phosphorylation, methylthiolation, S-nitrosylation, and nitration) in microorganisms. Using an E. coli proteome as a benchmark, we identified more than 5,000 PTM events of diverse types and a large number of modified proteins that carried multiple types of PTMs. We applied this demonstrated approach to profiling PTMs in two growth stages of a natural microbial community growing in the acid mine drainage environment. We found that the multi-type, multi-site protein modifications are highly prevalent in free-living microorganisms. A large number of proteins involved in various biological processes were dynamically modified during the community succession, indicating that dynamic protein modification might play an important role in organismal response to changing environmental conditions. Furthermore, we found closely related, but ecologically differentiated bacteria harbored remarkably divergent PTM patterns between their orthologous proteins, implying that PTM divergence could be a molecular mechanism underlying their phenotypic diversities. We also quantified fractional occupancy for thousands of PTM events. The findings of this study should help unravel the role of PTMs in microbial adaptation, evolution and ecology.

**February 14, 2014** - Celia E. Shiau: *Probing fish-microbe interface for environmental assessment of clean energy*

To preserve wildlife and natural resources for future generations, we face the grand challenge of effectively assessing and predicting the impact of current and future energy use. My overall goal is to probe the microbiome and host-microbe interface of fish populations, in order to evaluate environmental stress on aquatic life and resources. Current understanding of aquatic microbes in fresh and salt water is centered on free-living bacteria (independent of a host). I will discuss my work on the experimentally tractable fish model (Danio rerio) that can be applied to investigate the interaction between microbiota, host health, and environmental toxicants (such as mercury and other metalloids), and the aims of my Liane Russell fellowship research program. The findings will provide a framework for studies of other fish species, leveraging advanced imaging, metagenomics, bioinformatics, and neutron scattering. The proposed study promises to inform the potential use of fish microbes to solve energy and environmental challenges, thereby providing means for critical assessment of global energy impact.

**February 6, 2014** - Susan Janiszewski: *3-connected, claw-free, generalized net-free graphs are hamiltonian*

Given a family $\mathcal{F} = \{H_1, H_2, \dots, H_k\}$ of graphs, we say that a graph is $\mathcal{F}$-free if $G$ contains no subgraph isomorphic to any $H_i$, $i = 1,2,\dots, k$. The graphs in the set $\mathcal{F}$ are known as {\it forbidden subgraphs}. The main goal of this dissertation is to further classify pairs of forbidden subgraphs that imply a 3-connected graph is hamiltonian. First, the number of possible forbidden pairs is reduced by presenting families of graphs that are 3-connected and not hamiltonian. Of particular interest is the graph $K_{1,3}$, also known as the {\it claw}, as we show that it must be included in any forbidden pair. Secondly, we show that 3-connected, $\{K_{1,3}, N_{i,j,0}\}$-free graphs are hamiltonian for $i,j \ne 0, i+j \le 9$ and 3-connected, $\{K_{1,3}, N_{3,3,3}\}$-free graphs are hamiltonian, where $N_{i,j,k}$, known as the {\it generalized net}, is the graph obtained by rooting vertex-disjoint paths of length $i$, $j$, and $k$ at the vertices of a triangle. These results combined with previous known results give a complete classification of generalized nets such that claw-free, net-free implies a 3-connected graph is hamiltonian.

**January 30, 2014** - Wei Guo: *High order Semi-Lagrangian Methods for Transport Problems with Applications to Vlasov Simulations and Global Transport*

Abstract and Bio sent on behalf of the speaker:

The semi-Lagrangian (SL) scheme for transport problems gains more and more popularity in the computational science community due to its attractive properties. For example, the SL scheme, compared with the Eulerian approach, allows extra large time step evolution by incorporating characteristics tracing mechanism, hence achieving great computational efficiency. In this talk, we will introduce a family of dimensional splitting high order SL methods coupled with high order finite difference weighted essentially non-oscillatory (WENO) procedures and finite element discontinuous Galerkin (DG) methods. By performing dimensional splitting, the multi-dimensional problem is decoupled into a sequence of 1-D problems, which are much easier to solve numerically in the SL setting. The proposed SL schemes are applied to the Vlasov model arising from the plasma physics and the global transport problems based on the cubed-sphere geometry from the operational climate model. We further introduce the integral defer correction (IDC) framework to reduce the dimensional splitting errors. The proposed algorithms have been extensively tested and benchmarked with classical problems in plasma physics such as Landau damping, two stream instability, Kelvin-Helmholtz instability and global transport problems on the cubed-sphere. This is joint work with Andrew Christlieb, Maureen Morton, Ram Nair and Jing-Mei Qiu.

**January 28, 2014** - Jeff Haack: *Applications of computational kinetic theory*

Abstract and Bio sent on behalf of the speaker:

Kinetic theory describes the evolution of a complex system of a large number of interacting particles. These models are used to describe systems where the characteristic scales for interaction between particles and characteristic length scales are similar. In this talk, I will discuss numerical computation of several applications of kinetic theory, including rarefied gas dynamics with applications towards re-entry, kinetic models for plasmas, and a biological model for swarm behavior. As kinetic models often involve a high dimensional phase space as well as an integral operator modeling particle interactions, simulations have been impractical in many settings. However, recent advances in massively parallel computing are very well suited to solving kinetic models, and I will discuss how these resources are used in computing kinetic models and new difficulties that arise when computing on these architectures.

**January 24, 2014** - Roman Lysecky: *Data-driven Design Methods and Optimization for Adaptable High-Performance Systems*

Abstract and Bio sent on behalf of the speaker:

Research has demonstrated that runtime optimization and adaptation methods can achieve performance improvement over design-time optimization system implementations. Furthermore, modern computing applications require a large degree of configurability and adaptability to operate on a variety of data inputs where the characteristic of the data inputs may change over time. In this talk, we highlight two runtime optimization methods for adaptable computing systems. We first highlight the use of runtime profiling and system-level performance and power estimation methods for estimating the speedup and power consumption of dynamically reconfigurable systems. We evaluate the accuracy and fidelity of the online estimation framework for dynamic configuration of computational kernels with goals of both maximizing performance and minimizing system power consumption. We further present an overview of the design framework and runtime reconfiguration methods supporting data-adaptable reconfigurable systems. Data-adaptable reconfigurable systems enable a flexible runtime implementation in which a system can transition the execution of tasks between different execution modalities, e.g., hardware and software implementations, while simultaneously continuing to process data during the transition.

Bio:

Roman Lysecky is an Associate Professor of Electrical and Computer Engineering at the University of Arizona. He received his B.S., M.S., and Ph.D. in Computer Science from the University of California, Riverside in 1999, 2000, and 2005, respectively. His research interests focus on embedded systems, with emphasis on embedded system security, non-intrusive system observation methods for in-situ analysis of complex hardware and software behavior, runtime optimizations methods, and design methods for precisely timed systems with applications in safety-critical and mobile health systems. He was awarded the Outstanding Ph.D. Dissertation Award from the European Design and Automation Association (EDAA) in 2006 for New Directions in Embedded Systems. He received a CAREER award from the National Science Foundation in 2009 and four Best Paper Awards from the ACM/IEEE International Conference on Hardware-Software Codesign and System Synthesis (CODES+ISSS), the ACM/IEEE Design Automation and Test in Europe Conference (DATE), the IEEE International Conference on Engineering of Computer-Based Systems (ECBS), and the International Conference on Mobile Ubiquitous Computing, Systems, Services (UBICOMM). He has coauthored five textbooks on VHDL, Verilog, C, C++, and Java programming. He is an inventor on one US patent. In 2008 and 2013, he received an award for Excellence at the Student Interface from the College of Engineering and the University of Arizona.

**January 21, 2014** - Tuoc Van Phan: *Some Aspects in Nonlinear Partial Differential Equations and Nonlinear Dynamics*

This talk contains two parts:

Part I: We discuss the Shigesada-Kawasaki-Teramoto system of cross-diffusion equations of two competing species in population dynamics. We show that if there are self-diffusion in one species and no cross-diffusion in the other, then the system has a unique smooth solution for all time in bounded domains of any dimension. We obtain this result by deriving global W ^(1,p) –estimates of Calderón-Zygmund type for a class of nonlinear reaction-diffusion equations with self-diffusion. These estimates are achieved by employing Caffarelli-Peral perturbation technique together with a new two-parameter scaling argument.

Part II: We study a class of nonlinear Schrödinger equations in one dimensional spatial space with double-well symmetric potential. We derive and justify a normal form reduction of the nonlinear Schrödinger equation for a general pitchfork bifurcation of the symmetric bound state. We prove persistence of normal form dynamics for both supercritical and subcritical pitchfork bifurcations in the time-dependent solutions of the nonlinear Schrödinger equation over long but finite time intervals.

The talk is based on my joint work with Luan Hoang (Texas Tech University), Truyen Nguyen (University of Akron), and Dmitry Pelinovsky (McMaster University).

**January 17, 2014** - John Dolbow: *Recent advances in embedded finite element methods*

This seminar will present recent advances in an emerging class of embedded finite element methods for evolving interface problems in mechanics. By embedded, we refer to methods that allow for the interface geometry to be arbitrarily located with respect to the finite element mesh. This relaxation between mesh and geometry obviates the need for remeshing strategies in many cases and greatly facilitates adaptivity in others. The approach shares features with finite-difference methods for embedded boundaries, but within a variational setting that facilitates error and stability analysis.

We focus attention on a weighted form of Nitsche's method that allows interfacial conditions to be robustly enforced. Classically, Nitsche's method provides a means to weakly impose boundary conditions for Galerkin-based formulations. With regard to embedded interface problems, some care is needed to ensure that the method remains well behaved in varied settings ranging from interfacial configurations resulting in arbitrarily small elements to problems exhibiting large contrast. We illustrate how the weighting of the interfacial terms can be selected to both guarantee stability and to guard against ill-conditioning. Various benchmark problems for the method are then presented.

**January 16, 2014** - Aziz Takhirov: *Numerical analysis of the flows in Pebble Bed Geometries*

Flows in complex geometries intermediate between free flows and porous media flows occur in pebble bed reactors and other industrial processes. The Brinkman models have consistently shown that for simplified settings accurate prediction of essential flow features depends on the impossible problem of meshing the pores. We discuss a new model to understand the flow and its properties in these geometries.

**January 13, 2014** - Pablo Seleson: *Bridging Scales in Materials with Mesoscopic Models*

Complex systems are often characterized by processes occurring at different spatial and temporal scales. Accurate predictions of quantities of interest in such systems are many times only feasible through multiscale modeling. In this talk, I will discuss the use of mesoscopic models as a means to bridge disparate scales in materials. Examples of mesoscopic models include nonlocal continuum models, based on integro-differential equations, that generalize classical continuum models based on partial differential equations. Nonlocal models possess length scales, which can be controlled for multiscale modeling. I will present two nonlocal models: peridynamics and nonlocal diffusion, and demonstrate how inherent length scales in these models allow to bridge scales in materials.

**January 9, 2014** - Gung-Min Gie: *Motion of fluids in the presence of a boundary*

In most practical applications of fluid mechanics, it is the interaction of the fluid with the boundary that is most critical to understanding the behavior of the fluid. Physically important parameters, such as the lift and drag of a wing, are determined by the sharp transition the air makes from being at rest on the wing to flowing freely around the airplane near the wing. Mathematically, the behavior of such flows at small viscosity is modeled by the Navier-Stokes equations. In this talk, we discuss some recent results on the boundary layers of the Navier-Stokes equations under various boundary conditions.

**January 6, 2014** - Christine Klymko: *Central and Communicability Measures in Complex Networks: Analysis and Algorithms*

Complex systems are ubiquitous throughout the world, both in nature and within man-made structures. Over the past decade, large amounts of network data have become available and, correspondingly, the analysis of complex networks has become increasingly important. One of the fundamental questions in this analysis is to determine the most important elements in a given network. Measures of node importance are usually referred to as node centrality and measures of how well two nodes are able to communicate with each other are referred to as the communicability between pairs of nodes. Many measures of node centrality and communicability have been proposed over the years. Here, we focus on the analysis and computation of centrality and communicability measures based on matrix functions. First, we examine a node centrality measure based on the notion of total communicability, defined in terms of the row sums of the exponential of the adjacency matrix of the network. We argue that this is a natural metric for ranking nodes in a network, and we point out that it can be computed very rapidly even in the case of large networks. Furthermore, we propose a measure of the total network communicability, based on the total sum of node communicabilities, as a useful measure of the connectivity of the network as a whole. Next, we compare various parameterized centrality rankings based on the matrix exponential and matrix resolvent with degree and eigenvector centrality. The centrality measures we consider are exponential and resolvent subgraph centrality (defined in terms of the diagonal entries of the matrix exponential and matrix resolvent, respectively), total communicability, and Katz centrality (defined in terms of the row sums of the matrix resolvent). We demonstrate an analytical relationship between these rankings and the degree and subgraph centrality rankings which helps to explain explain the observed robustness of these rankings on many real world networks, even though the scores produced by the centrality measures are not stable.

**December 19, 2013** - Adam Larios: *New Techniques for Large-Scale Parallel Turbulence Simulations at High Reynolds Numbers*

Abstract sent on behalf of the speaker:

Two techniques have recently been developed to handle large-scale simulations of turbulent flows. The first is a nonlinear, LES-type viscosity, which is based on the numerical violation of the local energy balance of the Navier-Stokes equations. This technique enjoys a numerical dissipation which remains vanishingly small in regions where the solution is smooth, only damping the flow in regions of numerical shock, allowing for increased accuracy at reduced computational cost. The second is a direction-splitting technique for projection methods, which unlocks new parallelism previously unexploited in fluid flows, and enables very fast, large-scale turbulence simulations.

**December 16, 2013** - Tuoc Van Phan: *Some Aspects in Nonlinear Partial Differential Equations and Nonlinear Dynamics*

Abstract is attached and is sent on behalf of the speaker:

This talk contains two parts:

Part I: We discuss the Shigesada-Kawasaki-Teramoto system of cross-diffusion equations of two competing species in population dynamics. We show that if there are self-diffusion in one species and no cross-diffusion in the other, then the system has a unique smooth solution for all time in bounded domains of any dimension. We obtain this result by deriving global W ^(1,p) - estimates of Calderón-Zygmund type for a class of nonlinear reaction-diffusion equations with self-diffusion. These estimates are achieved by employing Caffarelli-Peral perturbation technique together with a new two-parameter scaling argument.

Part II: We study a class of nonlinear Schrödinger equations in one dimensional spatial space with double-well symmetric potential. We derive and justify a normal form reduction of the nonlinear Schrödinger equation for a general pitchfork bifurcation of the symmetric bound state. We prove persistence of normal form dynamics for both supercritical and subcritical pitchfork bifurcations in the time-dependent solutions of the nonlinear Schrödinger equation over long but finite time intervals.

The talk is based on my joint work with Luan Hoang (Texas Tech University), Truyen Nguyen (University of Akron), and Dmitry Pelinovsky (McMaster University).

**December 13, 2013** - Rich Lehoucq: *A Computational Spectral Graph Theory Tutorial*

My presentation considers the research question of whether existing algorithms and software for the large-scale sparse eigenvalue problem can be applied to problems in spectral graph theory. I first provide an introduction to several problems involving spectral graph theory. I then provide a review of several different algorithms for the large-scale eigenvalue problem and briefly introduce the Anasazi package of eigensolvers.

**December 10, 2013** - Jingwei Hu: *Fast algorithms for quantum Boltzmann collision operators*

The quantum Boltzmann equation describes the non-equilibrium dynamics of a quantum system consisting of bosons or fermions. The most prominent feature of the equation is a high-dimensional integral operator modeling particle collisions, whose nonlinear and nonlocal structure poses a great challenge for numerical simulation. I will introduce two fast algorithms for the quantum Boltzmann collision operator. The first one is a quadrature based solver specifically designed for the collision operator in reduced energy space. Compared to cubic complexity of direct evaluation, our algorithm runs in only linear complexity (optimal up to a logarithmic factor). The second one accelerates the computation of the full phase space collision operator. It is a spectral algorithm based on a special low-rank decomposition of the collision kernel. Numerical examples including an application to semiconductor device modeling are presented to illustrate the efficiency and accuracy of proposed algorithms.

**December 6, 2013** - Jeongnim Kim: *Analysis of QMC Applications on Petascale Computers*

Continuum Quantum Monte Carlo (QMC) has proved to be an invaluable tool for predicting the properties of matter from fundamental principles. The multiple forms of parallelism afforded by QMC algorithms and high compute-to-communication ratio make them ideal candidates for acceleration in the multi/many-core paradigm, as demonstrated by the performance of QMCPACK on various high-performance computing (HPC) platforms including Titan (Cray XK7) and Mira (IBM BlueGene Q).

The changes expected on future architectures - orders of magnitude higher parallelism, hierarchical memory and communication, and heterogeneous nodes - pose great challenges to application developers but also present opportunities to transform them to tackle new classes of problems. This talk presents core QMC algorithms and their implementations in QMCPACK on the HPC systems of today. The speaker will discuss the performance of typical QMC workloads to elucidate the critical issues to be resolved for QMC to fully exploit increasing computing powers of forthcoming HPC systems.

**December 3, 2013** - Terry Haut: *Advances on an asymptotic parallel-in-time method for highly oscillatory PDEs*

In this talk, I will first review a recent time-stepping algorithm for nonlinear PDEs that exhibit fast (highly oscillatory) time scales. PDEs of this form arise in many applications of interest, and in particular describe the dynamics of the ocean and atmosphere. The scheme combines asymptotic techniques (which are inexpensive but can have insufficient accuracy) with parallel-in-time methods (which, alone, can yield minimal speedup for equations that exhibit rapid temporal oscillations). Examples are presented on the (1D) rotating shallow water equations in a periodic domain, which demonstrate significant parallel speedup is achievable.

In order to implement this time-stepping method for general spatial domains (in 2D and 3D), a key component involves applying the exponential of skew-Hermitian operators. To this end, I will next present a new algorithm for doing so. This method can also be used for solving wave propagation problems, which is of independent interest. This scheme has several advantages over standard methods, including the absence of any stability constraints in relation to the spatial discretization, and the ability to parallelize the computation in the time variable over as many characteristic wavelengths as resources permit (in addition to any spatial parallelization). I will also present examples on the linear 2D shallow water equations, as well the 2D (variable coefficient) wave equation. In these examples, this method (in serial) is 1-2 orders of magnitude faster than both RK4 and the use of Chebyshev polynomials.

**December 3, 2013** - Galen Shipman: *The Compute and Data Environment for Science (CADES)*

In this talk I will discuss ORNL's Compute and Data Environment for Science. The Compute and Data Environment for Science (CADES) provides R&D with a flexible and elastic compute and data infrastructure. The initial deployment consists of over 5 petabytes of high-performance storage, nearly half a petabyte of scalable NFS storage, and over 1000 compute cores integrated into a high performance ethernet and InfiniBand network. This infrastructure, based on OpenStack, provides a customizable compute and data environment for a variety of use cases including large-scale omics databases, data integration and analysis tools, data portals, and modeling/simulation frameworks. These services can be composed to provide end-to-end solutions for specific science domains.

Galen Shipman is the Data Systems Architect for the Computing and Computational Sciences Directorate and Director of the Compute and Data Environment for Science at Oak Ridge National Laboratory (ORNL). He is responsible for defining and maintaining an overarching strategy and infrastructure for data storage, data management, and data analysis spanning from research and development to integration, deployment and operations for high-performance and data-intensive computing initiatives at ORNL. His current work includes addressing many of the data challenges of major facilities such as those of the Spallation Neutron Source (Basic Energy Sciences) and major data centers focusing on Climate Science (Biological and Environmental Research).

**December 2, 2013** - Wei Ding: *Klonos: A Similarity Analysis-Based Tool for Software Porting in High-Performance Computing*

Porting applications to a new system is a nontrivial job in the HPC field. It is a very time-consuming, labor-intensive process, and the quality of the results will depend critically on the experience of the experts involved. In order to ease the porting process, a methodology is proposed to address an important aspect of software porting that receives little attention, namely, planning support. When a scientific application consisting of many subroutines is to be ported, the selection of key subroutines greatly impacts the productivity and overall porting strategy, because these subroutines may represent a significant feature of the code in terms of functionality, code structure, or performance. They may also serve as indicators of the difficulty and amount of effort involved in porting a code to a new platform. The proposed methodology is based on the idea that a set of similar subroutines can be ported with similar strategies and result in a similar-quality porting. By vie wing subroutines as data and operator sequences, analogous to DNA sequences, various bio-informatics techniques may be used to conduct the similarity analysis of subroutines while avoiding NP-complete complexities of other approaches. Other code metrics and cost-model metrics have been adapted for similarity analysis to capture internal code characteristics. Based on those similarity analyses, "Klonos," a tool for software porting, has been created. Experiment shows that Klonos is very effective for providing a systematic porting plan to guide users during their porting process of reusing similar porting strategies for similar code regions.

**November 20, 2013** - Chao Yang: *Numerical Algorithms for Solving Nonlinear Eigenvalue Problems in Electronic Structure Calculation*

The Kohn-Sham density functional theory (KSDFT) is the most widely used theory for studying electronic properties of molecules and solids. The main computational problem in KSDFT is a nonlinear eigenvalue problem in which the matrix Hamiltonian is a function of a number of eigenvectors associated with smallest eigenvalues. The problem can also be formulated as a constrained energy minimization problem or a nonlinear equation in which the unknown ground state electron density satisfies a fixed point map. Significant progress has been made in the last few years on understanding the mathematical properties of this class of problems. Efficient and reliable numerical algorithms have been developed to accelerate the convergence of nonlinear solvers. New methods have also been developed to reduce the computational cost in each step of the iterative solver. We will review some of these developments and discuss additional challenges in large-scale electronic structure calculations.

**November 15, 2013** - Christian Straube: *Simulation of HPDC Infrastructure Attributes*

High Performance Distributed Computing (HPDC) infrastructures use several data centers, High Performance Computing (HPC) and distributed systems, each built from manifold (often heterogeneous) compute, storage, interconnect, and other specialized sub components to provide their capabilities, i.e. well-defined functionality that is exposed to a user or application. Capabilities' quality can be described by attributes, e.g., performance, energy efficiency, or reliability. Hardware-related modifications, such as clock rate adaptation or interconnect throughput improvement, often induce two groups of effects onto these attributes: the (by definition) positive intended effects and the mostly negative but unavoidable side effects. For instance, increasing a typical HPDC infrastructure's redundancy to address short-time breakdown and to improve reliability (positive intended effect), simultaneously increases energy consumption and degrades performance due to redundancy overhead (neg

ative side effects).

In this talk, I present Predictive Modification Effect Analysis (PMEA) that aims at avoiding harmful execution and costly but spare modification exploration by investigating in advance, whether the (negative) side effects on attributes will outweigh the (positive) intended effects. The talk covers the fundamental concepts and basic ideas of PMEA and it presents it's underlying model. The model is straightforward and fosters fast development, even for complex HPDC infrastructures, it handles individual and open sets of attributes and their calculations, and it addresses effect cascading through the entire HPC infrastructure. Additionally, I will present a prototype of a simulation tool and describe some selected features in detail.

Bio:

Christian Straube is a Computer Science Ph.D. student at the Ludwig-Maximilians-University (LMU) in Munich, Germany since January 2012. His research interests include HPDC infrastructure and data center analysis, in particular planning, modification justification, as well as effect outweighing and cascading. During his time as Ph.D. student, he worked several months at the Leibniz Supercomputing Center, which operates the SuperMUC, a three Petaflop/s system that applies warm-water cooling. Prior to joining LMU as a Ph.D. student, Christian worked for several years in industry and academia as software engineer and project manager. He ran his own software engineering company for 10 years, and was (co-) founder of several IT related start-ups. He received a best paper award for a conference contribution to INFOCOMP 2012 and was subsequently invited as technical program member of INFOCOMP 2013. Christian holds a Diploma with Distinction in Computer Science from Ludwig-Maximilians-University in Munich with a minor in Medicine.

**November 12, 2013** - Surya R. Kalidindi: *Data Science and Cyberinfrastructure Enabled Development of Advanced Materials*

Materials with enhanced performance characteristics have served as critical enablers for the successful development of advanced technologies throughout human history, and have contributed immensely to the prosperity and well-being of various nations. Although the core connections between the material's internal structure (i.e. microstructure), its evolution through various manufacturing processes, and its macroscale properties (or performance characteristics) in service are widely acknowledged to exist, establishing this fundamental knowledge base has proven effort-intensive, slow, and very expensive for a number of candidate material systems being explored for advanced technology applications. It is anticipated that the multi-functional performance characteristics of a material are likely to be controlled by a relatively small number of salient features in its microstructure. However, cost-effective validated protocols do not yet exist for fast identification of these salient features and establishment of the desired core knowledge needed for the accelerated design, manufacture and deployment of new materials in advanced technologies. The main impediment arises from lack of a broadly accepted framework for a rigorous quantification of the material's microstructure, and objective (automated) identification of the salient features in the microstructure that control the properties of interest.

Microstructure Informatics focuses on the development of data science algorithms and computationally efficient protocols capable of mining the essential linkages in large microstructure datasets (both experimental and modeling), and building robust knowledge systems that can be readily accessed, searched, and shared by the broader community. Given the nature of the challenges faced in the design and manufacture of new advanced, this new emerging interdisciplinary field is ideally positioned to produce a major transformation in the current practices used by materials scientists and engineers. The novel data science tools produced by this emerging field promise to significantly accelerate the design and development of new advanced materials through their increased efficacy in gleaning and blending the disparate knowledge and insights hidden in "big data" gathered from multiple sources (including both experiments and simulations). This presentation outlines specific strategies for data science enabled development of advanced materials, and illustrates key components of the proposed overall strategy with examples.

**November 11, 2013** - Hermann Härtig: *A fast and fault tolerant microkernel-based system for exa-scale computing (FFMK)*

FFMK is a recently started project funded by DFG's Exascale-Software program. It addresses three key scalability obstacles expected in future exa-scale systems: the vulnerability to system failures due to transient or permanent failures, the performance losses due to imbalances and the noise due to unpredictable interactions between HPC applications and the operating system. To this end, we adapt and integrate well-proven technologies including:

- Microkernel-based operating systems (L4) to eliminate operating system noise impacts of feature-heavy all-in-one operating systems and to make kernel influences more deterministic and predictable,
- Erasure-code protected on-node checkpointing to provide a fast checkpoint and restart mechanism capable of keeping up with worsening mean-time between failures (MTBF), and
- Mathematically sound management system and load balancing algorithms (Mosix) to adjust the system to the highly dynamic and wide variety of requirements for today's and future HPC applications.

FFMK will combine Linux running in a light-weight virtual machine with a special-purpose component for MPI, both running side by side on L4. The objective is to build a fluid self-organizing platform for applications that require scaling up to exa-scale performance. The talk will explain assumptions and overall architecture of FFMK and continue with presenting a number of design decisions the team is currently facing. FFMK is a cooperation between Hebrew University's MosiX team, the HPC centers of Berlin and Dresden (ZIB, ZIH) and TU Dresden's operating systems group.

Bio:

After having received his PhD from Karlsruhe University on an SMP-related topic, Hermann Härtig led a team at German National Research Center(GMD) to build BirliX, a Unix lookalike designed to address high security requirements. He then moved to TU Dresden to lead the operating systems chair. His team was among the pioneers in building micro kernels of the L4 family (Fiasco, Nova) and systems based on L4 (LeRE, DROPS, NIZZA). L4RE and Fiasco form the OS basis of the SIMKO 3 smart phone. Hermann Härtig now is PI for FFMK.

**October 17, 2013** - Marta D'Elia: *Fractional differential operators on bounded domains as special cases of nonlocal diffusion operators*

We analyze a nonlocal diffusion operator having as special cases the fractional Laplacian and fractional differential operators that arise in several applications, e.g. jump processes. In our analysis, a nonlocal vector calculus is exploited to define a weak formulation of the nonlocal problem. We demonstrate that the solution of the nonlocal equation converges to the solution of the fractional Laplacian equation on bounded domains as the nonlocal interactions become infinite. We also introduce Galerkin finite element discretizations of the nonlocal weak formulation and we derive a priori error estimates. Through several numerical examples we illustrate the theoretical results and we show that by solving the nonlocal problem it is possible to obtain accurate approximations of the solutions of fractional differential equations circumventing the problem of treating infinite-volume constraints.

**October 15, 2013** - Tommy Janjusic: *Framework for Evaluating Dynamic Memory Allocators including a new Equivalence Class based Cache-Conscious Dynamic Memory Allocator*

Software applications' performance is hindered by a variety of factors, but most notably by the well-known CPU-Memory speed gap (often known as the memory wall). This results in the CPU sitting idle waiting for data to be brought from memory to processor caches. The addressing used by caches causes non-uniform accesses to various cache sets. The non-uniformity is due to several reasons; including how different objects are accessed by the code and how the data objects are located in memory. Memory allocators determine where dynamically created objects are placed, thus defining addresses and their mapping to cache locations. It is important to evaluate how different allocators behave with respect to the localities of the created objects. Most allocators use a single attribute, the size, of an object in making allocation decisions. Additional attributes such as the placement with respect to other objects, or specific cache area may lead to better use of cache memories. This talk discusses a framework that allows for the development and evaluation of new memory allocation techniques. At the root of the framework is a memory tracing tool called Gleipnir, which provides very detailed information about every memory access, and relates it back to source level objects. Using the traces from Gleipnir, we extended a commonly used cache simulator for generating detailed cache statistics: per function, per data object, per cache line, and identify specific data objects that are conflicting with each other. The utility of the framework is demonstrated with a new memory allocator known as an equivalence class allocator. The new allocator allows users to specify cache sets, in addition to object size, where the objects should be placed. We compare this new allocator with two well-known allocators, viz., Doug\_Lea and Pool allocators.

**October 8, 2013** - Sophie Blondel: *NAT++: An analysis software for the NEMO experiment*

The NEMO 3 detector aims to prove that the neutrino is a Majorana particle (i.e. identical to the antineutrino). It is mainly composed of a calorimeter and a wire chamber, the former measuring the time and energy of a particle, and the latter reconstructing its track. NEMO 3 has taken data for 5 effective years with an event trigger rate of ~5 Hz, resulting in a total of 10e8 events to analyze. A C++-based software, called NAT++, was created to calibrate and analyze these events. The analysis is mainly based on a time of flight calculation which will be the focus of this presentation. Supplementing this classic analysis, a new tool named gamma-tracking has been developed in order to improve the reconstruction of the gamma energy deposits in the detector. The addition of this tool in the analysis pipeline leads to an increase of 30% of statistics in certain desired channels.

**September 30, 2013** - Eric Barton: *Fast Forward Storage and Input/Output (I/O)*

Conflicting pressures drive the requirements for I/O and Storage at Exascale. On the one hand, an explosion is anticipated, not only in the size of scientific data models but also in their complexity and in the volume of their attendant metadata. These models require workflows that integrate analysis and visualization and new object-oriented I/O Application Programming Interfaces (APIs) to make application development tractable and allow compute to be moved to the data or data to the compute as appropriate. On the other hand, economic realities driving the architecture and reliability of the underlying hardware will push the limits on horizontal scale, introduce unavoidable jitter and make failure the norm. The I/O system will have to handle these as transparently as possible while providing efficient, sustained and predictable performance. This talk will describe the research underway in the Department of Energy (DOE) Fast Forward Project to prototype a complete Exascale I/O stack including at the top level, an object-oriented I/O API based on HDF5, in the middle, a Burst Buffer and data layout optimizer based on PLFS (A Checkpoint Filesystem for Parallel Applications) and at the bottom, DAOs (Data Access Objects) - transactional object storage based on Lustre.

**September 25, 2013** - James Beyer: *OPENMP vs OPENACC*

A brief introduction to two accelerator programming directive sets with a common heritage: OpenACC 2.0 and OpenMP 4.0. After introducing the two directive sets, a side by side comparison of available features along with code examples will be presented to help developers understand their options as they begin programming for both Nvidia and Intel accelerated machines.

**September 25, 2013** - Michael Wolfe: *OPENACC 2.X AND BEYOND*

The OpenACC API is designed to support high-level, performance portable, programming across a range of host+accelerator target systems. This presentation will start with a short discussion of that range, which provides a context for the features and limitations of the specification. Some important additions that were included in OpenACC 2.0 will be highlighted. New features currently under discussion for future versions of the OpenACC API and a summary of the expected timeline will be presented.

**September 23, 2013** - Jun Jia: *Accelerating time integration using spectral deferred correction*

In this talk, we illustrate how to use the spectral deferred correction (SDC) to improve the time integration for scientific simulations. The SDC method combines a Picard integral formulation of the error equation, spectral integration and a user chosen low-order time marching method to form stable methods with arbitrarily high formal order of accuracy in time. The method could be either explicit or implicit, and it also provides the ability to adopt operator splitting while maintaining high formal order. At the end of the talk, we will show some applications using this technique.

**September 19, 2013** - Kenny Gross: *Energy Aware Data Center (EADC) Innovations: Save Energy, Boost Performance*

The global electricity consumption for enterprise and high-performance computing data centers continues to grow much faster than Moore's Law as data centers push into emerging markets, and as developed countries see explosive growth in computing demand as well as supraexponential growth in demand for exabyte (and now zettabyte) storage systems. The USDOE reported that data centers now consume 38 gigawatts of electricity worldwide, a number that is growing exponentially even during times of global economic slowdowns. Oracle has developed a suite of novel algorithmic innovations that can be applied nonintrusively to any IT servers and substantially reduces the energy usage and thermal dissipation for the IT assets (saving additional energy for the data center HVAC systems), while significantly boosting performance (and hence Return-On-Assets) for the IT assets, thereby avoiding additional server purchases (that would consume more energy). The key enabler for this suite of algorithmic innovations is Oracle's Intelligent Power Monitoring (IPM) telemetry harness (implemented in software...no hardware mods anywhere in the data center). IPM, when coupled with advanced pattern recognition, identifies and quantifies three significant nonlinear (heretofore 'invisible') energy-wastage mechanisms that are present in all enterprise and HPC computing assets today, including in low-PUE high-efficiency data centers: 1) leakage power in the CPUs (grows exponentially with CPU temperature), 2) aggregate fan-motor power inside the servers (grows with the cubic power of fan RPMs), and 3) substantial degradation of server energy efficiency by low-level ambient vibrations in the data center racks. This presentation shows how continuous system internal telemetry coupled with advanced pattern recognition technology that was developed for nuclear reactor applications by the presenter and his team back at Argonne National Lab in the 1990s are significantly cutting energy utilization while boosting performance for enterprise and HPC computing assets.

Speaker Bio Info:

------------------

Kenny Gross is a Distinguished Engineer for Oracle and team leader for the System Dynamics Characterization and Control team in Oracle's Physical Sciences Research Center in San Diego. Kenny specializes in advanced pattern recognition, continuous system telemetry, and dynamic system characterization for improving the reliability, availability, and energy efficiency of enterprise computing systems and for the datacenters in which the systems are deployed. Kenny has 220 US patents issued and others pending, 180 scientific publications, and was awarded a 1998 R&D 100 Award for one of the top 100 technological innovations of that year, for an advanced statistical pattern recognition technique that was originally developed for nuclear plant applications and is now being used for a variety of applications to improve the quality-of-service, availability, and optimal energy efficiency for enterprise and HPC computer servers. Kenny earned his Ph.D. in nuclear engineering from the U. of Cincinnati in 1977.

**September 17, 2013** - Damien Lebrun-Grandie: *Simulation of thermo-mechanical contact between fuel pellets and cladding in UO2 nuclear fuel rods*

As fission process heats up the fuel rods, UO2 pellets stacked on top of each other swell both radially and axially, while the surrounding Zircaloy cladding creeps down, so that cladding and pellet eventually come into contact. This exacerbate chemical degradation of the protective cladding and stresses may enable rapid propagation of cracks and thus threaten integrity of the clad. Along these lines, pellet-cladding interaction establish itself as a major concern in fuel rod design and reactor core operation in light water reactors. Accurately modeling fuel behavior is challenging because the mechanical contact problem strongly depends on temperature distribution, and the coupled pellet-cladding heat transfer problem, in turn, is affected by changes in geometry induced by bodies deformations and stresses generated at contact interface.

Our work focuses on active set strategies to determine the actual contact area in high-fidelity coupled physics fuel performance codes. The approach consists of two steps: In the first one, we determine the boundary region on conventional finite element meshes where the contact conditions shall be enforced to prevent objects from occupying the same space. For this purpose, we developed and implemented an efficient parallel search algorithm for detecting mesh inter-penetration and vertex/mesh overlap. The second step deals with solving the mechanical equilibrium factoring the contact conditions computed in the first step. To do so, we developed a modified version of the multi-point constraint (MPC) strategy. While the original algorithm was restricted to the Jacobi preconditioned conjugate gradient method, our MPC algorithm works with any other Krylov solvers (and thus liberate us from the symmetry requirements). Furthermore it does not place any restriction on the preconditioner used.

The multibody thermo-mechanical contact problem is tackled using modern numerics, with higher-order finite elements and a Newton-based monolithic strategy to handle both nonlinearities (coming from the non-linearity of the contact condition but as well as from the temperature-dependence of the fuel thermal conductivity for instance) and coupling between the various physics components (gap conductance sensitive to the clad- pellet distance, thermal expansion coefficient or Youngs modulus affected by temperature changes, etc.).

We will provide different numerical examples for one and multiple bodies contact problems to demonstrate how the method performs.

**September 5, 2013** - Jared Saia: *How to Build a Reliable System Out of Unreliable Components*

The first part of this talk will survey several decades of work on designing distributed algorithms that boost reliability. These algorithms boost reliability in the sense that they enable the creation of a reliable system from unreliable components. We will discuss practical successes of these algorithms, along with drawbacks. A key drawback is scalability: significant redundancy of resources is required in order to tolerate even one node fault. The second part of the talk will introduce a new class of distributed algorithms for boosting reliability. These algorithms are self-healing in the sense that they dynamically adapt to failures, requiring additional resources only when faults occur.

We will discuss two such self-healing algorithms. The first enables self-healing in an overlay network, even when an omniscient adversary repeatedly removes carefully chosen nodes. Specifically, the algorithm ensures that the shortest path between any pair of nodes never increases by more than a logarithmic factor, and that the degree of any node never increases by more than a factor of 3. The second algorithm enables self-healing with Byzantine faults, where an adversary can control t < n/8 of the n total nodes in the network. This algorithm enables point-to-point communication with an expected number of message corruptions that is O(t(log* n)^2). Empirical results show that this algorithm reduces bandwidth and computation costs by up to a factor of 70 when compared to previous work.

**August 21, 2013** - Hank Childs: *Hybrid Parallelism for Visualization and Analysis*

Many of today's parallel visualization and analysis programs are designed for distributed-memory parallelism, but not for the shared-memory parallelism available on GPUs or multi-core CPUs. However, architectural trends on supercomputers increasingly contain more and more cores per node, whether through the presence of GPUs or through more cores per CPU node. To make the best use of such hardware, we must evaluate the benefits of hybrid parallelism - parallelism that blends distributed- and shared-memory approaches - for visualization and analysis's data-intensive workloads. With this talk, Hank explores the fundamental challenges and opportunities for hybrid parallelism with visualization and analysis, and discusses recent results that measure its benefit.

Speaker Bio:

Hank Childs is an assistant professor at the University of Oregon and a computer systems engineer at Lawrence Berkeley National Laboratory. His research focuses on scientific visualization, high-performance computing, and the intersection of the two. He received the Department of Energy Career award in 2012 to research explorative visualization use cases on exascale machines. Additionally, Hank is one of the founding members of the team that developed the VisIt visualization and analysis software. He received his Ph.D. from UC Davis in 2006.

**August 13, 2013** - Rodney O. Fox: *Quadrature-Based Moment Methods for Kinetics-Based Flow Models*

Kinetic theory is a useful theoretical framework for developing multiphase flow models that account for complex physics (e.g., particle trajectory crossings, particle size distributions, etc.) (1). For most applications, direct solution of the kinetic equation is intractable due to the high-dimensionality of the phase space. Thus a key challenge is to reduce the dimensionality of the problem without losing the underlying physics. At the same time, the reduced description must be numerically tractable and possess the favorable attributes of the original kinetic equation (e.g. hyperbolic, conservation of mass/momentum, etc.)

Starting from the seminal work of McGraw (2) on the quadrature method of moments (QMOM), we have developed a general closure approximation referred to as quadrature-based moment methods (3; 4; 5). The basic idea behind these methods is to use the local (in space and time) values of the moments to reconstruct a well-defined local distribution function (i.e. non-negative, compact support, etc.). The reconstructed distribution function is then used to close the moment transport equations (e.g. spatial fluxes, nonlinear source terms, etc.).

In this seminar, I will present the underlying theoretical and numerical issues associated with quadrature-based reconstructions. The transport of moments in real space, and its numerical representation in terms of fluxes, plays a critical role in determining whether a moment set is realizable. Using selected examples, I will introduce recent work on realizable high-order flux reconstructions developed specifically for finite-volume schemes (6).

References

[1] MARCHISIO, D. L. & FOX, R. O. 2013 Computational Models for Polydisperse Particulate and Multiphase Systems, Cambridge University Press.

[2] MCGRAW, R. 1997 Description of aerosol dynamics by the quadrature method of moments. Aerosol Science and Technology 27, 255–265.

[3] DESJARDINS, O., FOX, R. O. & VILLEDIEU, P. 2008 A quadrature-based moment method for dilute fluid-particle flows. Journal of Computational Physics 227, 2514–2539.

[4] YUAN, C. & FOX, R. O. 2011 Conditional quadrature method of moments for kinetic equations. Journal of Computational Physics 230, 8216–8246.

[5] YUAN, C., LAURENT, F. & FOX, R. O. 2012 An extended quadrature method of moments for population balance equations. Journal of Aerosol Science 51, 1–23.

[6] VIKAS, V., WANG, Z. J., PASSALACQUA, A. & FOX, R. O. 2011 Realizable high-order finite-volume schemes for quadrature-based moment methods. Journal of Computational Physics 230, 5328–5352.

**August 12, 2013** - Lucy Nowell: *ASCR: Funding/ Data/ Computer Science*

Dr. Lucy Nowell is a Computer Scientist and Program Manager for the Advanced Scientific Computing Research (ASCR) program office in the Department of Energy's (DOE) Office of Science. While her primary focus is on scientific data management, analysis and visualization, her portfolio spans the spectrum of ASCR computer science interests, including supercomputer architecture, programming models, operating and runtime systems, and file systems and input/output research. Before moving to DOE in 2009, Dr. Nowell was a Chief Scientist in the Information Analytics Group at Pacific Northwest National Laboratory (PNNL). On detail from PNNL, she held a two-year assignment as a Program Director for the National Science Foundation's Office of Cyberinfrastructure, where her program responsibilities included Sustainable Digital Data Preservation and Access Network Partners (DataNet), Community-based Data Interoperability Networks (INTEROP), Software Development for Cyberinfrastructure (SDCI) and Strategic Technologies for Cyberinfrastructure (STCI). At PNNL, her research centered on applying her knowledge of visual design, perceptual psychology, human-computer interaction, and information storage and retrieval to problems of understanding and navigating in very large information spaces, including digital libraries. She holds several patents in information visualization technologies.

Dr. Nowell joined PNNL in August 1998 after a career as a professor at Lynchburg College in Virginia, where she taught a wide variety of courses in Computer Science and Theatre. She also headed the Theatre program and later chaired the Computer Science Department. While pursuing her Master of Science and Doctor of Philosophy degrees in Computer Science at Virginia, she worked as a Research Scientist in the Digital Libraries Research Laboratory and also interned with the Information Access team at IBM's T. J. Watson Research Laboratories in Hawthorne, NY. She also has a Master of Fine Arts degree in Drama from the University of New Orleans and the Master of Arts and Bachelor of Arts degrees in Theatre from the University of Alabama

**August 8, 2013** - Carlos Maltzahn: *Programmable Storage Systems*

With the advent of open source parallel file systems a new usage pattern emerges: users isolate subsystems of parallel file systems and put them in contexts not foreseen by the original designers, e.g., an object-based storage back end gets a new REST-ful front end to become Amazon Web Service's S3 compliant key value store, or a data placement function becomes a placement function for customer accounts. This trend shows a desire for the ability to use existing file system services and compose them to implement new services. We call this ability "programmable storage systems".

In this talk I will argue that by designing programmability into storage systems has the following benefits: (1) we are achieving greater separation of storage performance engineering from storage reliability engineering, making it possible to optimize storage systems in a wide variety of ways without risking years of investments into code hardening; (2) we are creating an environment that encourages people to create a new stack of storage systems abstractions, both domain-specific and across domains, including sophisticated optimizers that rely on machine learning techniques; (3) we inform commercial parallel file system vendors on the design of low-level APIs for their products so that they match the versatility of open source storage systems without having to release their entire code into open source; and (4) use this historical opportunity to leverage the tension between the versatility of open source storage systems and the reliability of proprietary systems to lead the community of storage system designers.

I will illustrate programmable storage with an overview of programming abstractions that we have found useful so far, and if time permits, talk about "scriptable storage systems" and the interesting new possibilities of truly data-centered software engineering it enables.

Bio: Carlos Maltzahn is an Associate Adjunct Professor at the Computer Science Department of the Jack Baskin School of Engineering, Director of the UCSC Systems Research Lab and Director of the UCSC/Los Alamos Institute for Scalable Scientific Data Management at the University of California at Santa Cruz. Carlos Maltzahn's current research interests include scalable file system data and metadata management, storage QoS, data management games, network intermediaries, information retrieval, and cooperation dynamics.

Carlos Maltzahn joined UC Santa Cruz in December 2004 after five years at Network Appliance. He received his Ph.D. in Computer Science from the University of Colorado at Boulder in 1999, his M.S. in Computer Science in 1997, and his Univ. Diplom Informatik from the University of Passau, Germany in 1991.

**August 7, 2013** - Tiffany M. Mintz: *Toward Abstracting the Communication Intent in Applications to Improve Portability and Productivity*

Programming with communication libraries such as the Message Passing Interface (MPI) obscures the high-level intent of the communication in an application and makes static communication analysis difficult to do. Compilers are unaware of communication libraries' specifics, leading to the exclusion of communication patterns from any automated analysis and optimizations. To overcome this, communication patterns can be expressed at higher-levels of abstraction and incrementally added to existing MPI applications. In this paper, we propose the use of directives to clearly express the communication intent of an application in a way that is not specific to a given communication library. Our communication directives allow programmers to express communication among processes in a portable way, giving hints to the compiler on regions of computations that can be overlapped with communication and relaxing communication constraints on the ordering, completion and synchronization of the communication imposed by specific libraries such as MPI. The directives can then be translated by the compiler into message passing calls that efficiently implement the intended pattern and be targeted to multiple communication libraries. Thus far, we have used the directives to express point-to-point communication patterns in C, C++ and Fortran applications, and have translated them to MPI and SHMEM.

**August 2, 2013** - Alberto Salvadori: *Multi-scale and multi-physics modeling of Li-ion batteries: a computational homogenization approach*

There is being great interest in developing next generation of lithium ion battery for higher capacity and longer life of cycling, in order to develop signiﬁcantly more demanding energy storage requirements for humanity existing and future inventories of power-generation and energy-management systems. Industry and academic are looking for alternative materials and Si is one of the most promising candidates for the active material, because it has the highest theoretical speciﬁc energy capacity. It emerged that very large mechanical stresses associated with huge volume changes during Li intercalation/deintercalation are responsible for poor cyclic behaviors and quick fading of electrical performance. The present contribution aims at providing scientific contributions in this vibrant context.

The computational homogenization scheme is here tailored to model the coupling between electrochemistry and mechanical phenomena that coexist during batteries charging and discharging cycles. At the macro-scale, di.ﬂ'usion-advection equations model the electro-chemistry of the whole cell, whereas the micro-scale models the multi-component porous electrode, diﬁfusion and intercalation of Lithium in the active particles, the swelling and fracturing of the latter. The scale transitions are formulated by tailoring the well established ﬁrst-order computational homogenization scheme for mechanical and thermal problems.

**August 2, 2013** - Michela Taufer: *The effectiveness of application-aware self-management for scientific discovery in volunteer computing systems*

There is being great interest in developing next generation of lithium ion battery for higher capacity and longer life of cycling, in order to develop signiﬁcantly more demanding energy storage requirements for humanity existing and future inventories of power-generation and energy-management systems. Industry and academic are looking for alternative materials and Si is one of the most promising candidates for the active material, because it has the highest theoretical speciﬁc energy capacity. It emerged that very large mechanical stresses associated with huge volume changes during Li intercalation/deintercalation are responsible for poor cyclic behaviors and quick fading of electrical performance. The present contribution aims at providing scientific contributions in this vibrant context.

**July 24, 2013** - Catalin Trenchea: *Improving time-stepping numerics for weakly dissipative systems*

In this talk I will address the stability and accuracy of CNLF time-stepping scheme, and propose a modification of Robert-Asselin time-filters for numerical models of weakly diffusive evolution systems. This is motivated by the vast number of applications, e.g., the meteorological equations, and coupled systems with dominating skew symmetric coupling (ground-water surface-water).

In contemporary numerical simulations of the atmosphere, evidence suggests that time-stepping errors may be a significant component of total model error, on both weather and climate time-scales. After a brief review, I will suggest a simple but effective method for substantially improving the time-stepping numerics at no extra computational expense.

The most common time-stepping method is the leapfrog scheme combined with the Robert-Asselin (RA) filter. This method is used in many atmospheric models: ECHAM, MAECHAM, MM5, CAM, MESO-NH, HIRLAM, KMCM, LIMA, SPEEDY, IGCM, PUMA, COSMO, FSU-GSM, FSU-NRSM, NCEP-GFS, NCEP-RSM, NSEAM, NOGAPS, RAMS, and CCSR/NIES-AGCM. Although the RA filter controls the time-splitting instability in these models (successfully suppresses the spurious computational mode associated with the leapfrog time stepping scheme), it also weakly suppresses the physical mode, introduces non-physical damping, and reduces the accuracy.

This presentation proposes a simple modification to the RA filter (mRA) [Y. Li, CT 2013].

The modification is analyzed and compared with the RAW filter (Williams 2009, 2011).

The mRA increases the numerical accuracy to O(Δt^4) amplitude error and at least O(Δt^{2}) phase-speed error for the physical mode. The mRA filter requires the same storage factors as RAW, and one more than the RA filter does. When used in conjunction with the leapfrog scheme, the RAW filter eliminates the non-physical damping and increases the amplitude accuracy by two orders, yielding third-order accuracy, the phase accuracy remaining second-order. The mRA and RAW filters can easily be incorporated into existing models, typically via the insertion of just a single line of code. Better simulations are obtained at no extra computational expense.

**June 28, 2013** - Yuri Melnikov: *A surprising connection between Green's functions and the infinite product representation of elementary functions*

Some standard as well as innovative approaches will be reviewed for the construction of Green's functions for the elliptic PDEs. Based on that, a surprising technique is proposed for obtaining infinite product representations of some trigonometric, hyperbolic, and special functions. The technique uses comparison of different alternative expressions of Green's functions constructed by different methods. This allows us not only obtain the classical Euler's formulas but also come up with a number of new representations.

**June 27, 2013** - Kimmy Mu: *Performance, accuracy and power tradeoff for scientific processes using workflow in high performance computing*

Power is getting more important in high performance computing than ever before as we are on the way to exascale computing. The transition from old style which considers performance and accuracy to the new style which will take care of performance, accuracy and power is necessary. In high performance computing a workflow is composed of a large number of tasks, such as simulation, analysis and visualization. However, there is no such guidance for user getting to know which kind of task allocation and task placement to nodes and clusters are good for performance or power with accuracy requirement. In this presentation, I will talk about power optimization for reconfigurable embedded systems which dynamically choose kernels to run on hardware co-processors to response to dynamic application behavior at runtime. With a lot of commonalities as in HPC, we are going to explore the method in high performance computing for a dynamic workflow of task placement, etc., in terms of performance, power and accuracy constraints.

**June 26, 2013** - Matthew Causley: *A fast implicit Maxwell field solver for plasma simulations*

We present a conservative spectral scheme for Boltzmann collision operators. This formulation is derived from the weak form of the Boltzmann equation, which can represent the collisional term as a weighted convolution in Fourier space. The weights contain all of the information of the collision mechanics and can be precomputed. I will present some results for isotropic (in angle) interations, such as hard spheres and Maxwell molecules. We have recently extended the method to take into account anisotropic scattering mechanisms arising from potential interactions between particles, and we use this method to compute the Boltzmann equation with screened Coulomb potentials. In particular, we study the rate of convergence of the Fourier transform for the Boltzmann collision operator in the grazing collisions limit to the Fourier transform for the limiting Landau collision operator. We show that the decay rate to equilibrium depends on the parameters associated with the collision cross section, and specifically study the differences between the classical Rutherford scattering angular cross section, which has logarithmic error, and an artificial one with a linear error. I will also present recent work extending this method for multispecies gases and gas with internal degrees of freedom, which introduces new challenges for conservation and introduces inelastic collisions to the system.

**June 25, 2013** - Jeff Haack: *Conservative Spectral Method for Solving the Boltzmann Equation*

We present a conservative spectral scheme for Boltzmann collision operators. This formulation is derived from the weak form of the Boltzmann equation, which can represent the collisional term as a weighted convolution in Fourier space. The weights contain all of the information of the collision mechanics and can be precomputed. I will present some results for isotropic (in angle) interations, such as hard spheres and Maxwell molecules. We have recently extended the method to take into account anisotropic scattering mechanisms arising from potential interactions between particles, and we use this method to compute the Boltzmann equation with screened Coulomb potentials. In particular, we study the rate of convergence of the Fourier transform for the Boltzmann collision operator in the grazing collisions limit to the Fourier transform for the limiting Landau collision operator. We show that the decay rate to equilibrium depends on the parameters associated with the collision cross section, and specifically study the differences between the classical Rutherford scattering angular cross section, which has logarithmic error, and an artificial one with a linear error. I will also present recent work extending this method for multispecies gases and gas with internal degrees of freedom, which introduces new challenges for conservation and introduces inelastic collisions to the system.

**June 17, 2013** - Megan Cason: *Analytic Utility Of Novel Threading Models In Distributed Graph Algorithms*

Current analytic methods for judging distributed algorithms rely on communication abstractions that characterize performance assuming purely passive data movement and access. This assumption complicates the analysis of certain algorithms, such as graph analytics, which have behavior that is very dependent on data movement and modifying shared variables. This presentation will discuss an alternative model for analyzing theoretic scalability of distributed algorithms written with the possibility of active data movement and access. The mobile subjective model presented here confines all communication to 1) shared memory access and 2) executing thread state which can be relocated between processes, i.e., thread migration. Doing so enables a new type of scalability analysis, which calculates the number of thread relocations required, and whether that communication is balanced across all processes in the system. This analysis also includes a model for contended shared data accesses, which is used to identify serialization points in an algorithm. This presentation will show the analysis for a common distributed graph algorithm, and illustrate how this model could be applied to a real world distributed runtime software stack.

**June 14, 2013** - Jeff Carver: *Applying Software Engineering Principles to Computational Science*

The increase in the importance of Computational Science software motivates the need to identify and understand which software engineering (SE) practices are appropriate. Because of the uniqueness of the computational science domain, exiting SE tools and techniques developed for the business/IT community are often not efficient or effective. Appropriate SE solutions must account for the salient characteristics of the computational science development environment. To identify these solutions, members of the SE community must interact with members of the computational science community. This presentation will discuss the findings from a series of case studies of CSE projects and the results of an ongoing workshop series. First, a series of case studies of computational science projects were conducted as part of the DARPA High Productivity Computing Systems (HPCS) project. The main goal of these studies was to understand how SE principles were and were not being applied in computational science along with some of the reasons why. The studies resulted in nine lessons learned about computational science software that are important to consider moving forward. Second, the Software Engineering for Computational Science and Engineering workshop brings together software engineers and computational scientists. The outcomes of this workshop series provide interesting insight into potential future trends.

**June 12, 2013** - Hans-Werner van Wyk: *Multilevel Quadrature Methods*

Stochastic Sampling methods are arguably the most direct and least intrusive means of incorporating parametric uncertainty into numerical simulations of partial differential equations with random inputs. However, to achieve an overall error that is within a desired tolerance, a large number of sample simulations may be required (to control the sampling error), each of which may need to be run at high levels of spatial fidelity (to control the spatial error). Multilevel methods aim to achieve the same accuracy as traditional sampling methods, but at a reduced computational cost, through the use of a hierarchy of spatial discretization models. Multilevel algorithms coordinate the number of samples needed at each discretization level by minimizing the computational cost, subject to a given error tolerance. They can be applied to a variety of sampling schemes, exploit nesting when available, can be implemented in parallel and can be used to inform adaptive spatial refinement strategies. We present an introduction to multilevel quadrature in the context of stochastic collocation methods, and demonstrate its effectiveness theoretically and by means of numerical examples.

**June 7, 2013** - Xuechen Zhang: *Scibox: Cloud Facility for Sharing On-Line Data*

Collaborative science demands global sharing of scientific data but it cannot leverage universally accessible cloud-based infrastructures, like DropBox, as those offer limited interfaces and inadequate levels of access bandwidth. In this talk, I will present Scibox cloud facility for online sharing scientific data. It uses standard cloud storage solutions, but offers a usage model in which high end codes can write/read data to/from the cloud via the same ADIOS APIs they already use for their I/O actions, thereby naturally coupling data generation with subsequent data analytics. Extending current ADIOS IO methods, with Scibox, data upload/download volumes are controlled via Data Reduction (DR) functions stated by end users and applied at the data source, before data is moved, with further gains in efficiency obtained by combining DR-functions to move exactly what is needed by current data consumers.

**June 6, 2013** - Yuan Tian: *Taming Scientific Big Data with Flexible Organizations for Exascale Computing*

The fast growing High Performance Computing systems enable scientists to simulate scientific processes with great complexities and consequently, often producing complex data that are also exponentially increasing in size. However, the growth within the computing infrastructure is significantly imbalanced. The dramatically increasing computing power is accompanied with a slowly improving storage system. Such discordant progress among computing power, storage, and data, has led to a severe Input/Output (I/O) bottleneck that requires novel techniques to address big data challenges in the scientific domain.

This talk will identify the prevalent characteristics of scientific data and storage system as a whole, and explore opportunities to drive I/O performance for petascale computing and prepare it for the exascale. To this end, a set of flexible data organization and management techniques are introduced and evaluated to address the aforementioned concerns. Four key techniques are designed to exploit the capability of the back-end storage system for processing and storing scientific big data with a fast and scalable I/O performance, visualization space filling curve-based data reorganization, system-aware chunking, spatial and temporal aggregation, and in-node staging with compression. The experimental results demonstrated more than 60x speedup for a mission critical climate application during data post-processing.

**May 31, 2013** - Pablo Seleson: *Multiscale Material Modeling with Peridynamics*

Multiscale modeling has been recognized in recent years as an important research field to achieve feasible and accurate predictions of complex systems. Peridynamics, a nonlocal reformulation of continuum mechanics based on integral equations, is able to resolve microscale phenomena at the continuum level. As a nonlocal model, peridynamics possesses a length scale which can be controlled for multiscale modeling. For instance, classical elasticity has been presented as a limiting case of a peridynamic model. In this talk, I will introduce the peridynamics theory and show analytical and numerical connections of peridynamics to molecular dynamics and classical elasticity. I will also present multiscale methods to concurrently couple peridynamics and classical elasticity, demonstrating the capabilities of peridynamics towards multiscale material modeling.

Dr. Seleson is a Postdoctoral Fellow in the Institute for Computational Engineering and Sciences at The University of Texas at Austin. He has obtained his Ph.D. in Computational Science from Florida State University in 2010. He holds a M.S. degree in Physics from the Hebrew University of Jerusalem (2006), and a double B.S. degree in Physics and Philosophy also from the Hebrew University of Jerusalem (2002).

**May 29, 2013** - Ryan McMahan: *The Effects of System Fidelity for Virtual Reality Applications*

Virtual reality (VR) has developed from Ivan Sutherland's inception of an "ultimate display" to a realized field of advanced technologies. Despite evidence supporting the use of VR for various benefits, the level of system fidelity required for such benefits is often unknown. Modern VR systems range from high-fidelity simulators that incorporate many technologies to lower-fidelity, desktop-based virtual environments. In order to identify the level of system fidelity required for certain beneficial uses, research has been conducted to better understand the effects of system fidelity on the user. In this talk, a series of experiments evaluating the effects of interaction fidelity and display fidelity will be presented. Future directions of system fidelity research will also be discussed.

Dr. Ryan P. McMahan is an Assistant Professor of Computer Science at the University of Texas at Dallas, where his research focuses on the effects of system fidelity for virtual reality (VR) applications. Using an immersive VR system comprised of a wireless head-mounted display (HMD), a real-time motion tracking system, and Wii Remotes as 3D input devices, his research determines the effects of system fidelity by varying components such as stereoscopy, field of view, and degrees of freedom for interactions. Currently, he is using this methodology to investigate the effects of fidelity on learning for VR training applications. Dr. McMahan received his Ph.D. in Computer Science in 2011 from Virginia Tech, where he also received his B.S. and M.S. in Computer Science in 2004 and 2007.

**May 28, 2013** - Adrian Sandu: *Data Assimilation and the Adaptive Solution of Inverse Problems*

The task of providing an optimal analysis of the state of the atmosphere requires the development of novel computational tools that facilitate an efficient integration of observational data into models. In this talk, we will introduce variational and statistical estimation approaches to data assimilation. We will discuss important computational aspects including the construction of efficient models for background errors, the construction and analysis of discrete adjoint models, new approaches to estimate the information content of observations, and hybrid variational-ensemble approaches to assimilation. We will also present some recent results on the solution of inverse problems using space and time adaptivity, and a priori and a posteriori error estimates for the optimal solution.

**May 24, 2013** - Satoshi Matsuoka: *The Futures of Tsubame Supercomputer and the Japanese HPCI Towards Exascale*

HPCI is the Japanese High Performance Computer Infrastructure, which encompasses the national operations of major supercomputers, such as the K supercomputer and Tsubame2.0, much like the XSEDE in the United States and PRACE in Europe. Recently it was announced that the Japanese Ministry of Education, Culture, Sports, Science and Technology is intending to initiate a project towards an exascale supercomputer to be deployed around 2020. However, the workshop report that recommend the project also calls out for a comprehensive infrastructure where a flagship machine will be supplemented with leadership machines to complement the abilities of the flagship. Although it is still early, I will attempt to discuss the current status of Tsubame2.0 evolution to 2.5 and 3.0 in this context, as well as the activities in Japan to initiate an exascale effort, with collaborative elements with the US Department of Energy partners in system software development.

**May 17, 2013** - Jon Mietling and Tony McCrary: *Bling3D: a new game development toolset from l33t Labs*

Bling3D is a forthcoming game development toolset from l33t labs.

The fusion of Eclipse 4 with game development technologies, Bling allows both programmers and designers to create compelling interactive experiences from within one powerful tool.

In this talk, you will be introduced to some of Bling's exciting features, including:

- GPU Powered UI - A revolutionary new user interface for Eclipse, which uses shader programs to render widgets directly on the GPU.
- BYOE (Bring Your Own Engine) - Bling is designed as a universal tools platform for game technologies. You can use our game engine or integrate your own!
- Ultimate Toolset - Use the power of Bling's interface and Eclipse's extensibility to create mind blowing tools and plugins.
- Designers Love It - Intuitive visual tools that allow you to create new worlds and artificial realities with ease.
- Transform Your Assets - Easily create new ways to process raw assets (geometry, images, etc) into materials suitable for runtime use.

Jon Mietling and Tony McCrary are representatives of l33t labs LLC, technology startup from the Detroit, Michigan region.

**May 10, 2013** - Xiao Chen: *A Modular Uncertainty Quantification Framework for Multi-physics Systems*

This talk presents a modular uncertainty quantification (UQ) methodology for multi-physics applications in which each physics module can be independently embedded with its internal UQ method (intrusive or non-intrusive). This methodology offers the advantage of "plug-and-play" flexibility (i.e., UQ enhancements to one module do not require updates to the other modules) without losing the "global" uncertainty propagation property. (This means that, by performing UQ in this modular manner, all inter-module uncertainty and sensitivity information is preserved.) In addition, using this methodology one can also track the evolution of global uncertainties and sensitivities at the grid point level, which may be useful for model improvement. We demonstrate the utility of such a framework for error management and Bayesian inference on a practical application involving a multi-species flow and reactive transport in randomly heterogeneous porous media.

**May 2, 2013** - Kenley Pelzer: *Quantum Biology: Elucidating Design Principles from Photosynthesis*

Recent experiments suggest that quantum mechanical effects may play a role in the efficiency of photosynthetic light harvesting. However, much controversy exists about the interpretation of these experiments, in which light harvesting complexes are excited by a fem to second laser pulse. The coherence in such laser pulses raises the important question of whether these quantum mechanical effects are significant in biological systems excited by incoherent light from the sun. In our work, we apply frequency-domain Green's function analysis to model a light-harvesting complex excited by incoherent light. By modeling incoherent excitation, we demonstrate that the evidence of long-lived quantum mechanical effects is not purely an artifact of peculiarities of the spectroscopy. This data provides a new perspective on the role of noisy biological environments in promoting or destroying quantum transport in photosynthesis.

**April 23, 2013** - Kirk W. Cameron: *Power-Performance Modeling, Analyses and Challenges*

The power consumption of supercomputers ultimately limits their performance. The current challenge is not whether we will can build an exaflop system by 2018, but whether we can do it in less than 20 megawatts. The SCAPE Laboratory at Virginia Tech has been studying the tradeoffs between performance and power for over a decade. We've developed an extensive tool chain for monitoring and managing power and performance in supercomputers. We will discuss our power-performance modeling efforts and the implications of our findings for exascale systems as well as some research directions ripe for innovation.

**April 23, 2013** - Jordan Deyton: *Tor Bridge Distribution Powered by Threshold RSA*

Since its inception, Tor has offered anonymity for internet users around the world. Tor now offers bridges to help users evade internet censorship, but the primary distribution schemes that provide bridges to users in need have come under attack. This talk explores how threshold RSA can help strengthen Tor's infrastructure while also enabling more powerful bridge distribution schemes. We implement a basic threshold RSA signature system for the bridge authority and a reputation-based social network design for bridge distribution. Experimental results are obtained showing the possibility of quick responses to requests from honest users while maintaining both the secrecy and the anonymity of registered clients and bridges.

**April 19, 2013** - Maria Avramova and Kostadin Ivanov: *OECD LWR UAM and PSBT/BFBT benchmarks and their relation to Advanced LWR Simulations*

From 1987 to 1995, Nuclear Power Engineering Corporation (NUPEC) in Japan performed a series of void measurement tests using full-size mock-up tests for both BWRs and PWRs. Void fraction measurements and departure from nucleate boiling (DNB) tests were performed at NUPEC under steady-state and transient conditions. The workshop will provide overview of the OECD/NEA/NRC PWR Subchannel and Bundle Tests (PSBT) and OECD/NEA/NRC BWR Full-size Fine-mesh Bundle Tests (BFBT) benchmarks based on the NUPEC data. The benchmarks were designed to provide a data set for evaluation of the abilities of existing subchannel, system, and computational fluid dynamics (CFD) thermal-hydraulics codes to predict void distribution and departure from nucleate boiling (DNB) in LWRs under steady-state and transient conditions. The first part of the seminar summarizes the description of PSBT and BFBT benchmark databases, specifications, definition of benchmark exercises and comparative analysis of obtained results and makes the case on how these benchmarks can be used for verification, validation and uncertainty quantification of thermal-hydraulic tools developed for advanced LWR simulations.

The second part of the seminar will provide overview of the OECD/NEA benchmark for LWR Uncertainty Analysis in Modeling (UAM) with emphasis on the Exercises of Phase I and Phase II of the benchmark and discussion of the Phase III, which is directly related to coupled multi-physics advanced LWR simulations. Series of well-defined problems with complete sets of input specifications and reference experimental data will be introduced with an objective is to determine the uncertainty in LWR calculations at all stages of coupled reactor physics/thermal hydraulics calculation. The full chain of uncertainty propagation will be discussed starting form from basic data and engineering uncertainties, across different scales (multi-scale), and physics phenomena (multi-physics) as well as how this propagation is tested on a number of benchmark exercises. Input, output and assumptions for each Exercise will be given as well as the procedures to calculate the output and propagated uncertainties in each step will be described supplemented by results of benchmark participants.

Bio of Dr. Maria Avramova

Dr. Maria Avramova is an Assistant Professor in the Mechanical and Nuclear Engineering Department at the Pennsylvania State University. She is currently the Director of Reactor Dynamics and Fuel Management Group (RDFMG). Her expertise and experience is in the area of developing methods and computer codes for multi-dimensional reactor core analysis. Her background includes development, verification, and validation of thermal-hydraulics sub-channel, porous media, and CFD models and codes for reactor core design, transient, and safety computational analysis. She has led and coordinated the OECD/NRC BFBT and PSBT benchmarks and currently is coordinating Phase II of the OECD LWR UAM benchmark. Her latest research efforts have been focused on high-fidelity multi-physics simulations (involving coupling of reactor physics, thermal-hydraulics and fuel performance models) as well as on uncertainty and sensitivity analysis of reactor design and safety calculations. Dr. Avramova has published over 15 refereed journal papers and over 40 refereed conference proceedings articles.

Bio of Dr. Kostadin Ivanov

Dr. Kostadin Ivanov is Distinguished Professor in the Mechanical and Nuclear Engineering Department at the Pennsylvania State University. He is currently Graduate Coordinator of Nuclear Engineering Program. His research developments include computational methods, numerical algorithms and iterative techniques, nuclear fuel management and reloading optimization techniques, reactor kinetics and core dynamics methods, cross-section generation and modeling algorithms for multi-dimensional steady-state and transient reactor calculations, and coupling three-dimensional (3-D) kinetics models with thermal-hydraulic codes. He has also led the development of multi-dimensional neutronics, in-core fuel management and coupled 3-D kinetics/thermal-hydraulic computer code benchmarks, multi-dimensional reactor transient and safety analysis methodologies as well as integrated analysis of safety-related parameters, system transient modeling of power plants, and in-core fuel management analyses.

Examples of such benchmarks are OECD/NRC PWR MSLB benchmark, OECD/NRC BWR TT benchmark and OECD/DOE/CEA VVER-1000 CT benchmark. He is currently a chair and coordinator of the Scientific Board and Technical Program Committee of OECD LWR UAM benchmark.

**April 18, 2013** - Sparsh Mittal: *MASTER: A Technique for Improving Energy Efficiency of Caches in Multicore Processors*

Large power consumption of modern processors has been identified as the most severe constraint in scaling their performance. Further, in recent CMOS technology generations, leakage energy has been dramatically increasing and hence, the leakage energy consumption of large last-level caches (LLCs) has become a significant source of the processor power consumption.

This talk first highlights the need of power management in LLCs in the modern multi-core processors and then presents MASTER, a micro-architectural cache leakage energy saving technique using dynamic cache reconfiguration. MASTER uses dynamic profiling of LLCs to predict energy consumption of running programs at multiple LLC sizes. Using these estimates, suitable cache quotas are allocated to different programs using cache-coloring scheme and the unused LLC space is turned off to save energy. The implementation overhead of MASTER is small and even for 4 core systems; its overhead is only 0.8% of L2 cache size. Simulations have been performed using an out-of-order x86-64 simulator and 2-core and 4-core multi-programmed workloads from SPEC2006 suite. Further, MASTER has been compared with two energy saving techniques, namely decay cache and way-adaptable cache. The results show that MASTER gives the highest saving in energy and does not harm performance or cause unfairness.

Finally, this talk briefly shows an extension of MASTER for multicore QoS systems. Simulation results confirm that a large amount of energy is saved while meeting the QoS requirement of most of the workloads.

**April 17, 2013** - Okwan Kwon: *Automatic Scaling of OpenMP Applications Beyond Shared Memory*

We present the first fully automated compiler-runtime system that successfully translates and executes OpenMP shared-address-space programs on laboratory-size clusters, for the complete set of regular, repetitive applications in the NAS Parallel Benchmarks. We introduce a hybrid compiler-runtime translation scheme. This scheme features a novel runtime data flow analysis and compiler techniques for improving data affinity and reducing communication costs. We present and discuss the performance of our translated programs, and compare them with the performance of the MPI, HPF and UPC versions of the benchmarks. The results show that our translated programs achieve 75% of the hand-coded MPI programs, on average.

**April 17, 2013** - Michael S. Murillo: *Molecular Dynamics Simulations of Charged Particle Transport in High Energy-Density Matter*

High energy-density matter is now routinely produced at large laser facilities. Producing fusion energy at such facilities challenges our ability to model collisional plasma processes that transport energy among the plasma species and across spatial scales. While the most accurate computational method for describing collisional processes is molecular dynamics, there are numerous challenges associated with using molecular dynamics to model very hot plasmas. However, recent advances in high performance computing have allowed us to develop methods for simulating a wide variety of processes in hot, dense plasmas. I will review these developments and describe our recent results that involve simulating fast particle stopping in dense plasmas. Using the simulation results, implications for theoretical modeling of charged-particle stopping will be given.

**April 12, 2013** - Vivek K. Pallipuram: *Exploring Multiple Levels Of Performance Modeling For Heterogeneous Systems*

One of the major challenges faced by the High-Performance Computing (HPC) community today is user-friendly and accurate heterogeneous performance modeling. Although performance prediction models exist to fine-tune applications, they are seldom easy-to-use and do not address multiple levels of design space abstraction. Our research aims to bridge the gap between reliable performance model selection and user-friendly analysis. We propose a straightforward and accurate multi-level performance modeling suite for multi-GPGPU systems that addresses multiple levels of design space abstraction. The multi-level performance modeling suite primarily targets synchronous iterative algorithms (SIAs) using our synchronous iterative GPGPU execution (SIGE) model and addresses two levels of design space abstraction: 1) low-level where partial details of the implementation are present along with system specifications and 2) high-level where implementation details are minimum and only high-level system specifications are known. The low-level abstraction of the modeling suite employs statistical techniques for runtime prediction, whereas the high-level abstraction utilizes existing analytical and quantitative modeling tools to predict the application runtime. Our initial validation efforts for the low-level abstraction yield high runtime prediction accuracy with less than 10% error rate for several tested GPGPU cluster configurations and case studies. The development of high-level abstraction models is underway. The end goal of our research is to offer the scientific community, a reliable and user-friendly performance prediction framework that allows them to optimally select a performance prediction strategy for the given design goals and system architecture characteristics.

**April 11, 2013** - Jeff Young: *Commodity Global Address Spaces - How Can We Scale Out Accelerator and Memory Performance for Tomorrow's Clusters?*

Current Top 500 systems like Titan, Stampede, and Tianhe-1A have started to embrace the use of off-chip accelerators, such as GPUs and x86 coprocessors, to dramatically improve their overall performance and efficiency numbers. At the same time, these systems also make very specific assumptions about the availability of highly optimized interconnects and software stacks that are used to mitigate the effects of running large applications across multiple nodes and their accelerators. This talk focuses on the gap in networking between high-performance computing clusters and data centers and proposes that future clusters should be built around commodity-based networks and managed global address spaces to improve the performance of data movement between host memory and accelerator memory. This thesis is supported by previous research into converged commodity interconnects and ongoing research on the Oncilla managed GAS runtime to support aggregated memory for data warehousing applications. In addition, we will speculate on how commodity-based networks and memory management for clusters of accelerators might be affected by the advent of 3D stacking and fused CPU/GPU architectures.

**April 9, 2013** - Cong Liu: *Towards Efficient Real-Time Multicore Computing Systems*

Current trends in multicore computing are towards building more powerful, intelligent, yet space- and power-efficient systems. A key requirement in correctly building such intelligent systems is to ensure real-time performance, i.e., "make the right move at the right time in a predictable manner." Current research on real-time multicore computing has been limited to simple systems for which complex application runtime behaviors are ignored; this limits the practical applicability of such research. In practice, complex but realistic application runtime behaviors often exist, such as I/O operations, data communications, parallel execution segments, critical sections etc. Such runtime behaviors are currently dealt with by over-provisioning systems, which is an economically wasteful practice. I will present predictable real-time multicore computing system design, analysis, and implementation methods that can efficiently support common types of application runtime behaviors. I will show that the proposed methods are able to avoid over-provisioning systems and to reduce the number of needed hardware components to the extent possible while providing timing correctness guarantees.

In the second part of the talk, I will present energy-efficient workload mapping techniques for heterogeneous multicore CPU/GPU systems. Through both algorithmic analysis and prototype system implementation, I will show that the proposed techniques are able to achieve better energy efficiency while guaranteeing response time performance.

**April 9, 2013** - Frank Mueller: *On Determining a Viable Path to Resilience at Exascale*

Exascale computing is projected to feature billion core parallelism. At such large processor counts, faults will become more common place. Current techniques to tolerate faults focus on reactive schemes for recovery and generally rely on a simple checkpoint/restart mechanism. Yet, they have a number of shortcomings. (1) They do not scale and require complete job restarts. (2) Projections indicate that the mean-time-between-failures is approaching the overhead required for checkpointing. (3) Existing approaches are application-centric, which increases the burden on application programmers and reduces portability.

To address these problems, we discuss a number of techniques and their level of maturity (or lack thereof) to address these problems. These include (a) scalable network overlays, (b) on-the-fly process recovery, (c) proactive process-level fault tolerance, (d) redundant execution, (e) the effort of SDCs on IEEE floating point arithmetic and (f) resilience modeling. In combination, these methods are aimed to pave the path to exascale computing.

**April 5, 2013** - Sarat Sreepathi: *Optimus: A Parallel Metaheuristic Optimization Framework With Environmental Engineering Applications*

Optimus (Optimization Methods for Universal Simulators) is a parallel optimization framework for coupling computational intelligence methods with a target scientific application. Optimus includes a parallel middleware component, PRIME (Parallel Reconfigurable Iterative Middleware Engine) for scalable deployment on emergent supercomputing architectures. PRIME provides a lightweight communication layer to facilitate periodic inter-optimizer data exchanges. A parallel search method, COMSO (Cooperative Multi-Swarm Optimization) was designed and tested on various high dimensional mathematical benchmark problems. Additionally, this work presents a novel technique, TAPSO (Topology Aware Particle Swarm Optimization) for network based optimization problems. Empirical studies demonstrate that TAPSO achieves better convergence than standard PSO for Water Distribution Systems (WDS) applications. Scalability analysis of Optimus was performed on the Cray XK6 supercomputer (Jaguar) at Oak Ridge Leadership Computing Facility for the leak detection problem in WDS. For a weak scaling scenario, we achieved 84.82% of baseline at 200,000 cores relative to performance at 1000 cores.

**March 20, 2013** - J.W. Banks: *Stable Partitioned Solvers for Compresible Uid-structure Interaction Problems*

In this talk, we discuss recent work concerning the developing and analysis of stable, partitioned solvers for uid-structure interaction problems. In a partitioned approach, the solvers for each uid or solid domain are isolated from each other and coupled only through the interface. This is in contrast to fully-coupled monolithic schemes where the entire system is advanced by a single unied solver, typically by an implicit method. Added-mass instabilities, common to partitioned schemes, are addressed through the use of a newly developed interface projection technique. The overall approach is based on imposing the exact solution to local uid-solid Riemann problems directly in the numerical method. Stability of the FSI coupling is discussed using normal-mode stability theory, and the new scheme is shown to be stable for a wide range of material parameters. For the rigid body case, the approach is shown to be stable even for bodies of no mass or rotational inertia. This dicult limiting case exposes interesting subtleties concerning the notion of added mass in uid-structure problems at the continuous level.

**March 13, 2013** - Travis Thompson: *Navier-Stokes equations to Describe the Motion of Fluid Substances*

The Navier-Stokes equations describe the motion of fluid substances; the equations are widely utilized to model many physical phenomena such as weather patterns, ocean currents, turbulent fluid flow and magneto-hydrodynamics. Despite their wide utilization a comprehensive theoretical understanding remains an open question; the equations offer a venue for challenges at the forefront of both theoretical and computational knowledge. My work at Texas A&M has focused, primarily, on two topics: aspects of hyperbolic conservation laws, specifically mass conservation for incompressible Navier-Stokes, and computational investigation of an LES model based on a new eddy-viscosity; both embody appeal to highly-parallel scientific computing albeit in differing ways.

With respect to hyperbolic conservation laws: on the computational side I have implemented a one-step artificial compression term in a numerical code which counteracts an entropy-viscosity regularization term. This is an innovative approach; canonical methods for interface tracking are two-step or adaptive procedures. In addition the implementation utilizes a splitting approach, originally designed for use in a highly-parallel momentum equation variant, as an approximation operator in the time-stepping scheme; this approach imbues the algorithm with additional parallelism. On the theoretical side a distinct approach towards the analysis of dispersion error, utilizing a commutator expression, has been investigated for particular finite element spaces; the approach offers a computational segue into investigating consistency error and moves away from the canonical, tedious, expansion-based methodology of analysis.

With respect to large eddy simulations (LES): Computational investigations of an eddy-viscosity model based on the entropy-viscosity of Guermond & Popov has been underway for the last six months; in collaboration with Dr. Larios, a post-doc here at Texas A&M, an analysis of the qualitative and statistical attributes of high Reynolds number, turbulent flow is being conducted. We will compare our results to the Smagorinsky-Lilly turbulence model and attempt to verify basic tenets of isotropic turbulence theory; namely the Kolmogorov – 5/3 law and predictions regarding the uncorrelated nature of velocity structure functions.

**March 1, 2013** - Bob Salko: *Development, Improvement, and Validation of Reactor Thermal-Hydraulic Analysis Tools*

As a result of the need for continual development, qualification, and application of computational tools relating to the modeling of nuclear systems, the Reactor Dynamics and Fuel Management Group (RDFMG) at the Pennsylvania State University has maintained an active involvement in this area. This presentation will highlight recent RDFMG work relating to thermal-hydraulic modeling tools. One such tool is the COolant Boiling in Rod Arrays - Two Fluids (COBRA-TF) computer code, capable of modeling the independent behavior of continuous liquid, vapor, and droplets using the sub-channel methodology. Work has been done to expand the modeling capabilities from the in-vessel region only, which COBRA-TF has been developed for, to the coolant-line region by developing a dedicated coolant-line-analysis package that serves as an add-on to COBRA-TF. Additional COBRA-TF work includes development of a pre-processing tool for faster, more user-friendly creation of COBRA-TF input decks, implementation of post-processing capabilities for visualization of simulation results, and optimization of the source code for significant improvements in simulation speed and memory management. Of equal importance to these development activities is the validation of the resulting tools for their intended applications. The code capability to capture rod-bundle thermal-hydraulic behavior during prototypical PWR operating conditions will be demonstrated through comparison of predicted and experimental results for the New Experimental Studies of Thermal-Hydraulics of Rod Bundles (NESTOR) tests. Due to the growing usage of Computational Fluids Dynamics (CFD) tools in this area, modeling results predicted by the STAR-CCM+ CFD tool will also be presented for these tests.

**February 23, 2013** - Thomas L. Lewis: *Finite Difference and Discontinuous Galerkin Numerical Methods for Fully Nonlinear Second Order PDEs with Applications to Stochastic Optimal Control*

In this talk I will discuss a convergence framework for directly approximating the viscosity solutions of fully nonlinear second order PDE problems. The main focus will be the introduction of a set of sufficient conditions for constructing convergent finite difference (FD) methods. The conditions given are meant to be easier to realize and implement than those found in the current literature. The given FD methodology will then be shown to generalize to a class of discontinuous Galerkin (DG) methods. The proposed DG methods are high order and allow for increased flexibility when choosing a computational mesh. Numerical experiments will be presented to gauge the performance of the proposed DG methods. An overview of the PDE theory of viscosity solutions will also be given. The presented ideas are part of a larger project concerned with efficiently and accurately approximating the Hamilton-Jacobi-Bellman equation from stochastic optimal control.

**February 22, 2013** - Charles K. Garrett: *Numerical Integration of Matrix Riccati Differential Equations with Solution Singularities*

A matrix Riccati differential equation (MRDE) is a quadratic ODE of the form

X' = A21 + A22X – XA11 – XA12X.

It is well known that MRDEs may have singularities in their solution. In this presentation, both the theory and practice of numerically integrating MRDEs past solution singularities will be analyzed. In particular, it will be shown how to create a black box numerical MRDE solver, which accurately solves an MRDE with or without singularities.

**February 21, 2013** - Giacomo Dimarco: *Asymptotic Preserving Implicit-Explicit Runge-Kutta Methods For Non-Linear Kinetic Equations*

In this talk, we will discuss Implicit-Explicit (IMEX) Runge Kutta methods which are particularly adapted to stiff kinetic equations of Boltzmann type. We will consider both the case of easy invertible collision operators and the challenging case of Boltzmann collision operators. We give sufficient conditions in order that such methods are asymptotic preserving and asymptotically accurate. Their monotonicity properties are also studied. In the case of the Boltzmann operator the methods are based on the introduction of a penalization technique for the collision integral. This reformulation of the collision operator permits to construct penalized IMEX schemes which work uniformly for a wide range of relaxation times avoiding the expensive implicit resolution of the collision operator. Finally we show some numerical results which confirm the theoretical analysis.

**February 20, 2013** - Tom Berlijn: *Effects of Disorder on the Electronic Structure of Functional Materials*

Doping is one of the most powerful ways to tune the properties of functional materials such as thermoelectrics, photovoltaics and superconductors. Besides carriers and chemical pressure, the dopants insert disorder into the materials. In this talk I will present two case studies of doped Fe based superconductors: Fe vacancies in KxFeySe2 [1] and Ru substitutions in Ba(Fe1-xRux)2As2 [2]. With the use of a recently developed first principles method [3], non-trivial disorder effects are found that are not only interesting scientifically, but also have potential implications for materials technology. Open questions for further research will be discussed.

[1] TB, P.j. Hirschfeld, W. Ku, PRL 109 (2012)

[2] L. Wang, TB, C.-H. Lin, Y. Wang, P.j. Hirschfeld, W. Ku, PRL 110 (2013)

[3] TB, D. Volja, W. Ku, PRL 106 (2011)

**February 19, 2013** - Joshua D. Carmichael: *Seismic Monitoring of the Western Greenland Ice Sheet: Response to Early Lake Drainage*

In 2006, the drainage of a supraglacial lake through hydrofracture on the Greenland Ice-sheet was directly observed for the first time. This event demonstrated that surface-to-bed hydrological connections can be established through 1km of cold ice and thereby allow surficial forcing of a developed subglacial drainage system by surface meltwater. In a climate changing scenario, supraglacial lakes on the Western Greenland Ice Sheet are expected to drain earlier each summer and form new lakes at higher elevations. The ice sheet response to these earlier drainages in the near future is of glaciological concern. We address the response of the Western Greenland Ice Sheet to an observed early lake drainage using a synthesis of seismic and GPS monitoring near an actively draining lake. This experiment demonstrates that (1) seismic activity precedes the drainage event by several days and is likely coincident with crack coalescence, that (2) seismic multiplet locations are coincident with the uplift of the ice during drainage and (3) a diurnal seismic response of the ice sheet follows after the ice surface settles to pre-drainage elevation a week later. These observations are consistent with a model in which the subglacial drainage system is likely distributed, highly pressurized and with low hydraulic conductivity at drainage initiation. It also demonstrates that an early lake drainage likely reduces basal normal stress for order-week time scales by storing water subglacially. We conclude with recommendations for future long-range lake drainage detection.

**February 18, 2013** - Mili Shah: *Calculating a Symmetry Preserving Singular Value Decomposition*

The symmetry preserving singular value decomposition (SPSVD) produces the best symmetric (low rank) approximation to a set of data. These symmetric approximations are characterized via an invariance under the action of a symmetry group on the set of data. The symmetry groups of interest consist of all the non-spherical symmetry groups in three dimensions. This set includes the rotational, reflectional, dihedral, and inversion symmetry groups. In order to calculate the best symmetric (low rank) approximation, the symmetry of the data set must be determined. Therefore, matrix representations for each of the non-spherical symmetry groups have been formulated. These new matrix representations lead directly to a novel reweighting iterative method to determine the symmetry of a given data set by solving a series of minimization problems. Once the symmetry of the data set is found, the best symmetric (low rank) approximation can be established by using the SPSVD. Applications of the SPSVD to protein dynamics problems as well as facial recognition will be presented.

**February 14, 2013** - Zheng (Cynthia) Gu: *Efficient and Robust Message Passing Schemes for Remote Direct Memory Access (RDMA)-Enabled Clusters*

While significant effort has been made in improving Message Passing Interface (MPI) performance, existing work has mainly focused on eliminating software overhead in the library and delivering raw network performance to applications. The current MPI implementations such as MPICH2, MVAPICH2, and Open MPI still suffer from performance issues such as unnecessary synchronizations, communication progress problems, and lack of communication-computation overlaps. The root cause of these problems is the mis-match between the communication protocols/algorithms and the communication scenarios. In my PhD research, I will develop efficient and robust message passing schemes for both point-to-point and collective communications for RDMA-enabled clusters. Unlike existing approaches for optimizing MPI performance, our approach will allow different communication protocols/algorithms for different communication scenarios. The idea is to use the most appropriate communication scheme for each communication so as to remove the mis-matches, which will eliminate unnecessary synchronizations, improve communication progress, and maximize communication-computation overlaps during a communication operation. This prospectus will describe the background of this research, present our preliminary research, and summarize the proposed future work.

**February 8, 2013** - Taylor Patterson: *Simulation of Complex Nonlinear Elastic Bodies Using Lattice Deformers*

Lattice deformers are a popular option in computer graphics for modeling the behavior of elastic bodies as they avoid the need for conforming mesh generation, and their regular structure offers significant opportunities for performance optimizations. This talk will present work that expands the scope of current grid-based elastic deformers, adding support for a number of important simulation features. The approach to be described accommodates complex nonlinear, optionally anisotropic materials while using an economical one-point quadrature scheme. The formulation fully accommodates near-incompressibility by enforcing accurate nonlinear constraints, supports implicit integration for large time steps, and is not susceptible to locking or poor conditioning of the discrete equations. Additionally, this technique increases the solver accuracy by employing a novel high-order quadrature scheme on lattice cells overlapping with the embedded model boundary, which are treated at sub-cell precision. This accurate boundary treatment can be implemented at a minimal computational premium over the cost of a voxel-accurate discretization. Finally, this talk will present part of the expanding feature set of this approach that is currently under development.

**February 6, 2013** - Makhan Virdi: *Modeling High-resolution Soil Moisture to Estimate Recharge Timing and
Experiences with Geospatial Analyses*

Estimating the time of groundwater recharge after a rainfall event is poorly understood because of it's dependence on non-linear soil characteristics and variability in antecedent soil conditions. Movement of water in variably saturated soil can be described by Richards' equation - a non-linear partial differential equation without a closed-form analytical solution, which is difficult to approximate. To develop a simple recharge model using minimum number of soil parameters, high resolution soil moisture data from a soil column in controlled laboratory conditions were analysed to understand the wetting front propagation at a finer temporal scale. Findings from a series of simulations using an existing Finite Element model by varying soil properties and depth to water table were used to propose a simple model that uses only the most significant representative soil properties and antecedent soil matrix state. In other separate geospatial analyses, satellite imagery was used for determining landslide risk cost to develop an algorithm for safest and shortest route planning in hilly areas susceptible to landslide; effects of decadal climate extremes was studied on lake-groundwater exchanges; Effects of Phosphate mining on a regional scale were studied using hydrological models and geospatial analysis LiDAR derived DEM and watershed.

**February 5, 2013** - Roshan J. Vengazhiyil and C. F. Jeff Wu: *Experimental Design, Model Calibration, and Uncertainty Quantification*

We will start the talk with a newly developed space-filling design, called minimum energy design (MED). The key ideas involved in constructing the MED are the visualization of each design point as a charged particle inside a box, and minimization of the total potential energy of these particles. It is shown through theoretical arguments and simulations, that under regularity conditions and proper choice of the charge function, the MED can asymptotically generate any arbitrary probability density function. This new design technique has important applications in Bayesian computation and uncertainty quantification. The second part of the talk will focus on model calibration. The commonly used Kennedy and O'Hagan's (KO) approach treats the computer model as a black box and therefore, the statistically calibrated models lack physical interpretability. We propose a new framework that opens up the black box and introduces statistical models inside the computer model. This approach leads to simpler models that are physically more interpretable. Then, we will present some theoretical results concerning the convergence properties of calibration parameter estimation in the KO formulation of the model calibration problem. The KO calibration is shown to be asymptotically inconsistent. A new approach, called L2 distance calibration, is shown to be consistent and asymptotically efficient in estimating the calibration parameters.

**February 4, 2013** - Li-Shi Luo: *Kinetic Methods for CFD*

Computational fluid dynamics (CFD) is based on direct discretizations of the Navier-Stokes equations. The traditional approach of CFD is now being challenged as new multi-scale and multi-physics problems have begun to emerge in many fields -- in nanoscale systems, the scale separation assumption does not hold; macroscopic theory is therefore inadequate, yet microscopic theory may be impractical because it requires computational capabilities far beyond our present reach. Methods based on mesoscopic theories, which connect the microscopic and macroscopic descriptions of the dynamics, provide a promising approach. Besides their connection to microscopic physics, kinetic methods also have certain numerical advantages due to the linearity of the advection term in the Boltzmann equation. Dr. Luo will discuss two mesoscopic methods: the lattice Boltzmann equation and the gas-kinetic scheme, their mathematical theory and their applications to simulate various complex flows. Examples include incompressible homogeneous isotropic turbulence, hypersonic flows, and micro-flows.

**January 23, 2013** - Tarek Ali El Moselhy: *New Tools for Uncertainty Quantification and Data Assimilation in Complex Systems
*

In this talk, Dr. Tarek Ali El Moselhy will present new tools for forward and inverse uncertainty quantification (UQ) and data assimilation.

In the context of forward UQ, Dr. Moselhy will briefly summarize a new scalable algorithm particularly suited for very high-dimensional stochastic elliptic and parabolic PDEs. The algorithm relies on computing a compact separated representation of the stochastic field of interest. The separated presentation is computed iteratively and adaptively via a greedy optimization algorithm. The algorithm has been successfully applied to problems of flow and transport in stochastic porous media, handling “real world” levels of spatial complexity and providing orders of magnitude reduction in computational time compared to state of the art methods.

In the context of inverse UQ, Dr. Moselhy will present a new algorithm for the Bayesian solution of inverse problems. The algorithm explores the posterior distribution by finding a {\it transport map} from a reference measure to the posterior measure, and therefore does not require any Markov chain Monte Carlo sampling. The map from the reference to the posterior is approximated using polynomial chaos expansion and is computed via stochastic optimization. Existence and uniqueness of the map are guaranteed by results from the optimal transport literature. The map approach is demonstrated on a variety of problems, ranging from inference of permeability fields in elliptic PDEs to benchmark high-dimensional spatial statistics problems such as inference in log-Gaussian cox point processes.

In addition to its computational efficiency and parallelizability, advantages of the map approach include: providing clear convergence criteria and error measures, providing analytical expressions for posterior moments, evaluating at no additional computational cost the marginal likelihood/evidence (thus enabling model selection), the ability to generate independent uniformly-weighted posterior samples without additional model evaluations, and the ability to efficiently propagate posterior information to subsequent computational modules (thus enabling stochastic control).

In the context of data assimilation, Dr. Moselhy will present an optimal map algorithm for filtering of nonlinear chaotic dynamical systems. Such an algorithm is suited for a wide variety of applications including prediction of weather and climate. The main advantage of the algorithm is that it inherently avoids issues of sample impoverishment common to particle filters, since it explicitly represents the posterior as the push forward of a reference measure rather than with a set of samples.

**December 13, 2012** - Russell Carden: *Automating and Stabilizing the Discrete Empirical Interpolation Method for Nonlinear Model Reduction *

The Discrete Empirical Interpolation Method (DEIM) is a technique for model reduction ofnonlinear dynamical systems. It is based upon a modification to proper orthogonal decomposition, which is designed to reduce the computational complexity for evaluating the reduced order nonlinear term. The DEIM approach is based upon an interpolatory projection and only requires evaluation of a few selected components of the original nonlinear term. Thus, implementation of the reduced order nonlinear term requires a new code to be derived from the original code for evaluating the nonlinearity. Dr. Carden will describe a methodology for automatically deriving a code for the reduced order nonlinearity directly from the original nonlinear code. Although DEIM has been effective on some very difficult problems, it can under certain conditions introduce instabilities in the reduced model. Dr. Carden will present a problem that has proved helpful in developing a method for stabilizing DEIM reduced models.

**December 12, 2012** - Charlotte Kotas: *Bringing Real-Time Array Signal Processing to the NVIDIA Tesla*

Underwater acoustic detection of hostile targets at range requires increasingly computationally advanced algorithms as adversaries become quieter. This seminar will discuss the mathematics behind one such algorithm and some of the challenges associated with modifying it to work in a real-time networked environment. The algorithm was modified from a sequential MATLAB formulation to a parallel CUDA FORTRAN formation designed to run on an NVIDIA Tesla C2050 processor. Speedups of greater than 50◊ were observed over comparable computational sections.

**December 6, 2012** - Shuaiwen "Leon" Song: *Power, Performance and Energy Models and Systems for Emergent Architectures *

Massive parallelism combined with complex memory hierarchies and heterogeneity in high-performance computing (HPC) systems form a barrier to efficient application and architecture design. The performance achievements of the past must continue over the next decade to address the needs of scientific simulations. However, building an exascale system by 2022 that uses less than 20 megawatts will require significant innovations in power and performance efficiency. Prior to this work, the fundamental relationships between power and performance were not well understood. Our analytical modeling approach allows users to quantify the relationship between power and performance at scale by enabling study of the effects of machine and application dependent characteristics on system energy efficiency. Our model helps users isolate root causes of energy or performance inefficiencies and develop strategies for scaling systems to maintain or improve efficiency. I will also show how this methodology can be extended and applied to model power and performance in heterogeneous GPU-based architectures.

Shuaiwen "Leon" Song is a PhD candidate in the Computer Science department of Virginia Tech. His primary research interests fall broadly within the area of High Performance Computing (HPC) with a focus on power and performance analysis and modeling for large scale homogeneous and heterogeneous parallel architectures and runtime systems. He is a recipient of the 2011 Paul E. Torgersen Award for Graduate Student Research Excellence and in 2011 was an Institute for Scientific Computing Research (ISCR) Scholar at Lawrence Livermore National Laboratory. His work has been published in conferences and journals including IPDPS, IEEE Cluster, PACT, MASCOTS, IEEE TPDS, and IJHPCA.

**December 6, 2012** - Miroslav Stoyanov: *Gradient Based Dimension Reduction Approach for Stochastic Partial Differential Equations*

Dimension reduction approach is considered for uncertainty quantification, where we use gradient information to partition the uncertainty domain into “active” and “passive” subspaces, where the “passive” subspace is characterized by near zero variance of the quantity of interest. We present a way to project the model onto the low dimensional “active” subspace and solve the resulting problem using conventional techniques. We derive rigorous error bounds for the projection algorithm and show convergence in $L^1$ norm.

**December 5, 2012** - Barbara Chapman: *Enabling Exascale Programming: The Intranode Challenge *

As we continue to debate the best way to program emerging generations of leadership-class hardware, it is imperative that we do not ignore the more traditional paths.

Dr. Chapman's presentation considers some of the ways in which today's intranode programming models may help us migrate legacy application code.

**December 5, 2012** - Andrew Christlieb: *An Implicit Maxwell Solver Based on Method of Lines Transpose *

Fast summation methods have been successfully used in a range of plasma applications. However, in the case of moving point charges, direct application of fast summation methods in the time domain requires the use of retarded potentials. In practices, this means that every time a point charge moves in a simulation, it leaves behind an image charge that becomes a source term for all time. Hence, at each time step the number of points in the simulation grows with the number of particles being simulated.

In this talk, Dr. Christlieb will present a new approach to Maxwell's equations based on the method of lines transpose. The method starts by expressing Maxwell’s equations in second order form, and then the time operator is discretized. The resulting implicit system is then solved using integral methods. This process is known as the method of lines transpose. This approach pushes the time history into a volume integral, which does not grow in complexity with time. To efficiently solve the boundary integral, Dr. Christlieb will explain the developed ADI method that is combined with a $O(N)$ solver for the 1D boundary integrals that is competitive with explicit time stepping methods. Because the new method is implicit, this approach does not have a CFL. Further, because the approach is based on an integral formulation, the new method easily encompasses complex geometry with no special modification. Dr. Christlieb will present preliminary results of this method applied to wave propagation and some basic Maxwell examples.

**November 27, 2012** - Charles Jackson: *Metrics for Climate Model Validation *

A “valid” model is a model that has been tested for its intended purpose. In the Bayesian formulation, the “log-likelihood” is a test statistic for selecting, weeding, or weighting climate model ensembles with observational data. Thisstatistic has the potential to synthesize the physical and data constraints on quantities of interest. One of the thorny issues in formulating the log-likelihood is how one should account for biases because not all biases affect predictions of quantities of interest. Dr. Jackson makes use of a 165-member ensemble CAM3.1/slab ocean climate models with different parameter settings to think through the issues that are involved with predicting eachmodel’s sensitivity to greenhouse gas forcing given what can be observed from the base state. In particular, Dr. Jackson uses multivariate empirical orthogonal functions to decompose the differences that exist among this ensemble to discover what fields and regions matter to the model’s sensitivity. What is found is that the differences that matter can be a small fraction of the total discrepancy. Moreover, weighting members of the ensemble using this knowledge does a relatively poor job of adjusting the ensemble mean toward the known answer. Dr. Jackson will discuss the implications of this result.

**November 15, 2012** - Erich Foster: *Finite Elements for the Quasi-Geostrophic Equations of the Ocean*

Erich Foster will present a conforming finite element (FE) discretization of the stream function formulation of the pure stream function form of the quasi-geostrophic equations (QGE), which are a commonly used model for the large scale wind-driven ocean circulation. The pure stream function form of the QGE is a fourth-order PDE and therefore requires a C^1 FE discretization to be conforming. Thus, the Argyris finite element, a C^1 FE with 21 degrees of freedom, was chosen for the FE discretization of the QGE. Optimal error estimates for the pure stream function form of the QGE will be presented. The QGE is a simplified model of the ocean, however it can be computationally expensive to resolve all scales, therefore numerical methods, such as the two-level method, are indispensable for time sensitive projects. A two-level method and optimal error estimate for a two-level method applied to the conforming FE discretization of the pure stream function form of the QGE will be presented and computational efficiency will be demonstrated.

**October 25, 2012** - Shi Jin: *Asymptotic-Preserving Schemes for Boltzmann Equation and Relative Problems with Stiff Sources *

Dr. Shi Jin will propose a general framework to design asymptotic preserving schemes for the Boltzmann kinetic and related equations. Numerically solving these equations are challenging due to the nonlinear stiff collision (source) terms induced by small mean free or relaxation time. Dr. Jin will propose to penalize the nonlinear collision term by a BGK-type relaxation term, which can be solved explicitly even if discretized implicitly in time. Moreover, the BGK-type relaxation operator helps to drive the density distribution toward the local Maxwellian, thus naturally imposes an asymptotic-preserving scheme in the Euler limit. The scheme so designed does not need any nonlinear iterative solver or the use of Wild Sum. It is uniformly stable in terms of the (possibly small) Knudsen number, and can capture the macroscopic fluid dynamic (Euler) limit even if the small scale determined by the Knudsen number is not numerically resolved. Dr. Jin will show how this idea can be applied to other collision operators; such as the Landau-Fokker-Planck operator, Ullenbeck-Urling model, and in the kinetic-fluid model of disperse multiphase flows.

**October 24, 2012** - Shi Jin: *Semiclassical Computation of High Frequency Waves in Heterogeneous Media *

Dr. Shi Jin will introduce semiclassical Eulerian methods that are efficient in computing high frequency waves through heterogeneous media. The method is based on the classical Liouville equation in phase space, with discontinuous Hamiltonians due to the barriers or material interfaces. Dr. Jin will provide physically relevant interface conditions consistent with the correct transmissions and reflections, and then build the interface conditions into the numerical fluxes. This method allows the resolution of high frequency waves without numerically resolving the small wave lengths, and capture the correct transmissions and reflections at the interface. This method can also be extended to deal with diffraction and quantum barriers. Dr. Jin will also discuss Eulerian Gaussian beam formulation which can compute caustics more accurately.

**October 09, 2012** - Christian Ringhofer: *Charged Particle Transport in Narrow Geometries under Strong Confinement *

Kinetic transport in narrow tubes and thin plates, involving scattering of particles with a background, is modeled by classical and quantum mechanical sub-band type macroscopic equations for the density of particles (ions). The result are diffusion equation with the projection of the (asymptotically conserved) energy tensor on the confined directions as an additional free variable, on large time scales. Classical transport of ions through protein channels and quantum transport in thin films are discussed as examples of the application of this methodology.

**October 05, 2012** - Amilcare Porporato: *Stochastic soil moisture dynamics: from soil-plant biogeochemistry and land-atmosphere interactions to sustainable use of soil and water *

The soil-plant-atmosphere system is characterized by a large number of interacting processes with high degree of unpredictability and nonlinearity. These elements of complexity, while making a full modeling effort extremely daunting, are also responsible for the emergence of characteristic behaviors. Duke University model these processes by mean of minimalist models which describe the main deterministic components of the system and surrogate the high dimensional ones (i.e., hydroclimatic variability and rainfall in particular) with suitable stochastic terms. The solution of the stochastic soil water balance allows us to describe probabilistically several ecohydrological processes, including ecosystem response plant productivity as well as soil organic matter and nutrient cycling dynamics. Dr. Porporato will also discuss how such an approach can be extended to include land atmosphere feedbacks and related impact on convective precipitation. Dr. Porporato will conclude with a brief discussion of how these methods can be employed to address quantitatively the sustainable management of water and soil resources, including optimal irrigation and fertilization, phytoremediation, and soil salinization risk.