Some algorithms achieve optimal arithmetic complexity with low arithmetic intensity (flops/Byte), or possess high arithmetic intensity but lack optimal complexity, while some hierarchical algorithms, such as Fast Multipole and its H-matrix algebraic generalizations, realize a combination of optimal complexity and high intensity. Implemented with task-based dynamic runtime systems, such methods also have potential for relaxed synchrony, which is important for future energy-austere architectures, since there may be significant nonuniformity in processing rates of different cores even if task sizes can be controlled. We describe modules of KAUST's Hierarchical Computations on Manycore Architectures (HiCMA) software toolkit that illustrate these features and are intended as building blocks of more sophisticated applications, such as matrix-free higher-order methods in optimization. HiCMA's target is hierarchical algorithms on emerging architectures, which have hierarchies of their own that generally do not align well with those of the algorithm. Some modules of this open source project have been adopted in the software libraries of major vendors. We describe what is currently available and some motivating applications.
David Keyes is the director of the Extreme Computing Research Center at King Abdullah University of Science and Technology, where he was a founding dean in 2009, and adjunct professor of applied mathematics at Columbia University. Keyes earned his BSE in Aerospace and Mechanical Engineering from Princeton and his PhD in Applied Mathematics from Harvard. He works at the algorithmic interface between parallel computing and the numerical analysis of partial differential equations. He is a Fellow of SIAM and AMS and has received the AMC Gordon Bell Prize, the IEEE Sidney Fernbach Award, and the SIAM Prize for Distinguished Service to the Profession.