Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems (ScalA)

held in conjunction with

SC14: The International Conference on High Performance Computing, Networking, Storage and Analysis

in cooperation with ACM SIGHPC

November 17, 2014, New Orleans, LA, USA

https://www.csm.ornl.gov/srt/conferences/Scala/2014/

Novel scalable scientific algorithms are needed in order to enable key science applications to exploit the computational power of large-scale systems. This is especially true for the current tier of leading petascale machines and the road to exascale computing as HPC systems continue to scale up in compute node and processor core count. These extreme-scale systems require novel scientific algorithms to hide network and memory latency, have very high computation/communication overlap, have minimal communication, and have no synchronization points.

Scientific algorithms for multi-petaflop and exa-flop systems also need to be fault tolerant and fault resilient, since the probability of faults increases with scale. Resilience at the system software and at the algorithmic level is needed as a crosscutting effort. Finally, with the advent of heterogeneous compute nodes that employ standard processors as well as GPGPUs, scientific algorithms need to match these architectures to extract the most performance. This includes different system-specific levels of parallelism as well as co-scheduling of computation. Key science applications require novel mathematical models and system software that address the scalability and resilience challenges of current- and future-generation extreme-scale HPC systems.

Submission Guidelines

Authors are invited to submit manuscripts in English structured as technical papers not exceeding 8 letter size (8.5x11) pages including figures, tables, and references using the IEEE format for conference proceedings. Submissions not conforming to these guidelines may be returned without review. Reference style files are available at http://www.ieee.org/conferences_events/conferences/publishing/templates.html.

All manuscripts will be reviewed and judged on correctness, originality, technical strength, and significance, quality of presentation, and interest and relevance to the workshop attendees. Submitted papers must represent original unpublished research that is not currently under review for any other conference or journal. Papers not following these guidelines will be rejected without review and further action may be taken, including (but not limited to) notifications sent to the heads of the institutions of the authors and sponsors of the conference. Submissions received after the due date, exceeding length limit, or not appropriately structured may also not be considered. At least one author of an accepted paper must register for and attend the workshop. Authors may contact the workshop program chair for more information. Papers should be submitted electronically at: https://www.easychair.org/conferences/?conf=scala20140.

Full papers will be published with the SC'14 workshop proceedings in the IEEE and ACM digital libraries. Selected papers will be invited for an extended version in a special issue of the Journal of Computational Science (JoCS).

Important Dates

Full paper submission: 7 September, 2014
Notification of acceptance: 23 September, 2014
Final paper submission (firm): 6 October, 2014

Topics

Topics of interest include, but are not limited to:

Novel scientific algorithms that improve performance, scalability, resilience, and power efficiency
Porting scientific algorithms and applications to many-core and heterogeneous architectures
Performance and resilience limitations of scientific algorithms and applications at scale
Crosscutting approaches (system software and applications) in addressing scalability challenges
Scientific algorithms that can exploit extreme concurrency (e.g. 1 billion for exascale by 2020)
Naturally fault tolerant, self-healing, or fault oblivious scientific algorithms
Programming model and system software support for algorithm scalability and resilience

Workshop Chairs

Prof. Vassil Alexandrov, Barcelona Supercomputing Center, Spain
Al Geist, Oak Ridge National Laboratory, USA

Workshop Program Chair

Dr. Christian Engelmann, Oak Ridge National Laboratory, USA

Program Committee

Prof. Vassil Alexandrov, Barcelona Supercomputing Center, Spain
Dr. Rick Archibald, Oak Ridge National Laboratory, USA
Dr. David E. Bernholdt, Oak Ridge National Laboratory, USA
Dr. Greg Bronevetsky, Lawrence Livermore National Laboratory, USA
Dr. Michael Heroux, Sandia National Laboratories
Dr. Mark Hoemmen, Sandia National Laboratories
Prof. Marian Bubak, AGH University of Science and Technology, Krakow, Poland and University of Amsterdam, The Netherlands
Prof. Zizhong Chen, University of California, Riverside, USA
Dr. Christian Engelmann, Oak Ridge National Laboratory, USA
Dr. Kirk E. Jordan, IBM T.J. Watson Research, USA
Prof. Dieter Kranzlmueller, Ludwig-Maximilians-University Munich, Germany
Prof. Ron Perrot, University of Oxford, UK
Dr. Nageswara Rao, Oak Ridge National Laboratory, USA

Venue

Room 283-84-85, New Orleans Ernest N. Morial Convention Center, 900 Convention Center Blvd, New Orleans, LA 70130, USA

Program

09:00-10:05 Session 1
- 09:00-09:05 Opening
- 09:05-09:45 Keynote 1: "Performance Analytics for Large Scale Computations," Jesus Labarta (Barcelona Supercomputing Center, Spain) (Abstract)
- 09:45-10:05 Paper 1: "Scaling Parallel 3-D FFT with Non-Blocking MPI Collectives," Sukhyun Song and Jeffrey K. Hollingsworth
10:05-10:30 Coffee break (coffee provided)
10:30-12:10 Session 2
- 10:30-11:10 Keynote 2: "Fault Tolerance in Numerical Library Routines," Jack Dongarra (University of Tennessee, Knoxville, USA) (Abstract)
- 11:10-11:30 Paper 2: "Exploiting Data Representation for Fault Tolerance," James Elliott, Mark Hoemmen and Frank Mueller
- 11:30-11:50 Paper 3: "VCube: A Provably Scalable Distributed Diagnosis Algorithm," Elias P. Duarte Jr., Luis C.E. Bona and Vinicius K. Ruoso
- 11:50-12:10 Paper 4: "TX: Algorithmic Energy Saving for Distributed Dense Matrix Factorizations," Li Tan and Zizhong Chen (presented by Panruo Wu)
12:10-13:30 Lunch break (lunch on your own)
13:30-15:10 Session 3
- 13:30-14:10 Keynote 3: "Unite and Conquer Approach for Large Scale Numerical Computing," Nahid Emad (University of Versailles, France) (Abstract)
- 14:10-14:30 Paper 5: "CholeskyQR2: A Simple and Communication-Avoiding Algorithm for Computing a Tall-Skinny QR Factorization on a Large-Scale Parallel System," Takeshi Fukaya, Yuji Nakatsukasa, Yuka Yanagisawa and Yusaku Yamamoto
- 14:30-14:50 Paper 6: "Deflation Strategies to Improve the Convergence of Communication-Avoiding GMRES," Ichitaro Yamazaki, Stanimire Tomov, and Jack Dongarra
- 14:50-15:10 Paper 7: "A Framework for Parallel Genetic Algorithms for Distributed Memory Architectures," Dobromir Georgiev, Emanouil Atanassov and Vassil Alexandrov
15:10-15:30 Coffee break (coffee provided)
15:30-17:30 Session 4
- 15:30-16:10 Keynote 4: "Dynamic Big Data Applications," Craig C. Douglas (University of Wyoming, USA) (Abstract)
- 16:10-16:30 Paper 8: "The Anatomy of Mr. Scan: A Dissection of Performance of an Extreme Scale GPU-Based Clustering Algorithm," Benjamin Welton and Barton Miller
- 16:30-16:50 Paper 9: "Performance and Portability with OpenCL for Throughput-Oriented HPC Workloads Across Accelerators, Coprocessors, and Multicore Processors," Chongxiao Cao, Mark Gates, Azzam Haidar, Piotr Luszczek, Stanimire Tomov, Ichitaro Yamazaki and Jack Dongarra
- 16:50-17:10 Paper 10: "A Hierarchical Tridiagonal System Solver for Heterogenous Supercomputers," Xinliang Wang, Yangtong Xu and Wei Xue
- 17:10-17:30 Closing

Proceedings

IEEE Computer Society Digital Library: 2014 5th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems