Participating Institutions:

Oak Ridge National Laboratory
North Carolina State University
 

Transient Data Recovery in HPC Centers

                    Description

                    People

                    Publications

                    Testbed

                    Positions   


 




Description.

Top





People.

Top





Research Publications.

  1. C. Wang, Z. Zhang, X. Ma, S. S. Vazhkudai, F. Mueller, "Improving the Availability of Supercomputer Job Input Data Using Temporal Replication", Proceedings of Int'l Supercomputing Conference (ISC-09) Hamburg, Germany, June 2009. pdf
  2. C. Wang, Z. Zhang, S. S. Vazhkudai, X, Ma, F. Mueller, "On-the-fly Recovery of Job Input Data in Supercomputers", Proceedings of 37th Int'l Conference on Parallel Processing (ICPP-08) Portland, Oregon, September 2008. pdf
  3. Z. Zhang, C. Wang, S. Vazhkudai, X, Ma, G. Pike, F. Mueller, J.W. Cobb, "Optimizing Center Performance through Coordinated Data Staging, Scheduling and Recovery", Proceedings of Supercomputing 2007 (SC07): Int'l Conference on High Performance Computing, Networking, Storage and Analysis, Reno, Nevada, November 2007. pdf slides
  4. S. Vazhkudai, X. Ma, "Recovering Transient Data: Automated On-demand Data Reconstruction and Offloading on Supercomputers", ACM SIGOPS Operating Systems Review: Special Issue on File and Storage Systems, Vol. 41, No. 1, pp. 14-18, January 2007. pdf
  5. S. Vazhkudai, X. Ma, M. Vilayannur, "Data Availability for Service Availability: Automated On-demand Data Reconstruction and Offloading on Supercomputers", ORNL Tech Report 003174, Septemer 2006.

Talks.

  1. Sudharshan Vazhkudai, "IO Virtualization: Robust Storage Management in the Machine-Room and Beyond", Virtualization in HPC, Nashville, TN, September 2006. Talk

Top





Job Opportunities.

Top





URL http://www.csm.ornl.gov/~vazhkuda/TransientDataRecovery.html
Updated: Friday, 27-Aug-2010 14:18:04 EDT