Oak Ridge National Laboratory
The University of British Columbia
North Carolina State University
FreeLoader: Distributed Storage Scavenging
Increasingly scientific discoveries are driven by analyses of massively distributed bulk data. This has led to the proliferation of high-end mass storage systems, storage area clusters and data centers as storage fabric elements for supercomputing, offering excellent price/performance ratio and good storage speeds, but increasing maintenance and administrative costs. A promising alternative then, is to harness the collective storage potential of individual workstations much as we harness the idle CPU cycles due to the affordable economics in aggregating commodity storage and low usage to available space ratio. However, such aggregated commodity storage is prone to volatility, machine failures, performance concerns and trust issues.
The FreeLoader project is an effort to aggregate space, and I/O bandwidth contributions from commodity desktop storage within a domain to provide a shared cache/scratch space for large, immutable data sets. The FreeLoader architecture comprises of contributing benefactor nodes that are aggregated into pools, steered by a management layer. Collectively, these entities provide services such as reliability, high performance, availability and load balancing.
The following Supercomputing 2005 paper, ORNL-TR-P05-123435, reflects our current thinking.
FreeLoader Primary Usecase Picture
Other FreeLoader Usecases Picture
Check out Dilbert's view on user desktop sharing... :) (comic strip 11/28/2004).
- Sudharshan Vazhkudai (ORNL)
- Matei Ripeanu (UBC)
- Xiaosong Ma (NCSU/ORNL)
- Zhe Zhang (PhD Student, NCSU)
- Samer Al Kiswany (PhD Student, UBC)
- Vincent Freeh (NCSU)
- Tyler Simon (US Army Corp. HPC)
- Nandan Tammineedi (MS, NCSU; First Hop: Yahoo!)
- Jonathan Strickland (First Hop: Nortel Networks)
- X. Ma, S. S. Vazhkudai, Z. Zhang, "Improving Data Availability for Better Access Performance: A Study on Caching Scientific Data on Distributed Workstations", Journal of Grid Computing--Special Issue on Volunteer Computing and Desktop Grids, July 2009. pdf
- S.A. Kiswany, A. Bahramshahry, H. Ghasemi, M. Ripeanu, S.S. Vazhkudai, "A High-Performance GridFTP Server at Desktop Cost", Poster in Supercomputing 2007 (SC07): Int'l Conference on High Performance Computing, Networking, Storage and Analysis, Reno, Nevada, November 2007. pdf
- S. Vazhkudai, D. Thain, X. Ma, V. Freeh, "Positioning Dynamic Storage Caches for Transient Data", in Proceedings of the International Workshop on High Performance I/O Techniques and Deployment of Very Large Scale I/O Systems (HiperIO'06 at IEEE Cluster Computing),Barcelona, Spain, September 2006. pdf
- X. Ma, S. Vazhkudai, V. Freeh, T.A. Simon, T. Yang, S.L. Scott, "Coupling Prefix Caching and Collective Downloads for Remote Data Access", in Proceedings of the 20th ACM International Conference on Supercomputing, pp. 229-238, Cairns, Australia, June 2006. pdf Talk
- S. Vazhkudai, X. Ma, V. Freeh, J. Strickland, N. Tammineedi, T.A. Simon, S.L. Scott, "Constructing Collaborative Desktop Storage Caches for Large Scientific Datasets", To appear in the ACM Transaction on Storage (TOS), 2006. pdf
- V.W. Freeh, X. Ma, S. Vazhkudai, J. Strickland, "Controlling Impact While Aggressively Scavenging Idle Resources", To appear in the Journal of Cluster Computing, 2006.
- S. Vazhkudai, X. Ma, V. Freeh, J. Strickland, N. Tammineedi, S.L. Scott, "FreeLoader:Scavenging Desktop Storage Resources for Scientific Data", Proceedings of Supercomputing 2005 (SC'05): Int'l Conference on High Performance Computing, Networking and Storage, Seattle, Washington, November 2005. pdf Talk
- V. Freeh, X. Ma, J. Strickland, S. Vazhkudai, "Synergetic Resource Stealing: We Promise It Will Not Hurt Much", Tech Report P05-123434, Oak Ridge National Laboratory, May 2005. pdf
- J. Strickland, V. Freeh, X. Ma, S. Vazhkudai, "Governor: Autonomic Throttling for Aggressive Idle Resource Scavenging", Proceedings of 2nd IEEE International Conference on Autonomic Computing (ICAC 2005), Seattle, WA, June 2005. pdf Talk
- N. Tammineedi, "Design of the management component in a scavenged storage environment", Masters Thesis Report, Computer Science Department, North Carolina State University, May 2005. pdf
- S. Vazhkudai, X. Ma, V. Freeh, J. Strickland, N. Tammineedi, S.L. Scott, "FreeLoader: Distributed Storage Infrastructure Using Scavenging", ORNL Booth Poster at Supercomputing 2004. poster JPG
- S. Vazhkudai, "On-demand Grid Storage using Scavenging", Proceedings of the Session on New Trends in Distributed Data Access, Las Vegas, Nevada, June 2004. ps Talk
- Sudharshan Vazhkudai, "Optimizing End-User Data Delivery Using Storage Virtualization", Systems Group Seminar, Department of Computer Science and Engineering, Ohio State University, October 2006. Talk
- Xiaosong Ma, "FreeLoader: Lightweight Data Management", IBM Almaden, September 2004. Talk
Current Testbed at ORNL.
- Testbed snapshot
- Over a dozen scavenger workstations each with 100Mb/sec Ethernet
- Collective aggregate storage over 400GB
- hsi access to HPSS
- GridFTP access to GPFS, PVFS