Computer Sciences Group Research Highlights

Recent highlights include:

Latest results of our PVM research includes studies of crossplatform cluster computing.
Netlib has had 390,000 requests for PVM (second only to Lapack).
We have recently released a beta version of PVM that allows a single virtual machine to contain Unix, NT, and even Windows 95 hosts. Over 1000 copies of this crossplatform version have been downloaded. Feedback from users is being incorporated into the next official release of PVM (version 3.4) due out this summer. Version 3.4 will also be a testbed for ideas we are working on to create the next generation heterogeneous distributed computing environment -- the Generalized Plug-in Machine.
SUN recently announced it will supply and support PVM (and MPI) on its workstations.
IBM recently announced that they will be using the ORNL version of PVM as their supplied and supported PVM version abandoning their own PVMe version.
ORNL's Distributed Computing Research was awarded one of ten AMSE Technology Achievement Awards on April 24, 1997.
CUMULVS is a key integrating technology in the DOE 2000 ACTS Toolkit
The 1997 ACTS Toolkit research has a goal of developing a scientific template library (SciTL) to aid in the writing of large applications. The research is divided into three components: Frameworks, Numerics, and Runtime. Cumulvs appears in the runtime portion of the proposal, but ORNL has been working with the Sandia group involved in the numerics component to integrate Cumulvs into the PD++ appliication code. Recently, ORNL has been discussing with LANL how Cumulvs can interface to the POOMA framework and provide steering and fault tolerance to PAWS applications. By spanning all three components and including collaborations with Sandia and LANL, Cumulvs plays a key role in the SciTL research.
Cumulvs has been chosen as one of two ACTS toolkit highlights to be demonstrated to the DOE R&D council in July 1997.
Electronic Notebook installed by Materials MicroCharacterization Collaboratory
The Materials MicroCharacterization Collaboratory is one of two DOE 2000 pilot projects. This research is a collaboration between ANL, LBNL, NIST, ORNL, ahd university of Illinois. The first collaborative tool installed by MMC is a set of electronic notebooks to share their research ideas and results. These notebooks (based on a prototype developed at ORNL) include two public notebooks, a committee notebook, and two private notebooks. Combined, the shared notebooks already have over a hundred pages in them. Feedback has been quite positive.
The Disel Combustion Collaboratory, the other DOE 2000 pilot, is evaluating the notebook and has provided feedback on additional features.
Over 30 groups around the world have installed our prototype electronic notebook.
NetSolve beta release aims to create a virtual software Libary
NetSolve is a network-enabled solver that allows users access to computational resources -- both hardware and software -- distributed across a network. Its development was motivated by a need for easy-to-use, efficient mechanisms for remotely accessing computational resources. Ease-of-use is obtained via four different interfaces -- Fortran, C, MatLab and Java.

A strong point is that it enables users to get access to hardware platforms through their own programs by making calls through NetSolve to various software components. Thus, there's locational transparency. A software library that a person can access remotely, a virtual library, is thereby created although it does not actually exist on their machine. Therefore, there can be central management of library resources, where the most up-to-date version is always available and systems administrators no longer have to maintain software packages on a variety of different machines.

The NetSolve system has three components: the client, which can be either a user program or a user interacting with one of the NetSolve interfaces; the NetSolve agent; and the pool of NetSolve resources. The entry point into the NetSolve system is the client sending a problem request to the agent. The agent analyzes this request and chooses a computational resource. The problem and its input data are then sent to the chosen NetSolve resource. The problem is solved by the appropriate scientific package on some hardware platform and the result is sent back to the client. This system can be deployed on the Internet or on a local intranet.

Currently, NetSolve can be enabled on any Unix-based machine. Its mechanism will manage and exploit full heterogeneity throughout without the user being aware of the complexities and hassles of network programming. Traditionally, if a user wanted to gain access to a given subroutine or function they would write a call to it, passing the input and output arguments. With NetSolve, you still call a routine and pass the parameters, but the executable software can be anywhere on the net. Thus, you simply call NetSolve, pass the arguments to it, and it figures out the most suitable computational device. It then sends your problem to that device for computation, and if necessary, using retry for fault-tolerance, solves a problem and returns the answers to the user's program.

Billion particle molecular dynamics simulation largest in the world.
Researchers Ed D'Azevedo and Charles Romine from the Department of Energy's (DOE) Oak Ridge National Laboratory (ORNL), have set a new world record for system size in molecular dynamics. They are the first researchers to be able to simulate over 1 billion atoms. D'Azevedo and Romine have demonstrated their new shared memory library can be used to run very large scale molecular dynamics on Intel's 1,024-node Paragon supercomputer.

Molecular dynamics, which models the interactions between the atoms in a chemical, biological or solid state system, is a cornerstone of applications ranging from the study of DNA-protein interactions to the design of new materials.

In a record-breaking demonstration run, D'Azevedo and Romine, performed molecular dynamics simulations for systems of one billion particles, with each simulation step taking about 280 seconds. This result is a major step forward over recently reported simulations of 600 million particles on a 1,024-node Thinking Machine CM-5 and 400 million particles on the 1,024-node Paragon supercomputer at Beaverton performed by another team at ORNL.

The MD code was modified from SOTON_PAR and was developed by Ed D'Azevedo and Charles Romine. Details of the implementation can be found in their ORNL Tech Report. The shared memory emulation library called DOLIB (Distributed Object Library) was also developed by the same authors with support from the PICS (Partnership in Computational Sciences) to simplify parallel programming on distributed memory multiprocessors such as the Intel MP Paragon. DOLIB uses the IPX message system developed by Ron Peierls and Bob Marr at Brookhaven National Laboratory. And IPX is in turn written on top of PVM.

The ability to solve larger MD simulations opens up many new problems that can be solved.

MPI Interoperability Forum Started
The Interoperable Message-Passing Interface (IMPI) has been organized by NIST and includes MPI developers as well as MPI interoperability developers (ORNL). Al Geist has been invited to be a member of this exclusive committee. While MPI was designed not to preclude heterogeneity, none of the half-dozen MPI implementations can interoperate with each other. Thus a user can not presently create a heterogeneous cluster and use the vendor's optimized MPI implementation on each different host because a "send" from one MPI can not be "received" by another MPI. The goal of this forum is to define a standard binding for MPI that will allow heterogeneous computing.
For further information Contact:
Al Geist CS Group Leader
http://www.epm.ornl.gov/~geist/
gst@ornl.gov
(423)574-3153
http://www.epm.ornl.gov/msr/msrcs.html
Oak Ridge National Laboratory / (webmaster@www.epm.ornl.gov)
Last modified: April 29, 1997