Bill Hargrove and Forrest Hoffman
 
Bill Hargrove, left, and Forrest Hoffman use computer clusters to conduct environmental research at ORNL. The scientists write computer codes that enable the old PCs to do parallel processing tasks similar to supercomputers.

Computer Clusters

A mix of old computers are joined into clusters in ORNL research lab.

from Knoxville News-Sentinel
October 21, 20002
 
original URL: http://www.knoxnews.com/kns/ornl/article/0,1406,KNS_4257_1484543,00.html

Analytical horsepower


Two scientists at Oak Ridge among pioneers in harnessing humble PCs into data dynamos

By Frank Munger, News-Sentinel senior writer
October 21, 2002

Cluster computing is like harnessing a team of horses to pull a heavy load. That's an analogy offered by Bill Hargrove and Forrest Hoffman, who were among the first to effectively turn groups of PCs into research workhorses.

Before assembling their first cluster of computers in 1996, the Oak Ridge scientists were drowning in ecological data. They were desperate for a quick way to analyze many sources and types of information.

"Now we just want more data," Hoffman said during a visit to their makeshift computer room on the first floor of ORNL's environmental sciences building. Computers were stacked everywhere, in racks and on the floor.

Their initial venture was dubbed the Stone SouperComputer, taking its name from the fable about a wanderer who promises to feed a village by boiling water and adding a stone. Village skeptics ultimately turn to volunteers, contributing vegetables and enough fixings to make a hearty soup, thus proving that good things come from meager beginnings through cooperation.

At ORNL, Hargrove and Hoffman built their cluster by mixing and matching dozens of castoff PCs from the lab's "swap shop" and gifts from cooperative secretaries who upgraded from 486s to better machines.

With virtually no capital investment, the Stone SouperComputer grew to 128 nodes (single processors) and was able to accomplish parallel-processing tasks equivalent to a small supercomputer.

"It really comes down to your codes," said Hoffman, a computer specialist in the environmental sciences division. "We invested a lot of time writing programs for dynamic load-balancing. That assigns more work to the faster nodes and less work to the slower nodes, but together they all combine to solve the problem."

The parallel PCs helped the ORNL team produce detailed maps of U.S. "eco-regions," with areas color-coded according to their environmental similarities. The computer nodes evaluated and correlated ecological data with 25 or more variables, everything from nitrogen levels in the soil to the humidity in April.

Hargrove, a landscape ecologist, said, "We've been able to tackle bigger problems with higher resolution than people would even think of doing before."

The Stone SouperComputer is now partially disassembled, and Hoffman and Hargrove no longer rely on junk computers to do their work. They have a budget to buy new processors needed for projects that support the laboratory's work on climate change.

The cluster they use now, also known as a Beowulf, named for the one developed at NASA, cost about $50,000 to build. It has nine nodes, each of which has two Intel Pentium III processors running at 1.0 Ghz; 1 GB of memory (RAM), 18 GB of disk space, and two Fast Ethernet interfaces for file-sharing and log-in access.

"While this system is relatively small, its capacity exceeds that of the Stone SouperComputer when we shut it down," Hoffman said.

"The great thing about these Beowulf-style clusters is that we can add to them when we need more capacity and upgrade them slowly over time as projects and problems necessitate."

A science fad of the 1990s is now a reliable, everyday tool at research institutions big and small.

That's due, in part, to software development at ORNL that makes it relatively simple for scientists everywhere to hook up a bunch of PCs and operate them as a single parallel computer.

"What makes it so popular is the cost," said Al Geist of ORNL's Center for Computational Sciences.

Instead of investing millions of dollars in a high-performance supercomputer, a university can build a powerful research cluster for $150,000.

Scientifically useful groupings can be had for a lot less. The cost of setting up a cluster is about $1,000 per node.

Even the smallest of colleges are using them to teach students how to do parallel computing and to do their academic research, Geist said.

The Oak Ridge lab heads a national consortium that developed OSCAR (Open Source Cluster Application Resources), the most popular cluster- management software program. According to Geist, it's "brain-dead easy" to use. A CD downloads the software that shows how to connect the machines and make them work together.

"We're very proud of the fact that we're making it possible for scientists all over the world to group PCs into clusters for good use," he said.

Though not officially classified as supercomputers, big clusters can be formidable research machines.

In order to do research on software and other cluster developments, ORNL's computing center has assembled its own 64-processor system known as Extreme TORC. When it's not engaged in cluster development, the experimental system is available to staff researchers.

Cluster computing is highly individualized. Hoffman and Hargrove develop their own research tools, implementing software from ORNL or elsewhere. They also use computer clusters to prepare some projects to be run on IBM supercomputers available at the lab.

"If the problem is small enough, we'll do it here on the cluster because we don't have to wait in line," Hoffman said. "If it's a really enormous problem - if we can't run it in two days or less - then we'll do the prep work here and migrate the data over to the IBM, do the runs there, bring the data back and do the hard-core analysis here."

Hoffman writes a monthly column for Linux magazine called "Extreme Linux," in which he discusses how to build and run these machines and how to write programs to maximize their capabilities.

Geist said cluster computers are now used in almost all fields, from oil exploration to airplane design.

"The biggest one in the world is probably run by Google, which has about 6,000 processors," he said. "But they don't do any interesting science. They're just using it to regurgitate URLs."

It's not unheard of for a research institution to have a cluster sitting right next to its supercomputer, Geist said. But, even at their best, cluster systems can't interact as well as a high-end supercomputer specifically designed for many processors to work together, he said.

And, while clusters can boost a scientist's research capabilities, connecting four or more PCs at home would probably offer few benefits to Average Joe or Average Jane.

"Most folks who use their computer to surf the Web and check their e-mail already have a computer that's probably 10 times more powerful than they need," Geist said.

Even video games, the single biggest power draw on most home computers, wouldn't benefit much from a cluster because those games typically are set up to run on a single processor, he said.

Copyright 2002, KnoxNews. All Rights Reserved.
Mirrored with permission.


  ORNL | Directorate | CSM | NCCS | ORNL Disclaimer | Search
Staff only: CSM computers | who, what, where? | news
 
URL: http://www.csm.ornl.gov/PR/NS10-21-02f.html
Updated: Thursday, 24-Oct-2002 12:35:58 EDT

webmaster