This is an early release of SSS-OSCAR, which is a slightly modified version of the standard OSCAR cluster installation to include the Scalable System Software (SSS) components. The release is still in the early phases and many defaults have been chosen to simplify the installation via OSCAR. Enjoy! INSTALLATION: - For further details please see the "Installation Guide" included in the doc/ directory. - Quick summary root# tar -zxf sss-oscar-VERSION.tar.gz -C /tmp # where VERSION is the software release version root# cd /tmp/sss-oscar-VERSION root# ./configure root# make install root# cd /opt/oscar root# ./install_cluster eth0 RELEASE NOTES: - The version string indicates both the SSS version as well as the OSCAR version used for the release. For example, "sss-oscar-0.2a1-v3.0" is sss alpha v0.2a1 and oscar stable v3.0. - Occasionally on the first invocation of the 'install_cluster' script, an error occurs related to the initialization of the OSCAR DAtabase (ODA) that causes the script to stop. If this occurs, simply re-run the script and it should startup properly. - Some tests have stalled/hung during 'Step3: Install OSCAR Server Packages' when trying to start NFS. Starting/restarting the 'portmap' service fixes the problem, e.g., [root@headnode]# service portmap restart - If standard manual pages are not available, use the following to extend the MANPATH (this is due to a problem with the modules/env-switcher shipped in oscar-3.0). BASH users [root@headnode]# export MANPATH="$MANPATH:" CSH users [root@headnode]# setenv MANPATH "${MANPATH}:" - Due to some differences with standard PBS and the Bamboo & friends tools used with SSS-OSCAR, some of the test scripts are SKIPPED. More specifically, any OSCAR Package test that uses a 'test_user' script, which makes use of the 'pbs_test' helper script, will be flagged as SKIPPED. This will be fixed in a future release. (This issue is known to effect LAM/MPI, MPICH & PVM.) - The following packages were removed from the stock OSCAR package set: maui pbs lam switcher This was due to either an alternate version supplied with SSS-OSCAR or because of conflicts/errors. In the case of switcher, a newer version than was included in OSCAR-3.0 is included with this release. - Occasionally on the first invocation of the 'install_cluster' script, an error occurs related to the initialization of the OSCAR DAtabase (ODA) that causes the script to stop. If this occurs, simply re-run the script and it should startup properly. - During "Step 7: 'Complete Cluster Setup'" some services that are restarted print usage errors when stopping the service. This is generally not a problem and can be ignored. Example, Stopping Event Manager: cat: /var/run/sss_em.pid: No such file or directory kill: usage: kill [-s sigspec | -n signum | -sigspec] [pid | job]... or kill -l [sigspec] done - Warehouse: If the test script fails, try manually restarting Warehouse's client and server services by typing the following command as shown here (in this order): [root@headnode]# /etc/init.d/warehouse_SysMon stop \ && cexec -p /etc/init.d/warehouse_node stop [root@headnode]# cexec -p /etc/init.d/warehouse_node start \ && /etc/init.d/warehouse_SysMon start - If trying to work directly from CVS, the 'make_dist.pl' script should be helpful in pooling together the necessary files. It creates a tarball that can be used for testing. RESOURCES: - Scalable System Software (SSS) Project http://www.scidac.org/ScalableSystems - OSCAR Homepage http://www.OpenClusterGroup.org/OSCAR - SSS-OSCAR project page http://sss-oscar.sourceforge.net # $Id: README,v 1.12 2005/03/24 22:02:31 naughtont Exp $