This is a slightly modified version of the standard OSCAR cluster installation toolkit to include the Scalable System Software (SSS) components. The SSS-OSCAR releases are still in the early phases and many defaults have been chosen to simplify the installation via OSCAR. Enjoy! INSTALLATION: - For further details please see the "Installation Guide" included in the doc/ directory. - Quick summary root# tar -zxf sss-oscar-VERSION.tar.gz -C /tmp # where VERSION is the software release version root# cd /tmp/sss-oscar-VERSION root# ./configure root# make install root# cd /opt/oscar root# ./install_cluster eth0 RELEASE NOTES: - The version string indicates both the SSS version as well as the OSCAR version used for the release. For example, "sss-oscar-1.1-v3.0" is sss v1.1 and oscar stable v3.0. - Some tests have stalled/hung during 'Step3: Install OSCAR Server Packages' when trying to start NFS. Starting/restarting the 'portmap' service fixes the problem, e.g., [root@headnode]# service portmap restart - Due to some differences with standard PBS and the Bamboo & friends tools used with SSS-OSCAR, some of the test scripts are SKIPPED. More specifically, any OSCAR Package test that uses a 'test_user' script, which makes use of the 'pbs_test' helper script, will be flagged as SKIPPED. This will be fixed in a future release. (This issue is known to effect LAM/MPI, MPICH & PVM.) - The following packages were removed from the stock OSCAR package set: maui pbs lam switcher This was due to either an alternate version supplied with SSS-OSCAR or because of conflicts/errors. In the case of switcher, a newer version than was included in OSCAR-3.0 is included with this release. - Occasionally on the first invocation of the 'install_cluster' script, an error occurs related to the initialization of the OSCAR DAtabase (ODA) that causes the script to stop. If this occurs, simply re-run the script and it should startup properly. - During "Step 7: 'Complete Cluster Setup'" some services that are restarted print usage errors when stopping the service. This is generally not a problem and can be ignored. Example, Stopping Event Manager: cat: /var/run/sss_em.pid: No such file or directory kill: usage: kill [-s sigspec | -n signum | -sigspec] [pid | job]... or kill -l [sigspec] done - Warehouse: due to an ordering issue and limitations with this release of OSCAR the top-level 'post_install' script has been modified to run Warehouse's "post_install" API script again after all other scripts in this phase of the install. - Warehouse: To manually restarting Warehouse's client and server services by typing the following commands as shown (in this order): [root@headnode]# /etc/init.d/warehouse_SysMon stop \ && cexec -p /etc/init.d/warehouse_node stop [root@headnode]# cexec -p /etc/init.d/warehouse_node start \ && /etc/init.d/warehouse_SysMon start - If trying to work directly from CVS, the 'make_dist.pl' script should be helpful in pooling together the necessary files. It creates a tarball that can be used for testing. RESOURCES: - Scalable System Software (SSS) Project http://www.scidac.org/ScalableSystems - OSCAR Homepage http://www.OpenClusterGroup.org/OSCAR - SSS-OSCAR project page http://sss-oscar.sourceforge.net # $Id: README,v 1.14 2005/07/08 17:54:51 naughtont Exp $