High Availability OSCAR Summary

 

 

OSCAR (Open Source Cluster Application Resources) is a snapshot of the best known methods for building, programming, and using clusters. It consists of a fully integrated and easy to install cluster software stack designed for high performance cluster computing HA-OSCAR (High Availability Open Source Cluster Application Resource) is also an open source project that leverages OSCAR to expand High Performance Computing (HPC) clustering toward non-stop services through the combination of High Availability and Performance Computing solutions.

HA-OSCAR’s goal is to enhance a Beowulf cluster to enable mission-critical and downtime-sensitive HPC applications. To achieve high availability, component redundancy is used to eliminate single point of failures, specifically at the head node. HA-OSCAR also incorporates a self-healing mechanism, failure detection and recovery, automatic failover and fail-back.

HA-OSCAR’s is presently at release version 1.1.  This release provides an Active/Hot-Standby mode of high availability for the head node.  In this mode, the primary master head node active and the secondary or standby master, is continually checking that the primary is up and active via a heartbeat mechanism.  Should the secondary master detect that the primary has become inactive; the secondary assumes the role of the primary head node until the original primary again becomes available for processing. This activity fully automated and takes place without the need for human interaction.