Rollback versus Restart…
Rollback Recovery:
- Only Replace Failed Tasks, Roll Back the Rest.
- Monitor ALL Communication for Restart Notify.
- Unroll Program Stack, Reset Comm & Files…
- Necessary for High Overhead Restart Cases.
Restart Recovery:
- Genocide, Kill Everything & Restart All Tasks.
- Simple Approach, No Additional Instrumentation.
- Not as Efficient Recovery in All Cases…