Thread (28 messages) 28 messages, 5 authors, 2009-03-18

Re: [PATCH 1/3] powerpc: bare minimum checkpoint/restart implementation

From: Cedric Le Goater <hidden>
Date: 2009-03-17 06:55:48

Again, how would 'cr' obtain exit status for these tasks, and how would
it distinguish failure from normal operation?
Here's our solution to this issue.

mcr maintains in its kernel container object an exitcode attribute for 
the mcr-restart process. This process is detached from the fork tree of 
the restarted application.  

when the restart is finished, an mcr-wait command can be called to reap 
this exitcode. This make it possible to distinguish an exit of the 
application process from an exit of the mcr-restart process.

This is a must-have for batch managers in an HPC environment. 

Cheers,

C.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help