[SCore-users] Job suspension

Holger Berger hberger at ess.nec.de
Fri Aug 2 01:03:31 JST 2002


Dear Score-experts,

is it possible to suspend a score job and reuse the nodes
used by the suspended job?

Example:

$ scrun -nodes=4,file=hfscore ./hello_score
SCOUT: Spawning done.
SCore-D 5.0.1 connected.
<0:0> SCORE: 4 nodes (2x2) ready.
Hi, i'm process 3 out of 4 running on necslave01
Hi, i'm process 2 out of 4 running on necslave01
Hi, i'm process 1 out of 4 running on necmaster
Hi, i'm process 0 out of 4 running on necmaster

[1]+  Stopped                 scrun -nodes=4,file=hfscore ./hello_score
$ scrun -nodes=4,file=hfscore ./hello_score
SCOUT: Failed to lock MessageBoard.

Suspension is done using ^Z, but could also come from a batch system.


The batch system will make sure that the suspended task will only start if
the second job is finished.
Would it be possible to allow this?
So in this case, the nodes should no longer be locked but free after suspension,
and scrun should check their state again if it is resumed - they should be
free at this time.

Is is possible to implement this feature?
This would help a lot in integration of scoe with
commercial batch systems, which are offering preemptive
scheduling - which is a very nice feature users love,
as they can interrupt long runnin jobs during day time
for short running jobs.

thanks for any hints and regards
	Holger Berger
	
-- 
Holger Berger                                   hberger at ess.nec.de
NEC European Supercomputer Systems, European HPC Technology Center
Stuttgart, Germany   phone: +49-711-68770-35  fax: +49-711-6877145




More information about the SCore-users mailing list