[SCore-users-jp] [SCore-users] Job suspension

Holger Berger hberger @ ess.nec.de
2002年 8月 2日 (金) 01:03:31 JST


Dear Score-experts,

is it possible to suspend a score job and reuse the nodes
used by the suspended job?

Example:

$ scrun -nodes=4,file=hfscore ./hello_score
SCOUT: Spawning done.
SCore-D 5.0.1 connected.
<0:0> SCORE: 4 nodes (2x2) ready.
Hi, i'm process 3 out of 4 running on necslave01
Hi, i'm process 2 out of 4 running on necslave01
Hi, i'm process 1 out of 4 running on necmaster
Hi, i'm process 0 out of 4 running on necmaster

[1]+  Stopped                 scrun -nodes=4,file=hfscore ./hello_score
$ scrun -nodes=4,file=hfscore ./hello_score
SCOUT: Failed to lock MessageBoard.

Suspension is done using ^Z, but could also come from a batch system.


The batch system will make sure that the suspended task will only start if
the second job is finished.
Would it be possible to allow this?
So in this case, the nodes should no longer be locked but free after suspension,
and scrun should check their state again if it is resumed - they should be
free at this time.

Is is possible to implement this feature?
This would help a lot in integration of scoe with
commercial batch systems, which are offering preemptive
scheduling - which is a very nice feature users love,
as they can interrupt long runnin jobs during day time
for short running jobs.

thanks for any hints and regards
	Holger Berger
	
-- 
Holger Berger                                   hberger @ ess.nec.de
NEC European Supercomputer Systems, European HPC Technology Center
Stuttgart, Germany   phone: +49-711-68770-35  fax: +49-711-6877145

_______________________________________________
SCore-users mailing list
SCore-users @ pccluster.org
http://www.pccluster.org/mailman/listinfo/score-users



SCore-users-jp メーリングリストの案内