[SCore-users-jp] [SCore-users] Resource problem

Nick Birkett nrcb @ streamline-computing.com
2003年 3月 10日 (月) 22:24:10 JST


Hi - I am getting a funny resource problem on a machine that has been running 
Score 5.0.1 for 200 days. 

I can run rpmtest, scstest between 2 nodes.
I can run a parallel job on each node separately,
but when I try to run both hosts together I get a  Resource unavailable
error.

A Myrnet line card was replaced on Friday. Could it be a cable is
in the wrong hole (but surely rpmtest and scstest would not then work )?

I have tried restarting scoreboard and msgbserv and rebooting comp29,30.

Anyone have an idea about this ?

------------------------------------------------------------------------

[nrcb @ saturn mpi]$ cat hosts 
comp29.ex.ac.uk
comp30.ex.ac.uk  

[nrcb @ saturn mpi]$ scout -wait  -F hosts -e scrun -nodes=2 ./jacobi_mpi
SCOUT: Spawning done.          
SCore-D 5.0.1 connected (jid=70).
<0:0> SCORE: 2 nodes (1x2) ready.
  Running with nprocs= 2
  Array size nxg,nyg =  1024 1024
  Iteration count    =  1024
  Running with nprocs= 2
 cpus= 2: Iteration =  10  8.66808374E+12
 cpus= 2: Iteration =  20  8.61407852E+12

WORKS

[nrcb @ saturn mpi]$ scout -wait  -F hosts -e scrun -nodes=4 ./jacobi_mpi
SCOUT: Spawning done.          
FEP:ERROR SCore-D Login failed: Resource unavailable.
SCOUT: Session done.

DOESNT WORK

[nrcb @ saturn mpi]$ cat hosts 
comp30.ex.ac.uk        
comp29.ex.ac.uk


[nrcb @ saturn mpi]$ scout -wait  -F hosts -e scrun -nodes=2 ./jacobi_mpi
SCOUT: Spawning done.          
SCore-D 5.0.1 connected (jid=72).
<0:0> SCORE: 2 nodes (1x2) ready.
  Running with nprocs= 2
  Array size nxg,nyg =  1024 1024
  Iteration count    =  1024
  Running with nprocs= 2
 cpus= 2: Iteration =  10  8.66808374E+12
 cpus= 2: Iteration =  20  8.61407852E+12
 cpus= 2: Iteration =  30  8.57295514E+12

WORKS

[nrcb @ saturn mpi]$ scout -wait  -F hosts -e scrun -nodes=4 ./jacobi_mpi
SCOUT: Spawning done.          
FEP:ERROR SCore-D Login failed: Resource unavailable.
SCOUT: Session done.

DOESNT WORK

The jacob_mpi application is a standard one that works up to 64 processes.

_______________________________________________
SCore-users mailing list
SCore-users @ pccluster.org
http://www.pccluster.org/mailman/listinfo/score-users



SCore-users-jp メーリングリストの案内