[SCore-users] Score limitations?

Amik St-Cyr CFD Lab amik at cfdlab.mcgill.ca
Wed Sep 18 07:20:35 JST 2002


Hi Gents,

	Something strange is happening with our new 
(not yet fully functional) Beowulf.
We have a 128 nodes(dual) cluster connected with both a myrinet2k 
switch and an ethernet 100MB network.
	The only problem (yet) is that when I try a very simple 
program the scrun/mpirun hangs forever when the number of
nodes used is greater then 64.

Example:
--------

| amik at stokes 17:49:23 amik> scout -g cn -n 64
SCOUT: Spawning done.                
SCOUT: session started.
| amik at stokes 18:09:08 amik> scrun -nodes=64x1 ./myname 
SCore-D 5.0.1 connected.
<0:0> SCORE: 64 nodes (64x1) ready.
My name is cn13.clumeq.mcgill.ca
...etc.

WORKS FINE

then:

| amik at stokes 18:09:30 amik> scrun -nodes=64x2 ./myname 
SCore-D 5.0.1 connected.
<0:0> SCORE: 128 nodes (64x2) ready.
My name is cn1.clumeq.mcgill.ca
...etc.

ALSO WORKS FINE

Now booting with 65 nodes:

| amik at stokes 18:09:41 amik> exit
SCOUT: Session done.
| amik at stokes 18:11:09 amik> scout -g cn -n 65
SCOUT: Spawning done.                
SCOUT: session started.
| amik at stokes 18:11:23 amik> scrun -nodes=65x1 ./myname 

Hangs forever...

| amik at stokes 18:11:52 amik> scrun -nodes=65x2 ./myname 

Hangs forever...

| amik at stokes 18:12:39 amik> scrun -nodes=64x2 ./myname 

Hangs forever...

| amik at stokes 18:13:18 amik> scrun -nodes=64x1 ./myname 

Hangs forever...



It seems that when we scout with 65 nodes even the 64 nodes
case does not work.

Do you folks think that:

1)the problem is myrinet2k related ?
2)the driver for the myrinet switch has a problem ?
   we are using:
	/* PM/Myrinet */
	myrinet2k   type=myrinet2k \
		    -firmware:file=/opt/score/share/lanai/lanaiM2k.mcp \
		    -config:file=/opt/score/etc/pm-myrinet.conf
3) Bad connections for the myrinet
4) faulty configuration of node 65?
5) Score is the culprit?(maybe because of a misconfiguration)


Best regards,


Amik St-Cyr
-- 
_____________________________________________________
Dr. A. St-Cyr
Research Associate, CFD Lab
Department of Mechanical Engineering
McGill University
688 Sherbrooke Street West, 7th floor
Montreal, Qc, Canada H3A 2S6
Tel: +1 (514) 398-1710, Admin. Fax : 2203
amik at cfdlab.mcgill.ca
_____________________________________________________




More information about the SCore-users mailing list