[SCore-users-jp] [SCore-users] Copper Myrinet pm problems
Nick Birkett
nrcb @ streamline-computing.com
2003年 2月 2日 (日) 03:17:25 JST
Hi, we have just upgraded one of our older clusters to SCore 5.0.1 from 4.x
(I think it was the first SCore to support Myrinet 2000 from 18 months ago -
RedHat 6.2 dist).
The cluster was working more or less ok under the old SCore system.
The entire system has been re-installed as RedHat 7.2 + SCore 5.0.1.
Hardware: Copper based Myrinet2k (May 2001) and Pentium III dual 866Mhz
SuperMicro 1U Superservers.
I have run rpmtest and the scstest -network myrinet2k for many hours over all
compute nodes without problems.
Have run gm1.6.3 codes (e.g PMB) and they work fine.
SCore PM codes are having problems over Myrinet - e.g running PMB:
<6:0> SCORE:WARNING MPICH/SCore [buffer=0x8951498, type=1025, from=11,
size=262144, offset=189520]
<6:0> SCORE:WARNING MPICH/SCore: receive-message-queue:
<6:0> SCORE:WARNING MPICH/SCore (empty)
<6:0> SCORE:WARNING MPICH/SCore: received-fragment:
<6:0> SCORE:WARNING MPICH/SCore [buffer=0x40066180, type=1025, from=11,
size=262144, fragment_size=8240, offset=189521]
<6:0> SCORE:WARNING MPICH/SCore: queued-message:
<6:0> SCORE:WARNING MPICH/SCore [buffer=0x8951498, type=1025, from=11,
size=262144, offset=189520]
<6:0> SCORE:WARNING MPICH/SCore: received an invalid fragment (mismatched
offset)
<6:0> SCORE:PANIC MPICH/SCore: critical error on message transfer
<6:0> Trying to attach GDB (DISPLAY=localhost:10.0): PANIC
SCORE: Program aborted.
SCOUT: Session done.
Lots of buffer mismatch errors. The same binary runs fine over ethernet or
gigabit on the same hardware (i.e if add the -network=ethernet option then
all ok so it is a Myrinet problem).
We would like to keep SCore as the cluster has some new Xeon Gigabit nodes
but will have to convert to GM if we cannot resolve this.
Looks like a hardware problem (same code runs fine over Score 5.0.1 and fibre
optic Myrinet 2k on Intel Xeon systems).
Thanks,
Nick
_______________________________________________
SCore-users mailing list
SCore-users @ pccluster.org
http://www.pccluster.org/mailman/listinfo/score-users
SCore-users-jp メーリングリストの案内