[SCore-users] Adding an additional computing node

Jure Jerman jure.jerman at rzs-hm.si
Wed May 14 18:58:37 JST 2003


Hi,

I checked, the score version is 5.4.0 and
binaries in /opt/score/deploy/* are the same
everywhere.

Behaviour of node numer 13 is getting stranger.
After reboot of score I was able to run hello
program, but not always as it can be seen from
attached output:

The outpout is now:

[root at tuba0 root]# scrun -nodes=28 /home/jure/hello
SCore-D 5.4.0 connected (jid=7).
_pmEthernetAttachContext(9, 0x40017ebc): _pmEthernetMapContext(9, 0x40017ebc): 135088608
<13:0> SCORE:ERROR _pmEthernetAttachContext(9, 0x40017ebc): _pmEthernetMapContext(9, 0x40017ebc): 135088608
pmAttachContext(type=ethernet,fd=9)=135088608
<13:1> SCORE:ERROR pmAttachContext(type=ethernet,fd=9)=135088608
<0:0> SCORE: 28 nodes (14x2) ready.
hello, world (from node 12)
hello, world (from node 10)
hello, world (from node 14)
hello, world (from node 4)
hello, world (from node 23)
hello, world (from node 2)
hello, world (from node 8)
hello, world (from node 6)
hello, world (from node 15)
hello, world (from node 21)
hello, world (from node 22)
hello, world (from node 13)
hello, world (from node 19)
hello, world (from node 17)
hello, world (from node 18)
hello, world (from node 16)
hello, world (from node 20)
hello, world (from node 11)
hello, world (from node 5)
hello, world (from node 3)
hello, world (from node 9)
hello, world (from node 7)
hello, world (from node 1)
hello, world (from node 25)
hello, world (from node 24)
hello, world (from node 0)
[root at tuba0 root]# scrun -nodes=28 /home/jure/hello
SCore-D 5.4.0 connected (jid=8).
<13> SCORE: Program signaled (SIGSEGV).


Note, that behaviour is quite random, it can even happen that
hello runs without any problem what is not the case with cpi
which fails every time.

On the other hand, pm tests run without any problem.

Any additional idea where to search would be very appreciated.

Thanks, Jure

kameyama at pccluster.org wrote:
> In article <1052889899.3ec1d32baf153 at webmail.xenya.si> jure.jerman at rzs-hm.si wrotes:
> 
>>I got the pmAtachContext error just once and I was not able to
>>reproduce it anymore.
>>
>>If I export PM_DEBUG=1, I just get
>>
>>[root at tuba0 jure]# export PM_DEBUG=1
>>[root at tuba0 jure]# scrun -nodes=28 ./hello
>>FEP: Unable to connect with SCore-D (tuba0)
>>FEP:WARNING checkpoint option is ignored in single-user mode.
>>SCore-D 5.4.0 connected.
>><13> SCORE: Program signaled (SIGILL).
> 
> 
> All scored binary (/opt/score/deploy/bin.*/scored*.exe) must be same.
> Your cluster installed SCore 5.4?
> 
> If you want to check score version on all compute hosts.
> Please issue:
>     % scout cat /opt/score/etc/version
> 
>                        from Kameyama Toyohisa
> _______________________________________________
> SCore-users mailing list
> SCore-users at pccluster.org
> http://www.pccluster.org/mailman/listinfo/score-users
> 
> 


-- 
--------------------------------------------------------------
Jure Jerman                       Email: jure.jerman at rzs-hm.si
Environmental Agency of Slovenia  tel:   xx 386 1 478 41 43
Meteorological office             fax:   xx 386 1 478 40 54
Vojkova 1b
SI-1001 Ljubljana
SLOVENIA
--------------------------------------------------------------




More information about the SCore-users mailing list