[SCore-users-jp] Re: [SCore-users] Adding an additional computing node

Jure Jerman jure.jerman @ rzs-hm.si
2003年 5月 14日 (水) 18:58:37 JST


Hi,

I checked, the score version is 5.4.0 and
binaries in /opt/score/deploy/* are the same
everywhere.

Behaviour of node numer 13 is getting stranger.
After reboot of score I was able to run hello
program, but not always as it can be seen from
attached output:

The outpout is now:

[root @ tuba0 root]# scrun -nodes=28 /home/jure/hello
SCore-D 5.4.0 connected (jid=7).
_pmEthernetAttachContext(9, 0x40017ebc): _pmEthernetMapContext(9, 0x40017ebc): 135088608
<13:0> SCORE:ERROR _pmEthernetAttachContext(9, 0x40017ebc): _pmEthernetMapContext(9, 0x40017ebc): 135088608
pmAttachContext(type=ethernet,fd=9)=135088608
<13:1> SCORE:ERROR pmAttachContext(type=ethernet,fd=9)=135088608
<0:0> SCORE: 28 nodes (14x2) ready.
hello, world (from node 12)
hello, world (from node 10)
hello, world (from node 14)
hello, world (from node 4)
hello, world (from node 23)
hello, world (from node 2)
hello, world (from node 8)
hello, world (from node 6)
hello, world (from node 15)
hello, world (from node 21)
hello, world (from node 22)
hello, world (from node 13)
hello, world (from node 19)
hello, world (from node 17)
hello, world (from node 18)
hello, world (from node 16)
hello, world (from node 20)
hello, world (from node 11)
hello, world (from node 5)
hello, world (from node 3)
hello, world (from node 9)
hello, world (from node 7)
hello, world (from node 1)
hello, world (from node 25)
hello, world (from node 24)
hello, world (from node 0)
[root @ tuba0 root]# scrun -nodes=28 /home/jure/hello
SCore-D 5.4.0 connected (jid=8).
<13> SCORE: Program signaled (SIGSEGV).


Note, that behaviour is quite random, it can even happen that
hello runs without any problem what is not the case with cpi
which fails every time.

On the other hand, pm tests run without any problem.

Any additional idea where to search would be very appreciated.

Thanks, Jure

kameyama @ pccluster.org wrote:
> In article <1052889899.3ec1d32baf153 @ webmail.xenya.si> jure.jerman @ rzs-hm.si wrotes:
> 
>>I got the pmAtachContext error just once and I was not able to
>>reproduce it anymore.
>>
>>If I export PM_DEBUG=1, I just get
>>
>>[root @ tuba0 jure]# export PM_DEBUG=1
>>[root @ tuba0 jure]# scrun -nodes=28 ./hello
>>FEP: Unable to connect with SCore-D (tuba0)
>>FEP:WARNING checkpoint option is ignored in single-user mode.
>>SCore-D 5.4.0 connected.
>><13> SCORE: Program signaled (SIGILL).
> 
> 
> All scored binary (/opt/score/deploy/bin.*/scored*.exe) must be same.
> Your cluster installed SCore 5.4?
> 
> If you want to check score version on all compute hosts.
> Please issue:
>     % scout cat /opt/score/etc/version
> 
>                        from Kameyama Toyohisa
> _______________________________________________
> SCore-users mailing list
> SCore-users @ pccluster.org
> http://www.pccluster.org/mailman/listinfo/score-users
> 
> 


-- 
--------------------------------------------------------------
Jure Jerman                       Email: jure.jerman @ rzs-hm.si
Environmental Agency of Slovenia  tel:   xx 386 1 478 41 43
Meteorological office             fax:   xx 386 1 478 40 54
Vojkova 1b
SI-1001 Ljubljana
SLOVENIA
--------------------------------------------------------------

_______________________________________________
SCore-users mailing list
SCore-users @ pccluster.org
http://www.pccluster.org/mailman/listinfo/score-users



SCore-users-jp メーリングリストの案内