[SCore-users-jp] SCRUN: Unauthorized connection

vqm_mp vqm_mp @ yahoo.co.jp
2006年 9月 22日 (金) 10:48:49 JST


鈴木です.

早速,再コンパイルしてhello.ccを走らせました.

Single-User,Multi-User環境ともに,今まで正常に動いていた
-nodes=1, -nodes=1x2が,だいたい10回に1回の割合で正常に
動き,残り10回に9回の割合で以下のエラーを起こすように
なりました.

[root @ server test]# scrun -nodes=1 ./a.out
SCore-D 5.8.3 connected.
<0> SCore-D:ERROR Unable to recover checkpoint file
(single host).
<0> SCORE-D:ERROR Killing user process due to the error.
SCORE: Program killed by user.

Multi-User環境において,-nodes=2x2でhello.ccを実行した
結果は,Thu, 21 Sep 2006 11:11:59のMLと変わらないエラー
です.

[root @ server test]# scrun -scored=comp002,nodes=2x2
./a.out
SCore-D 5.8.3 connected (jid=1,reconnect=33053).
SCRUN: Unauthorized connection (INET).

[root @ server test]# scored_dev
SYSLOG: /opt/score/deploy/scored_dev
SYSLOG: SCore-D 5.8.3 $Id: init.cc,v 1.74 2005/02/24
07:47:54 hori Exp $
SYSLOG: Compile option(s): DEVELOPMENT ULT_DO_TRACE
SCORE_DO_TRACE
SYSLOG: SCore-D network: ethernet/ethernet
SYSLOG: Cluster[0]:
(0..1)x2.i386-fedoracore3-linux2_6.penD.2800
SYSLOG:   Memory: 4055[MB], Swap: 1984[MB], Disk:
148135[MB]
SYSLOG:   Network[0]: ethernet/ethernet
SYSLOG: Scheduler initiated: Timeslice = 200 [msec]
SYSLOG:   Queue[0] activated, exclusive scheduling
SYSLOG:   Queue[1] activated, time-sharing scheduling
SYSLOG:   Queue[2] activated, time-sharing scheduling
SYSLOG: Session ID: 0
SYSLOG: Server Host: comp002.pccluster.org
SYSLOG: Backup Host: comp001.pccluster.org @ 0
SYSLOG: Operated by: root
SYSLOG: Recovery canceled by SCore-D:
root @ root@server.pccluster.org:33037, JID: 1
SYSLOG: --------- SCore-D (5.8.3) bootup --------
SYSLOG: Login request: root @ server.pccluster.org:33053
SYSLOG: Login accepted: root @ server.pccluster.org:33053,
JID: 1, Hosts: 4(2x2)@0, Priority: 1, Command: ./a.out
<1> ULT: Exception Signal (11)
 
<1> Attaching GDB: Exception signal
Using host libthread_db library
"/lib/tls/libthread_db.so.1".
`shared object read from target memory' has disappeared;
keeping its symbols.
`shared object read from target memory' has disappeared;
keeping its symbols.
`shared object read from target memory' has disappeared;
keeping its symbols.
0x0819cddd in wait ()
#0  0x0819cddd in wait ()
#1  0x080db5f8 in score_attach_debugger (
    message=0x44e2 <Address 0x44e2 out of bounds>,
exno=11) at ../message.c:289
#2  0x080d58c1 in ult_exception (sig=11, code=51, sc=0x0,
    addr=0x7b <Address 0x7b out of bounds>) at
../mpcrt.c:124
#3  <signal handler called>
#4  0x0804a145 in get_job_netset (job_gp=
          {gval = {gp = {pe = 1, addr = {laddr =
0x8602010, naddr = 140517392, b32s = {d1 = 140517392, d2 =
0}, b8s = {d1 = 16 '\020', d2 = 32 ' ', d3 = 96 '`', d4 =
8 '\b', d5 = 0 '\0', d6 = 0 '\0', d7 = 0 '\0', d8 = 0
'\0'}}, size = 7308}}}, c=1762000896, netset_gp=
          {gval = {gp = {pe = 0, addr = {laddr =
0x85fcb40, naddr = 140495680, b32s = {d1 = 140495680, d2 =
0}, b8s = {d1 = 64 '@', d2 = 203 '?', d3 = 95 '_', d4 = 8
'\b', d5 = 0 '\0', d6 = 0 '\0', d7 = 0 '\0', d8 = 0
'\0'}}, size = 1}}})
    at ../cluster.cc:959
#5  0x08086628 in _sinvoker3<int, GlobalPtr<Job>, int,
GlobalPtr<char> >::invoke () at mpcxx_mttl.h:1701
#6  0xb7df5dc4 in ?? ()
#7  0x0804a114 in reallocate_job () at ../cluster.cc:955
#8  0x080d8976 in ult_get_messages () at ../recv.c:106
#9  0x080d900d in ult_dequeue () at ../ultlib.c:45
#10 0xb7e96488 in ?? ()
#11 0x00000024 in ?? ()
#12 0xb7dd1dc4 in ?? ()
#13 0xb7df5ef8 in ?? ()
#14 0x080862ec in invoke<int, GlobalPtr<FEP>,
GlobalPtr<char> > (retval=Cannot access memory at address
0xfffffbb0
)
    at mpcxx_mttl.h:3829
/opt/score/deploy/score.gdb:1: Error in sourced command
file:
Previous frame inner to this frame (corrupt stack?)




--------------------------------------
[10th Anniversary] special auction campaign now!
http://pr.mail.yahoo.co.jp/auction/



SCore-users-jp メーリングリストの案内