[SCore-users] Problem starting multi-user

Shinji Sumimoto s-sumi at flab.fujitsu.co.jp
Mon Apr 15 18:42:01 JST 2002


Hi.

Could you try to run scored with -reset option?  It seems some
checkpoint files is on the node 4(comp5).

Shinji.

From: Nick Birkett <nrcb at streamline-computing.com>
Subject: [SCore-users] Problem starting multi-user
Date: Mon, 15 Apr 2002 10:15:04 +0100
Message-ID: <0204151015040K.17042 at pecan.comlab.ox.ac.uk>

nrcb> Hi we had a problem with one of our cluster nodes (comp3) so commented it out
nrcb> of scorehosts.db.
nrcb> 
nrcb> I have restarted scoreboard and msgbserv.
nrcb> 
nrcb> Now when starting scored with sc_watch I get the following error: 
nrcb> 
nrcb> 
nrcb> -----------------------------------------------------------------------------
nrcb> SCOUT: Spawning done.
nrcb> 15/Apr/2002 10:03:42 SYSLOG: /opt/score/deploy/scored
nrcb> 15/Apr/2002 10:03:42 SYSLOG: SCore-D 4.2 $Id: init.cc,v 1.63 2001/09/07 
nrcb> 09:10:26 hori Exp $
nrcb> 15/Apr/2002 10:03:42 SYSLOG: Compile option(s): 
nrcb> 15/Apr/2002 10:03:42 SYSLOG: SCore-D network: myrinet2k/myrinet2k
nrcb> 15/Apr/2002 10:03:42 SYSLOG: Cluster[0]: 
nrcb> (0..4)x1.i386-redhat7-linux2_4.i686.500
nrcb> 15/Apr/2002 10:03:42 SYSLOG:   Memory: 880[MB], Swap: 2048[MB], Disk: 2023[MB]
nrcb> 15/Apr/2002 10:03:42 SYSLOG:   Network[0]: myrinet2k/myrinet2k
nrcb> 15/Apr/2002 10:03:42 SYSLOG:   Network[1]: ethernet/ethernet
nrcb> 15/Apr/2002 10:03:42 SYSLOG: Scheduler initiated: Timeslice = 500 [msec]
nrcb> 15/Apr/2002 10:03:42 SYSLOG:   Queue[0] activated, exclusive scheduling
nrcb> 15/Apr/2002 10:03:42 SYSLOG:   Queue[1] activated, time-sharing scheduling
nrcb> 15/Apr/2002 10:03:42 SYSLOG:   Queue[2] activated, time-sharing scheduling
nrcb> 15/Apr/2002 10:03:42 SYSLOG: Session ID: 0
nrcb> 15/Apr/2002 10:03:42 SYSLOG: Server Host: comp5.ph.bham.ac.uk
nrcb> 15/Apr/2002 10:03:42 SYSLOG: Backup Host: comp1.ph.bham.ac.uk
nrcb> <4> SCore-D:ERROR Unable to continue session-0.
nrcb> 
nrcb> ------------------------------------------------------------------------------
nrcb> 
nrcb> If I also comment comp5 out of scorehosts.db and try again it works (using 
nrcb> comp0,1,2,4).
nrcb> 
nrcb> I have done an rpmtest (ping and point to point) between comp5 and all other
nrcb> available hosts and this seems fine (10 us round trip).
nrcb> 
nrcb> I have encountered this problem before when removing hosts, but have not been
nrcb> able to find what the problem is.
nrcb> 
nrcb> Anybody else seen this ?
nrcb> 
nrcb> Regards,
nrcb> 
nrcb> Nick
nrcb> _______________________________________________
nrcb> SCore-users mailing list
nrcb> SCore-users at pccluster.org
nrcb> http://www.pccluster.org/mailman/listinfo/score-users
nrcb> 
------
Shinji Sumimoto, Fujitsu Labs



More information about the SCore-users mailing list