[SCore-users] Problem starting multi-user
Shinji Sumimoto
s-sumi at flab.fujitsu.co.jp
Mon Apr 15 18:42:01 JST 2002
Hi.
Could you try to run scored with -reset option? It seems some
checkpoint files is on the node 4(comp5).
Shinji.
From: Nick Birkett <nrcb at streamline-computing.com>
Subject: [SCore-users] Problem starting multi-user
Date: Mon, 15 Apr 2002 10:15:04 +0100
Message-ID: <0204151015040K.17042 at pecan.comlab.ox.ac.uk>
nrcb> Hi we had a problem with one of our cluster nodes (comp3) so commented it out
nrcb> of scorehosts.db.
nrcb>
nrcb> I have restarted scoreboard and msgbserv.
nrcb>
nrcb> Now when starting scored with sc_watch I get the following error:
nrcb>
nrcb>
nrcb> -----------------------------------------------------------------------------
nrcb> SCOUT: Spawning done.
nrcb> 15/Apr/2002 10:03:42 SYSLOG: /opt/score/deploy/scored
nrcb> 15/Apr/2002 10:03:42 SYSLOG: SCore-D 4.2 $Id: init.cc,v 1.63 2001/09/07
nrcb> 09:10:26 hori Exp $
nrcb> 15/Apr/2002 10:03:42 SYSLOG: Compile option(s):
nrcb> 15/Apr/2002 10:03:42 SYSLOG: SCore-D network: myrinet2k/myrinet2k
nrcb> 15/Apr/2002 10:03:42 SYSLOG: Cluster[0]:
nrcb> (0..4)x1.i386-redhat7-linux2_4.i686.500
nrcb> 15/Apr/2002 10:03:42 SYSLOG: Memory: 880[MB], Swap: 2048[MB], Disk: 2023[MB]
nrcb> 15/Apr/2002 10:03:42 SYSLOG: Network[0]: myrinet2k/myrinet2k
nrcb> 15/Apr/2002 10:03:42 SYSLOG: Network[1]: ethernet/ethernet
nrcb> 15/Apr/2002 10:03:42 SYSLOG: Scheduler initiated: Timeslice = 500 [msec]
nrcb> 15/Apr/2002 10:03:42 SYSLOG: Queue[0] activated, exclusive scheduling
nrcb> 15/Apr/2002 10:03:42 SYSLOG: Queue[1] activated, time-sharing scheduling
nrcb> 15/Apr/2002 10:03:42 SYSLOG: Queue[2] activated, time-sharing scheduling
nrcb> 15/Apr/2002 10:03:42 SYSLOG: Session ID: 0
nrcb> 15/Apr/2002 10:03:42 SYSLOG: Server Host: comp5.ph.bham.ac.uk
nrcb> 15/Apr/2002 10:03:42 SYSLOG: Backup Host: comp1.ph.bham.ac.uk
nrcb> <4> SCore-D:ERROR Unable to continue session-0.
nrcb>
nrcb> ------------------------------------------------------------------------------
nrcb>
nrcb> If I also comment comp5 out of scorehosts.db and try again it works (using
nrcb> comp0,1,2,4).
nrcb>
nrcb> I have done an rpmtest (ping and point to point) between comp5 and all other
nrcb> available hosts and this seems fine (10 us round trip).
nrcb>
nrcb> I have encountered this problem before when removing hosts, but have not been
nrcb> able to find what the problem is.
nrcb>
nrcb> Anybody else seen this ?
nrcb>
nrcb> Regards,
nrcb>
nrcb> Nick
nrcb> _______________________________________________
nrcb> SCore-users mailing list
nrcb> SCore-users at pccluster.org
nrcb> http://www.pccluster.org/mailman/listinfo/score-users
nrcb>
------
Shinji Sumimoto, Fujitsu Labs
More information about the SCore-users
mailing list