From hori @ swimmy-soft.com Fri Aug 1 10:16:01 2003 From: hori @ swimmy-soft.com (Atsushi HORI) Date: Fri, 1 Aug 2003 10:16:01 +0900 Subject: [SCore-users-jp] Re: [SCore-users] Re: SCore-users digest, Vol 1 #256 - 4 msgs In-Reply-To: References: <20030731030001.24940.65938.Mailman@www.pccluster.org> Message-ID: <3142577761.hori0000@swimmy-soft.com> Hi, >Thankyou for your reply. I have talked to our key MPI-IO user and he >insists that his code was running perfectly without errors with the >previous version of SCore/MPICH/Redhat 7.2. And the MPI-IO users are >mathemeticians, so they're pretty handy with numbers ;-) Here I would like to clarify the facts. FACT: The mathmaticians program was working well, before Streamline upgraded SCore. Obviously the problem comes from the "upgrade." Asssuming that the other programs are runing well on the upgraded system, I believe we can focus on MPICH/SCore part, especially MPI-IO. The difference between original MPICH and MPICH/SCore is only the part related to communication, and we did not change (even touch) MPI-IO part at all. I am not sure which version you were using, but actually the MPICH/SCore in SCore 5.4 is based on MPICH 1.2.4, and in SCore 5.2 based on MPICH 1.2.0. So, if you were using SCore 5.0 or older, then I think the problem comes from the MPICH 1.2.4, not MPICH 1.2.0. >Assuming that they know their code is giving correct results, is there >anyway they can bypass these error messages that they are getting now? >If they can do that, then they can verify the results for themselves. >If it's not possible to do that, is there any other way around this issue >that you can think of? (which allows them to use a shared filesystem) Fortunately, MPICH 1.2.0 is still supported by SCore 5.4 and you (or Streamline) can find in the SCore 5.4 distribution. And you can test if the problem is still found with MPICH 1.2.0 or not. ---- Atsushi HORI Swimmy Software, Inc. _______________________________________________ SCore-users mailing list SCore-users @ pccluster.org http://www.pccluster.org/mailman/listinfo/score-users From h035102m @ mbox.nagoya-u.ac.jp Fri Aug 1 18:12:53 2003 From: h035102m @ mbox.nagoya-u.ac.jp (Naoshi Ueda) Date: Fri, 1 Aug 2003 18:12:53 +0900 Subject: [SCore-users-jp] error について In-Reply-To: <20030730.173459.304102105.s-sumi@flab.fujitsu.co.jp> References: <200307291935.GAI82364.2360NI40@mbox.nagoya-u.ac.jp> <20030729.210613.596527866.s-sumi@flab.fujitsu.co.jp> <200307301732.AAI60452.3400N26I@mbox.nagoya-u.ac.jp> <20030730.173459.304102105.s-sumi@flab.fujitsu.co.jp> Message-ID: <200308011812.ICB65115.6320IN40@mbox.nagoya-u.ac.jp> 名古屋大学の上田です。 ご返信ありがとうございました。 以下の方法について、質問があります。 >お手数ですが、SMP用カーネルで試して頂けないでしょうか? SMP用カーネルでの試し方というのは、どのように行えばよいのでしょうか? 例えば、起動方法を変えるということでしょうか? 基本的な質問かもしれませんが、よろしくお願いします。 =============================================== 名古屋大学大学院工学研究科 博士課程(前期課程) 上田 尚史 E-MAIL:h035102m @ mbox.nagoya-u.ac.jp =============================================== From s-sumi @ flab.fujitsu.co.jp Fri Aug 1 21:52:25 2003 From: s-sumi @ flab.fujitsu.co.jp (Shinji Sumimoto) Date: Fri, 01 Aug 2003 21:52:25 +0900 (JST) Subject: [SCore-users-jp] error について In-Reply-To: <200308011812.ICB65115.6320IN40@mbox.nagoya-u.ac.jp> References: <200307301732.AAI60452.3400N26I@mbox.nagoya-u.ac.jp> <20030730.173459.304102105.s-sumi@flab.fujitsu.co.jp> <200308011812.ICB65115.6320IN40@mbox.nagoya-u.ac.jp> Message-ID: <20030801.215225.653247641.s-sumi@flab.fujitsu.co.jp> 上田様 富士通研の住元です。 From: Naoshi Ueda Subject: Re: [SCore-users-jp] error について Date: Fri, 1 Aug 2003 18:12:53 +0900 Message-ID: <200308011812.ICB65115.6320IN40 @ mbox.nagoya-u.ac.jp> h035102m> h035102m> 名古屋大学の上田です。 h035102m> h035102m> ご返信ありがとうございました。 h035102m> 以下の方法について、質問があります。 h035102m> h035102m> >お手数ですが、SMP用カーネルで試して頂けないでしょうか? h035102m> h035102m> SMP用カーネルでの試し方というのは、どのように行えばよいのでしょうか? h035102m> 例えば、起動方法を変えるということでしょうか? h035102m> 基本的な質問かもしれませんが、よろしくお願いします。 起動するカーネルを変更します。 SCoreのCD-ROMのrpmの中で kernel-smp-2.4.19-1SCORE.i686.rpm に入っているカーネルか もしくは、 http://www.pccluster.org/score/dist/pub/score-5.4.0/rpm.redhat7.3.i386/kernel-smp-2.4.19-1SCORE.i686.rpm を各ノードにインストールしてブートファイルを変更すれば大丈夫です。 h035102m> h035102m> h035102m> =============================================== h035102m> 名古屋大学大学院工学研究科 博士課程(前期課程) h035102m> 上田 尚史 h035102m> E-MAIL:h035102m @ mbox.nagoya-u.ac.jp h035102m> =============================================== h035102m> _______________________________________________ h035102m> SCore-users-jp mailing list h035102m> SCore-users-jp @ pccluster.org h035102m> http://www.pccluster.org/mailman/listinfo/score-users-jp h035102m> h035102m> ------ Shinji Sumimoto, Fujitsu Labs From m-kawaguchi @ pst.fujitsu.com Sat Aug 2 17:00:10 2003 From: m-kawaguchi @ pst.fujitsu.com (Mitsugu Kawaguchi) Date: Sat, 2 Aug 2003 17:00:10 +0900 Subject: [SCore-users-jp] LDAP環境でのSCore動作実績について Message-ID: <000e01c358cc$18a53a00$570aa8c0@Globus> SCore関係者各位 富士通プライムソフトテクノロジの川口です。 いつもお世話になっております。 LDAP環境下でSCoreを実行しようとしたところ、 scrun実行時に"Segmentation Fault"が発生しました。 LDAPで管理する情報はユーザ情報(passwd,shadow,group)のみで ホスト名などのその他情報はfileまたはNISを利用しています。 なお、NIS登録されているユーザでは正常にジョブ実行できます。 SCoreがLDAPを意識するようなことはないと考えているため、 LDAP環境に問題があるかもしれませんが、 LDAP環境下でのSCore動作実績はあるのでしょうか? また、ジョブ実行できない原因として考えられることはありましたら ご指摘いただけますでしょうか? - LDAPのバージョン:openldap-2.0.11-13 openldap-clients-2.0.11-13 openldap-servers-2.0.11-13 nss_ldap-172-2 - rsh-all、scoutはLDAP登録ユーザで正常に実行可能。 但し、scrun は実行に失敗。   $ scrun ./a.out   セグメンテーション違反です 以上、宜しくお願いします。 --- 川口 mail => m-kawaguchi @ pst.fujitsu.com From kate @ pfu.fujitsu.com Sun Aug 3 12:42:31 2003 From: kate @ pfu.fujitsu.com (KATAYAMA Yoshio) Date: Sun, 03 Aug 2003 12:42:31 +0900 (JST) Subject: [SCore-users-jp] PM/UDP が動作しません Message-ID: <20030803.124231.59484660.kate@pfu.fujitsu.com> ----Next_Part(Sun_Aug__3_12:42:31_2003_968)-- Content-Type: Text/Plain; charset=iso-2022-jp Content-Transfer-Encoding: 7bit PFU の片山です。 お世話になっております。 PM/UDP で SCore プログラムを起動しますと、 SCore-D 5.4.0 connected. が表示されたところで止まってしまいます。rpmtest 及び scstest は うまく動作しています。 添付したファイルは、PM_DEBUG を 8 にして、 scrun -network=udp /opt/score/demo/bin/mandel を実行した時のログです。 ログを見ますと、/tmp/pmagent0/device でソケットをオープンしよう としているようですが、サーバーホスト、計算ホストともに、 /tmp/pmagent0 を作った形跡がありません。 # /tmp の mtime が変わりません その代わり、計算ノードの /tmp/pm ディレクトリの mtimeが更新され ています。 何が原因、或は何を調べればよいのでしょうか。 SCore は、RedHat 8.0 に SCore 5.4.0 をソースインストールしました。 -- (株)PFU 第二システム統括部 Linuxシステム部 片山 善夫 Tel 044-520-6617 Fax 044-556-1022 ----Next_Part(Sun_Aug__3_12:42:31_2003_968)-- Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename=LOG ethernet_open_device(): -config /var/scored/scoreboard/clust.0000B3002Jkt pmEthernetOpenDevice: Library version $Id: pm_ethernet.c,v 1.64 2002/03/04 09:44:42 s-sumi Exp $ pmEthernetReadConfig(0x85a8cc8, unit, 0): set unit number "0" (MAX: 4). pmEthernetReadConfig(0x85a8cc8, maxnsend, 24): set maxnsend "24". pmEthernetReadConfig(0x85a8cc8, backoff, 2400): set backoff "2400" usec. pmEthernetReadConfig(0x85a8cc8, checksum, 0): set checksum "0" off. pmEthernetGetNodeByNumber(0x85a8cc8, 0, 0xbffff1f8): not found pmEthernetGetNodeByNumber(0x85a8cc8, 0, 0xbffff1f4): not found store host 00:02:B3:B2:EE:46: dev{0] mac 0x00000002b3b2ee46 pmEthernetGetNodeByNumber(0x85a8cc8, 1, 0xbffff1f8): not found pmEthernetGetNodeByNumber(0x85a8cc8, 1, 0xbffff1f4): not found store host 00:02:B3:B2:EE:FC: dev{0] mac 0x00000002b3b2eefc pmEthernetOpenDevice("/var/scored/scoreboard/clust.0000B3002Jkt", 0xbffff6b4): pmEthernetMapEthernet(0, 0xbffff508): 0 Ethernet(0): fd=512 self comp3.beri.or.jp n 1 of 2 nodes ethernet_open_device(): -config /var/scored/scoreboard/clust.0000B3002Jkt pmEthernetOpenDevice: Library version $Id: pm_ethernet.c,v 1.64 2002/03/04 09:44:42 s-sumi Exp $ pmEthernetReadConfig(0x85a8cc8, unit, 0): set unit number "0" (MAX: 4). pmEthernetReadConfig(0x85a8cc8, maxnsend, 24): set maxnsend "24". pmEthernetReadConfig(0x85a8cc8, backoff, 2400): set backoff "2400" usec. pmEthernetReadConfig(0x85a8cc8, checksum, 0): set checksum "0" off. pmEthernetGetNodeByNumber(0x85a8cc8, 0, 0xbffff1f8): not found pmEthernetGetNodeByNumber(0x85a8cc8, 0, 0xbffff1f4): not found store host 00:02:B3:B2:EE:46: dev{0] mac 0x00000002b3b2ee46 pmEthernetGetNodeByNumber(0x85a8cc8, 1, 0xbffff1f8): not found pmEthernetGetNodeByNumber(0x85a8cc8, 1, 0xbffff1f4): not found store host 00:02:B3:B2:EE:FC: dev{0] mac 0x00000002b3b2eefc pmEthernetOpenDevice("/var/scored/scoreboard/clust.0000B3002Jkt", 0xbffff6b4): pmEthernetMapEthernet(0, 0xbffff508): 0 Ethernet(0): fd=512 self comp2.beri.or.jp n 0 of 2 nodes pm_ethernetCalibrateTimer(): loop t:3.792354e+07, vt: 1.896300e-02 pm_ethernetCalibrateTimer(): loop t:3.796258e+07, vt: 1.898400e-02 pm_ethernetCalibrateTimer(): loop t:3.994558e+07, vt: 1.997400e-02 pm_ethernetCalibrateTimer(): end loop t:3.994558e+07, vt: 1.997400e-02 pm_ethernetCalibrateTimer(): d0:1.999870e+09, d1:1.999879e+09 pm_ethernetCalibrateTimer(): clk:1999, clock 1.999875e+03 pmEthernetOpenDevice: Driver version $Id: pm_ethernet_dev.c,v 1.4 2003/02/03 03:11:42 kameyama Exp $ ethernet_open_device(): success pmAgentConnectSocket(/tmp/pmagent0/device): connect: No such file or directory pmAgentInvoke(0): invoking: /opt/score5.4.0/deploy/pmaudp -s 7 -u 0 pm_ethernetCalibrateTimer(): loop t:3.993891e+07, vt: 1.998000e-02 pm_ethernetCalibrateTimer(): end loop t:3.993891e+07, vt: 1.998000e-02 pm_ethernetCalibrateTimer(): d0:1.999714e+09, d1:1.998944e+09 pm_ethernetCalibrateTimer(): clk:1999, clock 1.999329e+03 pmEthernetOpenDevice: Driver version $Id: pm_ethernet_dev.c,v 1.4 2003/02/03 03:11:42 kameyama Exp $ ethernet_open_device(): success pmAgentConnectSocket(/tmp/pmagent0/device): connect: No such file or directory pmAgentInvoke(0): invoking: /opt/score5.4.0/deploy/pmaudp -s 7 -u 0 [0](0) PM/Ethernet CTX map Ctx(0x85aa008): send=0x4003a000, recv=0x4005a000, shared=0x40000000 success [1](0)pmEthernetRegisterProc(): proc 2232(2232), tid 2232 [1](0) pmEthernetAssociateNodes(0x85aa008, 0x823a3e0, 2):ndev=1 [1](0) pmEthernetBindChannel(0x85aa008, 0, 15): called [0](0) PM/Ethernet CTX map Ctx(0x85aa008): send=0x4003a000, recv=0x4005a000, shared=0x40000000 success [0](0)pmEthernetRegisterProc(): proc 2126(2126), tid 2126 [0](0) pmEthernetAssociateNodes(0x85aa008, 0x823a3e0, 2):ndev=1 [0](0) pmEthernetBindChannel(0x85aa008, 0, 15): called SCore-D 5.4.0 connected. [1](0) pmEthernetCloseContext(0x85aa008): called [1](0) _pmEthernetUnmapContext(0x85aa008): called [1] pmEthernetCloseDevice(0x85a8f28): called [0](0) pmEthernetCloseContext(0x85aa008): called [0](0) _pmEthernetUnmapContext(0x85aa008): called [0] pmEthernetCloseDevice(0x85a8f28): called ----Next_Part(Sun_Aug__3_12:42:31_2003_968)---- From hchen @ Mdl.ipc.pku.edu.cn Mon Aug 4 02:31:39 2003 From: hchen @ Mdl.ipc.pku.edu.cn (Chen Hao) Date: Mon, 4 Aug 2003 01:31:39 +0800 Subject: [SCore-users-jp] [SCore-users] pbs_mom error Message-ID: <001901c359e5$1f05ea10$9101a8c0@lazy> Dear All: I install score 5.4 on my rh8 liux boxes. But when I use sc_qsub to submit task, pbs_mom daemon report: 08/04/2003 00:41:24;0080; pbs_mom;Fil;sys_copy;command: /bin/cp -r /var/scored/pbs/spool/22.dogman.ER /datadisk/people/hchen/benchmark/MD/charmm/mbco/charmm.e22 status=12, try=4 08/04/2003 00:41:45;0004; pbs_mom;Fil;22.dogman.ER;Unable to copy file 22.dogman.ER to dogman:/datadisk/people/hchen/benchmark/MD/charmm/mbco/charmm.e22 08/04/2003 00:41:45;0001; pbs_mom;Svr;pbs_mom;Permission denied (13) in req_cpyfile, Unable to rename /var/scored/pbs/spool/22.dogman.ER to /var/scored/pbs/undelivered/22.dogman.ER What's the matter? -------------- next part -------------- HTMLの添付ファイルを保管しました... URL: From M.Newiger @ deltacomputer.de Mon Aug 4 09:29:19 2003 From: M.Newiger @ deltacomputer.de (Martin Newiger) Date: Mon, 4 Aug 2003 02:29:19 +0200 Subject: [SCore-users-jp] [SCore-users] ext2-Filesystem on compute nodes Message-ID: Hi, is there any reason why the compute nodes get a ext2-filesystem when they are installed by eit? Is there any argument against using ext3 by default in the next SCore-Version? >Regards >Martin Newiger _______________________________________________ SCore-users mailing list SCore-users @ pccluster.org http://www.pccluster.org/mailman/listinfo/score-users From kameyama @ pccluster.org Mon Aug 4 10:18:30 2003 From: kameyama @ pccluster.org (=?iso-2022-jp?b?a2FtZXlhbWEgGyRCIXcbKEIgcGNjbHVzdGVyLm9yZw==?=) Date: Mon, 04 Aug 2003 10:18:30 +0900 Subject: [SCore-users-jp] PM/UDP が動作しません In-Reply-To: Your message of "Sun, 03 Aug 2003 12:42:31 JST." <20030803.124231.59484660.kate@pfu.fujitsu.com> Message-ID: <20030804011330.8392212894C@neal.il.is.s.u-tokyo.ac.jp> 亀山です. In article <20030803.124231.59484660.kate @ pfu.fujitsu.com> KATAYAMA Yoshio wrotes: > PM/UDP で SCore プログラムを起動しますと、 PM/agent/UDP は SCore 4.1 からサポート対象外になっているのですが... > SCore-D 5.4.0 connected. > > が表示されたところで止まってしまいます。rpmtest 及び scstest は > うまく動作しています。 rpmtest が動いて > 添付したファイルは、PM_DEBUG を 8 にして、 > > scrun -network=udp /opt/score/demo/bin/mandel > > を実行した時のログです。 > > ログを見ますと、/tmp/pmagent0/device でソケットをオープンしよう > としているようですが、サーバーホスト、計算ホストともに、 > /tmp/pmagent0 を作った形跡がありません。 というのがちょっと不可解ですが... この directory は実際は使用されていないようです. pmaudp のオプションによっては使用されるようですが, pmAgentInvoke(0): invoking: /opt/score5.4.0/deploy/pmaudp -s 7 -u 0 と起動した場合, program と agent の間は pipe (agent 側の file descriptor 7) を使用するようです. > その代わり、計算ノードの /tmp/pm ディレクトリの mtimeが更新され > ています。 こっちは PM/Ethernet が使用します. 止まったのは別の原因かもしれません. PM/Agent/UDP をはずした場合うまく動くでしょうか? from Kameyama Toyohisa From kameyama @ pccluster.org Mon Aug 4 11:27:10 2003 From: kameyama @ pccluster.org (=?iso-2022-jp?b?a2FtZXlhbWEgGyRCIXcbKEIgcGNjbHVzdGVyLm9yZw==?=) Date: Mon, 04 Aug 2003 11:27:10 +0900 Subject: [SCore-users-jp] Re: [SCore-users] pbs_mom error In-Reply-To: Your message of "Mon, 04 Aug 2003 01:31:39 JST." <001901c359e5$1f05ea10$9101a8c0@lazy> Message-ID: <20030804022210.57E0C12894C@neal.il.is.s.u-tokyo.ac.jp> In article <001901c359e5$1f05ea10$9101a8c0 @ lazy> "Chen Hao" wrotes: > I install score 5.4 on my rh8 liux boxes. But when I use sc_qsub to submit task, > pbs_mom daemon report: > 08/04/2003 00:41:24;0080; pbs_mom;Fil;sys_copy;command: /bin/cp -r /var/scored > /pbs/spool/22.dogman.ER /datadisk/people/hchen/benchmark/MD/charmm/mbco/charmm.e22 status=12, try=4 status 12 says Out of memory. Please check memory usage on the host. from Kameyama Toyohisa _______________________________________________ SCore-users mailing list SCore-users @ pccluster.org http://www.pccluster.org/mailman/listinfo/score-users From kate @ pfu.fujitsu.com Mon Aug 4 11:48:39 2003 From: kate @ pfu.fujitsu.com (KATAYAMA Yoshio) Date: Mon, 04 Aug 2003 11:48:39 +0900 (JST) Subject: [SCore-users-jp] PM/UDP が動作しません In-Reply-To: <20030804011330.8392212894C@neal.il.is.s.u-tokyo.ac.jp> References: <20030803.124231.59484660.kate@pfu.fujitsu.com> <20030804011330.8392212894C@neal.il.is.s.u-tokyo.ac.jp> Message-ID: <20030804.114839.126580087.kate@pfu.fujitsu.com> PFU の片山です。 御回答有難う御座います。 From: kameyama @ pccluster.org Subject: Re: [SCore-users-jp] PM/UDP が動作しません Date: Mon, 04 Aug 2003 10:18:30 +0900 > > PM/UDP で SCore プログラムを起動しますと、 > > PM/agent/UDP は SCore 4.1 からサポート対象外になっているのですが... 今までは bininstall でインストールしたことしかなく、pmaudp がイ ンストールされなかったので PM/agent/UDP を試したことがありません でした。今回、初めてソースからインストールしたのですが、pmaudp がインストールされていたので、PM/agent/UDP を試してみました。 サポート対象外ということは、PM/agent/UDP で動作しなくても、SCore としては問題ない(というか、仕方ない)ということでしょうか。 # PM/agent/UDP が必要というわけではありませんので、、、 > > SCore-D 5.4.0 connected. > > > > が表示されたところで止まってしまいます。rpmtest 及び scstest は > > うまく動作しています。 > > rpmtest が動いて > > > ログを見ますと、/tmp/pmagent0/device でソケットをオープンしよう > > としているようですが、サーバーホスト、計算ホストともに、 > > /tmp/pmagent0 を作った形跡がありません。 > > というのがちょっと不可解ですが... > > この directory は実際は使用されていないようです. PM_DEBUG=8 で scstest を行なってみましたら、同様のメッセージを出 力してから動作を開始していましたので、/tmp/pmagent0/device オー プンのエラーメッセージは気にしなくてもよさそうです。 scorehosts.db で定義しているネットワークを udp だけにしてみたら、 “SCore-D 5.4.0 connected.”のメッセージも出なくなりました。この 状態で、PM_DEBUG を 3 以上にした場合、Ctrl-C で scrun を中止して も計算ホストがロックされたままになり、scout から抜ける必要があり ます。PM_DEBUG が 2 以下の場合は、Ctrl-C で scrun を中止すれば計 算ホストのロックが解除されますが、PM_DEBUG によるメッセージも出 力されませんでした。 > 止まったのは別の原因かもしれません. > PM/Agent/UDP をはずした場合うまく動くでしょうか? ethernet では、うまく動作しています。 ただし、マルチユーザモードでは、sc_console を起動するとサーバと なっている計算ノードで SIGSEGV が発生し、sc_console のコマンドが 受け付けられません。この問題はまだ状況を確認中ですので、もう少し 調査してから報告させて戴きます。 -- (株)PFU 第二システム統括部 Linuxシステム部 片山 善夫 Tel 044-520-6617 Fax 044-556-1022 From kameyama @ pccluster.org Mon Aug 4 12:31:29 2003 From: kameyama @ pccluster.org (=?iso-2022-jp?b?a2FtZXlhbWEgGyRCIXcbKEIgcGNjbHVzdGVyLm9yZw==?=) Date: Mon, 04 Aug 2003 12:31:29 +0900 Subject: [SCore-users-jp] PM/UDP が動作しません In-Reply-To: Your message of "Mon, 04 Aug 2003 11:48:39 JST." <20030804.114839.126580087.kate@pfu.fujitsu.com> Message-ID: <20030804032629.1A2D3128950@neal.il.is.s.u-tokyo.ac.jp> 亀山です. In article <20030804.114839.126580087.kate @ pfu.fujitsu.com> KATAYAMA Yoshio wrotes: > From: kameyama @ pccluster.org > Subject: Re: [SCore-users-jp] PM/UDP が動作しません > Date: Mon, 04 Aug 2003 10:18:30 +0900 > > > > PM/UDP で SCore プログラムを起動しますと、 > > > > PM/agent/UDP は SCore 4.1 からサポート対象外になっているのですが... > > 今までは bininstall でインストールしたことしかなく、pmaudp がイ > ンストールされなかったので PM/agent/UDP を試したことがありません > でした。今回、初めてソースからインストールしたのですが、pmaudp > がインストールされていたので、PM/agent/UDP を試してみました。 pmaudp は compute host のみで必要なので, rpm で install した場合, server には install されません. compute host には install だけはされているはずなのですが... > サポート対象外ということは、PM/agent/UDP で動作しなくても、SCore > としては問題ない(というか、仕方ない)ということでしょうか。 はい. from Kameyama Toyohisa From kate @ pfu.fujitsu.com Mon Aug 4 18:43:14 2003 From: kate @ pfu.fujitsu.com (KATAYAMA Yoshio) Date: Mon, 04 Aug 2003 18:43:14 +0900 (JST) Subject: [SCore-users-jp] PM/UDP が動作しません In-Reply-To: <20030804032629.1A2D3128950@neal.il.is.s.u-tokyo.ac.jp> References: <20030804.114839.126580087.kate@pfu.fujitsu.com> <20030804032629.1A2D3128950@neal.il.is.s.u-tokyo.ac.jp> Message-ID: <20030804.184314.108750701.kate@pfu.fujitsu.com> PFU の片山です。 お世話になっております。 From: kameyama @ pccluster.org Subject: Re: [SCore-users-jp] PM/UDP が動作しません Date: Mon, 04 Aug 2003 12:31:29 +0900 > pmaudp は compute host のみで必要なので, rpm で install した場合, > server には install されません. > > compute host には install だけはされているはずなのですが... RPM でインストールしたシステムで確認してみたところ、計算ホストに pmaudp がインストールされていました。どうも勘違いしていたようで す。お騒がせして済みませんでした。 -- (株)PFU 第二システム統括部 Linuxシステム部 片山 善夫 Tel 044-520-6617 Fax 044-556-1022 From bjudy46 @ erols.com Wed Aug 6 18:52:48 2003 From: bjudy46 @ erols.com (=?iso-2022-jp?b?Ymp1ZHk0NiAbJEIhdxsoQiBlcm9scy5jb20=?=) Date: Wed, 6 Aug 2003 18:52:48 +0900 Subject: [SCore-users-jp] [SCore-users] Re: hey Message-ID: <200308060948.h769mZm13504@pccluster.org> HTMLの添付ファイルを保管しました... URL: From nick @ streamline-computing.com Thu Aug 7 15:25:26 2003 From: nick @ streamline-computing.com (Nick Birkett) Date: Thu, 7 Aug 2003 07:25:26 +0100 Subject: [SCore-users-jp] Re: [SCore-users] MPI-IO In-Reply-To: <200307300828.26514.nick@streamline-computing.com> References: <200307300828.26514.nick@streamline-computing.com> Message-ID: <200308070725.27070.nick@streamline-computing.com> On Wednesday 30 July 2003 08:28, Nick Birkett wrote: > Hi Score team. We seem to be having an MPI-IO problem with Score 5.4 x86 > RedHat 7.3. > The users files are NFS mounted from the front end (RedHat 7.3 x86) > > > ---------- Forwarded Message ---------- > > Here is an issue that one of our users discovered, which has only started > occuring with the new version of SCore. > > > <0:0> SCORE: 32 nodes (16x2) ready. > > File locking failed in ADIOI_Set_lock. If the file system is NFS, you > I think the Romio package needs to have the nfslock service (rpc.statd) running on each nfs client and on the nfs server (eg the Front End and each compute node) when user directories are nfs mouinted. As far as I can see RedHat 7.2 or later already does nfs version 3 by default. nfslock is contained in the nfs-utils rpm package. nfs-utils needs to be added and nsflock service started. nfs-utils was included in the Score 5.0 realease but not in the Score 5.4 release. Can nfs-utils be added to later Score releases please ? Thanks. Nick _______________________________________________ SCore-users mailing list SCore-users @ pccluster.org http://www.pccluster.org/mailman/listinfo/score-users From nick @ streamline-computing.com Thu Aug 7 15:39:13 2003 From: nick @ streamline-computing.com (Nick Birkett) Date: Thu, 7 Aug 2003 07:39:13 +0100 Subject: [SCore-users-jp] [SCore-users] EIT partitions In-Reply-To: <200307300828.26514.nick@streamline-computing.com> References: <200307300828.26514.nick@streamline-computing.com> Message-ID: <200308070739.13043.nick@streamline-computing.com> Dear Score developers. We have some suggestions to improve EIT. Currently the EIT generated kickstart files do no specify exact disk partitions (eg hda3, sdb2, etc). The problem is that the compute node /etc/fstab files are not identical even if the hardware is the same. Usually this means that the swap, tmp and var partitions are different on different compute nodes which makes doing global changes (eg adding extra mount points/mount options) difficult. Also some of our users require 2 hard disks in each compute node with the second disk set up in a special way (eg striped or mirrored with first disk). Currently the EIT kickstart files will add system partitions to all available hard disks (eg / on disk 1, var of disk2 etc). It would be nice to specify to add system partitions only to eg the first disk. I think RedHat kickstart has an option to specify disks. I don't know if this is easy to add to the EIT Anaconda script ? Thanks, Nick _______________________________________________ SCore-users mailing list SCore-users @ pccluster.org http://www.pccluster.org/mailman/listinfo/score-users From hori @ swimmy-soft.com Thu Aug 7 16:03:45 2003 From: hori @ swimmy-soft.com (Atsushi HORI) Date: Thu, 7 Aug 2003 16:03:45 +0900 Subject: [SCore-users-jp] Re: [SCore-users] MPI-IO In-Reply-To: <200308070725.27070.nick@streamline-computing.com> References: <200307300828.26514.nick@streamline-computing.com> Message-ID: <3143117025.hori000f@swimmy-soft.com> Hi, >nfslock is contained in the nfs-utils rpm package. nfs-utils needs >to be added >and nsflock service started. Mount RedHat CD-ROM and copy the RPM file into some NFS region. Then install the package in a SCOUT environment so that the package will be installed on every host. I believe this procedure is not so hard to do. ---- Atsushi HORI Swimmy Software, Inc. _______________________________________________ SCore-users mailing list SCore-users @ pccluster.org http://www.pccluster.org/mailman/listinfo/score-users From qwaygfqzxwsh @ msn.com Sat Aug 9 12:35:35 2003 From: qwaygfqzxwsh @ msn.com (=?iso-2022-jp?b?cXdheWdmcXp4d3NoIBskQiF3GyhCIG1zbi5jb20=?=) Date: Sat, 9 Aug 2003 11:35:35 +0800 Subject: [SCore-users-jp] [SCore-users] Re: hey Message-ID: <200308090334.h793YXm30933@pccluster.org> HTMLの添付ファイルを保管しました... URL: From a_vid2 @ netscape.net Sun Aug 10 06:15:06 2003 From: a_vid2 @ netscape.net (=?iso-2022-jp?b?YV92aWQyIBskQiF3GyhCIG5ldHNjYXBlLm5ldA==?=) Date: Sat, 9 Aug 2003 17:15:06 -0400 Subject: [SCore-users-jp] [SCore-users] immediate value for you Message-ID: <200308092115.h79LF4m13644@pccluster.org> VIRUS ALERT most viruses are received via email Norton Antivirus will keep you safe from all virus systems, and scans all emails automatically! btw, you look great today. Purchase Norton Now! http://ppa1232 @ profitableproducts.com/default.asp?id=3000 ps. dont want any more of this shit? http://profitableproducts.com/remove/remove.html _______________________________________________ SCore-users mailing list SCore-users @ pccluster.org http://www.pccluster.org/mailman/listinfo/score-users From ggrjwm25tz @ yahoo.com Tue Aug 12 01:36:11 2003 From: ggrjwm25tz @ yahoo.com (Herschel Dolan) Date: Mon, 11 Aug 03 16:36:11 GMT Subject: [SCore-users-jp] [SCore-users] rock hard erections and size gains fh tlj hlhpjdvdjnute Message-ID: HTMLの添付ファイルを保管しました... URL: From nick @ streamline-computing.com Tue Aug 12 09:30:24 2003 From: nick @ streamline-computing.com (Nick Birkett) Date: Tue, 12 Aug 2003 01:30:24 +0100 Subject: [SCore-users-jp] [SCore-users] pm gigabit Message-ID: <200308120130.24698.nick@streamline-computing.com> We have some new motherboards with 82546EB Gigabit Ethernet Controller. This required adding the Intel 5.1.11 e1000 driver to the Score 2.4.19-1SCORE kernel. The system has some pm problems: pm-ethernet (gigabit): unit 0 maxnsend 16 backoff 2048 checksum 1 gives pm timeouts using scstest. When running application we get : <6> SCore-D:WARNING PM gigaethernet/ethernet device already opened. <6> SCore-D:ERROR No PM device opened. Do I just need to increase the backoff value or is there a driver /hardware problem ? Our previous motherboards with same chip (older revision) works fine. Can supply exact hardware specification if needed. Regards, Nick -- Dr Nick Birkett Technical Director Streamline Computing Ltd The Innovation Centre Warwick Technology Park Gallows Hill Warwick CV34 6UW Tel : +44 (0)1926 623130 Fax : +44 (0)1926 623140 Mobile : +44 (0)7890 246662 Email : nrcb @ streamline-computing.com Support : support @ streamline-computing.com Web : http://www.streamline-computing.com _______________________________________________ SCore-users mailing list SCore-users @ pccluster.org http://www.pccluster.org/mailman/listinfo/score-users From kameyama @ pccluster.org Tue Aug 12 10:24:38 2003 From: kameyama @ pccluster.org (=?iso-2022-jp?b?a2FtZXlhbWEgGyRCIXcbKEIgcGNjbHVzdGVyLm9yZw==?=) Date: Tue, 12 Aug 2003 10:24:38 +0900 Subject: [SCore-users-jp] Re: [SCore-users] pm gigabit In-Reply-To: Your message of "Tue, 12 Aug 2003 01:30:24 JST." <200308120130.24698.nick@streamline-computing.com> Message-ID: <20030812011913.88AD312894C@neal.il.is.s.u-tokyo.ac.jp> In article <200308120130.24698.nick @ streamline-computing.com> Nick Birkett wrotes: > The system has some pm problems: pm-ethernet (gigabit): > > unit 0 > maxnsend 16 > backoff 2048 > checksum 1 > > gives pm timeouts using scstest. > When running application we get : > > <6> SCore-D:WARNING PM gigaethernet/ethernet device already opened. > <6> SCore-D:ERROR No PM device opened. Is there any other process using PM/ethernet on compute host 6? If any otherprocess is not running, please set PM_DEBUG environment variable is 1, and re-run that application. from Kameyama Toyohisa _______________________________________________ SCore-users mailing list SCore-users @ pccluster.org http://www.pccluster.org/mailman/listinfo/score-users From qcnqspo @ bigfoot.com Wed Aug 13 03:03:56 2003 From: qcnqspo @ bigfoot.com (Doreen Magee) Date: Tue, 12 Aug 03 18:03:56 GMT Subject: [SCore-users-jp] [SCore-users] Safe, Natural Penis growth Pills mcxk Message-ID: HTMLの添付ファイルを保管しました... URL: From prebfnpm5 @ msn.com Wed Aug 13 12:02:18 2003 From: prebfnpm5 @ msn.com (Kate Yarbrough) Date: Wed, 13 Aug 03 03:02:18 GMT Subject: [SCore-users-jp] [SCore-users] Your Tits are to small! ub edoatjbwknfe Message-ID: <5zm2b$9-d3g72-lm1tq0-lviz-5s8e@fpq.w.e4rksxe> HTMLの添付ファイルを保管しました... URL: From kate @ pfu.fujitsu.com Tue Aug 12 22:46:13 2003 From: kate @ pfu.fujitsu.com (KATAYAMA Yoshio) Date: Tue, 12 Aug 2003 22:46:13 +0900 (JST) Subject: [SCore-users-jp] scored (PM/UDP が動作しません) In-Reply-To: <20030804.114839.126580087.kate@pfu.fujitsu.com> References: <20030803.124231.59484660.kate@pfu.fujitsu.com> <20030804011330.8392212894C@neal.il.is.s.u-tokyo.ac.jp> <20030804.114839.126580087.kate@pfu.fujitsu.com> Message-ID: <20030812.224613.03998136.kate@pfu.fujitsu.com> ----Next_Part(Tue_Aug_12_22:46:13_2003_464)-- Content-Type: Text/Plain; charset=iso-2022-jp Content-Transfer-Encoding: 7bit PFU の片山です。 お世話になっております。 From: KATAYAMA Yoshio Subject: Re: [SCore-users-jp] PM/UDP が動作しません Date: Mon, 04 Aug 2003 11:48:39 +0900 (JST) > ただし、マルチユーザモードでは、sc_console を起動するとサーバと > なっている計算ノードで SIGSEGV が発生し、sc_console のコマンドが > 受け付けられません。この問題はまだ状況を確認中ですので、もう少し > 調査してから報告させて戴きます。 PM_DEBUG=8 で scored を起動して、すぐに sc_console を行なった時 のログです。メッセージからすると SIGSEGV が発生しているようです。 環境は RedHat 8.0 + SCore 5.4 です。 SCore を make する時の configure には --prefix=/opt/score5.4.0 を付けています。 何を調べればよいのか分からず、手詰りになっています。 -- (株)PFU 第二システム統括部 Linuxシステム部 片山 善夫 Tel 044-520-6617 Fax 044-556-1022 ----Next_Part(Tue_Aug_12_22:46:13_2003_464)-- Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename=LOG ethernet_open_device(): -config /var/scored/scoreboard/clust.0000B3002Jk7 ethernet_open_device(): -config /var/scored/scoreboard/clust.0000B3002Jk7 pmEthernetOpenDevice: Library version $Id: pm_ethernet.c,v 1.64 2002/03/04 09:44:42 s-sumi Exp $ pmEthernetReadConfig(0x85a8cb8, unit, 0): set unit number "0" (MAX: 4). pmEthernetReadConfig(0x85a8cb8, maxnsend, 24): set maxnsend "24". pmEthernetReadConfig(0x85a8cb8, backoff, 2400): set backoff "2400" usec. pmEthernetReadConfig(0x85a8cb8, checksum, 0): set checksum "0" off. pmEthernetGetNodeByNumber(0x85a8cb8, 0, 0xbffff158): not found pmEthernetGetNodeByNumber(0x85a8cb8, 0, 0xbffff154): not found store host 00:02:B3:B2:EE:FC: dev{0] mac 0x00000002b3b2eefc pmEthernetGetNodeByNumber(0x85a8cb8, 1, 0xbffff158): not found pmEthernetGetNodeByNumber(0x85a8cb8, 1, 0xbffff154): not found store host 00:02:B3:B2:EE:46: dev{0] mac 0x00000002b3b2ee46 pmEthernetOpenDevice("/var/scored/scoreboard/clust.0000B3002Jk7", 0xbffff614): pmEthernetMapEthernet(0, 0xbffff468): 0 Ethernet(0): fd=512 self comp2.beri.or.jp n 0 of 2 nodes pm_ethernetCalibrateTimer(): loop t:3.805141e+07, vt: 1.902900e-02 pmEthernetOpenDevice: Library version $Id: pm_ethernet.c,v 1.64 2002/03/04 09:44:42 s-sumi Exp $ pmEthernetReadConfig(0x85a8cb8, unit, 0): set unit number "0" (MAX: 4). pmEthernetReadConfig(0x85a8cb8, maxnsend, 24): set maxnsend "24". pmEthernetReadConfig(0x85a8cb8, backoff, 2400): set backoff "2400" usec. pmEthernetReadConfig(0x85a8cb8, checksum, 0): set checksum "0" off. pmEthernetGetNodeByNumber(0x85a8cb8, 0, 0xbffff158): not found pmEthernetGetNodeByNumber(0x85a8cb8, 0, 0xbffff154): not found store host 00:02:B3:B2:EE:FC: dev{0] mac 0x00000002b3b2eefc pmEthernetGetNodeByNumber(0x85a8cb8, 1, 0xbffff158): not found pmEthernetGetNodeByNumber(0x85a8cb8, 1, 0xbffff154): not found store host 00:02:B3:B2:EE:46: dev{0] mac 0x00000002b3b2ee46 pmEthernetOpenDevice("/var/scored/scoreboard/clust.0000B3002Jk7", 0xbffff614): pmEthernetMapEthernet(0, 0xbffff468): 0 Ethernet(0): fd=512 self comp3.beri.or.jp n 1 of 2 nodes pm_ethernetCalibrateTimer(): loop t:3.800225e+07, vt: 1.900700e-02 pm_ethernetCalibrateTimer(): loop t:3.995259e+07, vt: 1.998100e-02 pm_ethernetCalibrateTimer(): end loop t:3.995259e+07, vt: 1.998100e-02 pm_ethernetCalibrateTimer(): d0:1.999382e+09, d1:1.999529e+09 pm_ethernetCalibrateTimer(): clk:1999, clock 1.999456e+03 pmEthernetOpenDevice: Driver version $Id: pm_ethernet_dev.c,v 1.4 2003/02/03 03:11:42 kameyama Exp $ ethernet_open_device(): success [0](0) PM/Ethernet CTX map Ctx(0x85aa008): send=0x4003a000, recv=0x4005a000, shared=0x40000000 success [1](0)pmEthernetRegisterProc(): proc 1676(1676), tid 1676 [1](0) pmEthernetAssociateNodes(0x85aa008, 0x823a3e0, 2):ndev=1 [1](0) pmEthernetBindChannel(0x85aa008, 0, 15): called pm_ethernetCalibrateTimer(): loop t:3.995405e+07, vt: 1.998100e-02 pm_ethernetCalibrateTimer(): end loop t:3.995405e+07, vt: 1.998100e-02 pm_ethernetCalibrateTimer(): d0:1.999654e+09, d1:1.999602e+09 pm_ethernetCalibrateTimer(): clk:1999, clock 1.999628e+03 pmEthernetOpenDevice: Driver version $Id: pm_ethernet_dev.c,v 1.4 2003/02/03 03:11:42 kameyama Exp $ ethernet_open_device(): success [0](0) PM/Ethernet CTX map Ctx(0x85aa008): send=0x4003a000, recv=0x4005a000, shared=0x40000000 success [0](0)pmEthernetRegisterProc(): proc 1767(1767), tid 1767 [0](0) pmEthernetAssociateNodes(0x85aa008, 0x823a3e0, 2):ndev=1 [0](0) pmEthernetBindChannel(0x85aa008, 0, 15): called SYSLOG: /opt/score5.4.0/deploy/scored SYSLOG: SCore-D 5.4.0 $Id: init.cc,v 1.68 2002/10/31 08:43:01 kameyama Exp $ SYSLOG: Compile option(s): SYSLOG: SCore-D network: ethernet1/ethernet SYSLOG: Cluster[0]: (0..1)x1.i386-redhat8-linux2_4.pentium4.2000 SYSLOG: Memory: 1002[MB], Swap: 2048[MB], Disk: 5042[MB] SYSLOG: Network[0]: ethernet1/ethernet SYSLOG: Scheduler initiated: Timeslice = 500 [msec] SYSLOG: Queue[0] activated, exclusive scheduling SYSLOG: Queue[1] activated, time-sharing scheduling SYSLOG: Queue[2] activated, time-sharing scheduling SYSLOG: Session ID: 0 SYSLOG: Server Host: comp3.beri.or.jp SYSLOG: Backup Host: comp2.beri.or.jp SYSLOG: Operated by: root SYSLOG: ========= SCore-D (5.4.0) bootup in SECURE MODE ======== <1> ULT: Exception Signal (11) ----Next_Part(Tue_Aug_12_22:46:13_2003_464)---- From PH1303 @ gmx.de Fri Aug 15 15:10:41 2003 From: PH1303 @ gmx.de (=?iso-2022-jp?b?UEgxMzAzIBskQiF3GyhCIGdteC5kZQ==?=) Date: Fri, 15 Aug 2003 08:10:41 +0200 (MEST) Subject: [SCore-users-jp] [SCore-users] Job can't be killed Message-ID: <1633.1060927841@www47.gmx.net> Hi, I installed SCore 5.4.0 on a 16 SMP node Red Hat 7.3 Cluster with GE and shmem. Everything works fine. I started a multi-user environment with sc_syslog mysyslogserver /my/syslogfile sc_watch -g mygroup -l /my/message -f /my/sc_watch.log scored -sysmon mysysmonserver -syslog mysyslogserver A user submitted a job, his shell crashed (logged out before job was finished) and now the job can't be killed/aborted or anything else by sc_console command. Here the scored.messages: --------------------------- 14/Aug/2003 18:51:06 SYSLOG: Login accepted: user01 @ comp01.domain.de:33488, JID: 13, Hosts: 1(1x1)@0, Priority: 1, Command: /baw/daten01/user01/benchmarks/telemac/cas21870_tmp/out21870 14/Aug/2003 18:51:45 SYSLOG: Logout: user01 @ comp01.domain.de:33488, JOB-ID: 13, CPU Time: 37.93[S] 14/Aug/2003 18:52:51 CONSOLE: >> info job 14/Aug/2003 18:52:51 CONSOLE: 13 user01 @ comp01:33488 1(1x1)@0.IRL - z 37.93[S]/105.0[S] 89.25[MB] 1[MB] out21870 14/Aug/2003 18:52:51 CONSOLE: 1 jobs. 14/Aug/2003 18:52:57 CONSOLE: >> abort 13 14/Aug/2003 18:52:57 CONSOLE: ERROR: job (13) not found. 14/Aug/2003 18:53:00 CONSOLE: >> kill 13 14/Aug/2003 18:53:26 CONSOLE: >> kill all 14/Aug/2003 18:53:43 CONSOLE: >> info queue 14/Aug/2003 18:53:43 CONSOLE: Queue[0] activated, 0 running exclusively, 0 waiting 14/Aug/2003 18:53:43 CONSOLE: Queue[1] activated, 0 running time-shared 14/Aug/2003 18:53:43 CONSOLE: Queue[2] activated, 0 running time-shared 14/Aug/2003 18:53:43 CONSOLE: 0 job(s) suspended 14/Aug/2003 18:53:43 CONSOLE: 0 job(s) aborted 14/Aug/2003 18:53:43 CONSOLE: 0 job(s) waiting for login 15/Aug/2003 07:44:18 CONSOLE: >> info all 15/Aug/2003 07:44:18 CONSOLE: SCore-D 5.4.0 SCORE_NOT_SECURE 15/Aug/2003 07:44:18 CONSOLE: Cluster[0]: (0..15)x2.i386-redhat7-linux2_4.pentium-iv.2800 15/Aug/2003 07:44:18 CONSOLE: Memory: 504[MB], Swap: 2048[MB], Disk: 5040[MB] 15/Aug/2003 07:44:18 CONSOLE: Network[0]: ethernet/ethernet 15/Aug/2003 07:44:18 CONSOLE: Hostname Load Memory Swap Disk 15/Aug/2003 07:44:18 CONSOLE: comp01.domain.de @ 0 0.00 1375.95[MB]/2527.11[MB] 2047.96[MB]/2047.96[MB] 7076.59[MB]/10079.13[MB] 15/Aug/2003 07:44:18 CONSOLE: comp02.domain.de @ 1 0.00 102.40[MB]/503.20[MB] 2047.96[MB]/2047.96[MB] 4562.91[MB]/5039.53[MB] 15/Aug/2003 07:44:18 CONSOLE: comp03.domain.de @ 2 0.00 102.38[MB]/503.20[MB] 2047.96[MB]/2047.96[MB] 4563.50[MB]/5039.53[MB] 15/Aug/2003 07:44:18 CONSOLE: comp04.domain.de @ 3 0.00 102.63[MB]/503.20[MB] 2047.96[MB]/2047.96[MB] 4563.47[MB]/5039.53[MB] 15/Aug/2003 07:44:18 CONSOLE: comp05.domain.de @ 4 0.02 102.35[MB]/503.20[MB] 2047.96[MB]/2047.96[MB] 4563.50[MB]/5039.53[MB] 15/Aug/2003 07:44:18 CONSOLE: comp06.domain.de @ 5 0.00 102.59[MB]/503.20[MB] 2047.96[MB]/2047.96[MB] 4563.50[MB]/5039.53[MB] 15/Aug/2003 07:44:18 CONSOLE: comp07.domain.de @ 6 0.01 102.59[MB]/503.20[MB] 2047.96[MB]/2047.96[MB] 4563.51[MB]/5039.53[MB] 15/Aug/2003 07:44:18 CONSOLE: comp08.domain.de @ 7 0.00 102.84[MB]/503.20[MB] 2047.96[MB]/2047.96[MB] 4563.51[MB]/5039.53[MB] 15/Aug/2003 07:44:18 CONSOLE: comp09.domain.de @ 8 0.00 102.81[MB]/503.20[MB] 2047.96[MB]/2047.96[MB] 4563.76[MB]/5039.53[MB] 15/Aug/2003 07:44:18 CONSOLE: comp10.domain.de @ 9 0.00 102.65[MB]/503.20[MB] 2047.96[MB]/2047.96[MB] 4563.80[MB]/5039.53[MB] 15/Aug/2003 07:44:18 CONSOLE: comp11.domain.de @ 10 0.00 102.37[MB]/503.20[MB] 2047.96[MB]/2047.96[MB] 4563.78[MB]/5039.53[MB] 15/Aug/2003 07:44:18 CONSOLE: comp12.domain.de @ 11 0.00 102.74[MB]/503.20[MB] 2047.96[MB]/2047.96[MB] 4563.78[MB]/5039.53[MB] 15/Aug/2003 07:44:18 CONSOLE: comp13.domain.de @ 12 0.00 102.58[MB]/503.20[MB] 2047.96[MB]/2047.96[MB] 4563.79[MB]/5039.53[MB] 15/Aug/2003 07:44:18 CONSOLE: comp14.domain.de @ 13 0.02 102.59[MB]/503.20[MB] 2047.96[MB]/2047.96[MB] 4563.79[MB]/5039.53[MB] 15/Aug/2003 07:44:18 CONSOLE: comp15.domain.de @ 14 0.00 103.55[MB]/503.20[MB] 2047.96[MB]/2047.96[MB] 4565.97[MB]/5039.53[MB] 15/Aug/2003 07:44:18 CONSOLE: comp16.domain.de @ 15 0.04 137.74[MB]/503.20[MB] 2047.96[MB]/2047.96[MB] 4565.81[MB]/5039.53[MB] 15/Aug/2003 07:44:18 CONSOLE: no device. 15/Aug/2003 07:44:18 CONSOLE: Queue[0] activated, 0 running exclusively, 0 waiting 15/Aug/2003 07:44:18 CONSOLE: Queue[1] activated, 0 running time-shared 15/Aug/2003 07:44:18 CONSOLE: Queue[2] activated, 0 running time-shared 15/Aug/2003 07:44:18 CONSOLE: 0 job(s) suspended 15/Aug/2003 07:44:18 CONSOLE: 0 job(s) aborted 15/Aug/2003 07:44:18 CONSOLE: 0 job(s) waiting for login 15/Aug/2003 07:44:18 CONSOLE: Cluster Nodes Memory Disk #Jobs 15/Aug/2003 07:44:18 CONSOLE: [0] 0..15 504[MB] 5040[MB] (none) 15/Aug/2003 07:44:18 CONSOLE: Queue Time Remain Memory Disk Group Min Max 15/Aug/2003 07:44:18 CONSOLE: [0] (none) (none) (none) (none) (none) (none) (none) 15/Aug/2003 07:44:18 CONSOLE: [1] (none) (none) (none) (none) (none) (none) (none) 15/Aug/2003 07:44:18 CONSOLE: [2] (none) (none) (none) (none) (none) (none) (none) 15/Aug/2003 07:44:18 CONSOLE: JID:13 CPU time limit: 0.0[m] 15/Aug/2003 07:44:18 CONSOLE: JID:13 Cluster[0].i386-redhat7-linux2_4 Memory limit: (none) Disk limit: (none) 15/Aug/2003 07:44:18 CONSOLE: 13 user01 @ comp01:33488 1(1x1)@0.IRL - z 37.93[S]/12.88[H] 89.25[MB] 1[MB] out21870 15/Aug/2003 07:44:18 CONSOLE: 1 jobs. ---------------------- sctop command after 13 hours (!) shows following output: ----------------------- 16 Hosts, comp01.domain.de .. comp16.domain.de Up 15.34[H], Load Average: 0.00, 1 Jobs logged in, 13 Jobs accumulated Host#: 0---------1----1 0123456789012345 #Jobs: 0000000000000000 JID User @ Host:Port Resource P S TIME(CPU/Elps) Memory Disk Command 13 user01 @ comp01:33488 1(1x1)@0.IRL - z 37.93[S]/13.14[H] 89.25[MB] 1[MB] out21870 ------------------------ What can I do to get rid of this job? Best regards, Patrick -- COMPUTERBILD 15/03: Premium-e-mail-Dienste im Test -------------------------------------------------- 1. GMX TopMail - Platz 1 und Testsieger! 2. GMX ProMail - Platz 2 und Preis-Qualitätssieger! 3. Arcor - 4. web.de - 5. T-Online - 6. freenet.de - 7. daybyday - 8. e-Post _______________________________________________ SCore-users mailing list SCore-users @ pccluster.org http://www.pccluster.org/mailman/listinfo/score-users From hori @ swimmy-soft.com Fri Aug 15 17:16:39 2003 From: hori @ swimmy-soft.com (Atsushi HORI) Date: Fri, 15 Aug 2003 17:16:39 +0900 Subject: [SCore-users-jp] Re: [SCore-users] Job can't be killed In-Reply-To: <1633.1060927841@www47.gmx.net> References: <1633.1060927841@www47.gmx.net> Message-ID: <3143812599.hori0001@swimmy-soft.com> Hi, >A user submitted a job, his shell crashed (logged out before job was >finished) and now the job can't be killed/aborted or anything else >by sc_console >command. Here the scored.messages: Well, this sounds like the situation in which job is "almost" killed, but waiting for the TCP connection between SCore-D and the scrun process is closed. This could happen when a machine crashed. I suppose that the job will gone when the host is rebooted. >JID User @ Host:Port Resource P S TIME(CPU/Elps) Memory Disk > Command > 13 user01 @ comp01:33488 1(1x1)@0.IRL - z 37.93[S]/13.14[H] 89.25[MB] 1[MB] >out21870 The simplest way to get rid of this job is killing SCore-D by typing ^C in the scout environment. ---- Atsushi HORI Swimmy Software, Inc. _______________________________________________ SCore-users mailing list SCore-users @ pccluster.org http://www.pccluster.org/mailman/listinfo/score-users From nick @ streamline-computing.com Fri Aug 15 22:21:28 2003 From: nick @ streamline-computing.com (Nick Birkett) Date: Fri, 15 Aug 2003 14:21:28 +0100 Subject: [SCore-users-jp] [SCore-users] re: pm gigabit Message-ID: <200308120130.24698.nick@streamline-computing.com> >We have some new motherboards with 82546EB Gigabit Ethernet Controller. >This required adding the Intel 5.1.11 e1000 driver to the Score 2.4.19-1SCORE >kernel. >The system has some pm problems: pm-ethernet (gigabit): >unit 0 >maxnsend 16 >backoff 2048 >checksum 1 >gives pm timeouts using scstest. > >When running application we get : > ><6> SCore-D:WARNING PM gigaethernet/ethernet device already opened. ><6> SCore-D:ERROR No PM device opened. Ok I have changed the e1000 driver back to version 4.4.19 and now it seems fine on older motherboards. However some of the newer motherboards with onboard e1000 we are using do not work at all with the 4.4.19 driver. Does the 4.4.19 driver in the 2.4.19-2SCORE kernel have any modifications or is this driver the unmodified one from Intel ? Thanks Nick -- Dr Nick Birkett Technical Director Streamline Computing Ltd The Innovation Centre Warwick Technology Park Gallows Hill Warwick CV34 6UW Tel : +44 (0)1926 623130 Fax : +44 (0)1926 623140 Mobile : +44 (0)7890 246662 Email : nrcb @ streamline-computing.com Support : support @ streamline-computing.com Web : http://www.streamline-computing.com _______________________________________________ SCore-users mailing list SCore-users @ pccluster.org http://www.pccluster.org/mailman/listinfo/score-users From pccc @ ics-inc.co.jp Mon Aug 18 14:08:37 2003 From: pccc @ ics-inc.co.jp (pccc) Date: Mon, 18 Aug 2003 14:08:37 +0900 Subject: [SCore-users-jp] サイエンティフィック・システム研究会 HPCフォーラムのご案内 Message-ID: SCore-users-jp メンバの皆様 PCクラスタコンソーシアム正会員からのお知らせです。 PCクラスタコンソーシアム事務局 ************************************* 「SS研HPCフォーラム2003開催のご案内」 ************************************* サイエンティフィック・システム研究会(以下、SS研)のHPCフォーラム (HPCF)では、主にHPC分野における各種アプリケーションの最先端 事例を取り上げ、会合を開催しております。 今回は各種シミュレーションの著名な先生方を迎えご講演頂きます。 金融経済データ解析の実態から、細胞シミュレーション、マルチスケー ルシミュレーションなど、多岐に渡る興味深い内容となっております。 また、海外からは、米国カリフォルニア州バークレイにあるエネルギー 省のスーパーコンピュータセンタNERSCのセンター長で、NAS Parallel Benchmarksの開発メンバやGordon Bell賞の審査委員としても有名な Horst Simon氏にご講演頂きます。 HPCF2003はSS研会員に限らずどなたでも(*1)ご参加頂けます。 お誘いあわせのうえ是非ご参加ください。   *1:コンピュータベンダー様はご遠慮ください。 ●テーマ/プログラム(敬称略) 「HPCアプリの新たな領域」   http://www.ssken.gr.jp/bun/hpcm/program2003.html    1. 高頻度金融データの解析とシミュレーション技法      ソニーコンピュータサイエンス研究所 高安秀樹  2. E-CELL Project : 細胞のコンピューターシミュレーション      慶應義塾大学先端生命科学研究所 冨田勝  3. Recent Progress in Computational Science at NERSC      Dr. Horst D. Simon, Director      NERSC Center and Computational Research Division  4. OCTAプロジェクト : 物質の多階層シミュレーション      名古屋大学大学院工学研究科計算理工学専攻 増渕雄一  5. ナノデバイス開発のための原子スケールシミュレーション      (株)富士通研究所シリコンテクノロジ研究所 金田千穂子  ※講演要旨はホームページをご覧ください。 ●日時/場所 ・日時 : 2003年10月 3日(金) 10:30〜17:20  *懇親会17:30〜19:30 ・場所 : 汐留シティセンター24階 富士通(株)大会議室 (東京) ・地図 : http://jp.fujitsu.com/facilities/shiodome/ ●お申込み 締切り : 9月12日(金)  http://www.ssken.gr.jp/bun/hpcm/program2003.html#reg  *定員になり次第受付終了とさせて頂きます。 ●講演者プロフィールなど  http://www.ssken.gr.jp/bun/hpcm/sshpcf2003leaflet.pdf ご不明な点がございましたらご遠慮なくお問い合わせください。 担当 : 石綿(いしわた) +++++++++++++++++++++++++++++++++++++++++++++ サイエンティフィック・システム研究会事務局 Tel. 03-3778-8215 Fax. 03-3778-8238 Email. ssken @ ssken.gr.jp http://www.ssken.gr.jp/ +++++++++++++++++++++++++++++++++++++++++++++ From uebayasi @ pultek.co.jp Tue Aug 19 17:15:16 2003 From: uebayasi @ pultek.co.jp (Masao Uebayashi) Date: Tue, 19 Aug 2003 17:15:16 +0900 (JST) Subject: [SCore-users-jp] [SCore-users] MPICH-SCore on heterogeneous environments Message-ID: <20030819.171516.879472766.uebayasi@pultek.co.jp> Hello. Does MPICH-SCore work on such an environment that composed with * cluster built only with Alpha machines (== homogeneous cluster) * i386 server * i386 remote host ? Masao _______________________________________________ SCore-users mailing list SCore-users @ pccluster.org http://www.pccluster.org/mailman/listinfo/score-users From kameyama @ pccluster.org Tue Aug 19 17:44:49 2003 From: kameyama @ pccluster.org (=?iso-2022-jp?b?a2FtZXlhbWEgGyRCIXcbKEIgcGNjbHVzdGVyLm9yZw==?=) Date: Tue, 19 Aug 2003 17:44:49 +0900 Subject: [SCore-users-jp] Re: [SCore-users] MPICH-SCore on heterogeneous environments In-Reply-To: Your message of "Tue, 19 Aug 2003 17:15:16 JST." <20030819.171516.879472766.uebayasi@pultek.co.jp> Message-ID: <20030819083901.9A90712894C@neal.il.is.s.u-tokyo.ac.jp> In article <20030819.171516.879472766.uebayasi @ pultek.co.jp> Masao Uebayashi wrotes: > Hello. > > Does MPICH-SCore work on such an environment that composed with > > * cluster built only with Alpha machines (== homogeneous cluster) NOte that alpha machine is not support after SCore 4.1. > * i386 server > * i386 remote host If you want to run that environment, The MPI program must compile on Alpha machine and i386 machine. So you must setup compile environment on alpha machine. Please look at following document: http://www.pccluster.org/score/dist/score/html/en/reference/scored/hetero.html from Kameyama Toyohisa _______________________________________________ SCore-users mailing list SCore-users @ pccluster.org http://www.pccluster.org/mailman/listinfo/score-users From pvenka @ yahoo.com Tue Aug 19 18:21:14 2003 From: pvenka @ yahoo.com (parthasarathy venkataraman) Date: Tue, 19 Aug 2003 02:21:14 -0700 (PDT) Subject: [SCore-users-jp] [SCore-users] output files Message-ID: <20030819092114.4706.qmail@web12304.mail.yahoo.com> Hi, Is it possible to view the the output file while the job is running. I have a job which runs for 1 hr 45 minutes and the output is sent to the output only after the job finishes. venkat __________________________________ Do you Yahoo!? Yahoo! SiteBuilder - Free, easy-to-use web site design software http://sitebuilder.yahoo.com _______________________________________________ SCore-users mailing list SCore-users @ pccluster.org http://www.pccluster.org/mailman/listinfo/score-users From uebayasi @ pultek.co.jp Tue Aug 19 21:15:56 2003 From: uebayasi @ pultek.co.jp (Masao Uebayashi) Date: Tue, 19 Aug 2003 21:15:56 +0900 (JST) Subject: [SCore-users-jp] Re: [SCore-users] MPICH-SCore on heterogeneous environments In-Reply-To: <20030819083901.9A90712894C@neal.il.is.s.u-tokyo.ac.jp> References: <20030819.171516.879472766.uebayasi@pultek.co.jp> <20030819083901.9A90712894C@neal.il.is.s.u-tokyo.ac.jp> Message-ID: <20030819.211556.1005186341.uebayasi@pultek.co.jp> One point was that MPICH-SCore support only homogeneous clusters. I wanted to know if a cluster composed of i386 remote host + i386 server host + non-i386 computing hosts is homogeneous or not for MPICH-SCore. > If you want to run that environment, > The MPI program must compile on Alpha machine and i386 machine. > So you must setup compile environment on alpha machine. I see. Thank you very much. Masao _______________________________________________ SCore-users mailing list SCore-users @ pccluster.org http://www.pccluster.org/mailman/listinfo/score-users From nick @ streamline-computing.com Wed Aug 20 22:01:29 2003 From: nick @ streamline-computing.com (Nick Birkett) Date: Wed, 20 Aug 2003 14:01:29 +0100 Subject: [SCore-users-jp] [SCore-users] IA64 SMP Message-ID: <200308201401.29078.nick@streamline-computing.com> Hi there, I have Score 5.4 running on a 4 cpu itanium system. However I cannot get things to run on more than 2 cpus: nickb @ itanium00 la3d]$ scout -F hosts -e scrun -nodes=4 ~/bin/ia64/hyd-3.04.027.par_op done. FEP:ERROR SCore-D Login failed: Resource unavailable. SCOUT: Session done. scout -F hosts -e scrun -nodes=1x4 ~/bin/ia64/hyd-3.04.027.par_op done. FEP:ERROR SCore-D Login failed: Resource unavailable. The hosts file has a single entry itanium00.streamline which is a 4 cpu system. 2 cpu jobs work fine: [nickb @ itanium00 la3d]$ scout -F hosts -e scrun -nodes=1x2 ~/bin/ia64/hyd-3.04.027.par_op done. SCore-D 5.4.0 connected. <0:0> SCORE: 2 nodes (1x2) ready. (*001)OPlus rev 1.07: good start (*001)********************************************* (*001)*** *** This is my shmem section of the scorehosts.db file: ## /* PM/SHMEM */ shmem0 type=shmem -node=0 shmem1 type=shmem -node=1 shmem2 type=shmem -node=2 shmem3 type=shmem -node=3 ## #include "/opt/score//etc/ndconf/0" ## #define MSGBSERV msgbserv=(itanium00.streamline:8764) itanium00.streamline HOST_0 network=ethernet,shmem0,shmem1,shmem2,shmem3 group=_scoreall_,SHMEM smp=4 MSGBSERV I have attached a tar file of the score/etc directory. If anyone can tell me what is wrong I would be grateful. Many thanks, Nick -- Dr Nick Birkett Technical Director Streamline Computing Ltd The Innovation Centre Warwick Technology Park Gallows Hill Warwick CV34 6UW Tel : +44 (0)1926 623130 Fax : +44 (0)1926 623140 Mobile : +44 (0)7890 246662 Email : nrcb @ streamline-computing.com Support : support @ streamline-computing.com Web : http://www.streamline-computing.com -------------- next part -------------- テキスト形式以外の添付ファイルを保管しました... ファイル名: ia64.etc.tgz 型: application/x-tgz サイズ: 6544 バイト 説明: 無し URL: From kameyama @ pccluster.org Thu Aug 21 09:21:08 2003 From: kameyama @ pccluster.org (=?iso-2022-jp?b?a2FtZXlhbWEgGyRCIXcbKEIgcGNjbHVzdGVyLm9yZw==?=) Date: Thu, 21 Aug 2003 09:21:08 +0900 Subject: [SCore-users-jp] Re: [SCore-users] IA64 SMP In-Reply-To: Your message of "Wed, 20 Aug 2003 14:01:29 JST." <200308201401.29078.nick@streamline-computing.com> Message-ID: <20030821001515.6445012894C@neal.il.is.s.u-tokyo.ac.jp> In article <200308201401.29078.nick @ streamline-computing.com> Nick Birkett wrotes: > Hi there, I have Score 5.4 running on a 4 cpu itanium system. > This is my shmem section of the scorehosts.db file: I think your scorehostss.db is OK, but scoreboard dose not read this file. (scoreboard dose not read scorehostss.db until restart or SIGHUP is cached.) Please check following command to check scoreboard: % scbinfo -t v -n itanium00.streamline -a network This command output list of network. If this output is not include shmem2 and shmem3, please restart or reload scoreboard. from Kameyama Toyohisa _______________________________________________ SCore-users mailing list SCore-users @ pccluster.org http://www.pccluster.org/mailman/listinfo/score-users From hori @ swimmy-soft.com Thu Aug 21 10:21:20 2003 From: hori @ swimmy-soft.com (Atsushi HORI) Date: Thu, 21 Aug 2003 10:21:20 +0900 Subject: [SCore-users-jp] Re: [SCore-users] IA64 SMP In-Reply-To: <200308201401.29078.nick@streamline-computing.com> References: <200308201401.29078.nick@streamline-computing.com> Message-ID: <3144306080.hori0001@swimmy-soft.com> Hi, >Hi there, I have Score 5.4 running on a 4 cpu itanium system. > >However I cannot get things to run on more than 2 cpus: While kameyama-san already answered, but I suspect that all the CPUs might not be working. Check the /proc/cpuinfo. Or, send the output of scored invoked with multiuser mode. ---- Atsushi HORI Swimmy Software, Inc. _______________________________________________ SCore-users mailing list SCore-users @ pccluster.org http://www.pccluster.org/mailman/listinfo/score-users From nick @ streamline-computing.com Thu Aug 21 15:14:39 2003 From: nick @ streamline-computing.com (Nick Birkett) Date: Thu, 21 Aug 2003 07:14:39 +0100 Subject: [SCore-users-jp] Re: [SCore-users] IA64 SMP In-Reply-To: <20030821001515.6445012894C@neal.il.is.s.u-tokyo.ac.jp> References: <20030821001515.6445012894C@neal.il.is.s.u-tokyo.ac.jp> Message-ID: <200308210714.39124.nick@streamline-computing.com> On Thursday 21 August 2003 01:21, kameyama @ pccluster.org wrote: > In article <200308201401.29078.nick @ streamline-computing.com> Nick Birkett wrotes: > > Hi there, I have Score 5.4 running on a 4 cpu itanium system. > > > > This is my shmem section of the scorehosts.db file: > > I think your scorehostss.db is OK, but scoreboard dose not read this file. > (scoreboard dose not read scorehostss.db until restart or SIGHUP is > cached.) > > Please check following command to check scoreboard: > % scbinfo -t v -n itanium00.streamline -a network > This command output list of network. > If this output is not include shmem2 and shmem3, please restart or reload > scoreboard. [root @ itanium00 nickb]# /etc/init.d/scoreboard stop Shutting down scoreboard services: [ OK ] [root @ itanium00 nickb]# /etc/init.d/scoreboard start Starting scoreboard services: [ OK ] [root @ itanium00 nickb]# scbinfo -t v -n itanium00.streamline -a network ethernet shmem0 shmem1 [root @ itanium00 nickb]# cat /proc/cpuinfo processor : 0 vendor : GenuineIntel arch : IA-64 family : Itanium 2 model : 1 revision : 5 archrev : 0 features : branchlong cpu number : 0 cpu regs : 4 cpu MHz : 1296.473997 itc MHz : 1296.473997 BogoMIPS : 1941.96 processor : 1 vendor : GenuineIntel arch : IA-64 family : Itanium 2 model : 1 revision : 5 archrev : 0 features : branchlong cpu number : 0 cpu regs : 4 cpu MHz : 1296.473997 itc MHz : 1296.473997 BogoMIPS : 1941.96 processor : 2 vendor : GenuineIntel arch : IA-64 family : Itanium 2 model : 1 revision : 5 archrev : 0 features : branchlong cpu number : 0 cpu regs : 4 cpu MHz : 1296.473997 itc MHz : 1296.473997 BogoMIPS : 1941.96 processor : 3 vendor : GenuineIntel arch : IA-64 family : Itanium 2 model : 1 revision : 5 archrev : 0 features : branchlong cpu number : 0 cpu regs : 4 cpu MHz : 1296.473997 itc MHz : 1296.473997 BogoMIPS : 1941.96 top 7:09am up 6 days, 12:22, 1 user, load average: 0.00, 0.00, 0.00 59 processes: 56 sleeping, 1 running, 0 zombie, 2 stopped CPU0 states: 0.0% user, 0.0% system, 0.0% nice, 100.0% idle CPU1 states: 0.0% user, 0.1% system, 0.0% nice, 99.9% idle CPU2 states: 0.0% user, 0.0% system, 0.0% nice, 100.0% idle CPU3 states: 0.2% user, 0.4% system, 0.0% nice, 99.4% idle Mem: 16656416K av, 3333616K used, 13322800K free, 0K shrd, 337712K buff Swap: 2040208K av, 0K used, 2040208K free 1988608K cached PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND 4705 nickb I've restarted scoreboard and tested I have 4 cpu's. [nickb @ itanium00 falcon]$ scout -F hosts -e scrun -nodes=4 ~/bin/ia64/hyd-3.04.027.par_op done. FEP:ERROR SCore-D Login failed: Resource unavailable. SCOUT: Session done. Perhaps there is something wrong with my binaries ? It does work fine using 2 cpus, but scbinfo only shows shmem0,shmem1 Is there anything else I need to check ? -- Dr Nick Birkett Technical Director Streamline Computing Ltd The Innovation Centre Warwick Technology Park Gallows Hill Warwick CV34 6UW Tel : +44 (0)1926 623130 Fax : +44 (0)1926 623140 Mobile : +44 (0)7890 246662 Email : nrcb @ streamline-computing.com Support : support @ streamline-computing.com Web : http://www.streamline-computing.com _______________________________________________ SCore-users mailing list SCore-users @ pccluster.org http://www.pccluster.org/mailman/listinfo/score-users From kameyama @ pccluster.org Thu Aug 21 15:28:57 2003 From: kameyama @ pccluster.org (=?iso-2022-jp?b?a2FtZXlhbWEgGyRCIXcbKEIgcGNjbHVzdGVyLm9yZw==?=) Date: Thu, 21 Aug 2003 15:28:57 +0900 Subject: [SCore-users-jp] Re: [SCore-users] IA64 SMP In-Reply-To: Your message of "Thu, 21 Aug 2003 07:14:39 JST." <200308210714.39124.nick@streamline-computing.com> Message-ID: <20030821062302.A56F912894C@neal.il.is.s.u-tokyo.ac.jp> In article <200308210714.39124.nick @ streamline-computing.com> Nick Birkett wrotes: > [root @ itanium00 nickb]# /etc/init.d/scoreboard stop > Shutting down scoreboard services: [ OK ] > [root @ itanium00 nickb]# /etc/init.d/scoreboard start > Starting scoreboard services: [ OK ] > [root @ itanium00 nickb]# scbinfo -t v -n itanium00.streamline -a network > ethernet shmem0 shmem1 Probably, this is scoreboard probrem. 0. Please check SCBDSERV environment variable. 1. Please execute following command: % scoreboard -test This command dumps scoreboard database. 2. Please chek /etc/init.d/scoreboard. If this script specifies scoreb9oard file, please check the file. from Kameyama Toyohisa _______________________________________________ SCore-users mailing list SCore-users @ pccluster.org http://www.pccluster.org/mailman/listinfo/score-users From nick @ streamline-computing.com Thu Aug 21 15:40:50 2003 From: nick @ streamline-computing.com (Nick Birkett) Date: Thu, 21 Aug 2003 07:40:50 +0100 Subject: [SCore-users-jp] Re: [SCore-users] IA64 SMP In-Reply-To: <20030821062302.A56F912894C@neal.il.is.s.u-tokyo.ac.jp> References: <20030821062302.A56F912894C@neal.il.is.s.u-tokyo.ac.jp> Message-ID: <200308210740.50830.nick@streamline-computing.com> On Thursday 21 August 2003 07:28, you wrote: > Probably, this is scoreboard probrem. > 0. Please check SCBDSERV environment variable. > Many thanks - it seems SCBDSERV was set to the wrong cluster name. I have to changed it to itanium00.streamline and now I can run 4 cpu jobs. -- Dr Nick Birkett Technical Director Streamline Computing Ltd The Innovation Centre Warwick Technology Park Gallows Hill Warwick CV34 6UW Tel : +44 (0)1926 623130 Fax : +44 (0)1926 623140 Mobile : +44 (0)7890 246662 Email : nrcb @ streamline-computing.com Support : support @ streamline-computing.com Web : http://www.streamline-computing.com _______________________________________________ SCore-users mailing list SCore-users @ pccluster.org http://www.pccluster.org/mailman/listinfo/score-users From nick @ streamline-computing.com Fri Aug 22 01:27:01 2003 From: nick @ streamline-computing.com (Nick Birkett) Date: Thu, 21 Aug 2003 17:27:01 +0100 Subject: [SCore-users-jp] [SCore-users] IA32/IA64 Message-ID: <200308211727.01481.nick@streamline-computing.com> I need some help getting a Intel Xeon / Itanium cluster to work. I have an 16 node dual i386 cluster and a single 4 cpu Itanium system both working with Score 5.4.0 with gigabit network/ shmem. I have 2 binaries compiled and working for both architectures. There is a .wrapper link : nickb @ cserver ]$ ls -al ~/bin/hyd-3.04.027.par_op lrwxrwxrwx 1 nickb sc 8 Aug 21 12:00 /users/nickb/bin/hyd-3.04.027.par_op -> .wrapper The binaries are /opt/score/bin/bin.i386-redhat7-linux2_4/hyd-3.04.027.par_op.exe /opt/score/bin/bin.ia64-redhat-linux2_4/hyd-3.04.027.par_op.exe I can run the wrapper program hyd-3.04.027.par_op on both machines successfully. scout works across both platforms [nickb @ cserver falcon]$ cat hosts comp00.streamline itanium00.streamline [nickb @ cserver falcon]$ scout -F hosts SCOUT: Spawning done. SCOUT: session started. [nickb @ cserver falcon]$ scout hostname [comp00]: comp00.streamline [itanium00]: itanium00.streamline I have compiled scored.exe and scored_dev.exe to include ia64 symbols on the i386 front end server. [nickb @ cserver bin.i386-redhat7-linux2_4]$ nm scored.exe | grep ia64 081a657c t elf64_ia64_size_dynamic_sections 081a7440 t elf64_ia64_unwind_entry_compare 082e945c b elf64_ia64_unwind_entry_compare_bfd 08252c00 r ia64_howto_table 081bac40 t ia64coff_object_p I have both an i386 and ia64 bin and deploy directories on the front end: [nickb @ cserver deploy]$ ls -ald bin.i* drwxr-xr-x 2 root root 4096 Aug 21 16:48 bin.i386-redhat7-linux2_4 drwxr-xr-x 2 root root 4096 Aug 21 16:22 bin.ia64-redhat-linux2_4 This is the error I get when the hosts file has comp00 (i386) and itanium00 (ia64) nickb @ cserver falcon]$ scout -F hosts -e scrun -nodes=4 ~/bin/hyd-3.04.027.par_op SCOUT: Spawning done. <1> ULT:ERROR Unable to open binary file (/opt/score/deploy/bin.ia64-redhat-linux2_4/scored.exe)=0 However the file /opt/score/deploy/bin.ia64-redhat-linux2_4/scored.exe is readable [nickb @ cserver falcon]$ ls -al /opt/score/deploy/bin.ia64-redhat-linux2_4/scored.exe -rwxr-xr-x 1 root root 17489790 May 26 23:19 /opt/score/deploy/bin.ia64-redhat-linux2_4/scored.exe Any help to make this work would be appreciated. Thanks, Nick -- Dr Nick Birkett Technical Director Streamline Computing Ltd The Innovation Centre Warwick Technology Park Gallows Hill Warwick CV34 6UW _______________________________________________ SCore-users mailing list SCore-users @ pccluster.org http://www.pccluster.org/mailman/listinfo/score-users From hori @ swimmy-soft.com Fri Aug 22 14:29:04 2003 From: hori @ swimmy-soft.com (Atsushi HORI) Date: Fri, 22 Aug 2003 14:29:04 +0900 Subject: [SCore-users-jp] Re: [SCore-users] IA32/IA64 In-Reply-To: <200308211727.01481.nick@streamline-computing.com> References: <200308211727.01481.nick@streamline-computing.com> Message-ID: <3144407344.hori0002@swimmy-soft.com> Hi, again, >However the file >/opt/score/deploy/bin.ia64-redhat-linux2_4/scored.exe is readable > >[nickb @ cserver falcon]$ ls -al >/opt/score/deploy/bin.ia64-redhat-linux2_4/scored.exe >-rwxr-xr-x 1 root root 17489790 May 26 23:19 >/opt/score/deploy/bin.ia64-redhat-linux2_4/scored.exe What happens if you type the following commands. % scout -F hosts % scout ls -l /opt/score/deploy/bin.ia64-redhat-linux2_4/ In the massage "<1> ULT:ERROR Unable to open binary file", leading "<1>" means that the error happens on the host number 1. So scored.exe for ia64 must be present on comp00.streamline too. ---- Atsushi HORI Swimmy Software, Inc. _______________________________________________ SCore-users mailing list SCore-users @ pccluster.org http://www.pccluster.org/mailman/listinfo/score-users From O.Weihe @ deltacomputer.de Fri Aug 22 16:49:31 2003 From: O.Weihe @ deltacomputer.de (Oliver Weihe) Date: Fri, 22 Aug 2003 09:49:31 +0200 Subject: [SCore-users-jp] [SCore-users] new SCore version Message-ID: Hi there, do you have any idea when the new SCore-version ( supporting the Myrinet-interfaces with Lanai XP chip ) will be released ? regards Oliver Weihe _______________________________________________ SCore-users mailing list SCore-users @ pccluster.org http://www.pccluster.org/mailman/listinfo/score-users From kameyama @ pccluster.org Fri Aug 22 20:34:14 2003 From: kameyama @ pccluster.org (=?iso-2022-jp?b?a2FtZXlhbWEgGyRCIXcbKEIgcGNjbHVzdGVyLm9yZw==?=) Date: Fri, 22 Aug 2003 20:34:14 +0900 Subject: [SCore-users-jp] Re: [SCore-users] IA32/IA64 In-Reply-To: Your message of "Thu, 21 Aug 2003 17:27:01 JST." <200308211727.01481.nick@streamline-computing.com> Message-ID: <20030822112816.B18F912894C@neal.il.is.s.u-tokyo.ac.jp> In article <200308211727.01481.nick @ streamline-computing.com> Nick Birkett wrotes: > nickb @ cserver falcon]$ scout -F hosts -e scrun -nodes=4 ~/bin/hyd-3.04.027.pa > r_op > SCOUT: Spawning done. > <1> ULT:ERROR Unable to open binary file (/opt/score/deploy/bin.ia64-redhat-l > inux2_4/scored.exe)=0 Please apply following patch (this patch change under score-src/SCore/mttl-ult/ult directory). Note that I don't test this patch is correctory. I testted only redhat Advance Workstation version 2.1 IA64 host (gcc 2./96) and redhat 8 x86 host (gcc 3.2). But there are not C++ naming compatibility between C++ compiler. So I don't execute scored on adove environment. from Kameyama Toyohisa ---------------------------------------cut here--------------------------------- Index: score-src/SCore/mttl-ult/ult/Makefile =================================================================== RCS file: /develop/cvsroot/score-src/SCore/mttl-ult/ult/Makefile,v retrieving revision 1.17 retrieving revision 1.18 diff -u -r1.17 -r1.18 --- score-src/SCore/mttl-ult/ult/Makefile 5 Dec 2002 06:14:48 -0000 1.17 +++ score-src/SCore/mttl-ult/ult/Makefile 26 Jun 2003 09:56:19 -0000 1.18 @@ -1,4 +1,4 @@ -# $Id: Makefile,v 1.17 2002/12/05 06:14:48 kameyama Exp $ +# $Id: Makefile,v 1.18 2003/06/26 09:56:19 kameyama Exp $ # $PCCC_Release$ # $PCCC_Copyright$ # @@ -110,8 +110,9 @@ ### default targets BFD_TARGET = `case $(host_canonical) in \ - i386-unknown-linux) echo elf32-i386;; \ - alpha-unknown-linux) echo elf64-alpha;; \ + i386-*-linux) echo elf32-i386;; \ + alpha-*-linux) echo elf64-alpha;; \ + ia64-*-linux) echo elf64-ia64-little;; \ esac` OBJDIR_RULE = lib Index: score-src/SCore/mttl-ult/ult/heterosetup.c =================================================================== RCS file: /develop/cvsroot/score-src/SCore/mttl-ult/ult/heterosetup.c,v retrieving revision 1.21 retrieving revision 1.24 diff -u -r1.21 -r1.24 --- score-src/SCore/mttl-ult/ult/heterosetup.c 24 Jan 2003 10:09:37 -0000 1.21 +++ score-src/SCore/mttl-ult/ult/heterosetup.c 22 Aug 2003 10:46:42 -0000 1.24 @@ -1,10 +1,10 @@ -static char rcsid[] = "$Id: heterosetup.c,v 1.21 2003/01/24 10:09:37 hori Exp $"; +static char rcsid[] = "$Id: heterosetup.c,v 1.24 2003/08/22 10:46:42 kameyama Exp $"; /* * $PCCC_Release$ * $PCCC_Copyright$ */ /* - * $Id: heterosetup.c,v 1.21 2003/01/24 10:09:37 hori Exp $ + * $Id: heterosetup.c,v 1.24 2003/08/22 10:46:42 kameyama Exp $ * * $RWC_Release: SCore Release 4.2.1 of SCore Cluster System Software (2001/11/13) $ * @@ -78,6 +78,14 @@ extern void mpcxx_FLreg(void*, char*); extern pmContext *score_pmnet[]; +#ifdef __ia64__ +struct ia64_functable { + void *addr; + void *gp; +}; +extern struct ia64_functable *ia64_functable; +extern int ia64_numfuncs; +#endif mpcxx_nodeinfo *mpcxx_allNodeInfo; mpcxx_nodecom myNodeCom; @@ -136,6 +144,7 @@ u_short stmp; u_int ltmp; + ULT_DEBUG( 3, "<< recv_host_info()", host_info_count); while( 1 ) { while( 1 ) { st = pmReceive( score_pmnet[ULT_NETSET_NO], (caddr_t*) &buff, &size ); @@ -174,7 +183,7 @@ if( st != PM_SUCCESS ) ULT_ERROR( "pmReleaseReceiveBuffer()=%d", st ); host_info_count ++; - ULT_DEBUG( 0, "recv_host_info(%d)", node ); + ULT_DEBUG( 0, ">>recv_host_info(%d)", node ); return( 0 ); } @@ -189,9 +198,12 @@ if( send_host_info( node ) == 0 ) node ++; if( host_info_count < numNode ) (void) recv_host_info(); } + ULT_DEBUG( 3, "<< exchange_host_info(): node = %d", node); while( host_info_count < numNode ) { (void) recv_host_info(); + ULT_DEBUG( 3, "<< exchange_host_info(): host_info_count = %d", host_info_count); } + ULT_TRACE( 3, ">> setup_host_info()"); } static void setup_host_info( void ) { @@ -278,8 +290,16 @@ #endif /* defined(__NetBSD__) || defined(sunos) */ #if defined(linux) +#ifdef FALSE +#undef FALSE +#endif #include +#ifndef FALSE static boolean dynamic = false; +#else +/* For redhat 9 */ +static bfd_boolean dynamic = FALSE; +#endif static char *target = MPCXX_TARGET; static void @@ -326,6 +346,27 @@ } } } +#ifdef __ia64__ +#define GP_TABLE_NAME ".opd" + { + asection *section; + int section_size; + + section = bfd_get_section_by_name(abfd, GP_TABLE_NAME); + if(!section) { + ULT_ERROR("Unable to read IA64 function data (%s)", filename); + } + section_size = bfd_section_size(abfd, section); + ia64_functable = malloc(section_size); + if (!ia64_functable) { + ULT_ERROR("Unable to allocate function data (%d bytes)", + section_size); + } + bfd_get_section_contents(abfd, section, (char *)ia64_functable, + 0, section_size); + ia64_numfuncs = section_size / sizeof(struct ia64_functable); + } +#endif bfd_close(abfd); } #endif /* defined(linux) */ Index: score-src/SCore/mttl-ult/ult/idle.c =================================================================== RCS file: /develop/cvsroot/score-src/SCore/mttl-ult/ult/idle.c,v retrieving revision 1.16 retrieving revision 1.18 diff -u -r1.16 -r1.18 --- score-src/SCore/mttl-ult/ult/idle.c 21 Jan 2003 09:30:28 -0000 1.16 +++ score-src/SCore/mttl-ult/ult/idle.c 3 Apr 2003 03:14:26 -0000 1.18 @@ -2,9 +2,9 @@ * $PCCC_Release$ * $PCCC_Copyright$ */ -static char rcsid[] = "$Id: idle.c,v 1.16 2003/01/21 09:30:28 hori Exp $"; +static char rcsid[] = "$Id: idle.c,v 1.18 2003/04/03 03:14:26 kameyama Exp $"; /* - * $Id: idle.c,v 1.16 2003/01/21 09:30:28 hori Exp $ + * $Id: idle.c,v 1.18 2003/04/03 03:14:26 kameyama Exp $ * * $RWC_Release: SCore Release 4.2.1 of SCore Cluster System Software (2001/11/13) $ * @@ -70,7 +70,7 @@ ult_switch_thread( ult_dequeue() ); /* suspend */ } -void score_setup_sysycall( void ) { +void score_setup_syscall( void ) { extern void score_set_syscall_hook( void(*)(void), void(*)(Syscall*) ); @@ -94,7 +94,7 @@ #else -void score_setup_sysycall( void ) { +void score_setup_syscall( void ) { return; } Index: score-src/SCore/mttl-ult/ult/mpcrt.c =================================================================== RCS file: /develop/cvsroot/score-src/SCore/mttl-ult/ult/mpcrt.c,v retrieving revision 1.32 retrieving revision 1.33 diff -u -r1.32 -r1.33 --- score-src/SCore/mttl-ult/ult/mpcrt.c 24 Jan 2003 10:09:37 -0000 1.32 +++ score-src/SCore/mttl-ult/ult/mpcrt.c 2 Apr 2003 06:34:25 -0000 1.33 @@ -2,9 +2,9 @@ * $PCCC_Release$ * $PCCC_Copyright$ */ -static char rcsid[] = "$Id: mpcrt.c,v 1.32 2003/01/24 10:09:37 hori Exp $"; +static char rcsid[] = "$Id: mpcrt.c,v 1.33 2003/04/02 06:34:25 hori Exp $"; /* - * $Id: mpcrt.c,v 1.32 2003/01/24 10:09:37 hori Exp $ + * $Id: mpcrt.c,v 1.33 2003/04/02 06:34:25 hori Exp $ * * $RWC_Release: SCore Release 4.2.1 of SCore Cluster System Software (2001/11/13) $ * @@ -64,8 +64,8 @@ void score_set_signal_handler( int, void(*)() ); void ult_setup_comm( void ); void ult_exit( void ); -void score_setup_sysycall( void ); void mpcxxHeteroEnvSetup( void ); +void score_setup_syscall( void ); int ult_ignore_sync_error_flag = 0; @@ -127,7 +127,7 @@ mpcNullSyncCell.naddr = 0; ult_setup_comm(); /* numNode, myNode are set */ mpcxxHeteroEnvSetup(); /* must be called after ult_setup_comm() */ - score_setup_sysycall(); /* must be called after HeteroEnvSetup() */ + score_setup_syscall(); /* must be called after HeteroEnvSetup() */ } static void non_spmd_initialize( void ) { Index: score-src/SCore/mttl-ult/ult/mpcxx.h =================================================================== RCS file: /develop/cvsroot/score-src/SCore/mttl-ult/ult/mpcxx.h,v retrieving revision 1.8 retrieving revision 1.9 diff -u -r1.8 -r1.9 --- score-src/SCore/mttl-ult/ult/mpcxx.h 14 Dec 2001 07:30:22 -0000 1.8 +++ score-src/SCore/mttl-ult/ult/mpcxx.h 27 Mar 2003 06:15:14 -0000 1.9 @@ -3,7 +3,7 @@ * $PCCC_Copyright$ */ /* - * $Id: mpcxx.h,v 1.8 2001/12/14 07:30:22 kameyama Exp $ + * $Id: mpcxx.h,v 1.9 2003/03/27 06:15:14 hori Exp $ * * $RWC_Release: SCore Release 4.2.1 of SCore Cluster System Software (2001/11/13) $ * @@ -58,9 +58,9 @@ #include #include -#ifdef __cplusplus - #define STACKSIZE(SS) int ult_stack_size=SS + +#ifdef __cplusplus #include Index: score-src/SCore/mttl-ult/ult/mpcxx_mttl.h =================================================================== RCS file: /develop/cvsroot/score-src/SCore/mttl-ult/ult/mpcxx_mttl.h,v retrieving revision 1.28 retrieving revision 1.29 diff -u -r1.28 -r1.29 --- score-src/SCore/mttl-ult/ult/mpcxx_mttl.h 14 Dec 2001 07:30:22 -0000 1.28 +++ score-src/SCore/mttl-ult/ult/mpcxx_mttl.h 26 Jun 2003 09:56:19 -0000 1.29 @@ -3,7 +3,7 @@ * $PCCC_Copyright$ */ /* - * $Id: mpcxx_mttl.h,v 1.28 2001/12/14 07:30:22 kameyama Exp $ + * $Id: mpcxx_mttl.h,v 1.29 2003/06/26 09:56:19 kameyama Exp $ * * $RWC_Release: SCore Release 4.2.1 of SCore Cluster System Software (2001/11/13) $ * @@ -4138,7 +4138,9 @@ char *argp; char buf[MPCXX_MARSHAL_BUFSIZE]; + ULT_DEBUG( 10, "invoke(%p)", (void*)_voidsinvoker0::invoke2); argp = mpcxxFuncMarshal(buf, (void*)_voidsinvoker0::invoke2); + ULT_DEBUG( 10, "invoke(%p)", (void*)f); argp = mpcxxFuncMarshal(argp, (void*) f); mpcRemoteSyncInvoke(pe, 0, buf, argp - &buf[0], (char*) &dummy); } else if(MPCXX_NOTHING_SAME) { Index: score-src/SCore/mttl-ult/ult/nlist.c =================================================================== RCS file: /develop/cvsroot/score-src/SCore/mttl-ult/ult/nlist.c,v retrieving revision 1.7 retrieving revision 1.9 diff -u -r1.7 -r1.9 --- score-src/SCore/mttl-ult/ult/nlist.c 14 Dec 2001 07:30:22 -0000 1.7 +++ score-src/SCore/mttl-ult/ult/nlist.c 22 Aug 2003 10:46:42 -0000 1.9 @@ -1,10 +1,10 @@ -static char rcsid[] = "$Id: nlist.c,v 1.7 2001/12/14 07:30:22 kameyama Exp $"; +static char rcsid[] = "$Id: nlist.c,v 1.9 2003/08/22 10:46:42 kameyama Exp $"; /* * $PCCC_Release$ * $PCCC_Copyright$ */ /* - * $Id: nlist.c,v 1.7 2001/12/14 07:30:22 kameyama Exp $ + * $Id: nlist.c,v 1.9 2003/08/22 10:46:42 kameyama Exp $ * * $RWC_Release: SCore Release 4.2.1 of SCore Cluster System Software (2001/11/13) $ * @@ -74,6 +74,14 @@ funclist fl_name2addr[BUCKET_SIZE]; funclist fl_addr2name[BUCKET_SIZE]; +#ifdef __ia64__ +struct ia64_functable { + void *addr; + void *gp; +}; +struct ia64_functable *ia64_functable; +int ia64_numfuncs; +#endif void * mpcxxFLN2A(char *cp) @@ -81,11 +89,24 @@ int ent; char *sp = cp + 1; struct funclist *flp; +#ifdef __ia64__ + struct ia64_functable *ft; + int i; +#endif N2A_HASH(ent, sp); for (flp = &fl_name2addr[ent]; flp; flp = flp->next) { if (strcmp(flp->symp, cp) == 0) { +#ifdef __ia64__ + for(ft = ia64_functable, i = 0; i < ia64_numfuncs; + i++, ia64_functable ++) { + if(flp->addr == ft->addr) + return((void *)ft); + } + return 0; +#else return flp->addr; +#endif /* ia64 */ } } return 0; @@ -103,6 +124,16 @@ return flp->symp; } } +#ifdef __ia64__ + /* If this is a function, try again */ + addr = *(void **)addr; + A2N_HASH(ent, addr); + for (flp = &fl_addr2name[ent]; flp; flp = flp->next) { + if (flp->addr == addr) { + return flp->symp; + } + } +#endif return 0; } ---------------------------------------cut here--------------------------------- _______________________________________________ SCore-users mailing list SCore-users @ pccluster.org http://www.pccluster.org/mailman/listinfo/score-users From loos @ rz.uni-potsdam.de Mon Aug 25 15:56:02 2003 From: loos @ rz.uni-potsdam.de (Steffen Loos) Date: Mon, 25 Aug 2003 08:56:02 +0200 (MET DST) Subject: [SCore-users-jp] [SCore-users] problems with Netgear GA621 In-Reply-To: <20030823030001.10133.70232.Mailman@www.pccluster.org> Message-ID: hi SCore-team, when I try to attach the GA621 with /etc/rc.d/init.d/pm-ethernet the output of dmesg shows something like: ... etherpm1: 16 contexts using 4096KB MEM, maxunit=4, maxnodes=512, mtu=1468, eth1. etherpm1: Interrupt Reaping on eth1, irq 0 ... where the other interfaces are right. when i test the performance with rpmtest(latency and bandwith) the results are very poor! without Interrupt Reaping the results are better but not really good.(latency=200mysec, bandwith=42Mbit) can anyone give me some hints whats here going wrong? as driver we use the ns83820.o thanks Steffen Loos _______________________________________________ SCore-users mailing list SCore-users @ pccluster.org http://www.pccluster.org/mailman/listinfo/score-users From s-sumi @ flab.fujitsu.co.jp Mon Aug 25 21:31:43 2003 From: s-sumi @ flab.fujitsu.co.jp (Shinji Sumimoto) Date: Mon, 25 Aug 2003 21:31:43 +0900 (JST) Subject: [SCore-users-jp] Re: [SCore-users] problems with Netgear GA621 In-Reply-To: References: <20030823030001.10133.70232.Mailman@www.pccluster.org> Message-ID: <20030825.213143.640925921.s-sumi@flab.fujitsu.co.jp> Hi. From: Steffen Loos Subject: [SCore-users] problems with Netgear GA621 Date: Mon, 25 Aug 2003 08:56:02 +0200 (MET DST) Message-ID: loos> hi SCore-team, loos> loos> when I try to attach the GA621 with /etc/rc.d/init.d/pm-ethernet the loos> output of dmesg shows something like: loos> ... loos> etherpm1: 16 contexts using 4096KB MEM, maxunit=4, maxnodes=512, mtu=1468, loos> eth1. loos> etherpm1: Interrupt Reaping on eth1, irq 0 loos> ... Could you check whether the GA621 is works well or not? The number of irq 0 is something wrong. The number 0 is for system timer. Maybe the NIC does not work hardware interrupt. Could you send us the results of "cat /proc/interrupts" command. Shinji. loos> where the other interfaces are right. loos> when i test the performance with rpmtest(latency and bandwith) the results loos> are very poor! loos> without Interrupt Reaping the results are better but not really loos> good.(latency=200mysec, bandwith=42Mbit) loos> loos> can anyone give me some hints whats here going wrong? loos> loos> as driver we use the ns83820.o loos> loos> thanks loos> loos> Steffen Loos loos> loos> loos> loos> loos> _______________________________________________ loos> SCore-users mailing list loos> SCore-users @ pccluster.org loos> http://www.pccluster.org/mailman/listinfo/score-users loos> loos> ------ Shinji Sumimoto, Fujitsu Labs _______________________________________________ SCore-users mailing list SCore-users @ pccluster.org http://www.pccluster.org/mailman/listinfo/score-users From loos @ rz.uni-potsdam.de Mon Aug 25 23:02:53 2003 From: loos @ rz.uni-potsdam.de (Steffen Loos) Date: Mon, 25 Aug 2003 16:02:53 +0200 (MET DST) Subject: [SCore-users-jp] Re: [SCore-users] problems with Netgear GA621 In-Reply-To: <20030825.213143.640925921.s-sumi@flab.fujitsu.co.jp> Message-ID: hi. On Mon, 25 Aug 2003, Shinji Sumimoto wrote: > > Could you check whether the GA621 is works well or not? > > The number of irq 0 is something wrong. The number 0 is for system timer. > Maybe the NIC does not work hardware interrupt. > > Could you send us the results of "cat /proc/interrupts" command. [steffen @ ophelia]# cat /proc/interrupts CPU0 0: 9805444 XT-PIC timer 1: 1004 XT-PIC keyboard 2: 0 XT-PIC cascade 3: 9230956 XT-PIC eth2 5: 9759 XT-PIC aic7xxx 7: 245407 XT-PIC eth1 8: 1 XT-PIC rtc 9: 1580244 XT-PIC eth0 10: 15 XT-PIC aic7xxx 12: 15 XT-PIC PS/2 Mouse 14: 0 XT-PIC ide0 NMI: 0 ERR: 0 [The other both 100Mbit-devices work without any problems (normal or with trunking).] The effect with the Interrupt Reaping-option could based on a feature of the driver called Interrupt-holdoff. My problem is I don't know how I can disable this. Maybe anyone with some experience of this can help. Steffen Loos _______________________________________________ SCore-users mailing list SCore-users @ pccluster.org http://www.pccluster.org/mailman/listinfo/score-users From s-sumi @ flab.fujitsu.co.jp Tue Aug 26 09:03:21 2003 From: s-sumi @ flab.fujitsu.co.jp (Shinji Sumimoto) Date: Tue, 26 Aug 2003 09:03:21 +0900 (JST) Subject: [SCore-users-jp] Re: [SCore-users] problems with Netgear GA621 In-Reply-To: References: <20030825.213143.640925921.s-sumi@flab.fujitsu.co.jp> Message-ID: <20030826.090321.884033697.s-sumi@flab.fujitsu.co.jp> Hi. From: Steffen Loos Subject: Re: [SCore-users] problems with Netgear GA621 Date: Mon, 25 Aug 2003 16:02:53 +0200 (MET DST) Message-ID: loos> hi. loos> loos> On Mon, 25 Aug 2003, Shinji Sumimoto wrote: loos> loos> > loos> > Could you check whether the GA621 is works well or not? loos> > loos> > The number of irq 0 is something wrong. The number 0 is for system timer. loos> > Maybe the NIC does not work hardware interrupt. loos> > loos> > Could you send us the results of "cat /proc/interrupts" command. loos> loos> [steffen @ ophelia]# cat /proc/interrupts loos> CPU0 loos> 0: 9805444 XT-PIC timer loos> 1: 1004 XT-PIC keyboard loos> 2: 0 XT-PIC cascade loos> 3: 9230956 XT-PIC eth2 loos> 5: 9759 XT-PIC aic7xxx loos> 7: 245407 XT-PIC eth1 loos> 8: 1 XT-PIC rtc loos> 9: 1580244 XT-PIC eth0 loos> 10: 15 XT-PIC aic7xxx loos> 12: 15 XT-PIC PS/2 Mouse loos> 14: 0 XT-PIC ide0 loos> NMI: 0 loos> ERR: 0 loos> loos> [The other both 100Mbit-devices work without any problems (normal or loos> with trunking).] loos> loos> The effect with the Interrupt Reaping-option could based on a feature of loos> the driver called Interrupt-holdoff. My problem is I don't know how I can loos> disable this. Maybe anyone with some experience of this can help. Set the flag in /etc/init.d/pm_ethernet as follows: INTERRUPT_REAPING=off $ cat ==================================== #!/bin/sh # # pm_ethernet: Starts the PM Ethernet driver # # Version: @(#) /etc/rc.d/init.d/pm_ethernet 1.00 # # Author: Shinji Sumimoto (Real World Computing Partnership) # chkconfig: 345 90 18 # description: PM Ethernet driver # probe: true IF=eth4 UNIT=0 INTERRUPT_REAPING=on # INTERRUPT_REAPING=off # Source function library. . /etc/rc.d/init.d/functions # check module module=`modprobe -l pm_ethernet_dev.o | grep -v Note:` # See how we were called. case "$1" in ==================================== Shinji. ------ Shinji Sumimoto, Fujitsu Labs _______________________________________________ SCore-users mailing list SCore-users @ pccluster.org http://www.pccluster.org/mailman/listinfo/score-users