[SCore-users] Races inside SCore(?)

Shinji Sumimoto s-sumi at bd6.so-net.ne.jp
Tue Dec 10 01:19:05 JST 2002


Hi.

Thank you for your information.

From: Richard Guenther <rguenth at tat.physik.uni-tuebingen.de>
Subject: Re: [SCore-users] Races inside SCore(?)
Date: Mon, 9 Dec 2002 16:36:44 +0100 (CET)
Message-ID: <Pine.LNX.4.33.0212091617450.10722-100000 at bellatrix.tat.physik.uni-tuebingen.de>

rguenth> On Sun, 8 Dec 2002, Shinji Sumimoto wrote:
rguenth> 
rguenth> > Hi.
rguenth> >
rguenth> > Have you tried to execute the program with mpi_zerocopy=on option?
rguenth> > It it works, we have to some workaround to MPI_Iprobe.
rguenth> 
rguenth> Maybe related, I get (sometimes, cannot easily check if its correlated)
rguenth> the following messages from kernel:
rguenth> pmm_mem_read copy failed 0x1 (tsk=f2674000, addr=0x261adc30,
rguenth> ctx=0xf76c0000) err 0x0
rguenth> pmm_mem_read copy failed 0x0 (tsk=f3340000, addr=0x261e5dd0,
rguenth> ctx=0xf7700000) err 0x0

Which network are you using now?

This message 

pmm_mem_read copy failed 0x1 (tsk=f2674000, addr=0x261adc30,
ctx=0xf76c0000) err 0x0

seems to be output from PM/Ethernet with mpi_zerocopy=on. Is this true?
The message says PM/Ethernet failed to read data from user memory.

In PM/Ethernet case, SCore 4.2 has problem with mpi_zerocopy=on, and
sometimes occurred on newer version of SCore. Now, I am re-writing the
feature.

How about PM/Myrinet case? 

scrun -network=myrinet,,,,,

rguenth> To answer your question, we use mpi_zerocopy as argument to scrun,
rguenth> exchanging that for mpi_zerocopy=on doesnt change things, but specifying
rguenth> mpi_zerocopy=off leads to immediate failure (but again, the failure is
rguenth> only with Nx2 setups, not with Nx1, also 1x2 seems to be ok).
rguenth> 
rguenth> Is there a option to SCore to allow tracing/logging of the MPI functions
rguenth> called? Maybe we can see a pattern to the problem.

MPICH/SCore also has -mpi_log option like MPICH.  But, if the program
runs a couple of minuites, logs becomes huge and difficult to analize.

rguenth> If I change the computation/communication order in my program I cannot
rguenth> reproduce the failures. But in this mode requests never accumulate so
rguenth> its just a lot less hammering on the MPI backend.

Could you let us know your runtime environment more?
PM/Myrinet problems or PM/Ethernet problems?

Shinji.

rguenth> > Shinji.
rguenth> >
rguenth> > From: Shinji Sumimoto <s-sumi at bd6.so-net.ne.jp>
rguenth> > Subject: Re: [SCore-users] Races inside SCore(?)
rguenth> > Date: Thu, 05 Dec 2002 00:03:25 +0900 (JST)
rguenth> > Message-ID: <20021205.000325.74756171.s-sumi at bd6.so-net.ne.jp>
rguenth> >
rguenth> > s-sumi> Hi.
rguenth> > s-sumi>
rguenth> > s-sumi> From: Richard Guenther <rguenth at tat.physik.uni-tuebingen.de>
rguenth> > s-sumi> Subject: [SCore-users] Races inside SCore(?)
rguenth> > s-sumi> Date: Tue, 3 Dec 2002 12:23:18 +0100 (CET)
rguenth> > s-sumi> Message-ID: <Pine.LNX.4.33.0212031214090.19216-100000 at bellatrix.tat.physik.uni-tuebingen.de>
rguenth> > s-sumi>
rguenth> > s-sumi> rguenth> I experience problems using SCore (version 4.2.1 with 100MBit
rguenth> > s-sumi> rguenth> and 3.3.1 with Myrinet) in
rguenth> > s-sumi> rguenth> conjunction with the cheetah (v1.1.4) library used by POOMA.
rguenth> > s-sumi> rguenth> The problem appears if I use a nx2 processor setup and does
rguenth> > s-sumi> rguenth> not appear in nx1 mode. The problem is all processes spinning
rguenth> > s-sumi> rguenth> in kernel space (>90% system time) and no progress achieved
rguenth> > s-sumi> rguenth> anymore. Does this sound familiar to anyone?
rguenth> > s-sumi>
rguenth> > s-sumi> rguenth> Now to elaborate some more. Cheetah presents sort of one-sided
rguenth> > s-sumi> rguenth> communication interface to the user and at certain points polls
rguenth> > s-sumi> rguenth> for messages with a construct like (very simplified)
rguenth> > s-sumi> rguenth>
rguenth> > s-sumi> rguenth>  do {
rguenth> > s-sumi> rguenth>     MPI_Iprobe(MPI_ANY_SOURCE, tag, comm, &flag, &status);
rguenth> > s-sumi> rguenth>  } while (!flag);
rguenth> > s-sumi> rguenth>
rguenth> > s-sumi> rguenth> now, if I insert a sched_yield() or a usleep(100) after the
rguenth> > s-sumi> rguenth> MPI_Iprobe(), the problem goes away (well, not completely, but
rguenth> > s-sumi> rguenth> it is a lot harder to reproduce). SCore usually does not
rguenth> > s-sumi> rguenth> detect any sort of deadlock, but ocasionally it does.
rguenth> > s-sumi> rguenth>
rguenth> > s-sumi> rguenth> Now the question, may this be a race condition somewhere in the
rguenth> > s-sumi> rguenth> SCore code that handles multiple processors on one node? Where
rguenth> > s-sumi> rguenth> should I start to look at to fix the problem?
rguenth> > s-sumi>
rguenth> > s-sumi> We have not been seen such situation. Some race condition or
rguenth> > s-sumi> scheduling problem may occur.  MPI_Iprobe does only check message
rguenth> > s-sumi> queue. When user program is in MPI infinite loop, SCore detects the
rguenth> > s-sumi> loop as deadlock. However, in your case, program seems to be working
rguenth> > s-sumi> well.
rguenth> > s-sumi>
rguenth> > s-sumi> When the deadlock has occurred, the other process is also running or
rguenth> > s-sumi> sleeping?  Could you attatch and test the program using gdb with
rguenth> > s-sumi> process number?
rguenth> > s-sumi>
rguenth> > s-sumi> such as
rguenth> > s-sumi>
rguenth> > s-sumi> % gdb your-program-binary process-number
rguenth> > s-sumi>
rguenth> > s-sumi> and test the program using backtrace and step execution.
rguenth> > s-sumi>
rguenth> > s-sumi> Shinji.
rguenth> > s-sumi>
rguenth> > s-sumi> rguenth> Thanks for any hints,
rguenth> > s-sumi> rguenth>    Richard.
rguenth> > s-sumi> rguenth>
rguenth> > s-sumi> rguenth> PS: please CC me, I'm not on the list.
rguenth> > s-sumi> rguenth>
rguenth> > s-sumi> rguenth> --
rguenth> > s-sumi> rguenth> Richard Guenther <richard.guenther at uni-tuebingen.de>
rguenth> > s-sumi> rguenth> WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/
rguenth> > s-sumi> rguenth>
rguenth> > s-sumi> rguenth>
rguenth> > s-sumi> rguenth> _______________________________________________
rguenth> > s-sumi> rguenth> SCore-users mailing list
rguenth> > s-sumi> rguenth> SCore-users at pccluster.org
rguenth> > s-sumi> rguenth> http://www.pccluster.org/mailman/listinfo/score-users
rguenth> > s-sumi> rguenth>
rguenth> > s-sumi> -----
rguenth> > s-sumi> Shinji Sumimoto    E-Mail: s-sumi at bd6.so-net.ne.jp
rguenth> > s-sumi>
rguenth> > s-sumi> _______________________________________________
rguenth> > s-sumi> SCore-users mailing list
rguenth> > s-sumi> SCore-users at pccluster.org
rguenth> > s-sumi> http://www.pccluster.org/mailman/listinfo/score-users
rguenth> > s-sumi>
rguenth> > -----
rguenth> > Shinji Sumimoto    E-Mail: s-sumi at bd6.so-net.ne.jp
rguenth> >
rguenth> 
rguenth> --
rguenth> Richard Guenther <richard.guenther at uni-tuebingen.de>
rguenth> WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/
rguenth> The GLAME Project: http://www.glame.de/
rguenth> 
rguenth> _______________________________________________
rguenth> SCore-users mailing list
rguenth> SCore-users at pccluster.org
rguenth> http://www.pccluster.org/mailman/listinfo/score-users
rguenth> 
-----
Shinji Sumimoto    E-Mail: s-sumi at bd6.so-net.ne.jp



More information about the SCore-users mailing list