[SCore-users-jp] Re: [SCore-users] Problems with MPI when running CHARMM

kameyama @ pccluster.org kameyama @ pccluster.org
2005年 4月 1日 (金) 20:27:26 JST


In article <40721.131.130.40.20.1112351563.squirrel @ mail.fh-stpoelten.ac.at> si011015 @ fh-stpoelten.ac.at wrotes:
> I compiled CHARMM which is used in our group (molecular dynamic
> simulations) and linked it with the /opt/score/mpi/mpich-1.2.5/  library.
> 
> I start CHARMM with the following command:
> scrun -nodes=$n /opt/c32a1_MPICH_CMPI_GENCOMM/exec/gnu/charmm < charmm.inp
> > test.$n
> 
> If $n=1 --> I run on the local machine and 1 cpu
> everything works out fine.
> 
> If $n=1x2 --> I run on 2 machines with 1 cpu

scrun -nodes=1x2 means 1 macines with 2 cpus.
   http://www.pccluster.org/score/dist/score/html/en/man/man1/scrun.html

> some communication is done, but somethings going wrong because the results
> are bad, and the program aborts with the following error message on
> stderror:
> 
> <0:0> SCORE: 2 nodes (1x2) ready.
> <0:0>SCore: *** SIGNAL EXCEPTION eip=0x08698790, cr2=0x 37daf88 ***
> <0:0>SCore: gs=0x0000, fs=0x0000, es=0x002b, ds=0x002b
> <0:0>SCore: edi=0x037daf88, esi=0x80000001, ebp=0xbfffda08, esp=0xbfffd6b0
> <0:0>SCore: ebx=0xffffffff, edx=0x0d5c4d60, ecx=0x0d5dbd78, eax=0x0d5dbd78
> <0:0>SCore: trapno=0x0000000e, err=0x00000004, eip=0x08698790, cs=0x0023
> <0:0>SCore: esp_at_signal=0xbfffd6b0, ss=0x002b, oldmask=0x00000000,
> cr2=0x037daf88
> <0:0> Trying to attach GDB (DISPLAY=localhost:11.0): Exception signal
> (SIGSEGV)
> <1:1>SCore: *** SIGNAL EXCEPTION eip=0x08698790, cr2=0x 37daf88 ***
> <1:1>SCore: gs=0x0000, fs=0x0000, es=0x002b, ds=0x002b
> <1:1>SCore: edi=0x037daf88, esi=0x80000001, ebp=0xbfffda08, esp=0xbfffd6b0
> <1:1>SCore: ebx=0xffffffff, edx=0x0d5c4518, ecx=0x0d5db530, eax=0x0d5db530
> <1:1>SCore: trapno=0x0000000e, err=0x00000004, eip=0x08698790, cs=0x0023
> <1:1>SCore: esp_at_signal=0xbfffd6b0, ss=0x002b, oldmask=0x00000000,
> cr2=0x037daf88
> <0:1> Trying to attach GDB (DISPLAY=localhost:11.0): Exception signal
> (SIGSEGV)
> SCORE: Program aborted.

Program aborted at address 0x08698790 by SIGSEGV.
If your program is  compiled and linked with -g option,
please run with debug option on scrun:
    % env DISPLAY= scrun -nodes=1x2,debug /opt/c32a1_MPICH_CMPI_GENCOMM/exec/gnu/charmm < charmm.inp 

If DISPLAY environment variable is set,
scrun execute "xterm -e gdb (pid_of_the_program)" with -debug option
on compute hosts to attach the program.
But localhost:11 will not access on the compute hosts.
If DISPLAY environment variable is not set, scrun execut gdb bt subcommand
on the compute hosts.
So you must set correect DISPLAY variable or unset it.

                       from Kameyama Toyohisa
_______________________________________________
SCore-users mailing list
SCore-users @ pccluster.org
http://www.pccluster.org/mailman/listinfo/score-users



SCore-users-jp メーリングリストの案内