[SCore-users-jp] Re: [SCore-users] MPI and PM at the same time
Bogdan Costescu
bogdan.costescu @ iwr.uni-heidelberg.de
2002年 10月 17日 (木) 23:05:06 JST
On Thu, 17 Oct 2002, Atsushi HORI wrote:
> This should not happen if the number of nodes is greater than one.
It does ! Or maybe I don't know how to obtain the data. I've included at
the end of the message a sample program along with the output that I
obtain when running here with SCore configured to use both Myrinet and
shared memory. (If the text is too mangled to be useful I can send it as
attachement or make it available on a web site).
> Define the number of network sets with the RESOURCE MACRO, like this.
>
> SCORE_RSRC_NUM_NETS(N)
I've already tried to set directly the score_num_pmnet variable which is
mentioned in the score_initialize() man page and after MPI_Init(), the
number of contexts is always 1. When using this macro, the compiler (with
-Wall) warns that "unused variable `score_resource_num_netsets'" and the
result is always 1 context.
But the real problem is that I can't use this method. The ARMCI library
has to be initialized *after* MPI, so that it already has all processes up
and running. That's why I asked how to obtain another context starting
from the one used by MPI.
In order to have another context, I was trying to get the device used by
the MPI context so that I can call pmOpenContext and get a second context
on this device - that's where I discovered that ->device was NULL and of
course I couldn't use it in the pmOpenContext call. I also tried to get
->device for the "children" contexts attached to real devices which are in
one case only shmem and in the other only myrinet (and I've also tried on
a larger number on nodes to have both shmem and myrinet at the same time,
but the output becomes long - available on request).
Is there any other way of getting another context ? How about using
pmSaveContext/pmRestoreContext to get a copy of the first context (as we
want the same connectivity) ?
What is pmAttachContext used for ? The documentation for pmCreateAttachFd
says that the fd obtained there could be used in pmAttachContext. But what
for ? If I have a context I attach a fd to it so that I can use select(2),
but then I use this fd and a context type to create another context ???
Another strange thing is that in the Myrinet case, the number of nodes
returned in pmContextConfig.nodes is 1 when I run on 2 nodes as 2x1 (but
becomes 4 when I run on 4 nodes as 4x1). However, I haven't investigated
this further, so there might be a logical explanation for it...
---------------------------------------------------------------
#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>
#include <sc.h>
#include <score.h>
#include <errno.h>
#include <string.h>
#include <score_resource.h>
pmContext *mpic, *pc;
pmDevice *pd;
void fatal(char *s, int err) {
printf("#%d: %s %s\n", score_self_node, s, pmErrorString(err)); fflush(stdout);
exit(1);
}
int main(int argc, char **argv)
{
int err, i;
pmContextConfig cc;
pmContext *allc[PM_MAX_NODE];
int allnr[PM_MAX_NODE];
SCORE_RSRC_NUM_NETS(2);
MPI_Init(&argc, &argv);
if (score_num_pmnet < 1) {
printf("No context !!!\n");
return 1;
}
else printf("#%d: Nr. of contexts: %d\n", score_self_node, score_num_pmnet);
mpic = score_pmnet[0];
if ((err = pmGetContextConfig(mpic, &cc)) != PM_SUCCESS) fatal("pmGetContextConfig", err);
printf("#%d:C: device=%p, parent=%p, ref_count=%d, use_count=%d, size=%d\n",
score_self_node, mpic->device, mpic->parent, mpic->ref_count, mpic->use_count, mpic->size);
printf("#%d:CC: type=%s, nr=%d, nodes=%d, mtu=%d, size=%d, opt=%ld\n",
score_self_node, cc.type, cc.number, cc.nodes, cc.mtu, cc.size, cc.option);
for (i = 0; i < cc.nodes; i++) {
/* pmExtractNode does not work for the node itself !!!*/
if (i == score_self_node) continue;
if ((err = pmExtractNode(mpic, i, &allc[i], &allnr[i])) != PM_SUCCESS) fatal("pmExtractNode", err);
if ((err = pmGetContextConfig(allc[i], &cc)) != PM_SUCCESS) fatal("pmGetContextConfig", err);
printf("#%d:C: me=%d, device=%p, parent=%p, ref_count=%d, use_count=%d, size=%d\n",
score_self_node, i, allc[i]->device, allc[i]->parent, allc[i]->ref_count, allc[i]->use_count, allc[i]->size);
printf("#%d:CC: me=%d, type=%s, nr=%d, nodes=%d, mtu=%d, size=%d, opt=%ld\n",
score_self_node, i, cc.type, cc.number, cc.nodes, cc.mtu, cc.size, cc.option);
fflush(stdout);
}
MPI_Barrier(MPI_COMM_WORLD);
fflush(stdout);
MPI_Finalize();
return 0;
}
And the output:
[bogdan @ node203 ~/tmp]$ scrun -nodes=1x2 ./z
SCore-D 4.2.1 connected (jid=257).
<0:0> SCORE: 2 nodes (1x2) ready.
#0: Nr. of contexts: 1
#0:C: device=(nil), parent=(nil), ref_count=1, use_count=0, size=8484
#0:CC: type=composite, nr=0, nodes=2, mtu=8192, size=65952, opt=68
#0:C: me=1, device=(nil), parent=0x8530148, ref_count=2, use_count=2, size=276
#0:CC: me=1, type=shmem, nr=21, nodes=2, mtu=8192, size=65568, opt=68
#1: Nr. of contexts: 1
#1:C: device=(nil), parent=(nil), ref_count=1, use_count=0, size=8484
#1:CC: type=composite, nr=0, nodes=2, mtu=8192, size=65952, opt=68
#1:C: me=0, device=(nil), parent=0x8530148, ref_count=2, use_count=2, size=276
#1:CC: me=0, type=shmem, nr=21, nodes=2, mtu=8192, size=65568, opt=68
[bogdan @ node203 ~/tmp]$
[bogdan @ node203 ~/tmp]$ scrun -nodes=2x1 ./z
SCore-D 4.2.1 connected (jid=256).
<0:0> SCORE: 2 nodes (2x1) ready.
#0: Nr. of contexts: 1
#0:C: device=(nil), parent=(nil), ref_count=1, use_count=0, size=8484
#0:CC: type=composite, nr=0, nodes=1, mtu=8256, size=164240, opt=94
#1: Nr. of contexts: 1
#1:C: device=(nil), parent=(nil), ref_count=1, use_count=0, size=8484
#1:CC: type=composite, nr=0, nodes=1, mtu=8256, size=164240, opt=94
#1:C: me=0, device=(nil), parent=0x8530148, ref_count=2, use_count=2, size=272
#1:CC: me=0, type=myrinet, nr=0, nodes=2, mtu=8256, size=163856, opt=127
[bogdan @ node203 ~/tmp]$
--
Bogdan Costescu
IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: Bogdan.Costescu @ IWR.Uni-Heidelberg.De
_______________________________________________
SCore-users mailing list
SCore-users @ pccluster.org
http://www.pccluster.org/mailman/listinfo/score-users
SCore-users-jp メーリングリストの案内