[Mckernel-users 24] Re: Some questions about large scale tests with mckernel

Balazs Gerofi bgerofi at riken.jp
Wed May 3 14:17:28 JST 2017


Hello Jeremie,

On Tue, May 2, 2017 at 7:14 AM, FINIEL, JEREMIE <jeremie.finiel at atos.net>
wrote:

> I am sorry but it is not possible to access our machines.
>
> It look like  the problem is in the init_fpu() function (in
> mckernel/arch/x86§kernel/cpu.c file). When enabling debug traces, it look
> like xgetbv(0) is the issue (function before dkprintf("init_fpu():
> xsave_mask = 0x%016lX\n", xsave_mask);).
>
>
>
> I think that the processor don’t have the xgetbv instruction available.
> Does I need to undefine the ENABLE_SSE?
>

I am not quite sure about this, I will have to investigate it first. You
could meanwhile try it though!


> I have checkout the performance branch, but I get error at compile time.
> In mckernel/executer/kernel/mcctrl/driver.c, IHK_OS_AUX_PERF_* are not
> declared.
>
>
>
> Grep show that these varaibles are used in mckernel/executer/kernel/mcctrl/driver.c
> and ./executer/kernel/mcctrl/control.c, but never declared.
>

Please pull the ihk repository as well, I think it's due to the mismatch
between the mckernel and ihk repository versions.

Balazs



> Thank you for your help.
>
>
>
> Best regards,
>
> Jérémie Finiel
>
>
>
>
>
> *From:* Balazs Gerofi [mailto:bgerofi at riken.jp]
> *Sent:* Saturday, April 29, 2017 1:01 AM
>
> *To:* FINIEL, JEREMIE
> *Cc:* mckernel-users at pccluster.org; LAFERRIERE, Christophe; WELTERLEN,
> BENOIT; Olivier Gruber
> *Subject:* Re: Some questions about large scale tests with mckernel
>
>
>
> Hello Jeremie,
>
>
>
> On Fri, Apr 28, 2017 at 6:16 AM, FINIEL, JEREMIE <jeremie.finiel at atos.net>
> wrote:
>
> Here is the new dmesg file in attachment, seems like nothing has changed.
>
>
>
> That indeed looks the same as previously. Is there any way I could access
> your machine?
>
>
>
> Otherwise you could try to comment out the call to x86_init_perfctr() in
> init_cpu() in the file mckernel/arch/x86/kernel/cpu.c, that is where the
> next kmsg is supposed to show up so perhaps something goes wrong there.
> This change would disable the performance counter initialization code.
>
>
>
> I’m trying now to use InfiniBand. As the driver bypass the OS, no syscall
> may be triggered and no offloading may be necessary. Right?
>
> I tried to execute IB tests commands like ib_read_bw (from perftest), but
> the command hang when executed in McKernel, and put the machine in a
> strange state (command like ps hang).
>
> So I tried to use a more simple application which uses ibverbs lib (found
> here : https://blog.zhaw.ch/icclab/infiniband-an-introduction-
> simple-ib-verbs-program-with-rdma-write/).
>
> But when I execute it in McKernel, the application hangs too.
>
>
>
> IB works on our machines, but I am a little confused now, are you using
> another machine for the IB tests? Are you using real HW or a VM?
>
>
>
> As I try to reduce the number of syscall in an intensive MPI communication
> application, I have started to read some IHK/McKernel code. In particular
> regarding syscall catch and syscall offloading.
>
>
>
> I have seen that a tracking syscall functionality is available (by
> defining TRACK_SYSCALLS and using “track_syscalls”). I guess that’s what
> you used to evaluate syscall offloading time in “Exploring the Design Space
> of Combining Linux with Lightweight Kernels for Extreme Scale Computing”
> article, isn’t it?
>
> I have tried to define this TRACK_SYSCALLS but now McKernel don’t start.
> Please find the dmesg_track_syscalls.log file in attachment.
>
> Could you tell me how I can enable this feature (if this is a feature)? Is
> it deprecated?
>
>
>
> TRACK_SYSCALLS is a deprecated feature in the master branch, there is
> another branch called "performance" that provides much more elaborate
> profiling code, if you are interested in trying it please checkout that
> branch (in mckernel: git fetch; git checkout performance). What you can do
> then is to pass --profile to mcexec, for example: mcexec --profile ls -ls ,
> when it returns you can look at the kmsg and see some log about various
> syscalls/kernel events.
>
>
>
> I think it would be helpful to be able to debug an application under
> McKernel. Do you have a tool to do so?
>
>
>
> GDB is half-way supported, you could give it a try, but it lacks many
> features.. catching segfaults in single threaded apps sort of works.
>
> For example, you would run it as: mcexec gdb --args ls -ls
>
>
>
> When I activate debug trace on McKernel, a lot of information are written
> with kprintf. More than ihkosctl kmsg are able to print. How can I extend
> this buffer size? Is there a way to follow kmsg in real time (like dmesg
> –w)?
>
>
>
> There is macro called IHK_KMSG_SIZE, that controls the size of the buffer.
> dmesg -w like usage model doesn't quite work yet I believe.
>
>
>
> Balazs
>
>
>
> Thank you in advance for your help.
>
>
>
> Best regards,
>
>
>
> Jérémie Finiel
>
>
>
> *From:* Balazs Gerofi [mailto:bgerofi at riken.jp]
> *Sent:* Sunday, April 23, 2017 2:01 AM
> *To:* FINIEL, JEREMIE
> *Cc:* mckernel-users at pccluster.org; LAFERRIERE, Christophe; WELTERLEN,
> BENOIT; Olivier Gruber
>
>
> *Subject:* Re: Some questions about large scale tests with mckernel
>
>
>
> Hello Jeremie,
>
>
>
> I would suggest that you use MVAPICH or Intel MPI, but let's focus on the
> boot bug first. I've added a change to make sure we can see the full kmsg
> when boot fails.
>
> Could you please pull the IHK repository, retry the boot script, and send
> me the new dmesg log?
>
>
>
> Thanks,
>
> Balazs
>
>
>
>
>
> On Thu, Apr 20, 2017 at 6:37 AM, FINIEL, JEREMIE <jeremie.finiel at atos.net>
> wrote:
>
> Hello Balazs,
>
> Thank you for your quick reply.
>
>
>
> I catch a set of hwloc and mpi_hello_world executions in the ‘execution’
> attached file.
> It contain my console output with hwloc-ls execution in both, Linux and
> McKernel, and execution of mpi_hello_world with and without pinning.
>
> Has you can see (Line 161), in case where pinning is not explicit, the
> command print an error and then hangs.
>
> And by explicitly disabling pinning (Line 217), the error does not appear,
> but the command hang too.
>
>
>
> Here is my environment:
>
> 1 node
>
> CPU Xeon E5-2680, 2 sockets, 12 cores per socket, 1 thread per core
>
> Uname –r : 3.10.0-327.el7.x86_64
>
> OpenMPI version: 2.0.0
>
>
>
> Best regards,
>
>
>
> Jérémie Finiel
>
>
>
>
>
>
>
> *From:* Balazs Gerofi [mailto:bgerofi at riken.jp]
> *Sent:* Thursday, April 20, 2017 6:11 AM
> *To:* FINIEL, JEREMIE; mckernel-users at pccluster.org
> *Cc:* LAFERRIERE, Christophe; WELTERLEN, BENOIT; Olivier Gruber
> *Subject:* Re: Some questions about large scale tests with mckernel
>
>
>
> Hello Jeremie,
>
>
>
> I have added the mckernel-users mailing list to CC so that we all see your
> messages, please keep it in the loop!
>
>
>
> On Wed, Apr 19, 2017 at 8:07 AM, FINIEL, JEREMIE <jeremie.finiel at atos.net>
> wrote:
>
> We had difficulties to launch McKernel on Nehalem processor (E5540 in this
> case). When starting mcreboot.sh, we just got this error two times: “error:
> booting”. Please find the dmesg log attached.
>
>
>
> I haven't had a chance so far to try McKernel on Nehalem, but I would
> expect it's more related to your Linux kernel version, what is the version
> you are running?  Also, I looked at the dmesg but unfortunately the root
> cause of the failure is not visible, I will need to adjust how the kmsg is
> printed to make that part visible. Let me try to get a patch done this
> week, I'll contact you again.
>
>
>
> We tried to launch MPI  program with openMPI, but we had an error about
> hwloc pinning functions (hwloc_set_cpubind returned "Error" for bitmap "0").
>
> As written in your paper, we tried with mvapich2 which work perfectly.
>
> Could you let me know what specific development you had to do to have
> mvapich working with McKernel? I'm trying to see how hard it would be to
> have OpenMPI working too.
>
>
>
> Are you trying to run OpenMPI on another host? Could you let me know the
> platform and the configuration how you boot McKernel?
>
> Hwloc generally is supported, what do you get for mcexec hwloc-ls? Does
> that work? Another thing you could try is to disable binding in OpenMPI,
> just as a test..
>
>
>
> Now we would like to do some tests at a larger scale. So we develop a
> script to launch McKernel on each machines, but we got difficulties about
> right access. Mcreboot.sh must be launch with privilege access, but mcexec
> can be executed by a user only if /dev/mcos0 is accessible by this user.
> For the moment, in order to avoid using root for every execution, I can
> change owner of /dev/mcos0 as a workaround.
>
> Furthermore, when executing “./mcexec mpirun …” we notice that execution
> on the other machine is done in the Linux side and not in McKernel, but
> “mpirun ./mcexec …” seems to do the job. I would be interested to know how
> you launch your tests at large scale. If, by any chances, you have a
> template script, it would be helpful.
>
>
>
> Ahh, yes, we are aware of this. One way to get around it is to set your
> umask to 0002 and use sudo to run the script (I mean not directly by root).
>
>
>
> Best,
>
> Balazs
>
>
>
> Thank you in advance.
>
>
>
> Best regards,
>
> Jérémie Finiel
>
>
>
>
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.pccluster.org/pipermail/mckernel-users/attachments/20170502/2beea127/attachment-0001.html>


More information about the Mckernel-users mailing list