<div dir="ltr">Hello Jeremie,<div><br></div><div class="gmail_extra"><div class="gmail_quote">On Thu, May 4, 2017 at 6:58 AM, FINIEL, JEREMIE <span dir="ltr"><<a href="mailto:jeremie.finiel@atos.net" target="_blank">jeremie.finiel@atos.net</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div lang="FR">
<div class="gmail-m_-3642148976276023264WordSection1">
<p class="MsoNormal"><span style="font-family:calibri,sans-serif;font-size:11pt">I’m interested in the changes you have done in MPI and McKernel for the Thread Private Shared Library (Toward Operating System Support for Scalable Multithreaded
Message Passing).</span><br></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11pt;font-family:calibri,sans-serif"> </span><span lang="EN-US"><u></u><u></u></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11pt;font-family:calibri,sans-serif">Is it already integrated in McKernel or do you have a branch for this? How can I try this functionality?</span></p></div></div></blockquote><div><br></div><div>This functionality is implemented in another branch, which is currently not on the git server. It's a little out-dated compared to the master branch. I can send you a tarball, but it assumes the IB network is working.</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div lang="FR"><div class="gmail-m_-3642148976276023264WordSection1">
<p class="MsoNormal"><span lang="EN-US" style="font-size:11pt;font-family:calibri,sans-serif">I’m also interested in the article “Revisiting RDMA Buffer Registration in the Context of Lightweight Multi-kernels”. I have tried to download it from your web
page but it look like the link is down.<br>
</span><span style="font-size:11pt;font-family:calibri,sans-serif"><a href="http://www-sys-aics.riken.jp/Members/bgerofi/papers/bgerofi-eurompi2016.pdf" target="_blank">http://www-sys-aics.riken.jp/<wbr>Members/bgerofi/papers/<wbr>bgerofi-eurompi2016.pdf</a></span></p></div></div></blockquote><div><br></div><div>Sorry, I fixed the link, you should be able to access the paper now.</div><div> <span style="font-family:calibri,sans-serif;font-size:11pt"> </span></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div lang="FR"><div class="gmail-m_-3642148976276023264WordSection1"><p class="MsoNormal"><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11pt;font-family:calibri,sans-serif">IHK was developed to be able to host LWK, right?
</span><span lang="EN-US"><u></u><u></u></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11pt;font-family:calibri,sans-serif">McKernel is one instance of a LWK, but did you try with others? Which one?</span><span lang="EN-US"><u></u><u></u></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11pt;font-family:calibri,sans-serif">Is McKernel a simple LWK to allow proof of concept of IHK or does it have specific assets beside the TPSL capability?</span></p></div></div></blockquote><div><br></div><div>McKernel is mostly about memory management and scalability. We have a paper submission this year to SC, if it gets in to the program you will be able to see more details.<br></div><div>As for the "other LWKs" argument, it is primarily McKernel at the moment, but we had some versions of it that were considerable different. For example the one with the thread private shared library feature. </div><div>I haven't tried to port another OS (e.g., Kitten or MINIX) on top of IHK, but it shouldn't be that difficult. You could give it a try! :)</div><div><br></div><div>Thanks,</div><div>Balazs</div><div> <span style="font-family:calibri,sans-serif;font-size:11pt"> </span></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div lang="FR"><div class="gmail-m_-3642148976276023264WordSection1"><span class="gmail-">
<p class="MsoNormal"><span lang="EN-US" style="font-size:11pt;font-family:calibri,sans-serif">Thank you in advance.</span><span lang="EN-US"><u></u><u></u></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11pt;font-family:calibri,sans-serif"> </span><span lang="EN-US"><u></u><u></u></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11pt;font-family:calibri,sans-serif">Best regards,</span><span lang="EN-US"><u></u><u></u></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11pt;font-family:calibri,sans-serif"> </span><span lang="EN-US"><u></u><u></u></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11pt;font-family:calibri,sans-serif">Jérémie Finiel</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11pt;font-family:calibri,sans-serif;color:rgb(31,73,125)"><u></u> <u></u></span></p>
<p class="MsoNormal"><span lang="EN-US"><u></u> <u></u></span></p>
</span><p class="MsoNormal"><b><span style="font-size:10pt;font-family:tahoma,sans-serif">From:</span></b><span style="font-size:10pt;font-family:tahoma,sans-serif"> Balazs Gerofi [mailto:<a href="mailto:bgerofi@riken.jp" target="_blank">bgerofi@riken.jp</a>]
<br>
<b>Sent:</b> Wednesday, May 03, 2017 7:17 AM</span></p><div><div class="gmail-h5"><br>
<b>To:</b> FINIEL, JEREMIE<br>
<b>Cc:</b> <a href="mailto:mckernel-users@pccluster.org" target="_blank">mckernel-users@pccluster.org</a>; LAFERRIERE, Christophe; WELTERLEN, BENOIT; Olivier Gruber<br>
<b>Subject:</b> Re: Some questions about large scale tests with mckernel<u></u><u></u></div></div><p></p><div><div class="gmail-h5">
<p class="MsoNormal"><u></u> <u></u></p>
<div>
<p class="MsoNormal">Hello Jeremie,<u></u><u></u></p>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<div>
<p class="MsoNormal">On Tue, May 2, 2017 at 7:14 AM, FINIEL, JEREMIE <<a href="mailto:jeremie.finiel@atos.net" target="_blank">jeremie.finiel@atos.net</a>> wrote:<u></u><u></u></p>
<div>
<div>
<p class="MsoNormal"><span style="font-size:11pt;font-family:calibri,sans-serif">I am sorry but it is not possible to access our machines.</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11pt;font-family:calibri,sans-serif">It look like the problem is in the init_fpu() function (in mckernel/arch/x86§kernel/cpu.c file).
When enabling debug traces, it look like xgetbv(0) is the issue (function before dkprintf("init_fpu(): xsave_mask = 0x%016lX\n", xsave_mask);).</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11pt;font-family:calibri,sans-serif"> </span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11pt;font-family:calibri,sans-serif">I think that the processor don’t have the xgetbv instruction available. Does I need to undefine the
ENABLE_SSE?</span><u></u><u></u></p>
</div>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">I am not quite sure about this, I will have to investigate it first. You could meanwhile try it though!<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
<blockquote style="border-top:none;border-right:none;border-bottom:none;border-left:1pt solid rgb(204,204,204);padding:0cm 0cm 0cm 6pt;margin-left:4.8pt;margin-right:0cm">
<div>
<div>
<p class="MsoNormal"><span style="font-size:11pt;font-family:calibri,sans-serif">I have checkout the performance branch, but I get error at compile time. In mckernel/executer/kernel/<wbr>mcctrl/driver.c,
IHK_OS_AUX_PERF_* are not declared.</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11pt;font-family:calibri,sans-serif"> </span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11pt;font-family:calibri,sans-serif">Grep show that these varaibles are used in mckernel/executer/kernel/<wbr>mcctrl/driver.c and ./executer/kernel/mcctrl/<wbr>control.c,
but never declared.</span><u></u><u></u></p>
</div>
</div>
</blockquote>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">Please pull the ihk repository as well, I think it's due to the mismatch between the mckernel and ihk repository versions.<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">Balazs<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:11pt;font-family:calibri,sans-serif"> </span><u></u><u></u></p>
</div>
<blockquote style="border-top:none;border-right:none;border-bottom:none;border-left:1pt solid rgb(204,204,204);padding:0cm 0cm 0cm 6pt;margin-left:4.8pt;margin-right:0cm">
<div>
<div>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11pt;font-family:calibri,sans-serif">Thank you for your help.</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11pt;font-family:calibri,sans-serif"> </span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:11pt;font-family:calibri,sans-serif">Best regards,<br>
<br>
Jérémie Finiel</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11pt;font-family:calibri,sans-serif;color:rgb(31,73,125)"> </span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US"> </span><u></u><u></u></p>
<p class="MsoNormal"><b><span style="font-size:10pt;font-family:tahoma,sans-serif">From:</span></b><span style="font-size:10pt;font-family:tahoma,sans-serif"> Balazs Gerofi [mailto:<a href="mailto:bgerofi@riken.jp" target="_blank">bgerofi@riken.jp</a>]
<br>
<b>Sent:</b> Saturday, April 29, 2017 1:01 AM</span><u></u><u></u></p>
<div>
<div>
<p class="MsoNormal"><br>
<b>To:</b> FINIEL, JEREMIE<br>
<b>Cc:</b> <a href="mailto:mckernel-users@pccluster.org" target="_blank">mckernel-users@pccluster.org</a>; LAFERRIERE, Christophe; WELTERLEN, BENOIT; Olivier Gruber<br>
<b>Subject:</b> Re: Some questions about large scale tests with mckernel<u></u><u></u></p>
</div>
</div>
<div>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
<div>
<p class="MsoNormal">Hello Jeremie,<u></u><u></u></p>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
<div>
<div>
<p class="MsoNormal">On Fri, Apr 28, 2017 at 6:16 AM, FINIEL, JEREMIE <<a href="mailto:jeremie.finiel@atos.net" target="_blank">jeremie.finiel@atos.net</a>> wrote:<u></u><u></u></p>
<div>
<div>
<p class="MsoNormal"><span style="font-size:11pt;font-family:calibri,sans-serif">Here is the new dmesg file in attachment, seems like nothing has changed.</span><u></u><u></u></p>
</div>
</div>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">That indeed looks the same as previously. Is there any way I could access your machine?<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">Otherwise you could try to comment out the call to x86_init_perfctr() in init_cpu() in the file mckernel/arch/x86/kernel/cpu.<wbr>c, that is where the next kmsg is supposed to show up
so perhaps something goes wrong there. This change would disable the performance counter initialization code.<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
<blockquote style="border-top:none;border-right:none;border-bottom:none;border-left:1pt solid rgb(204,204,204);padding:0cm 0cm 0cm 6pt;margin:5pt 0cm 5pt 4.8pt">
<div>
<div>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11pt;font-family:calibri,sans-serif">I’m trying now to use InfiniBand. As the driver bypass the OS, no syscall may be triggered and no
offloading may be necessary. Right?</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11pt;font-family:calibri,sans-serif">I tried to execute IB tests commands like ib_read_bw (from perftest), but the command hang when executed
in McKernel, and put the machine in a strange state (command like ps hang).</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11pt;font-family:calibri,sans-serif">So I tried to use a more simple application which uses ibverbs lib (found here :
</span><span style="font-size:11pt;font-family:calibri,sans-serif"><a href="https://blog.zhaw.ch/icclab/infiniband-an-introduction-simple-ib-verbs-program-with-rdma-write/" target="_blank"><span lang="EN-US">https://blog.zhaw.ch/icclab/<wbr>infiniband-an-introduction-<wbr>simple-ib-verbs-program-with-<wbr>rdma-write/</span></a></span><span lang="EN-US" style="font-size:11pt;font-family:calibri,sans-serif">).</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11pt;font-family:calibri,sans-serif">But when I execute it in McKernel, the application hangs too.</span><u></u><u></u></p>
</div>
</div>
</blockquote>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">IB works on our machines, but I am a little confused now, are you using another machine for the IB tests? Are you using real HW or a VM?<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:11pt;font-family:calibri,sans-serif;color:rgb(31,73,125)"> </span><u></u><u></u></p>
</div>
<blockquote style="border-top:none;border-right:none;border-bottom:none;border-left:1pt solid rgb(204,204,204);padding:0cm 0cm 0cm 6pt;margin:5pt 0cm 5pt 4.8pt">
<div>
<div>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11pt;font-family:calibri,sans-serif">As I try to reduce the number of syscall in an intensive MPI communication application, I have started
to read some IHK/McKernel code. In particular regarding syscall catch and syscall offloading.</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11pt;font-family:calibri,sans-serif"> </span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11pt;font-family:calibri,sans-serif">I have seen that a tracking syscall functionality is available (by defining TRACK_SYSCALLS and using
“track_syscalls”). I guess that’s what you used to evaluate syscall offloading time in “Exploring the Design Space of Combining Linux with Lightweight Kernels for Extreme Scale Computing” article, isn’t it?</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11pt;font-family:calibri,sans-serif">I have tried to define this TRACK_SYSCALLS but now McKernel don’t start. Please find the dmesg_track_syscalls.log
file in attachment.</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11pt;font-family:calibri,sans-serif">Could you tell me how I can enable this feature (if this is a feature)? Is it deprecated?</span><u></u><u></u></p>
</div>
</div>
</blockquote>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">TRACK_SYSCALLS is a deprecated feature in the master branch, there is another branch called "performance" that provides much more elaborate profiling code, if you are interested
in trying it please checkout that branch (in mckernel: git fetch; git checkout performance). What you can do then is to pass --profile to mcexec, for example: mcexec --profile ls -ls , when it returns you can look at the kmsg and see some log about various
syscalls/kernel events.<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
<blockquote style="border-top:none;border-right:none;border-bottom:none;border-left:1pt solid rgb(204,204,204);padding:0cm 0cm 0cm 6pt;margin:5pt 0cm 5pt 4.8pt">
<div>
<div>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11pt;font-family:calibri,sans-serif">I think it would be helpful to be able to debug an application under McKernel. Do you have a tool
to do so?</span><u></u><u></u></p>
</div>
</div>
</blockquote>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">GDB is half-way supported, you could give it a try, but it lacks many features.. catching segfaults in single threaded apps sort of works.<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">For example, you would run it as: mcexec gdb --args ls -ls<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
<blockquote style="border-top:none;border-right:none;border-bottom:none;border-left:1pt solid rgb(204,204,204);padding:0cm 0cm 0cm 6pt;margin:5pt 0cm 5pt 4.8pt">
<div>
<div>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11pt;font-family:calibri,sans-serif">When I activate debug trace on McKernel, a lot of information are written with kprintf. More than
ihkosctl kmsg are able to print. How can I extend this buffer size? Is there a way to follow kmsg in real time (like dmesg –w)?</span><u></u><u></u></p>
</div>
</div>
</blockquote>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">There is macro called IHK_KMSG_SIZE, that controls the size of the buffer. dmesg -w like usage model doesn't quite work yet I believe.<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">Balazs<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
<blockquote style="border-top:none;border-right:none;border-bottom:none;border-left:1pt solid rgb(204,204,204);padding:0cm 0cm 0cm 6pt;margin:5pt 0cm 5pt 4.8pt">
<div>
<div>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11pt;font-family:calibri,sans-serif">Thank you in advance for your help.</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11pt;font-family:calibri,sans-serif"> </span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:11pt;font-family:calibri,sans-serif">Best regards,</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:11pt;font-family:calibri,sans-serif"> </span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:11pt;font-family:calibri,sans-serif">Jérémie Finiel</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US"> </span><u></u><u></u></p>
<p class="MsoNormal"><b><span style="font-size:10pt;font-family:tahoma,sans-serif">From:</span></b><span style="font-size:10pt;font-family:tahoma,sans-serif"> Balazs Gerofi [mailto:<a href="mailto:bgerofi@riken.jp" target="_blank">bgerofi@riken.jp</a>]
<br>
<b>Sent:</b> Sunday, April 23, 2017 2:01 AM<br>
<b>To:</b> FINIEL, JEREMIE<br>
<b>Cc:</b> <a href="mailto:mckernel-users@pccluster.org" target="_blank">mckernel-users@pccluster.org</a>; LAFERRIERE, Christophe; WELTERLEN, BENOIT; Olivier Gruber</span><u></u><u></u></p>
<div>
<div>
<p class="MsoNormal"><br>
<b>Subject:</b> Re: Some questions about large scale tests with mckernel<u></u><u></u></p>
</div>
</div>
<div>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
<div>
<p class="MsoNormal">Hello Jeremie,<u></u><u></u></p>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">I would suggest that you use MVAPICH or Intel MPI, but let's focus on the boot bug first. I've added a change to make sure we can see the full kmsg when boot fails.<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">Could you please pull the IHK repository, retry the boot script, and send me the new dmesg log?<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">Thanks,<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">Balazs<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
</div>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
<div>
<p class="MsoNormal">On Thu, Apr 20, 2017 at 6:37 AM, FINIEL, JEREMIE <<a href="mailto:jeremie.finiel@atos.net" target="_blank">jeremie.finiel@atos.net</a>> wrote:<u></u><u></u></p>
<div>
<div>
<p class="MsoNormal"><span lang="EN-US">Hello Balazs,<br>
<br>
Thank you for your quick reply.</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US"> </span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US">I catch a set of hwloc and mpi_hello_world executions in the ‘execution’ attached file.<br>
It contain my console output with hwloc-ls execution in both, Linux and McKernel, and execution of mpi_hello_world with and without pinning.</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US">Has you can see (Line 161), in case where pinning is not explicit, the command print an error and then hangs.</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US">And by explicitly disabling pinning (Line 217), the error does not appear, but the command hang too.</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US"> </span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US">Here is my environment:</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US">1 node</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US">CPU Xeon E5-2680, 2 sockets, 12 cores per socket, 1 thread per core</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US">Uname –r : 3.10.0-327.el7.x86_64</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US">OpenMPI version: 2.0.0</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US"> </span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US">Best regards,</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US"> </span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US">Jérémie Finiel</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US"> </span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:11pt;font-family:calibri,sans-serif;color:rgb(31,73,125)"> </span><u></u><u></u></p>
<p class="MsoNormal"> <u></u><u></u></p>
<p class="MsoNormal"><b><span style="font-size:10pt;font-family:tahoma,sans-serif">From:</span></b><span style="font-size:10pt;font-family:tahoma,sans-serif"> Balazs Gerofi [mailto:<a href="mailto:bgerofi@riken.jp" target="_blank">bgerofi@riken.jp</a>]
<br>
<b>Sent:</b> Thursday, April 20, 2017 6:11 AM<br>
<b>To:</b> FINIEL, JEREMIE; <a href="mailto:mckernel-users@pccluster.org" target="_blank">
mckernel-users@pccluster.org</a><br>
<b>Cc:</b> LAFERRIERE, Christophe; WELTERLEN, BENOIT; Olivier Gruber<br>
<b>Subject:</b> Re: Some questions about large scale tests with mckernel</span><u></u><u></u></p>
<div>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
<div>
<p class="MsoNormal">Hello Jeremie,<u></u><u></u></p>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">I have added the mckernel-users mailing list to CC so that we all see your messages, please keep it in the loop!<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
<div>
<p class="MsoNormal">On Wed, Apr 19, 2017 at 8:07 AM, FINIEL, JEREMIE <<a href="mailto:jeremie.finiel@atos.net" target="_blank">jeremie.finiel@atos.net</a>> wrote:<u></u><u></u></p>
<div>
<div>
<p class="MsoNormal"><span style="font-size:11pt;font-family:calibri,sans-serif">We had difficulties to launch McKernel on Nehalem processor (E5540 in this case). When starting mcreboot.sh, we
just got this error two times: “error: booting”. Please find the dmesg log attached.</span><u></u><u></u></p>
</div>
</div>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">I haven't had a chance so far to try McKernel on Nehalem, but I would expect it's more related to your Linux kernel version, what is the version you are running? Also, I looked
at the dmesg but unfortunately the root cause of the failure is not visible, I will need to adjust how the kmsg is printed to make that part visible. Let me try to get a patch done this week, I'll contact you again.<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
<blockquote style="border-top:none;border-right:none;border-bottom:none;border-left:1pt solid rgb(204,204,204);padding:0cm 0cm 0cm 6pt;margin:5pt 0cm 5pt 4.8pt">
<div>
<div>
<p class="MsoNormal"><span style="font-size:11pt;font-family:calibri,sans-serif">We tried to launch MPI program with openMPI, but we had an error about hwloc pinning functions (hwloc_set_cpubind
returned "Error" for bitmap "0").</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:11pt;font-family:calibri,sans-serif">As written in your paper, we tried with mvapich2 which work perfectly.
</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:11pt;font-family:calibri,sans-serif">Could you let me know what specific development you had to do to have mvapich working with McKernel? I'm trying
to see how hard it would be to have OpenMPI working too.</span><u></u><u></u></p>
</div>
</div>
</blockquote>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">Are you trying to run OpenMPI on another host? Could you let me know the platform and the configuration how you boot McKernel?<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">Hwloc generally is supported, what do you get for mcexec hwloc-ls? Does that work? Another thing you could try is to disable binding in OpenMPI, just as a test..<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
<blockquote style="border-top:none;border-right:none;border-bottom:none;border-left:1pt solid rgb(204,204,204);padding:0cm 0cm 0cm 6pt;margin:5pt 0cm 5pt 4.8pt">
<div>
<div>
<p class="MsoNormal"><span style="font-size:11pt;font-family:calibri,sans-serif">Now we would like to do some tests at a larger scale. So we develop a script to launch McKernel on each machines,
but we got difficulties about right access. Mcreboot.sh must be launch with privilege access, but mcexec can be executed by a user only if /dev/mcos0 is accessible by this user. For the moment, in order to avoid using root for every execution, I can change
owner of /dev/mcos0 as a workaround. </span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:11pt;font-family:calibri,sans-serif">Furthermore, when executing “./mcexec mpirun …” we notice that execution on the other machine is done in the Linux
side and not in McKernel, but “mpirun ./mcexec …” seems to do the job. I would be interested to know how you launch your tests at large scale. If, by any chances, you have a template script, it would be helpful.</span><u></u><u></u></p>
</div>
</div>
</blockquote>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">Ahh, yes, we are aware of this. One way to get around it is to set your umask to 0002 and use sudo to run the script (I mean not directly by root).<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">Best,<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">Balazs<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
<blockquote style="border-top:none;border-right:none;border-bottom:none;border-left:1pt solid rgb(204,204,204);padding:0cm 0cm 0cm 6pt;margin:5pt 0cm 5pt 4.8pt">
<div>
<div>
<p class="MsoNormal"><span style="font-size:11pt;font-family:calibri,sans-serif">Thank you in advance.</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:11pt;font-family:calibri,sans-serif"> </span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:11pt;font-family:calibri,sans-serif">Best regards,</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:11pt;font-family:calibri,sans-serif">Jérémie Finiel</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:11pt;font-family:calibri,sans-serif"> </span><u></u><u></u></p>
</div>
</div>
</blockquote>
</div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
</div>
</div>
</div>
</div>
</blockquote>
</div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
</div>
</div>
</div>
</div>
</div>
</blockquote>
</div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
</div>
</div></div></div>
</div>
</blockquote></div><br></div></div>