[Mckernel-users 35] Re: [Problem with installation of mckernel again]
Balazs Gerofi
bgerofi at riken.jp
Mon May 29 14:02:22 JST 2017
Hello Minwoo,
There are a few issues with how you are trying to run MPI jobs, but before
getting to those, could you send me your ssh public key?
I will give you access to our code repository so that you can access the
latest McKernel version.
Once you have that we can then move on to the MPI topic.
Best,
Balazs
On Thu, May 25, 2017 at 21:18 안민우 <mwahn402 at gmail.com> wrote:
> To be specific, Linpack benchmark problem shows
>
> kernel:BUG: soft lockup - CPU#0 stuck for 23s! [mcexec:63572]
> Terminate by signal 28
>
> 2017-05-26 12:52 GMT+09:00 안민우 <mwahn402 at gmail.com>:
>
>> Hello Balazs,
>>
>> I solved all problems I send before.
>>
>> I compared elapsed time of each workloads in NPB-MPI, results was quite
>> strange.
>>
>> My Intel Xeon server have 2 NUMAs and 6 cores each. I assigned 6 cores
>> from NUMA1 and 2 cores from NUMA0(total 8 cores) to McKernel and run NPB.
>> Compared to this, I used same cores(8 cores) to run NPB in CentOS.
>>
>> According to my result, elapsed time of NPB on McKernel vs. Linux was
>> similar, but every workloads show that elapsed time of NPB on Linux is
>> little bit shorter. Is it right result?
>>
>>
>> Moreover, I have some runtime error to run "Linpack" benchmark. Linpack
>> stops it's running during runtime.
>>
>> I use mpiexec command as $mcexec mpiexec -np 8 <executable>.
>>
>> Is there any problem? And there is way to see kmsg on McKernel before my
>> server reboots itself?
>>
>>
>> Thank you to read my email.
>>
>> Best regards,
>>
>> Minwoo Ahn.
>>
>> 2017-05-17 11:18 GMT+09:00 안민우 <mwahn402 at gmail.com>:
>>
>>> Hello Balazs,
>>>
>>> I'm MS student from Korea, who sent emails about installation of
>>> mckernel(I successfully installed before). I have 3 questions.
>>>
>>> 1.
>>>
>>> I have some problem in my linux, so I install centOS again and I tried
>>> to build and boot mckernel.
>>>
>>> I successed to build mckernel(ihk+mckernel.tgz as you gave to me), but
>>> when I tried to run mcreboot.sh script in "install/sbin" directory, my
>>> whole server rebooted itself.
>>>
>>> I'm sure that I disabled irqbalance and changed to SELINUX=disabled.
>>>
>>> How can I fix my problem?
>>>
>>> 2.
>>>
>>> Moreover, how can I find some documents(or manual) about mckernel source
>>> code?
>>>
>>> It is for change # of cores of mckernel. In my server, without any
>>> modification of configuration, mckernel installed on 6 cores. Can I change
>>> it to 8 cores?
>>>
>>> 3.
>>>
>>> I tried to run NAS parallel benchmark in MPI version(ft.B.16), it causes
>>> page fault(What's the problem?). Is it write to run "ft.B.16" executable
>>> file with command "MPI_NUM_THREAD=16 mcexec {PATH_TO_EXECUTABLE}/ft.B.16" ?
>>>
>>>
>>>
>>> Thank you to read my email.
>>>
>>> Best regards,
>>>
>>> Minwoo Ahn.
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.pccluster.org/pipermail/mckernel-users/attachments/20170529/2973182a/attachment.html>
More information about the Mckernel-users
mailing list