Re: Floating point register corruption on ARM Cortex A57 (ARMv8) with RT_PREEMPT linux
From: Anup Pemmaiah <hidden>
Date: 2018-10-08 12:46:05
Some more observations with RT_PREEMPT configs enabled.
1) I re-ran the tests disabling all crypto including NEON related
crypto and EFI kernel config options. I still see randomly floating
point register getting corrupted
2) I noticed that, when I run the tests with RT schedulers and RT
priorities, eg: ("chrt -f 5 ./test_float" or "chrt -r 5
./test_float"), I am not able reproduce the corruption issue. But,
when I run the tests (just ./test_float) without any RT scheduler and
priority (i.e SCHED_OTHER) can easily reproduce the issue.
I tried disabling PREEMPT_LAZY, by "echo NO_PREEMPT_LAZY >
/sys/kernel/debug/sched_features". It did not help and am able to
reproduce the problem
3) I have another Cortex ARM A57 system from a different vendor(cannot
name the vendors because of proprietary reasons) with Linux kernel
version 4.9.38 and RT_PREEMPT enabled. I do not see any floating point
corruption issue, even if I run the test as SCHED_OTHER or with real
time settings. So, that tells me moving to 4.18 may not help. What do
you think?
Thanks
Anup
On Sun, Oct 7, 2018 at 9:58 AM Anup Pemmaiah [off-list ref] wrote:quoted
nope, should work by default. Do you have NEON related crypto code or EFI enabled?Sebastian, Thank you for the comments. I have NEON related crypto code enabled right now, but I remember disabling it and it did not make a difference. I will disable it again and will give it a try. In the mean time, when I disabled the following 4 lines from the config file and re-compiled the kernel, the test code works fine without the issue described earlier related to floating point. Are you suspecting that NEON related crypto interferes with real time kernel and not with non-rt kernel? # CONFIG_PREEMPT_RT_BASE=y # CONFIG_HAVE_PREEMPT_LAZY=y # CONFIG_PREEMPT_LAZY=y # CONFIG_PREEMPT_RT_FULL=yquoted
Could you please try the latest v4.18? I believe it is fixed there and needs just backporting. Could you please try?I will try it as a last resort because I am not sure if the board BSP supports v4.18. Right now, I am trying to figure out, why it works fine with non-rt kernel and only see the issue when the above four RT_PREEMPT config options are turned on. On Fri, Oct 5, 2018, 9:55 AM Sebastian Andrzej Siewior [off-list ref] wrote:quoted
On 2018-10-04 19:12:53 [-0700], Anup Pemmaiah wrote:quoted
1) Is there any floating point related kernel setting that I should set in the RT_PREEMPT kernel? I have set eagerfpu=on (even though it is on by default)nope, should work by default. Do you have NEON related crypto code or EFI enabled?quoted
2) Was reading about "Lazy Stacking" for Cortex-M4, but not sure if it applies for Cortex A57 Any comments will be greatly appreciated.Could you please try the latest v4.18? I believe it is fixed there and needs just backporting. Could you please try? SebastianOn Fri, Oct 5, 2018 at 9:55 AM Sebastian Andrzej Siewior [off-list ref] wrote:quoted
On 2018-10-04 19:12:53 [-0700], Anup Pemmaiah wrote:quoted
1) Is there any floating point related kernel setting that I should set in the RT_PREEMPT kernel? I have set eagerfpu=on (even though it is on by default)nope, should work by default. Do you have NEON related crypto code or EFI enabled?quoted
2) Was reading about "Lazy Stacking" for Cortex-M4, but not sure if it applies for Cortex A57 Any comments will be greatly appreciated.Could you please try the latest v4.18? I believe it is fixed there and needs just backporting. Could you please try? Sebastian