Re: [PATCH] ARM: fix __get_user_check() in case uaccess_* calls are not inlined
From: Nick Desaulniers <hidden>
Date: 2019-09-30 22:19:22
Also in:
lkml
On Sun, Sep 29, 2019 at 11:00 PM Masahiro Yamada [off-list ref] wrote:
KernelCI reports that bcm2835_defconfig is no longer booting since
commit ac7c3e4ff401 ("compiler: enable CONFIG_OPTIMIZE_INLINING
forcibly"):
https://lkml.org/lkml/2019/9/26/825
I also received a regression report from Nicolas Saenz Julienne:
https://lkml.org/lkml/2019/9/27/263
This problem has cropped up on arch/arm/config/bcm2835_defconfig
because it enables CONFIG_CC_OPTIMIZE_FOR_SIZE. The compiler tends
to prefer not inlining functions with -Os. I was able to reproduce
it with other boards and defconfig files by manually enabling
CONFIG_CC_OPTIMIZE_FOR_SIZE.
The __get_user_check() specifically uses r0, r1, r2 registers.Yep, that part is obvious, but...
So, uaccess_save_and_enable() and uaccess_restore() must be inlined in order to avoid those registers being overwritten in the callees.
Right, r0, r1, r2 are caller saved, meaning that __get_user_check must save/restore them when making function calls. So uaccess_save_and_enable() and uaccess_restore() should either be made into macros (macros and typecheck (see include/linux/typecheck.h) are peanut butter and chocolate), or occur at different points in the function when those register variables are no longer in use.
quoted hunk ↗ jump to hunk
Prior to commit 9012d011660e ("compiler: allow all arches to enable CONFIG_OPTIMIZE_INLINING"), the 'inline' marker was always enough for inlining functions, except on x86. Since that commit, all architectures can enable CONFIG_OPTIMIZE_INLINING. So, __always_inline is now the only guaranteed way of forcible inlining. I want to keep as much compiler's freedom as possible about the inlining decision. So, I changed the function call order instead of adding __always_inline around. Call uaccess_save_and_enable() before assigning the __p ("r0"), and uaccess_restore() after evacuating the __e ("r0"). Fixes: 9012d011660e ("compiler: allow all arches to enable CONFIG_OPTIMIZE_INLINING") Reported-by: "kernelci.org bot" <redacted> Reported-by: Nicolas Saenz Julienne <redacted> Signed-off-by: Masahiro Yamada <redacted> --- arch/arm/include/asm/uaccess.h | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)diff --git a/arch/arm/include/asm/uaccess.h b/arch/arm/include/asm/uaccess.h index 303248e5b990..559f252d7e3c 100644 --- a/arch/arm/include/asm/uaccess.h +++ b/arch/arm/include/asm/uaccess.h@@ -191,11 +191,12 @@ extern int __get_user_64t_4(void *); #define __get_user_check(x, p) \ ({ \ unsigned long __limit = current_thread_info()->addr_limit - 1; \ + unsigned int __ua_flags = uaccess_save_and_enable(); \ register typeof(*(p)) __user *__p asm("r0") = (p); \ register __inttype(x) __r2 asm("r2"); \ register unsigned long __l asm("r1") = __limit; \ register int __e asm("r0"); \
What does it mean for there to be two different local variables pinned to the same register? Ie. it looks like __e and __p are defined to exist in r0. Would having one variable and an explicit cast result in differing storage?
quoted hunk ↗ jump to hunk
- unsigned int __ua_flags = uaccess_save_and_enable(); \ + unsigned int __err; \ switch (sizeof(*(__p))) { \ case 1: \ if (sizeof((x)) >= 8) \@@ -223,9 +224,10 @@ extern int __get_user_64t_4(void *); break; \ default: __e = __get_user_bad(); break; \
^ I think this assignment to __e should be replaced with an assignment to __err? We no longer need the register at this point and could skip the assignment of x.
} \
- uaccess_restore(__ua_flags); \
+ __err = __e; \
x = (typeof(*(p))) __r2; \
- __e; \
+ uaccess_restore(__ua_flags); \
+ __err; \
})
#define get_user(x, p) \
--
2.17.1-- Thanks, ~Nick Desaulniers _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel