Re: [PATCH v2] madvise: MADV_SOFT_OFFLINE requests can return -EBUSY
From: Luis Claudio R. Goncalves <hidden>
Date: 2024-12-04 20:45:26
On Wed, Dec 04, 2024 at 09:35:20PM +0100, Alejandro Colomar wrote:
Hi Luis, Tyonnchie, On Fri, Nov 29, 2024 at 06:43:39PM -0300, Luis Claudio R. Goncalves wrote:quoted
On Thu, Nov 28, 2024 at 12:35:48PM +0100, Alejandro Colomar wrote:quoted
Hi Tyonnchie, On Tue, Nov 26, 2024 at 11:12:03AM -0500, tyberry@redhat.com wrote:quoted
If the page could not be offlined madvise will return -EBUSY. This might occur if the page is currently in use or locked.Could you show this in a small example program (if possible)? Like 30 lines or so. If not, it's okay.Hi Alejandro! Given the ongoing holidays, let me take the liberty of giving some context in order to keep the conversation going. We received reports of failed LTP madvise11[1] tests. The errors looked like this: madvise11.c:409: TINFO: Spawning 4 threads, with a total of 640 memory pages madvise11.c:132: TFAIL: madvise failed: EBUSY (16) madvise11.c:163: TINFO: Thread [0] returned 16, failed. madvise11.c:191: TFAIL: thread [0] - exited with errors madvise11.c:163: TINFO: Thread [2] returned 0, succeeded. madvise11.c:163: TINFO: Thread [3] returned 0, succeeded. madvise11.c:163: TINFO: Thread [1] returned 0, succeeded. madvise11.c:361: TINFO: Restore 629 Soft-offlined pages madvise11.c:290: TWARN: write(3,0x7ffce114b8a0,8) failed: EBUSY (16) Clearly the problem had to do with -EBUSY being returned by a madvise() operation. The bug was initially reported on kernels with PREEMPT_RT enabled but we soon observed that the problem also happened with the stock kernel, though requiring more repetitions to trigger issue. After debug and investigation we observed that the -EBUSY return was a valid case in the kernel code and was not being handled by the test. A fix was sent to the LTP project by Li Wang[2], specifically for the madvise11 test. In this process, we noticed that the man pages did not mention -EBUSY as a possible result of a failed offlining operation, as described by Tyonnchie. I hope this helps!Thanks! I've applied the patch, with some tweaks: <https://www.alejandro-colomar.es/src/alx/linux/man-pages/man-pages.git/commit/?h=contrib&id=3205359a3a7079d9d40a50388e851874729a827a> I added an Acked-by on your behalf, Luis.
Thank you! You have all my respect for the great work you and many others do with the man pages! Luis
Have a lovely night! Alexquoted
Best regards, Luis [1] https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/syscalls/madvise/madvise11.c [2] https://lists.linux.it/pipermail/ltp/2024-May/038310.htmlquoted
Have a lovely day! Alexquoted
Signed-off-by: Tyonnchie Berry <redacted> ---diff --git a/man/man2/madvise.2 b/man/man2/madvise.2 index 4f2210ee2..c10dcd599 100644 --- a/man/man2/madvise.2 +++ b/man/man2/madvise.2@@ -702,6 +702,13 @@ The map exists, but the area maps something that isn't a file. .BR MADV_COLLAPSE ) Could not charge hugepage to cgroup: cgroup limit exceeded. .TP +.B EBUSY +(for +.B MADV_SOFT_OFFLINE ) +If any pages within the add+length range could not be offlined, +madvise will return -EBUSY. +This might occur if the page is currently in use or locked. +.TP .B EFAULT .I advice is-- <https://www.alejandro-colomar.es/>---end quoted text----- <https://www.alejandro-colomar.es/>
---end quoted text---
Attachments
- signature.asc [application/pgp-signature] 833 bytes