Thread (29 messages) 29 messages, 5 authors, 2024-10-28

Re: [PATCH v3 3/5] mm: madvise: implement lightweight guard page mechanism

From: Lorenzo Stoakes <hidden>
Date: 2024-10-28 12:43:44
Also in: linux-alpha, linux-arch, linux-kselftest, linux-mips, linux-mm, lkml

On Fri, Oct 25, 2024 at 11:35:03PM +0100, Lorenzo Stoakes wrote:
On Fri, Oct 25, 2024 at 11:56:52PM +0200, Vlastimil Babka wrote:
quoted
On 10/25/24 19:12, Lorenzo Stoakes wrote:
quoted
On Wed, Oct 23, 2024 at 05:24:40PM +0100, Lorenzo Stoakes wrote:
quoted
Implement a new lightweight guard page feature, that is regions of userland
virtual memory that, when accessed, cause a fatal signal to arise.
<snip>

Hi Andrew - Could you apply the below fix-patch? I realise we must handle
fatal signals and conditional rescheduling in the vector_madvise() special
case.

Thanks!

----8<----
From 546d7e1831c71599fc733d589e0d75f52e84826d Mon Sep 17 00:00:00 2001
From: Lorenzo Stoakes <redacted>
Date: Fri, 25 Oct 2024 18:05:48 +0100
Subject: [PATCH] mm: yield on fatal signal/cond_sched() in vector_madvise()

While we have to treat -ERESTARTNOINTR specially here as we are looping
through a vector of operations and can't simply restart the entire
operation, we mustn't hold up fatal signals or RT kernels.
For plain madvise() syscall returning -ERESTARTNOINTR does the right thing
and checks fatal_signal_pending() before returning, right?
I believe so. But now you've caused me some doubt so let me double check
and make absolutely sure :)
quoted
Uh actually can we be just returning -ERESTARTNOINTR or do we need to use
restart_syscall()?
Yeah I was wondering about that, but restart_syscall() seems to set
TIF_SIGPENDING, and I wondered if that was correct... but then I saw other
places that seemed to use it direct so it seemed so.

Let's eliminiate doubt, will check this next week and make sure.
Yeah looks like we do actually, as the function is handled by
arch_do_signal_or_restart():

do_syscall_64()
  -> sycall_exit_to_user_mode_work()
    -> __sycall_exit_to_user_mode_work()
      -> exit_to_user_mode_prepare()
        -> exit_to_user_mode_loop()
	  -> arch_do_signal_or_restart()
	    -> (possibly) get_signal()

And arch_do_signal_or_restart() is only invoked by exit_to_user_mode_loop()
if _TIF_SIGPENDING or _TIF_NOTIFY_SIGNAL is set:

		if (ti_work & (_TIF_SIGPENDING | _TIF_NOTIFY_SIGNAL))
			arch_do_signal_or_restart(regs);

This is set by restart_syscall():

static inline int restart_syscall(void)
{
	set_tsk_thread_flag(current, TIF_SIGPENDING);
	return -ERESTARTNOINTR;
}

It's a nop if no signal is actually pending too, and it handily also deals
with signals...
quoted
quoted
---
 mm/madvise.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/mm/madvise.c b/mm/madvise.c
index 48eba25e25fe..127aa5d86656 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -1713,8 +1713,14 @@ static ssize_t vector_madvise(struct mm_struct *mm, struct iov_iter *iter,
 		 * we have already rescinded locks, it should be no problem to
 		 * simply try again.
 		 */
-		if (ret == -ERESTARTNOINTR)
+		if (ret == -ERESTARTNOINTR) {
+			if (fatal_signal_pending(current)) {
+				ret = -EINTR;
+				break;
+			}
+			cond_resched();
Should be unnecessary as we're calling an operation that takes a rwsem so
there are reschedule points already. And with lazy preempt hopefully
cond_resched()s will become history, so let's not add more only to delete later.
Ack will remove on respin.
quoted
quoted
 			continue;
+		}
 		if (ret < 0)
 			break;
 		iov_iter_advance(iter, iter_iov_len(iter));
--
2.47.0
For simplicitly with your other comments too I think I'll respin this next
week.
So will respin to use restart_syscall() correctly (+ fix up your other points too).

Have tested and confirmed that failing to use restart_syscall() causes the
-ERESTARTINTR to be returned and no restart, but using it works.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help