Re: [PATCH 8/8] powerpc/rtas: consume retry statuses in sys_rtas()

[PATCH 0/8] RTAS changes for 6.4 · Nathan Lynch via B4 Relay <devnull+nathanl.linux.ibm.com@kernel.org> · 2023-03-06
[PATCH 4/8] powerpc/rtas: fix miswording in rtas_function kerneldoc · Nathan Lynch via B4 Relay <devnull+nathanl.linux.ibm.com@kernel.org> · 2023-03-06
Re: [PATCH 4/8] powerpc/rtas: fix miswording in rtas_function kerneldoc · Andrew Donnellan <hidden> · 2023-03-23
[PATCH 7/8] powerpc/rtas: warn on unsafe argument to rtas_call_unlocked() · Nathan Lynch via B4 Relay <devnull+nathanl.linux.ibm.com@kernel.org> · 2023-03-06
Re: [PATCH 7/8] powerpc/rtas: warn on unsafe argument to rtas_call_unlocked() · Andrew Donnellan <hidden> · 2023-03-23
Re: [PATCH 7/8] powerpc/rtas: warn on unsafe argument to rtas_call_unlocked() · Nathan Lynch <hidden> · 2023-03-23
Re: [PATCH 7/8] powerpc/rtas: warn on unsafe argument to rtas_call_unlocked() · Nathan Lynch <hidden> · 2023-03-24
Re: [PATCH 7/8] powerpc/rtas: warn on unsafe argument to rtas_call_unlocked() · Michael Ellerman <mpe@ellerman.id.au> · 2023-03-29
Re: [PATCH 7/8] powerpc/rtas: warn on unsafe argument to rtas_call_unlocked() · Nathan Lynch <hidden> · 2023-03-29
[PATCH 3/8] powerpc/rtas: rtas_call_unlocked() kerneldoc · Nathan Lynch via B4 Relay <devnull+nathanl.linux.ibm.com@kernel.org> · 2023-03-06
Re: [PATCH 3/8] powerpc/rtas: rtas_call_unlocked() kerneldoc · Andrew Donnellan <hidden> · 2023-03-23
[PATCH 8/8] powerpc/rtas: consume retry statuses in sys_rtas() · Nathan Lynch via B4 Relay <devnull+nathanl.linux.ibm.com@kernel.org> · 2023-03-06
Re: [PATCH 8/8] powerpc/rtas: consume retry statuses in sys_rtas() · Andrew Donnellan <hidden> · 2023-03-23
Re: [PATCH 8/8] powerpc/rtas: consume retry statuses in sys_rtas() · Nathan Lynch <hidden> · 2023-03-23
Re: [PATCH 8/8] powerpc/rtas: consume retry statuses in sys_rtas() · Michael Ellerman <mpe@ellerman.id.au> · 2023-03-23
Re: [PATCH 8/8] powerpc/rtas: consume retry statuses in sys_rtas() · Nathan Lynch <hidden> · 2023-03-23
Re: [PATCH 8/8] powerpc/rtas: consume retry statuses in sys_rtas() · Christophe Leroy <hidden> · 2024-01-25
Re: [PATCH 8/8] powerpc/rtas: consume retry statuses in sys_rtas() · Nathan Lynch <hidden> · 2024-01-25
Re: [PATCH 8/8] powerpc/rtas: consume retry statuses in sys_rtas() · Christophe Leroy <hidden> · 2024-01-25
Re: [PATCH 8/8] powerpc/rtas: consume retry statuses in sys_rtas() · Nathan Lynch <hidden> · 2024-01-25
[PATCH 1/8] powerpc/rtas: ensure 8-byte alignment for struct rtas_args · Nathan Lynch via B4 Relay <devnull+nathanl.linux.ibm.com@kernel.org> · 2023-03-06
Re: [PATCH 1/8] powerpc/rtas: ensure 8-byte alignment for struct rtas_args · Andrew Donnellan <hidden> · 2023-03-23
[PATCH 5/8] powerpc/rtas: rename va_rtas_call_unlocked() to va_rtas_call() · Nathan Lynch via B4 Relay <devnull+nathanl.linux.ibm.com@kernel.org> · 2023-03-06
Re: [PATCH 5/8] powerpc/rtas: rename va_rtas_call_unlocked() to va_rtas_call() · Andrew Donnellan <hidden> · 2023-03-23
Re: [PATCH 5/8] powerpc/rtas: rename va_rtas_call_unlocked() to va_rtas_call() · Nathan Lynch <hidden> · 2023-03-23
Re: [PATCH 5/8] powerpc/rtas: rename va_rtas_call_unlocked() to va_rtas_call() · Michael Ellerman <mpe@ellerman.id.au> · 2023-03-29
[PATCH 2/8] powerpc/rtas: use memmove for potentially overlapping buffer copy · Nathan Lynch via B4 Relay <devnull+nathanl.linux.ibm.com@kernel.org> · 2023-03-06
Re: [PATCH 2/8] powerpc/rtas: use memmove for potentially overlapping buffer copy · Andrew Donnellan <hidden> · 2023-03-23
[PATCH 6/8] powerpc/rtas: lockdep annotations · Nathan Lynch via B4 Relay <devnull+nathanl.linux.ibm.com@kernel.org> · 2023-03-06
Re: [PATCH 6/8] powerpc/rtas: lockdep annotations · Andrew Donnellan <hidden> · 2023-03-23
Re: (subset) [PATCH 0/8] RTAS changes for 6.4 · Michael Ellerman <hidden> · 2023-04-06
Re: [PATCH 0/8] RTAS changes for 6.4 · Michael Ellerman <hidden> · 2023-04-26

From: Nathan Lynch <hidden>
Date: 2023-03-23 13:41:14

Michael Ellerman [off-list ref] writes:

Nathan Lynch via B4 Relay [off-list ref] writes:

quoted

From: Nathan Lynch <redacted>

The kernel can handle retrying RTAS function calls in response to
-2/990x in the sys_rtas() handler instead of relaying the intermediate
status to user space.

This looks good in general.

One query ...

quoted

diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
index 47a2aa43d7d4..c330a22ccc70 100644
--- a/arch/powerpc/kernel/rtas.c
+++ b/arch/powerpc/kernel/rtas.c

@@ -1798,7 +1798,6 @@ static bool block_rtas_call(int token, int nargs,
 /* We assume to be passed big endian arguments */
 SYSCALL_DEFINE1(rtas, struct rtas_args __user *, uargs)
 {
-	struct pin_cookie cookie;
 	struct rtas_args args;
 	unsigned long flags;
 	char *buff_copy, *errbuf = NULL;

@@ -1866,20 +1865,25 @@ SYSCALL_DEFINE1(rtas, struct rtas_args __user *, uargs)
 
 	buff_copy = get_errorlog_buffer();
 
-	raw_spin_lock_irqsave(&rtas_lock, flags);
-	cookie = lockdep_pin_lock(&rtas_lock);
+	do {
+		struct pin_cookie cookie;
 
-	rtas_args = args;
-	do_enter_rtas(&rtas_args);
-	args = rtas_args;
+		raw_spin_lock_irqsave(&rtas_lock, flags);
+		cookie = lockdep_pin_lock(&rtas_lock);
 
-	/* A -1 return code indicates that the last command couldn't
-	   be completed due to a hardware error. */
-	if (be32_to_cpu(args.rets[0]) == -1)
-		errbuf = __fetch_rtas_last_error(buff_copy);
+		rtas_args = args;
+		do_enter_rtas(&rtas_args);
+		args = rtas_args;
 
-	lockdep_unpin_lock(&rtas_lock, cookie);
-	raw_spin_unlock_irqrestore(&rtas_lock, flags);
+		/*
+		 * Handle error record retrieval before releasing the lock.
+		 */
+		if (be32_to_cpu(args.rets[0]) == -1)
+			errbuf = __fetch_rtas_last_error(buff_copy);
+
+		lockdep_unpin_lock(&rtas_lock, cookie);
+		raw_spin_unlock_irqrestore(&rtas_lock, flags);
+	} while (rtas_busy_delay(be32_to_cpu(args.rets[0])));

rtas_busy_delay_early() has the successive_ext_delays case that will
break out eventually. But if we keep getting plain RTAS_BUSY back from
RTAS I *think* this loop will never terminate?

Yes, but if this happens, then there is a serious bug in Linux or
RTAS. The only time I've seen something like that on PowerVM is when
Linux corrupted internal RTAS state by not serializing calls correctly.

rtas_busy_delay_early() has a bail-out heuristic, not for RTAS_BUSY, but
for extended delay statuses (990x), which I suspect happen rarely (if
ever) that early. That's there in order to allow boot to proceed and
hopefully get useful messages out in a truly unexpected circumstance.

That said...

To avoid that, and just as good manners, I think we should have a
fatal_signal_pending() check, and if that returns true we bail out of
the syscall with -EINTR ?

That probably makes sense. In its current state, I could see
this patch preventing or delaying OS shutdown in situations where it
wouldn't have occurred before.

I think I would want the bailout condition in this case to be
(fatal_signal_pending() && retries > some_threshold), to reduce the
likelihood of non-"stuck" operations from being left unfinished. And it
should dump a stack trace.

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help