Thread (26 messages) 26 messages, 5 authors, 2024-02-28

Re: [PATCH 2/2] fbcon: Defer console takeover for splash screens to first switch

From: Hans de Goede <hidden>
Date: 2024-02-27 13:47:20
Also in: dri-devel, lkml

Hi,

On 2/27/24 02:06, Daniel van Vugt wrote:
On 27/2/24 02:23, Hans de Goede wrote:
quoted
Hi All,

On 2/2/24 09:53, Daniel van Vugt wrote:
quoted
Until now, deferred console takeover only meant defer until there is
output. But that risks stepping on the toes of userspace splash screens,
as console messages may appear before the splash screen. So check for the
"splash" parameter (as used by Plymouth) and if present then extend the
deferral until the first switch.
Daniel, thank you for your patch but I do not believe that this
is the right solution. Deferring fbcon takeover further then
after the first text is output means that any errors about e.g.
a corrupt initrd or the kernel erroring out / crashing will not
be visible.
That's not really correct. If a boot failure has occurred after the splash then
pressing escape shows the log.
Hmm, I guess this is with the latest plymouth which has a builtin terminal
emulator for kernels without VT support ? Pressing ESC does not to a VC
switch and AFAICT that is what you are triggering on to allow fbcon takeover
after this patches.
If a boot failure has occurred before the splash
then it can be debugged visually by rebooting without the "splash" parameter.
Which requires the user to know this and requires the user to know how to
edit kernel cmdline parameters in e.g. grub. This is not a good user
experience. We want inexperienced users to just be able to point
a phone camera at the screen and take a picture of the errors.

quoted
When the kernel e.g. oopses or panics because of not finding
its rootfs (I tested the latter option when writing the original
deferred fbcon takeover code) then fbcon must takeover and
print the messages from the dying kernel so that the user has
some notion of what is going wrong.
Indeed, just reboot without the "splash" parameter to do that.
Again not something beginning Linux users will be able to do,
what happened to "Ubuntu: Linux for Human Beings" ?
quoted
And since your patch seems to delay switching till the first
vc-switch this means that e.g. even after say gdm refusing
to start because of issues there still will be no text
output. This makes debugging various issues much harder.
I've debugged many gdm failures and it is never useful to use the console for
those. Reboot and get the system journal instead.
But users will not see any errors now, meaning they don't
even know where to begin with troubleshooting ...
quoted
Moreover Fedora has been doing flickerfree boot for many
years without needing this.
I believe Fedora has a mostly working solution, but not totally reliable, as
mentioned in the commit message:

"even systems whose splash exists in initrd may not be not immune because they
 still rely on racing against all possible kernel messages that might
 trigger the fbcon takeover"
Only very serious kernel errors like oopses or panics will
trigger the takeover and that is *exactly* what we want.

There is a race where plymouth may hide such vary serious
messages, if plymouth does manage to start before the errors,
but that is actually an existing issue which we don't want
to make bigger by *always* hiding such errors.
quoted
The kernel itself will be quiet as long as you set
CONFIG_CONSOLE_LOGLEVEL_QUIET=3 Ubuntu atm has set this
to 4 which means any kernel pr_err() or dev_err()
messages will get through and since there are quite
a few false positives of those Ubuntu really needs
to set CONFIG_CONSOLE_LOGLEVEL_QUIET=3 to fix part of:
https://bugs.launchpad.net/bugs/1970069
Incorrect. In my testing some laptops needed log level as low as 2 to go quiet.
And the Ubuntu kernel team is never going to fix all those for non-sponsored
devices.
Notice that atm Ubuntu's kernel is using the too high
CONFIG_CONSOLE_LOGLEVEL_QUIET=4 with
CONFIG_CONSOLE_LOGLEVEL_QUIET=3 getting any errors logged
to the console should be very very rare.

The only thing I can think of is if the kernel oopses
/ WARN()s early on but the cause is innocent enough
that the boot happily continues.

In that case actually showing the oops/WARN() is a good
thing.

For all the years Fedora has had flickerfree boot I have
seen zero bug reports about this. If you have examples
of this actually being a problem please file bugs for
them (launchpad or bugzilla.kernel.org is fine) and
then lets take a look at those bugs and fix them.

These should be so rare that I'm not worried about this
becoming a never ending list of bugs (unlike pr_err() /
dev_err() messages of which there are simply too many).
quoted
After that it is "just" a matter of not making userspace
output anything unless it has errors to report.

systemd already is quiet by default (only logging
errors) when quiet is on the kernel commandline.
Unfortunately not true for Ubuntu. We carry a noisy systemd patch which I'm
told we can't remove in the short term:

https://bugs.launchpad.net/ubuntu/+source/plymouth/+bug/1970069/comments/39
Well then make the patch less noisy? Suppressing non
error message unless in debug mode should be easy
even with a downstream patch.
Thanks for your input, but I respectfully disagree and did consider these
points already.
Sorry, but your real problem here seems to be your
noisy downstream systemd patch. I'm not going to ack
a kernel patch which I consider a bad idea because
Ubuntu has a non standard systemd patch which is
to trigger happy with spamming the console.

So this is still a NACK from me.

Regards,

Hans




quoted
quoted
Closes: https://bugs.launchpad.net/bugs/1970069
Cc: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Daniel van Vugt <redacted>
---
 drivers/video/fbdev/core/fbcon.c | 32 +++++++++++++++++++++++++++++---
 1 file changed, 29 insertions(+), 3 deletions(-)
diff --git a/drivers/video/fbdev/core/fbcon.c b/drivers/video/fbdev/core/fbcon.c
index 63af6ab034..5b9f7635f7 100644
--- a/drivers/video/fbdev/core/fbcon.c
+++ b/drivers/video/fbdev/core/fbcon.c
@@ -76,6 +76,7 @@
 #include <linux/crc32.h> /* For counting font checksums */
 #include <linux/uaccess.h>
 #include <asm/irq.h>
+#include <asm/cmdline.h>
 
 #include "fbcon.h"
 #include "fb_internal.h"
@@ -146,6 +147,7 @@ static inline void fbcon_map_override(void)
 
 #ifdef CONFIG_FRAMEBUFFER_CONSOLE_DEFERRED_TAKEOVER
 static bool deferred_takeover = true;
+static int initial_console = -1;
 #else
 #define deferred_takeover false
 #endif
@@ -3341,7 +3343,7 @@ static void fbcon_register_existing_fbs(struct work_struct *work)
 	console_unlock();
 }
 
-static struct notifier_block fbcon_output_nb;
+static struct notifier_block fbcon_output_nb, fbcon_switch_nb;
 static DECLARE_WORK(fbcon_deferred_takeover_work, fbcon_register_existing_fbs);
 
 static int fbcon_output_notifier(struct notifier_block *nb,
@@ -3358,6 +3360,21 @@ static int fbcon_output_notifier(struct notifier_block *nb,
 
 	return NOTIFY_OK;
 }
+
+static int fbcon_switch_notifier(struct notifier_block *nb,
+				 unsigned long action, void *data)
+{
+	struct vc_data *vc = data;
+
+	WARN_CONSOLE_UNLOCKED();
+
+	if (vc->vc_num != initial_console) {
+		dummycon_unregister_switch_notifier(&fbcon_switch_nb);
+		dummycon_register_output_notifier(&fbcon_output_nb);
+	}
+
+	return NOTIFY_OK;
+}
 #endif
 
 static void fbcon_start(void)
@@ -3370,7 +3387,14 @@ static void fbcon_start(void)
 
 	if (deferred_takeover) {
 		fbcon_output_nb.notifier_call = fbcon_output_notifier;
-		dummycon_register_output_notifier(&fbcon_output_nb);
+		fbcon_switch_nb.notifier_call = fbcon_switch_notifier;
+		initial_console = fg_console;
+
+		if (cmdline_find_option_bool(boot_command_line, "splash"))
+			dummycon_register_switch_notifier(&fbcon_switch_nb);
+		else
+			dummycon_register_output_notifier(&fbcon_output_nb);
+
 		return;
 	}
 #endif
@@ -3417,8 +3441,10 @@ void __exit fb_console_exit(void)
 {
 #ifdef CONFIG_FRAMEBUFFER_CONSOLE_DEFERRED_TAKEOVER
 	console_lock();
-	if (deferred_takeover)
+	if (deferred_takeover) {
 		dummycon_unregister_output_notifier(&fbcon_output_nb);
+		dummycon_unregister_switch_notifier(&fbcon_switch_nb);
+	}
 	console_unlock();
 
 	cancel_work_sync(&fbcon_deferred_takeover_work);
  
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help