Re: [PATCH v6 02/20] liveupdate: luo_core: integrate with KHO
From: Mike Rapoport <rppt@kernel.org>
Date: 2025-11-17 21:05:43
Also in:
linux-doc, linux-fsdevel, linux-mm, lkml
On Mon, Nov 17, 2025 at 01:29:47PM -0500, Pasha Tatashin wrote:
On Sun, Nov 16, 2025 at 2:16 PM Mike Rapoport [off-list ref] wrote:quoted
On Sun, Nov 16, 2025 at 09:55:30AM -0500, Pasha Tatashin wrote:quoted
On Sun, Nov 16, 2025 at 7:43 AM Mike Rapoport [off-list ref] wrote:quoted
quoted
+static int __init liveupdate_early_init(void) +{ + int err; + + err = luo_early_startup(); + if (err) { + pr_err("The incoming tree failed to initialize properly [%pe], disabling live update\n", + ERR_PTR(err));How do we report this to the userspace? I think the decision what to do in this case belongs there. Even if it's down to choosing between plain kexec and full reboot, it's still a policy that should be implemented in userspace.I agree that policy belongs in userspace, and that is how we designed it. In this specific failure case (ABI mismatch or corrupt FDT), the preserved state is unrecoverable by the kernel. We cannot parse the incoming data, so we cannot offer it to userspace. We report this state by not registering the /dev/liveupdate device. When the userspace agent attempts to initialize, it receives ENOENT. At that point, the agent exercises its policy: - Check dmesg for the specific error and report the failure to the fleet control plane.Hmm, this is not nice. I think we still should register /dev/liveupdate and let userspace discover this error via /dev/liveupdate ABIs.Not registering the device is the correct approach here for two reasons: 1. This follows the standard Linux driver pattern. If a driver fails to initialize its underlying resources (hardware, firmware, or in this case, the incoming FDT), it does not register a character device. 2. Registering a "zombie" device that exists solely to return errors adds significant complexity. We would need to introduce a specific "broken" state to the state machine and add checks to IOCTLs to reject commands with a specific error code.
You can avoid that complexity if you register the device with a different fops, but that's technicality. Your point about treating the incoming FDT as an underlying resource that failed to initialize makes sense, but nevertheless userspace needs a reliable way to detect it and parsing dmesg is not something we should rely on.
Pasha
-- Sincerely yours, Mike.