Thread (14 messages) 14 messages, 3 authors, 2025-09-06

Re: [PATCH V0 0/2] Fix CONFIG_HYPERV and vmbus related anamoly

From: Mukesh R <hidden>
Date: 2025-09-05 21:41:58
Also in: linux-arch, linux-fbdev, linux-hyperv, linux-input, linux-pci, linux-scsi, lkml, virtualization

On 9/5/25 13:08, Nuno Das Neves wrote:
On 9/4/2025 11:18 AM, Mukesh R wrote:
quoted
On 9/4/25 09:26, Michael Kelley wrote:
quoted
From: Mukesh R <redacted> Sent: Wednesday, September 3, 2025 7:17 PM
quoted
On 9/2/25 07:42, Michael Kelley wrote:
quoted
From: Mukesh Rathor <redacted> Sent: Wednesday, August 27, 2025 6:00 PM
quoted
At present, drivers/Makefile will subst =m to =y for CONFIG_HYPERV for hv
subdir. Also, drivers/hv/Makefile replaces =m to =y to build in
hv_common.c that is needed for the drivers. Moreover, vmbus driver is
built if CONFIG_HYPER is set, either loadable or builtin.

This is not a good approach. CONFIG_HYPERV is really an umbrella config that
encompasses builtin code and various other things and not a dedicated config
option for VMBUS. Vmbus should really have a config option just like
CONFIG_HYPERV_BALLOON etc. This small series introduces CONFIG_HYPERV_VMBUS
to build VMBUS driver and make that distinction explicit. With that
CONFIG_HYPERV could be changed to bool.
Separating the core hypervisor support (CONFIG_HYPERV) from the VMBus
support (CONFIG_HYPERV_VMBUS) makes sense to me. Overall the code
is already mostly in separate source files code, though there's some
entanglement in the handling of VMBus interrupts, which could be
improved later.

However, I have a compatibility concern. Consider this scenario:

1) Assume running in a Hyper-V VM with a current Linux kernel version
    built with CONFIG_HYPERV=m.
2) Grab a new version of kernel source code that contains this patch set.
3) Run 'make olddefconfig' to create the .config file for the new kernel.
4) Build the new kernel. This succeeds.
5) Install and run the new kernel in the Hyper-V VM. This fails.

The failure occurs because CONFIG_HYPERV=m is no longer legal,
so the .config file created in Step 3 has CONFIG_HYPERV=n. The
newly built kernel has no Hyper-V support and won't run in a
Hyper-V VM.
It surprises me a little that =m doesn't get 'fixed up' to =y in this case.
I guess any invalid value turns to =n, which makes sense most of the time.
quoted
quoted
quoted
quoted
As a second issue, if in Step 1 the current kernel was built with
CONFIG_HYPERV=y, then the .config file for the new kernel will have
CONFIG_HYPERV=y, which is better. But CONFIG_HYPERV_VMBUS
defaults to 'n', so the new kernel doesn't have any VMBus drivers
and won't run in a typical Hyper-V VM.

The second issue could be fixed by assigning CONFIG_HYPERV_VMBUS
a default value, such as whatever CONFIG_HYPERV is set to. But
I'm not sure how to fix the first issue, except by continuing to
allow CONFIG_HYPERV=m.
I'm wondering, is there a path for this change, then? Are there some
intermediate step/s we could take to minimize the problem?
quoted
quoted
quoted
To certain extent, imo, users are expected to check config files
for changes when moving to new versions/releases, so it would be a
one time burden. 
I'm not so sanguine about the impact. For those of us who work with
Hyper-V frequently, yes, it's probably not that big of an issue -- we can
figure it out. But a lot of Azure/Hyper-V users aren't that familiar with
the details of how the Kconfig files are put together. And the issue occurs
with no error messages that something has gone wrong in building
the kernel, except that it won't boot. Just running "make olddefconfig"
has worked in the past, so some users will be befuddled and end up
generating Azure support incidents. I also wonder about breaking
automated test suites for new kernels, as they are likely to be running
"make olddefconfig" or something similar as part of the automation.
quoted
CONFIG_HYPERV=m is just broken imo as one sees that
in .config but magically symbols in drivers/hv are in kerenel.
I agree that's not ideal. But note that some Hyper-V code and symbols
like ms_hyperv_init_platform() and related functions show up when
CONFIG_HYPERVISOR_GUEST=y, even if CONFIG_HYPERV=n. That's
the code in arch/x86/kernel/cpu/mshyperv.c and it's because Hyper-V
is one of the recognized and somewhat hardwired hypervisors (like
VMware, for example).

Finally, there are about a dozen other places in the kernel that use
the same Makefile construct to make some code built-in even though
the CONFIG option is set to "m". That may not be enough occurrences
to make it standard practice, but Hyper-V guests are certainly not the
only case.

In my mind, this is judgment call with no absolute right answer. What
do others think about the tradeoffs?
Wei had said in private message that he agrees this is a good idea. Nuno
said earlier above: 

"FWIW I think it's a good idea, interested to hear what others think."
That was before Michael pointed out the potential issues which I was
unaware of. Let's see if there's a path that is smoother for all the
downstream users who may be compiling with CONFIG_HYPERV=m.
Ok, we've already thought of it for sometime and not able to come up
with any. IMO, it's a minor hickup, not major. This is stalling
upcoming iommu and other patches which will use CONFIG_HYPERV and 
add more dependencies, and it would be much harder to straighten 
out then. So I hope you guys can come up with some solution sooner than
later, I can't think of any.

Thanks,
-Mukesh

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help