Thread (14 messages) 14 messages, 3 authors, 2021-03-05

Re: [PATCH] pci-driver: Add driver load messages

From: Prarit Bhargava <hidden>
Date: 2021-01-26 13:44:02
Also in: linux-pci


On 1/26/21 8:14 AM, Leon Romanovsky wrote:
On Tue, Jan 26, 2021 at 07:54:46AM -0500, Prarit Bhargava wrote:
quoted
  Leon Romanovsky [off-list ref] wrote:
quoted
On Mon, Jan 25, 2021 at 02:41:38PM -0500, Prarit Bhargava wrote:
quoted
There are two situations where driver load messages are helpful.

1) Some drivers silently load on devices and debugging driver or system
failures in these cases is difficult.  While some drivers (networking
for example) may not completely initialize when the PCI driver probe() function
has returned, it is still useful to have some idea of driver completion.
Sorry, probably it is me, but I don't understand this use case.
Are you adding global to whole kernel command line boot argument to debug
what and when?

During boot:
If device success, you will see it in /sys/bus/pci/[drivers|devices]/*.
If device fails, you should get an error from that device (fix the
device to return an error), or something immediately won't work and
you won't see it in sysfs.
What if there is a panic during boot?  There's no way to get to sysfs.
That's the case where this is helpful.
How? If you have kernel panic, it means you have much more worse problem
than not-supported device. If kernel panic was caused by the driver, you
will see call trace related to it. If kernel panic was caused by
something else, supported/not supported won't help here.
I still have no idea *WHICH* device it was that the panic occurred on.
quoted
quoted
During run:
We have many other solutions to get debug prints during run, for example
tracing, which is possible to toggle dynamically.

Right now, my laptop will print 34 prints on boot and endless amount during
day-to-day usage.

➜  kernel git:(rdma-next) ✗ lspci |wc -l
34
quoted
2) Storage and Network device vendors have relatively short lives for
some of their hardware.  Some devices may continue to function but are
problematic due to out-of-date firmware or other issues.  Maintaining
a database of the hardware is out-of-the-question in the kernel as it would
require constant updating.  Outputting a message in the log would allow
different OSes to determine if the problem hardware was truly supported or not.
And rely on some dmesg output as a true source of supported/not supported and
making this ABI which needs knob in command line. ?
Yes.  The console log being saved would work as a true source of load
messages to be interpreted by an OS tool.  But I see your point about the
knob below...
You will need much more stronger claim than the above if you want to proceed
ABI path through dmesg prints.
See my answer below.  I agree with you on the ABI statement.
quoted
quoted
quoted
Add optional driver load messages from the PCI core that indicates which
driver was loaded, on which slot, and on which device.
Why don't you add simple pr_debug(..) without any knob? You will be able
to enable/disable it through dynamic prints facility.
Good point.  I'll wait for more feedback and submit a v2 with pr_debug.
Just to be clear, none of this can be ABI and any kernel print can
be changed or removed any minute without any announcement.
Yes, that's absolutely the case and I agree with you that nothing can guarantee
ABI of those pr_debug() statements.  They are *debug* after all.

P.
Thanks
quoted
P.
quoted
Thanks
  
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help