Thread (13 messages) 13 messages, 4 authors, 2020-05-26

Re: devlink interface for asynchronous event/messages from firmware?

From: Jacob Keller <jacob.e.keller@intel.com>
Date: 2020-05-21 20:22:36


On 5/20/2020 5:16 PM, Jakub Kicinski wrote:
On Wed, 20 May 2020 17:03:02 -0700 Jacob Keller wrote:
quoted
Hi Jiri, Jakub,

I've been asked to investigate using devlink as a mechanism for
reporting asynchronous events/messages from firmware including
diagnostic messages, etc.

Essentially, the ice firmware can report various status or diagnostic
messages which are useful for debugging internal behavior. We want to be
able to get these messages (and relevant data associated with them) in a
format beyond just "dump it to the dmesg buffer and recover it later".

It seems like this would be an appropriate use of devlink. I thought
maybe this would work with devlink health:

i.e. we create a devlink health reporter, and then when firmware sends a
message, we use devlink_health_report.

But when I dug into this, it doesn't seem like a natural fit. The health
reporters expect to see an "error" state, and don't seem to really fit
the notion of "log a message from firmware" notion.

One of the issues is that the health reporter only keeps one dump, when
what we really want is a way to have a monitoring application get the
dump and then store its contents.

Thoughts on what might make sense for this? It feels like a stretch of
the health interface...

I mean basically what I am thinking of having is using the devlink_fmsg
interface to just send a netlink message that then gets sent over the
devlink monitor socket and gets dumped immediately.
Why does user space need a raw firmware interface in the first place?

Examples?
So the ice firmware can optionally send diagnostic debug messages via
its control queue. The current solutions we've used internally
essentially hex-dump the binary contents to the kernel log, and then
these get scraped and converted into a useful format for human consumption.

I'm not 100% of the format, but I know it's based on a decoding file
that is specific to a given firmware image, and thus attempting to tie
this into the driver is problematic.

There is also a plan to provide a simpler interface for some of the
diagnostic messages where a simple bijection between one code to one
message for a handful of events, like if the link engine can detect a
known reason why it wasn't able to get link. I suppose these could be
translated and immediately printed by the driver without a special
interface.

-Jake
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help