Mon, Apr 02, 2018 at 02:30:45PM CEST, rahul.lakkireddy@chelsio.com wrote:
On Monday, April 04/02/18, 2018 at 14:41:43 +0530, Jiri Pirko wrote:
quoted
Fri, Mar 30, 2018 at 08:42:00PM CEST, ebiederm@xmission.com wrote:
quoted
Rahul Lakkireddy [off-list ref] writes:
quoted
On Friday, March 03/30/18, 2018 at 16:09:07 +0530, Jiri Pirko wrote:
quoted
Sat, Mar 24, 2018 at 11:56:33AM CET, rahul.lakkireddy@chelsio.com wrote:
quoted
Add a new module crashdd that exports the /sys/kernel/crashdd/
directory in second kernel, containing collected hardware/firmware
dumps.
The sequence of actions done by device drivers to append their device
specific hardware/firmware logs to /sys/kernel/crashdd/ directory are
as follows:
1. During probe (before hardware is initialized), device drivers
register to the crashdd module (via crashdd_add_dump()), with
callback function, along with buffer size and log name needed for
firmware/hardware log collection.
2. Crashdd creates a driver's directory under
/sys/kernel/crashdd/<driver>. Then, it allocates the buffer with
This smells. I need to identify the exact ASIC instance that produced
the dump. To identify by driver name does not help me if I have multiple
instances of the same driver. This looks wrong to me. This looks like
a job for devlink where you have 1 devlink instance per 1 ASIC instance.
Please see:
http://patchwork.ozlabs.org/project/netdev/list/?series=36524
I bevieve that the solution in the patchset could be used for
your usecase too.
The sysfs approach proposed here had been dropped in favour exporting
the dumps as ELF notes in /proc/vmcore.
Will be posting the new patches soon.
The concern was actually how you identify which device that came from.
Where you read the identifier changes but sysfs or /proc/vmcore the
change remains valid.
Yeah. I still don't see how you link the dump and the device.
In our case, the dump and the device are being identified by the
driver’s name followed by its corresponding pci bus id. I’ve posted an
example in my v3 series:
https://www.spinics.net/lists/netdev/msg493781.html
Here’s an extract from the link above:
# readelf -n /proc/vmcore
Displaying notes found at file offset 0x00001000 with length 0x04003288:
Owner Data size Description
VMCOREDD_cxgb4_0000:02:00.4 0x02000fd8 Unknown note type:(0x00000700)
VMCOREDD_cxgb4_0000:04:00.4 0x02000fd8 Unknown note type:(0x00000700)
CORE 0x00000150 NT_PRSTATUS (prstatus structure)
CORE 0x00000150 NT_PRSTATUS (prstatus structure)
CORE 0x00000150 NT_PRSTATUS (prstatus structure)
CORE 0x00000150 NT_PRSTATUS (prstatus structure)
CORE 0x00000150 NT_PRSTATUS (prstatus structure)
CORE 0x00000150 NT_PRSTATUS (prstatus structure)
CORE 0x00000150 NT_PRSTATUS (prstatus structure)
CORE 0x00000150 NT_PRSTATUS (prstatus structure)
VMCOREINFO 0x0000074f Unknown note type: (0x00000000)
Here, for my two devices, the dump’s names are
VMCOREDD_cxgb4_0000:02:00.4 and VMCOREDD_cxgb4_0000:04:00.4.
It’s really up to the callers to write their own unique name for the
dump. The name is appended to “VMCOREDD_” string.
quoted
Rahul, did you look at the patchset I pointed out?
For devlink, I think the dump name would be identified by
bus_type/device_name; i.e. “pci/0000:02:00.4” for my example.
Is my understanding correct?
Yes.
Thanks,
Rahul