Re: [PATCH for-next 4/4] devlink: add health command support
From: Jiri Pirko <jiri@resnulli.us>
Date: 2019-02-11 10:50:44
Sun, Feb 10, 2019 at 07:28:49PM CET, ayal@mellanox.com wrote:
This patch adds support for the following commands:
devlink health show [DEV reporter REPORTE_NAME]
devlink health recover DEV reporter REPORTER_NAME
devlink health diagnose DEV reporter REPORTER_NAME
devlink health dump show DEV reporter REPORTER_NAME
devlink health dump clear DEV reporter REPORTER_NAME
devlink health set DEV reporter REPORTER_NAME NAME VALUE
* show: Devlink health show command displays status and configuration info on
specific reporter on a device or dump the info on all reporters on all
devices.
* recover: Devlink health recover enables the user to initiate a
recovery on a reporter. This operation will increment the recoveries
counter displayed in the show command.
* diagnose: Devlink health diagnose enables the user to retrieve diagnostics data
on a reporter on a device. The command's output is a free text defined
by the reporter.
* dump show: Devlink health dump show displays the last saved dump. Devlink
health saves a single dump. If a dump is not already stored by
the Devlink for this reporter, Devlink generates a new dump. The
dump can be generated automatically when a reporter reports on an
error or manually by user's request.
dump output is defined by the reporter.
* dump clear: Devlink health dump clear, deletes the last saved dump file.
* set: Devlink health set, enables the user to configure:
1) grace_period [msec] time interval between auto recoveries.
2) auto_recover [true/false] whether the devlink should execute
automatic recover on error.
Examples:
$devlink health show pci/0000:00:09.0 reporter tx
pci/0000:00:09.0:
name tx
state healthy #err 0 #recover 1 last_dump_ts N/A
parameters:
grace period 600 auto_recover true
$devlink health diagnose pci/0000:00:09.0 reporter tx
SQs:
sqn: 4283 HW state: 1 stopped: false
sqn: 4288 HW state: 1 stopped: false
sqn: 4293 HW state: 1 stopped: false
sqn: 4298 HW state: 1 stopped: false
sqn: 4303 HW state: 1 stopped: false
$devlink health dump show pci/0000:00:09.0 reporter tx
TX dump data
$devlink health dump clear pci/0000:00:09.0 reporter tx
$devlink health set pci/0000:00:09.0 reporter tx grace_period 3500
$devlink health set pci/0000:00:09.0 reporter tx auto_recover false
Signed-off-by: Aya Levin <redacted>
Reviewed-by: Moshe Shemesh <redacted>
---
devlink/devlink.c | 551 ++++++++++++++++++++++++++++++++++++++++++-
include/uapi/linux/devlink.h | 23 ++
man/man8/devlink-health.8 | 176 ++++++++++++++
man/man8/devlink.8 | 7 +-
4 files changed, 755 insertions(+), 2 deletions(-)755 lines is too much for one patch. For easier review, please split this patch into separate patchset, preferably per-cmd.