Re: [PATCH 08/14] peci: Add device detection
From: Zev Weiss <hidden>
Date: 2021-07-29 21:30:21
Also in:
linux-arm-kernel, linux-aspeed, linux-doc, linux-hwmon, lkml, openbmc
On Thu, Jul 29, 2021 at 01:55:19PM CDT, Winiarska, Iwona wrote:
On Tue, 2021-07-27 at 17:49 +0000, Zev Weiss wrote:quoted
On Mon, Jul 12, 2021 at 05:04:41PM CDT, Iwona Winiarska wrote:quoted
Since PECI devices are discoverable, we can dynamically detect devices that are actually available in the system. This change complements the earlier implementation by rescanning PECI bus to detect available devices. For this purpose, it also introduces the minimal API for PECI requests. Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com> Reviewed-by: Pierre-Louis Bossart <redacted> --- drivers/peci/Makefile | 2 +- drivers/peci/core.c | 13 ++++- drivers/peci/device.c | 111 ++++++++++++++++++++++++++++++++++++++++ drivers/peci/internal.h | 15 ++++++ drivers/peci/request.c | 74 +++++++++++++++++++++++++++ drivers/peci/sysfs.c | 34 ++++++++++++ 6 files changed, 246 insertions(+), 3 deletions(-) create mode 100644 drivers/peci/device.c create mode 100644 drivers/peci/request.cdiff --git a/drivers/peci/Makefile b/drivers/peci/Makefile index 621a993e306a..917f689e147a 100644 --- a/drivers/peci/Makefile +++ b/drivers/peci/Makefile@@ -1,7 +1,7 @@# SPDX-License-Identifier: GPL-2.0-only # Core functionality -peci-y := core.o sysfs.o +peci-y := core.o request.o device.o sysfs.o obj-$(CONFIG_PECI) += peci.o # Hardware specific bus driversdiff --git a/drivers/peci/core.c b/drivers/peci/core.c index 0ad00110459d..ae7a9572cdf3 100644 --- a/drivers/peci/core.c +++ b/drivers/peci/core.c@@ -31,7 +31,15 @@ struct device_type peci_controller_type = {int peci_controller_scan_devices(struct peci_controller *controller) { - /* Just a stub, no support for actual devices yet */ + int ret; + u8 addr; + + for (addr = PECI_BASE_ADDR; addr < PECI_BASE_ADDR + PECI_DEVICE_NUM_MAX; addr++) { + ret = peci_device_create(controller, addr); + if (ret) + return ret; + } + return 0; }@@ -106,7 +114,8 @@ EXPORT_SYMBOL_NS_GPL(peci_controller_add, PECI);static int _unregister(struct device *dev, void *dummy) { - /* Just a stub, no support for actual devices yet */ + peci_device_destroy(to_peci_device(dev)); + return 0; }diff --git a/drivers/peci/device.c b/drivers/peci/device.c new file mode 100644 index 000000000000..1124862211e2 --- /dev/null +++ b/drivers/peci/device.c@@ -0,0 +1,111 @@ +// SPDX-License-Identifier: GPL-2.0-only +// Copyright (c) 2018-2021 Intel Corporation + +#include <linux/peci.h> +#include <linux/slab.h> + +#include "internal.h" + +static int peci_detect(struct peci_controller *controller, u8 addr) +{ + struct peci_request *req; + int ret; + + req = peci_request_alloc(NULL, 0, 0); + if (!req) + return -ENOMEM; +Might be worth a brief comment here noting that an empty request happens to be the format of a PECI ping command (and/or change the name of the function to peci_ping()).I'll add a comment: "We are using PECI Ping command to detect presence of PECI devices."
Well, what I was more aiming to get at was that to someone not intimately familiar with the PECI protocol it's not immediately obvious from the code that it in fact implements a ping (there's no 'msg->cmd = PECI_CMD_PING' or anything), so I was hoping for something that would just make that slightly more explicit.
quoted
quoted
+ mutex_lock(&controller->bus_lock); + ret = controller->xfer(controller, addr, req); + mutex_unlock(&controller->bus_lock); + + peci_request_free(req); + + return ret; +} + +static bool peci_addr_valid(u8 addr) +{ + return addr >= PECI_BASE_ADDR && addr < PECI_BASE_ADDR + PECI_DEVICE_NUM_MAX; +} + +static int peci_dev_exists(struct device *dev, void *data) +{ + struct peci_device *device = to_peci_device(dev); + u8 *addr = data; + + if (device->addr == *addr) + return -EBUSY; + + return 0; +} + +int peci_device_create(struct peci_controller *controller, u8 addr) +{ + struct peci_device *device; + int ret; + + if (WARN_ON(!peci_addr_valid(addr))) + return -EINVAL;Wondering about the necessity of this check (and the peci_addr_valid() function) -- as of the end of this patch series, there's only one caller of peci_device_create(), and it's peci_controller_scan_devices() looping from PECI_BASE_ADDR to PECI_BASE_ADDR + PECI_DEVICE_NUM_MAX, so checking that the address is in that range seems a bit redundant. Do we anticipate that we might gain additional callers in the future that could run a non-zero risk of passing a bad address?It's just a sanity check to avoid any surprises if the code changes in the future.quoted
quoted
+ + /* Check if we have already detected this device before. */ + ret = device_for_each_child(&controller->dev, &addr, peci_dev_exists); + if (ret) + return 0; + + ret = peci_detect(controller, addr); + if (ret) { + /* + * Device not present or host state doesn't allow successful + * detection at this time. + */ + if (ret == -EIO || ret == -ETIMEDOUT) + return 0;Do we really want to be ignoring EIO here? From a look at aspeed_peci_xfer(), it looks like the only path that would produce that is the non-timeout, non-CMD_DONE case, which I guess happens on contention or FCS errors and such. Should we maybe have some automatic (limited) retry loop for cases like those?Yes, we want to ignore EIO here. It may be returned when we get "Bad Write FCS", after we try to ping non- existing PECI device.quoted
quoted
+ + return ret; + } + + device = kzalloc(sizeof(*device), GFP_KERNEL); + if (!device) + return -ENOMEM; + + device->controller = controller; + device->addr = addr; + device->dev.parent = &device->controller->dev; + device->dev.bus = &peci_bus_type; + device->dev.type = &peci_device_type; + + ret = dev_set_name(&device->dev, "%d-%02x", controller->id, device-quoted
addr);+ if (ret) + goto err_free; + + ret = device_register(&device->dev); + if (ret) + goto err_put; + + return 0; + +err_put: + put_device(&device->dev); +err_free: + kfree(device); + + return ret; +} + +void peci_device_destroy(struct peci_device *device) +{ + device_unregister(&device->dev); +} + +static void peci_device_release(struct device *dev) +{ + struct peci_device *device = to_peci_device(dev); + + kfree(device); +} + +struct device_type peci_device_type = { + .groups = peci_device_groups, + .release = peci_device_release, +};diff --git a/drivers/peci/internal.h b/drivers/peci/internal.h index 80c61bcdfc6b..6b139adaf6b8 100644 --- a/drivers/peci/internal.h +++ b/drivers/peci/internal.h@@ -9,6 +9,21 @@struct peci_controller; struct attribute_group; +struct peci_device; +struct peci_request; + +/* PECI CPU address range 0x30-0x37 */ +#define PECI_BASE_ADDR 0x30 +#define PECI_DEVICE_NUM_MAX 8 + +struct peci_request *peci_request_alloc(struct peci_device *device, u8 tx_len, u8 rx_len); +void peci_request_free(struct peci_request *req); + +extern struct device_type peci_device_type; +extern const struct attribute_group *peci_device_groups[]; + +int peci_device_create(struct peci_controller *controller, u8 addr); +void peci_device_destroy(struct peci_device *device); extern struct bus_type peci_bus_type; extern const struct attribute_group *peci_bus_groups[];diff --git a/drivers/peci/request.c b/drivers/peci/request.c new file mode 100644 index 000000000000..78cee51dfae1 --- /dev/null +++ b/drivers/peci/request.c@@ -0,0 +1,74 @@ +// SPDX-License-Identifier: GPL-2.0-only +// Copyright (c) 2021 Intel Corporation + +#include <linux/export.h> +#include <linux/peci.h> +#include <linux/slab.h> +#include <linux/types.h> + +#include "internal.h" + +/** + * peci_request_alloc() - allocate &struct peci_request with buffers withgiven lengths + * @device: PECI device to which request is going to be sent + * @tx_len: requested TX buffer length + * @rx_len: requested RX buffer length + * + * Return: A pointer to a newly allocated &struct peci_request on success or NULL otherwise. + */ +struct peci_request *peci_request_alloc(struct peci_device *device, u8 tx_len, u8 rx_len) +{ + struct peci_request *req; + u8 *tx_buf, *rx_buf; + + req = kzalloc(sizeof(*req), GFP_KERNEL); + if (!req) + return NULL; + + req->device = device; + + /* + * PECI controllers that we are using now don't support DMA, this + * should be converted to DMA API once support for controllers that do + * allow it is added to avoid an extra copy. + */ + if (tx_len) { + tx_buf = kzalloc(tx_len, GFP_KERNEL); + if (!tx_buf) + goto err_free_req; + + req->tx.buf = tx_buf; + req->tx.len = tx_len; + } + + if (rx_len) { + rx_buf = kzalloc(rx_len, GFP_KERNEL); + if (!rx_buf) + goto err_free_tx; + + req->rx.buf = rx_buf; + req->rx.len = rx_len; + } +As long as we're punting on DMA support, could we do the whole thing in a single allocation instead of three? It'd add some pointer arithmetic, but would also simplify the error-handling/deallocation paths a bit. Or, given that the one controller we're currently supporting has a hardware limit of 32 bytes per transfer anyway, maybe just inline fixed-size rx/tx buffers into struct peci_request and have callers keep them on the stack instead of kmalloc()-ing them?I disagree on error handling (it's not complicated) - however, one argument for doing a single alloc (or moving the buffers as fixed-size arrays inside struct peci_request) is that single kzalloc is going to be faster than 3. But I don't expect it to show up on any perf profiles for now (since peci-wire interface is not a speed demon). I wanted to avoid defining max size for TX and RX in peci-core. Do you have a strong opinion against multiple alloc? If yes, I can go with fixed-size arrays inside struct peci_request.
As is it's certainly not terribly complicated in an absolute sense, but comparatively speaking the cleanup path for a single allocation is still simpler, no? Making it more efficient would definitely be a nice benefit too (perhaps a more significant one) -- in a typical deployment I'd guess this code path will see roughly socket_count + total_core_count executions per second? On a big multi-socket system that could end up being a reasonably large number (>100), so while it may not end up as a major hot spot in a system-wide profile, it seems like it might be worth having it do 1/3 as many allocations if it's reasonably easy to do. (And while I don't think the kernel is generally at fault for this, from what I've seen of OpenBMC as a whole I think it might benefit from a bit more overall frugality with CPU cycles.) As for a fixed max request size and inlined buffers, I definitely understand not wanting to put a cap on that in the generic PECI core -- and actually, looking at the peci-npcm code from previous iterations of the PECI patchset, it looks like the Nuvoton hardware has significantly larger size limits (127 bytes if I'm reading things right) that might be a bit bulky for on-stack allocation. So while that's appealing efficiency-wise and (IMO) aesthetically, perhaps it's not ultimately real viable. Hmm, though (thinking out loud) I suppose we could also get down to a zero-allocation common case by having the driver hold on to a request struct and reuse it across transfers, given that they're all serialized by a mutex anyway?
Thanks -Iwonaquoted
quoted
+ return req; + +err_free_tx: + kfree(req->tx.buf); +err_free_req: + kfree(req); + + return NULL; +} +EXPORT_SYMBOL_NS_GPL(peci_request_alloc, PECI); + +/** + * peci_request_free() - free peci_request + * @req: the PECI request to be freed + */ +void peci_request_free(struct peci_request *req) +{ + kfree(req->rx.buf); + kfree(req->tx.buf); + kfree(req); +} +EXPORT_SYMBOL_NS_GPL(peci_request_free, PECI);diff --git a/drivers/peci/sysfs.c b/drivers/peci/sysfs.c index 36c5e2a18a92..db9ef05776e3 100644 --- a/drivers/peci/sysfs.c +++ b/drivers/peci/sysfs.c@@ -1,6 +1,8 @@// SPDX-License-Identifier: GPL-2.0-only // Copyright (c) 2021 Intel Corporation +#include <linux/device.h> +#include <linux/kernel.h> #include <linux/peci.h> #include "internal.h"@@ -46,3 +48,35 @@ const struct attribute_group *peci_bus_groups[] = {&peci_bus_group, NULL }; + +static ssize_t remove_store(struct device *dev, struct device_attribute *attr, + const char *buf, size_t count) +{ + struct peci_device *device = to_peci_device(dev); + bool res; + int ret; + + ret = kstrtobool(buf, &res); + if (ret) + return ret; + + if (res && device_remove_file_self(dev, attr)) + peci_device_destroy(device); + + return count; +} +static DEVICE_ATTR_IGNORE_LOCKDEP(remove, 0200, NULL, remove_store); + +static struct attribute *peci_device_attrs[] = { + &dev_attr_remove.attr, + NULL +}; + +static const struct attribute_group peci_device_group = { + .attrs = peci_device_attrs, +}; + +const struct attribute_group *peci_device_groups[] = { + &peci_device_group, + NULL +}; -- 2.31.1