Thread (34 messages) 34 messages, 4 authors, 2018-10-28

Re: [PATCH 2/2] virtio: fix PCI config err handling

From: Luca Boccassi <hidden>
Date: 2018-08-16 10:27:36

On Thu, 2018-08-16 at 14:46 +0800, Tiwei Bie wrote:
On Wed, Aug 15, 2018 at 10:50:57AM +0100, Luca Boccassi wrote:
quoted
On Wed, 2018-08-15 at 11:11 +0800, Tiwei Bie wrote:
quoted
On Tue, Aug 14, 2018 at 03:30:35PM +0100, Luca Boccassi wrote:
quoted
From: Brian Russell <redacted>

In virtio_read_caps, rte_pci_read_config returns the number of
bytes
read from PCI config or < 0 on error.
If less than the expected number of bytes are read then log the
failure and return rather than carrying on with garbage.
Is this a fix or an improvement?
Or did you see anything broken without this patch?
If so, we may need a fixes line and Cc stable.
It is a fix, as it was creating problems in production due to the
constant flux of errors in the logs.
Could you be a bit more specific about which errors
were logged if possible?

If my understanding is correct, you mean the errors
were logged because less than the required amount of
bytes were read?
Yes - rte_pci_read_config on Linux will return not just 0/-1, but the
actual number of bytes read. If it's less than the required amount, the
code then goes on and reads garbage, which causes errors later in the
execution. Checking that we actually got the amount of data we need
fixes this issue.
quoted
But given patch 1/2 is effectively doing a small change in the BSD
bus
API, and it's a requirement for 2/2, I don't think we can include
it in
the stable releases unfortunately.
If it's a fix, we need a fixes line.
Sure, will send a v2.
quoted
quoted
quoted
[...]
quoted
quoted
quoted
@@ -567,16 +567,18 @@ virtio_read_caps(struct rte_pci_device
*dev,
struct virtio_hw *hw)
 	}
 
 	ret = rte_pci_read_config(dev, &pos, 1,
PCI_CAPABILITY_LIST);
-	if (ret < 0) {
-		PMD_INIT_LOG(DEBUG, "failed to read pci
capability list");
+	if (ret != 1) {
+		PMD_INIT_LOG(DEBUG,
+			     "failed to read pci capability
list, ret %d", ret);
 		return -1;
 	}
 
 	while (pos) {
 		ret = rte_pci_read_config(dev, &cap,
sizeof(cap), pos);
-		if (ret < 0) {
-			PMD_INIT_LOG(ERR,
-				"failed to read pci cap at
pos: %x", pos);
+		if (ret != sizeof(cap)) {
Above code has to successfully read a full virtio
PCI capability during each read, otherwise it will
give up reading other capabilities and may fallback
to the legacy mode. In which case it will fail to
read the requested amount of bytes? Should we try
to read the generic PCI fields first?
I do not know what exactly causes less than required bytes to be read,
but we have seen it happen in production (not 100% of the times though
- so I think it's worth keeping the structure as-is). As you said in
that case it falls back to legacy mode which, in our experience in
production deployments, then succeeds. That's why the error level print
is undesired - because the code will actually work via the fallback,
but the customers will see scary errors in the logs and open
escalations :-)
Besides, you also need to update other calls to
rte_pci_read_config(), e.g.:

https://github.com/DPDK/dpdk/blob/76b9d9de5c7d/drivers/net/virtio/vir
tio_pci.c#L696

Thanks
Sure I will apply the same changes in v2.

-- 
Kind regards,
Luca Boccassi
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help