Thread (6 messages) 6 messages, 3 authors, 2023-02-09

RE: [PATCH net-next 1/1] hv_netvsc: Check status in SEND_RNDIS_PKT completion message

From: Haiyang Zhang <haiyangz@microsoft.com>
Date: 2023-02-09 19:10:26
Also in: linux-hyperv, lkml

-----Original Message-----
From: Michael Kelley (LINUX) <redacted>
Sent: Thursday, February 9, 2023 12:11 PM
To: Haiyang Zhang <haiyangz@microsoft.com>; KY Srinivasan
[off-list ref]; wei.liu@kernel.org; Dexuan Cui
[off-list ref]; davem@davemloft.net; edumazet@google.com;
kuba@kernel.org; pabeni@redhat.com; netdev@vger.kernel.org; linux-
hyperv@vger.kernel.org; linux-kernel@vger.kernel.org
Subject: RE: [PATCH net-next 1/1] hv_netvsc: Check status in
SEND_RNDIS_PKT completion message

From: Haiyang Zhang <haiyangz@microsoft.com> Sent: Thursday, February 9,
2023 5:49 AM
quoted
quoted
-----Original Message-----
From: Michael Kelley (LINUX) <redacted>
Sent: Wednesday, February 8, 2023 6:50 PM
To: KY Srinivasan <kys@microsoft.com>; Haiyang Zhang
[off-list ref]; wei.liu@kernel.org; Dexuan Cui
[off-list ref]; davem@davemloft.net; edumazet@google.com;
kuba@kernel.org; pabeni@redhat.com; netdev@vger.kernel.org; linux-
hyperv@vger.kernel.org; linux-kernel@vger.kernel.org
Cc: Michael Kelley (LINUX) <redacted>
Subject: [PATCH net-next 1/1] hv_netvsc: Check status in
SEND_RNDIS_PKT
quoted
quoted
completion message

Completion responses to SEND_RNDIS_PKT messages are currently
processed
quoted
quoted
regardless of the status in the response, so that resources associated
with the request are freed.  While this is appropriate, code bugs that
cause sending a malformed message, or errors on the Hyper-V host, go
undetected. Fix this by checking the status and outputting a message
if there is an error.

Signed-off-by: Michael Kelley <redacted>
---
 drivers/net/hyperv/netvsc.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)
diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c
index 661bbe6..caf22e9 100644
--- a/drivers/net/hyperv/netvsc.c
+++ b/drivers/net/hyperv/netvsc.c
@@ -813,6 +813,7 @@ static void netvsc_send_completion(struct
net_device *ndev,
quoted
quoted
 	u32 msglen = hv_pkt_datalen(desc);
 	struct nvsp_message *pkt_rqst;
 	u64 cmd_rqst;
+	u32 status;

 	/* First check if this is a VMBUS completion without data payload */
 	if (!msglen) {
@@ -884,6 +885,22 @@ static void netvsc_send_completion(struct
net_device *ndev,
quoted
quoted
 		break;

 	case NVSP_MSG1_TYPE_SEND_RNDIS_PKT_COMPLETE:
+		if (msglen < sizeof(struct nvsp_message_header) +
+		    sizeof(struct
nvsp_1_message_send_rndis_packet_complete)) {
quoted
quoted
+			netdev_err(ndev, "nvsp_rndis_pkt_complete length
too small: %u\n",
quoted
quoted
+				   msglen);
+			return;
+		}
+
+		/* If status indicates an error, output a message so we know
+		 * there's a problem. But process the completion anyway so
the
quoted
quoted
+		 * resources are released.
+		 */
+		status = nvsp_packet-
msg.v1_msg.send_rndis_pkt_complete.status;
quoted
+		if (status != NVSP_STAT_SUCCESS)
+			netdev_err(ndev, "nvsp_rndis_pkt_complete error
status: %x\n",
quoted
quoted
+				   status);
+
Could you add rate limit to this error, so in case it happens frequently, the
errors won't fill up the dmesg.

Or even better, add a counter for this.
I thought about rate limiting.  But my assumption is that such errors are
very rare, and that it would be better to see all occurrences instead of
potentially filtering some out due to rate limiting.  If that assumption
proves to not be true, then we probably have a bigger problem -- there's
a bug in the Linux guest causing it to submit bad requests, or there's a
bug on the Hyper-V side.

That said, I don't feel strongly about it either way.

Thoughts?
I haven't seen any cases of large amount of TX errors so far (Our 
existing code doesn't check it).

But I'm just worried about if a VM sending at high speed, and host side is,
for some reason, not able to send them correctly, the log file will become 
really big and difficult to download and read. With rate limit, we still see 
dozens of messages every 5 seconds or so, and it tells you how many 
messages are skipped. And, if the rate is lower, it won't skip anything. 
Isn't this info sufficient to debug?

By the way, guests cannot trust the host -- probably we shouldn't allow the
host to have a way to jam guest's log file?

Thanks,
- Haiyang
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help