Re: [PATCH net-next 1/3] net-timestamp: add strict check when setting tx flags

From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Date: 2024-09-30 17:14:19
Also in: linux-kselftest

Jason Xing wrote:

On Mon, Sep 30, 2024 at 7:49 PM Willem de Bruijn
[off-list ref] wrote:

quoted

Jason Xing wrote:

quoted

On Mon, Sep 30, 2024 at 6:39 PM Willem de Bruijn
[off-list ref] wrote:

quoted

Jason Xing wrote:

quoted

From: Jason Xing <kernelxing@tencent.com>

Even though this case is unlikely to happen, we have to avoid such
a case occurring at an earlier point: the sk_rmem_alloc could get
increased because of inserting more and more skbs into the errqueue
when calling __skb_complete_tx_timestamp(). This bad case would stop
the socket transmitting soon.

It is up to the application to read from the error queue frequently
enough and/or increase SO_RCVBUF.

Sure thing. If we test it without setting SOF_TIMESTAMPING_SOFTWARE on
the loopback, it will soon stop. That's the reason why I tried to add
the restriction just in case.

I don't follow at all.

That bit does not affect the core issue: that the application is not
clearing its error queue quickly enough.

quoted

Signed-off-by: Jason Xing <kernelxing@tencent.com>
---
 net/core/sock.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/net/core/sock.c b/net/core/sock.c
index fe87f9bd8f16..4bddd6f62e4f 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c

@@ -905,6 +905,10 @@ int sock_set_timestamping(struct sock *sk, int optname,
      if (val & ~SOF_TIMESTAMPING_MASK)
              return -EINVAL;

+     if (val & SOF_TIMESTAMPING_TX_RECORD_MASK &&
+         !(val & SOF_TIMESTAMPING_SOFTWARE))
+             return -EINVAL;
+

This breaks hardware timestamping

Yes, and sorry about that. I'll fix this.

As is I don't understand the purpose of this patch. Please do not
just resubmit with a change, but explain the problem and suggested
solution first.

I will drop this patch because I just tested with my program in the
local machine and found there is one mistake I made about calculating
the diff between those two . Sorry for the noise.

Well, I only need to send a V2 patch of patch [3/3] in the next few days.

BTW, please allow me to ask one question unrelated to this patch
again. I do wonder: if we batch the recv skbs from the errqueue when
calling tcp_recvmsg() -> inet_recv_error(), it could break users,
right?

Analogous to __msg_zerocopy_callback with __msg_zerocopy_callback.

Only here we cannot return range-based results and thus cannot just
expand the range of the one outstanding notification.

This would mean in ip(v6)_recv_error calling sock_dequeue_err_skb,
sock_recv_timestamp and put_cmsg IP_RECVERR multiple times. And
ip_cmsg_recv if needed.

Existing applications do not have to expect multiple results per
single recvmsg call. So enabling that unconditionally could break
them.

Adding this will require a new flag. An sk_tsflag is the obvious
approach.

Interpreting a MSG_* flag passed to recvmsg would be
another option. If there is a bit that can be set with MSG_ERRQUEUE
and cannot already be set currently. But I don't think that's the
case. We allow all bits and ignore any undefined ones.

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help