Re: [PATCH v7 4/5] nvme-tcp: Support KeyUpdate
From: Alistair Francis <hidden>
Date: 2026-03-04 11:37:50
Also in:
linux-doc, linux-nfs, linux-nvme, lkml
On Wed, Mar 4, 2026 at 5:40 PM Hannes Reinecke [off-list ref] wrote:
On 3/4/26 06:34, alistair23@gmail.com wrote:quoted
From: Alistair Francis <redacted> If the nvme_tcp_try_send() or nvme_tcp_try_recv() functions return EKEYEXPIRED then the underlying TLS keys need to be updated. This occurs on an KeyUpdate event as described in RFC8446 https://datatracker.ietf.org/doc/html/rfc8446#section-4.6.3. If the NVMe Target (TLS server) initiates a KeyUpdate this patch will allow the NVMe layer to process the KeyUpdate request and forward the request to userspace. Userspace must then update the key to keep the connection alive. This patch allows us to handle the NVMe target sending a KeyUpdate request without aborting the connection. At this time we don't support initiating a KeyUpdate. Signed-off-by: Alistair Francis <redacted> --- v7: - Use read_sock_cmsg instead of recvmsg() to handle KeyUpdate v6: - Don't use `struct nvme_tcp_hdr` to determine TLS_HANDSHAKE_KEYUPDATE, instead look at the cmsg fields. - Don't flush async_event_work v5: - Cleanup code flow - Check for MSG_CTRUNC in the msg_flags return from recvmsg and use that to determine if it's a control message v4: - Remove all support for initiating KeyUpdate - Don't call cancel_work() when updating keys v3: - Don't cancel existing handshake requests v2: - Don't change the state - Use a helper function for KeyUpdates - Continue sending in nvme_tcp_send_all() after a KeyUpdate - Remove command message using recvmsg drivers/nvme/host/tcp.c | 59 ++++++++++++++++++++++++++++++++++++++++- 1 file changed, 58 insertions(+), 1 deletion(-)diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c index 8b6172dd1c0f..ade11d2ac9ef 100644 --- a/drivers/nvme/host/tcp.c +++ b/drivers/nvme/host/tcp.c@@ -171,6 +171,7 @@ struct nvme_tcp_queue { bool tls_enabled; u32 rcv_crc; u32 snd_crc; + key_serial_t handshake_session_id; __le32 exp_ddgst; __le32 recv_ddgst; struct completion tls_complete;@@ -1361,6 +1362,59 @@ static int nvme_tcp_try_send(struct nvme_tcp_queue *queue) return ret; } +static void update_tls_keys(struct nvme_tcp_queue *queue) +{ + int qid = nvme_tcp_queue_id(queue); + int ret; + + dev_dbg(queue->ctrl->ctrl.device, + "updating key for queue %d\n", qid); + + ret = nvme_tcp_start_tls(&(queue->ctrl->ctrl), + queue, queue->ctrl->ctrl.tls_pskid, + HANDSHAKE_KEY_UPDATE_TYPE_RECEIVED); + + if (ret < 0) { + dev_err(queue->ctrl->ctrl.device, + "failed to update the keys %d\n", ret); + nvme_tcp_fail_request(queue->request); + } +} + +static int nvme_tcp_recv_cmsg(read_descriptor_t *desc, + struct sk_buff *skb, + unsigned int offset, size_t len, + u8 content_type) +{ + struct nvme_tcp_queue *queue = desc->arg.data; + struct socket *sock = queue->sock; + struct sock *sk = sock->sk; + + switch (content_type) { + case TLS_RECORD_TYPE_HANDSHAKE: + if (len == 5) { + u8 header[5]; + + if (!skb_copy_bits(skb, offset, header, + sizeof(header))) { + if (header[0] == TLS_HANDSHAKE_KEYUPDATE) { + dev_err(queue->ctrl->ctrl.device, "KeyUpdate message\n"); + release_sock(sk); + update_tls_keys(queue); + lock_sock(sk); + return 0; + } + } + } + + break; + default: + break; + }I think a simple 'if' condition would be sufficient here, or do you have handling of other TLS record types queued somewhere? And we should log unhandled TLS records.
I like this approach as it makes it really easy to handle more types in the future. I don't have any more record types queued anywhere so I can change it to an if statement. Good point about logging unhandled records Alistair
quoted
+ + return -EAGAIN; +} + static int nvme_tcp_try_recv(struct nvme_tcp_queue *queue) { struct socket *sock = queue->sock;@@ -1372,7 +1426,8 @@ static int nvme_tcp_try_recv(struct nvme_tcp_queue *queue) rd_desc.count = 1; lock_sock(sk); queue->nr_cqe = 0; - consumed = sock->ops->read_sock(sk, &rd_desc, nvme_tcp_recv_skb); + consumed = sock->ops->read_sock_cmsg(sk, &rd_desc, nvme_tcp_recv_skb, + nvme_tcp_recv_cmsg); release_sock(sk); return consumed == -EAGAIN ? 0 : consumed; }@@ -1708,6 +1763,7 @@ static void nvme_tcp_tls_done(void *data, int status, key_serial_t pskid, ctrl->ctrl.tls_pskid = key_serial(tls_key); key_put(tls_key); queue->tls_err = 0; + queue->handshake_session_id = handshake_session_id; } out_complete:@@ -1737,6 +1793,7 @@ static int nvme_tcp_start_tls(struct nvme_ctrl *nctrl, keyring = key_serial(nctrl->opts->keyring); args.ta_keyring = keyring; args.ta_timeout_ms = tls_handshake_timeout * 1000; + args.ta_handshake_session_id = queue->handshake_session_id; queue->tls_err = -EOPNOTSUPP; init_completion(&queue->tls_complete); if (keyupdate == HANDSHAKE_KEY_UPDATE_TYPE_UNSPEC)Otherwise looks good. Cheers, Hannes -- Dr. Hannes Reinecke Kernel Storage Architect hare@suse.de +49 911 74053 688 SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich