Re: [PATCH v7 1/2] net/handshake: Create a NETLINK service for handling handshake requests
From: Chuck Lever III <hidden>
Date: 2023-03-28 18:19:42
On Mar 28, 2023, at 2:14 PM, Jeff Layton [off-list ref] wrote: On Sat, 2023-03-18 at 12:18 -0400, Chuck Lever wrote:quoted
From: Chuck Lever <redacted> When a kernel consumer needs a transport layer security session, it first needs a handshake to negotiate and establish a session. This negotiation can be done in user space via one of the several existing library implementations, or it can be done in the kernel. No in-kernel handshake implementations yet exist. In their absence, we add a netlink service that can: a. Notify a user space daemon that a handshake is needed. b. Once notified, the daemon calls the kernel back via this netlink service to get the handshake parameters, including an open socket on which to establish the session. c. Once the handshake is complete, the daemon reports the session status and other information via a second netlink operation. This operation marks that it is safe for the kernel to use the open socket and the security session established there. The notification service uses a multicast group. Each handshake mechanism (eg, tlshd) adopts its own group number so that the handshake services are completely independent of one another. The kernel can then tell via netlink_has_listeners() whether a handshake service is active and prepared to handle a handshake request. A new netlink operation, ACCEPT, acts like accept(2) in that it instantiates a file descriptor in the user space daemon's fd table. If this operation is successful, the reply carries the fd number, which can be treated as an open and ready file descriptor. While user space is performing the handshake, the kernel keeps its muddy paws off the open socket. A second new netlink operation, DONE, indicates that the user space daemon is finished with the socket and it is safe for the kernel to use again. The operation also indicates whether a session was established successfully. Signed-off-by: Chuck Lever <redacted> --- Documentation/netlink/specs/handshake.yaml | 122 +++++++++++ MAINTAINERS | 8 + include/trace/events/handshake.h | 159 ++++++++++++++ include/uapi/linux/handshake.h | 70 ++++++ net/Kconfig | 5 net/Makefile | 1 net/handshake/Makefile | 11 + net/handshake/genl.c | 57 +++++ net/handshake/genl.h | 23 ++ net/handshake/handshake.h | 82 +++++++ net/handshake/netlink.c | 316 ++++++++++++++++++++++++++++ net/handshake/request.c | 307 +++++++++++++++++++++++++++ net/handshake/trace.c | 20 ++ 13 files changed, 1181 insertions(+) create mode 100644 Documentation/netlink/specs/handshake.yaml create mode 100644 include/trace/events/handshake.h create mode 100644 include/uapi/linux/handshake.h create mode 100644 net/handshake/Makefile create mode 100644 net/handshake/genl.c create mode 100644 net/handshake/genl.h create mode 100644 net/handshake/handshake.h create mode 100644 net/handshake/netlink.c create mode 100644 net/handshake/request.c create mode 100644 net/handshake/trace.c[...]quoted
diff --git a/net/handshake/request.c b/net/handshake/request.c new file mode 100644 index 000000000000..3f8ae9e990d2 --- /dev/null +++ b/net/handshake/request.c@@ -0,0 +1,307 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Handshake request lifetime events + * + * Author: Chuck Lever <chuck.lever@oracle.com> + * + * Copyright (c) 2023, Oracle and/or its affiliates. + */ + +#include <linux/types.h> +#include <linux/socket.h> +#include <linux/kernel.h> +#include <linux/module.h> +#include <linux/skbuff.h> +#include <linux/inet.h> +#include <linux/fdtable.h> +#include <linux/rhashtable.h> + +#include <net/sock.h> +#include <net/genetlink.h> +#include <net/netns/generic.h> + +#include <uapi/linux/handshake.h> +#include "handshake.h" + +#include <trace/events/handshake.h> + +/* + * We need both a handshake_req -> sock mapping, and a sock -> + * handshake_req mapping. Both are one-to-one. + * + * To avoid adding another pointer field to struct sock, net/handshake + * maintains a hash table, indexed by the memory address of @sock, to + * find the struct handshake_req outstanding for that socket. The + * reverse direction uses a simple pointer field in the handshake_req + * struct. + */ + +static struct rhashtable handshake_rhashtbl ____cacheline_aligned_in_smp; + +static const struct rhashtable_params handshake_rhash_params = { + .key_len = sizeof_field(struct handshake_req, hr_sk), + .key_offset = offsetof(struct handshake_req, hr_sk), + .head_offset = offsetof(struct handshake_req, hr_rhash), + .automatic_shrinking = true, +}; + +int handshake_req_hash_init(void) +{ + return rhashtable_init(&handshake_rhashtbl, &handshake_rhash_params); +} + +void handshake_req_hash_destroy(void) +{ + rhashtable_destroy(&handshake_rhashtbl); +} + +struct handshake_req *handshake_req_hash_lookup(struct sock *sk) +{ + return rhashtable_lookup_fast(&handshake_rhashtbl, &sk,Is this correct? It seems like we should be searching for the struct sock pointer value, not on the pointer to the pointer (which will be a stack var), right?
I copied this from the nfsd_file and nfs4_file code we added recently. rhashtable_lookup_fast takes a pointer to the key, so a pointer to a pointer should be correct in this case.
quoted
+ handshake_rhash_params); +} + +static noinline bool handshake_req_hash_add(struct handshake_req *req) +{ + int ret; + + ret = rhashtable_lookup_insert_fast(&handshake_rhashtbl, + &req->hr_rhash, + handshake_rhash_params); + return ret == 0; +} + +static noinline void handshake_req_destroy(struct handshake_req *req) +{ + if (req->hr_proto->hp_destroy) + req->hr_proto->hp_destroy(req); + rhashtable_remove_fast(&handshake_rhashtbl, &req->hr_rhash, + handshake_rhash_params); + kfree(req); +} + +static void handshake_sk_destruct(struct sock *sk) +{ + void (*sk_destruct)(struct sock *sk); + struct handshake_req *req; + + req = handshake_req_hash_lookup(sk); + if (!req) + return; + + trace_handshake_destruct(sock_net(sk), req, sk); + sk_destruct = req->hr_odestruct; + handshake_req_destroy(req); + if (sk_destruct) + sk_destruct(sk); +} + +/** + * handshake_req_alloc - consumer API to allocate a request + * @sock: open socket on which to perform a handshake + * @proto: security protocol + * @flags: memory allocation flags + * + * Returns an initialized handshake_req or NULL. + */ +struct handshake_req *handshake_req_alloc(struct socket *sock, + const struct handshake_proto *proto, + gfp_t flags) +{ + struct sock *sk = sock->sk; + struct net *net = sock_net(sk); + struct handshake_net *hn = handshake_pernet(net); + struct handshake_req *req; + + if (!hn) + return NULL; + + req = kzalloc(struct_size(req, hr_priv, proto->hp_privsize), flags); + if (!req) + return NULL; + + sock_hold(sk); + + INIT_LIST_HEAD(&req->hr_list); + req->hr_sk = sk; + req->hr_proto = proto; + return req; +} +EXPORT_SYMBOL(handshake_req_alloc); + +/** + * handshake_req_private - consumer API to return per-handshake private data + * @req: handshake arguments + * + */ +void *handshake_req_private(struct handshake_req *req) +{ + return (void *)&req->hr_priv; +} +EXPORT_SYMBOL(handshake_req_private); + +static bool __add_pending_locked(struct handshake_net *hn, + struct handshake_req *req) +{ + if (!list_empty(&req->hr_list)) + return false; + hn->hn_pending++; + list_add_tail(&req->hr_list, &hn->hn_requests); + return true; +} + +void __remove_pending_locked(struct handshake_net *hn, + struct handshake_req *req) +{ + hn->hn_pending--; + list_del_init(&req->hr_list); +} + +/* + * Returns %true if the request was found on @net's pending list, + * otherwise %false. + * + * If @req was on a pending list, it has not yet been accepted. + */ +static bool remove_pending(struct handshake_net *hn, struct handshake_req *req) +{ + bool ret; + + ret = false; + + spin_lock(&hn->hn_lock); + if (!list_empty(&req->hr_list)) { + __remove_pending_locked(hn, req); + ret = true; + } + spin_unlock(&hn->hn_lock); + + return ret; +} + +/** + * handshake_req_submit - consumer API to submit a handshake request + * @req: handshake arguments + * @flags: memory allocation flags + * + * Return values: + * %0: Request queued + * %-EBUSY: A handshake is already under way for this socket + * %-ESRCH: No handshake agent is available + * %-EAGAIN: Too many pending handshake requests + * %-ENOMEM: Failed to allocate memory + * %-EMSGSIZE: Failed to construct notification message + * %-EOPNOTSUPP: Handshake module not initialized + * + * A zero return value from handshake_request() means that + * exactly one subsequent completion callback is guaranteed. + * + * A negative return value from handshake_request() means that + * no completion callback will be done and that @req has been + * destroyed. + */ +int handshake_req_submit(struct handshake_req *req, gfp_t flags) +{ + struct sock *sk = req->hr_sk; + struct net *net = sock_net(sk); + struct handshake_net *hn = handshake_pernet(net); + int ret; + + if (!hn) + return -EOPNOTSUPP; + + ret = -EAGAIN; + if (READ_ONCE(hn->hn_pending) >= hn->hn_pending_max) + goto out_err; + + req->hr_odestruct = sk->sk_destruct; + sk->sk_destruct = handshake_sk_destruct; + spin_lock(&hn->hn_lock); + ret = -EOPNOTSUPP; + if (test_bit(HANDSHAKE_F_NET_DRAINING, &hn->hn_flags)) + goto out_unlock; + ret = -EBUSY; + if (!handshake_req_hash_add(req)) + goto out_unlock; + if (!__add_pending_locked(hn, req)) + goto out_unlock; + spin_unlock(&hn->hn_lock); + + ret = handshake_genl_notify(net, req->hr_proto->hp_handler_class, + flags); + if (ret) { + trace_handshake_notify_err(net, req, sk, ret); + if (remove_pending(hn, req)) + goto out_err; + } + + trace_handshake_submit(net, req, sk); + return 0; + +out_unlock: + spin_unlock(&hn->hn_lock); +out_err: + trace_handshake_submit_err(net, req, sk, ret); + handshake_req_destroy(req); + return ret; +} +EXPORT_SYMBOL(handshake_req_submit); + +void handshake_complete(struct handshake_req *req, unsigned int status, + struct genl_info *info) +{ + struct sock *sk = req->hr_sk; + struct net *net = sock_net(sk); + + if (!test_and_set_bit(HANDSHAKE_F_REQ_COMPLETED, &req->hr_flags)) { + trace_handshake_complete(net, req, sk, status); + req->hr_proto->hp_done(req, status, info); + __sock_put(sk); + } +} + +/** + * handshake_req_cancel - consumer API to cancel an in-progress handshake + * @sock: socket on which there is an ongoing handshake + * + * XXX: Perhaps killing the user space agent might also be necessary? + * + * Request cancellation races with request completion. To determine + * who won, callers examine the return value from this function. + * + * Return values: + * %true - Uncompleted handshake request was canceled or not found + * %false - Handshake request already completed + */ +bool handshake_req_cancel(struct socket *sock) +{ + struct handshake_req *req; + struct handshake_net *hn; + struct sock *sk; + struct net *net; + + sk = sock->sk; + net = sock_net(sk); + req = handshake_req_hash_lookup(sk); + if (!req) { + trace_handshake_cancel_none(net, req, sk); + return true; + } + + hn = handshake_pernet(net); + if (hn && remove_pending(hn, req)) { + /* Request hadn't been accepted */ + trace_handshake_cancel(net, req, sk); + return true; + } + if (test_and_set_bit(HANDSHAKE_F_REQ_COMPLETED, &req->hr_flags)) { + /* Request already completed */ + trace_handshake_cancel_busy(net, req, sk); + return false; + } + + __sock_put(sk); + trace_handshake_cancel(net, req, sk); + return true; +} +EXPORT_SYMBOL(handshake_req_cancel);diff --git a/net/handshake/trace.c b/net/handshake/trace.c new file mode 100644 index 000000000000..1c4d8e27e17a --- /dev/null +++ b/net/handshake/trace.c@@ -0,0 +1,20 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Trace points for transport security layer handshakes. + * + * Author: Chuck Lever <chuck.lever@oracle.com> + * + * Copyright (c) 2023, Oracle and/or its affiliates. + */ + +#include <linux/types.h> + +#include <net/sock.h> +#include <net/netlink.h> +#include <net/genetlink.h> + +#include "handshake.h" + +#define CREATE_TRACE_POINTS + +#include <trace/events/handshake.h>-- Jeff Layton [off-list ref]
-- Chuck Lever