Re: [RFC PATCH v2 40/48] sunrpc: Use sendmsg(MSG_SPLICE_PAGES) rather then sendpage
From: David Howells <dhowells@redhat.com>
Date: 2023-03-30 14:27:55
Also in:
linux-fsdevel, linux-mm, linux-nfs, lkml
Subsystem:
kernel nfsd, sunrpc, and lockd servers, networking [general], nfs, sunrpc, and lockd clients, the rest · Maintainers:
Chuck Lever, Jeff Layton, "David S. Miller", Eric Dumazet, Jakub Kicinski, Paolo Abeni, Trond Myklebust, Anna Schumaker, Linus Torvalds
Chuck Lever III [off-list ref] wrote:
Don't. Just change svc_tcp_send_kvec() to use sock_sendmsg, and leave the marker alone for now, please.
If you insist. See attached. David --- sunrpc: Use sendmsg(MSG_SPLICE_PAGES) rather then sendpage When transmitting data, call down into TCP using sendmsg with MSG_SPLICE_PAGES to indicate that content should be spliced rather than performing sendpage calls to transmit header, data pages and trailer. Signed-off-by: David Howells <dhowells@redhat.com> cc: Trond Myklebust <redacted> cc: Anna Schumaker <anna@kernel.org> cc: Chuck Lever <chuck.lever@oracle.com> cc: Jeff Layton <jlayton@kernel.org> cc: "David S. Miller" <davem@davemloft.net> cc: Eric Dumazet <edumazet@google.com> cc: Jakub Kicinski <kuba@kernel.org> cc: Paolo Abeni <pabeni@redhat.com> cc: Jens Axboe <axboe@kernel.dk> cc: Matthew Wilcox <willy@infradead.org> cc: linux-nfs@vger.kernel.org cc: netdev@vger.kernel.org --- include/linux/sunrpc/svc.h | 11 +++++------ net/sunrpc/svcsock.c | 40 +++++++++++++--------------------------- 2 files changed, 18 insertions(+), 33 deletions(-)
diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h
index 877891536c2f..456ae554aa11 100644
--- a/include/linux/sunrpc/svc.h
+++ b/include/linux/sunrpc/svc.h@@ -161,16 +161,15 @@ static inline bool svc_put_not_last(struct svc_serv *serv) extern u32 svc_max_payload(const struct svc_rqst *rqstp); /* - * RPC Requsts and replies are stored in one or more pages. + * RPC Requests and replies are stored in one or more pages. * We maintain an array of pages for each server thread. * Requests are copied into these pages as they arrive. Remaining * pages are available to write the reply into. * - * Pages are sent using ->sendpage so each server thread needs to - * allocate more to replace those used in sending. To help keep track - * of these pages we have a receive list where all pages initialy live, - * and a send list where pages are moved to when there are to be part - * of a reply. + * Pages are sent using ->sendmsg with MSG_SPLICE_PAGES so each server thread + * needs to allocate more to replace those used in sending. To help keep track + * of these pages we have a receive list where all pages initialy live, and a + * send list where pages are moved to when there are to be part of a reply. * * We use xdr_buf for holding responses as it fits well with NFS * read responses (that have a header, and some data pages, and possibly
diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index 03a4f5615086..af146e053dfc 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c@@ -1059,17 +1059,18 @@ static int svc_tcp_recvfrom(struct svc_rqst *rqstp) svc_xprt_received(rqstp->rq_xprt); return 0; /* record not complete */ } - + static int svc_tcp_send_kvec(struct socket *sock, const struct kvec *vec, int flags) { - return kernel_sendpage(sock, virt_to_page(vec->iov_base), - offset_in_page(vec->iov_base), - vec->iov_len, flags); + struct msghdr msg = { .msg_flags = MSG_SPLICE_PAGES | flags, }; + + iov_iter_kvec(&msg.msg_iter, ITER_SOURCE, vec, 1, vec->iov_len); + return sock_sendmsg(sock, &msg); } /* - * kernel_sendpage() is used exclusively to reduce the number of + * MSG_SPLICE_PAGES is used exclusively to reduce the number of * copy operations in this path. Therefore the caller must ensure * that the pages backing @xdr are unchanging. *
@@ -1109,28 +1110,13 @@ static int svc_tcp_sendmsg(struct socket *sock, struct xdr_buf *xdr, if (ret != head->iov_len) goto out; - if (xdr->page_len) { - unsigned int offset, len, remaining; - struct bio_vec *bvec; - - bvec = xdr->bvec + (xdr->page_base >> PAGE_SHIFT); - offset = offset_in_page(xdr->page_base); - remaining = xdr->page_len; - while (remaining > 0) { - len = min(remaining, bvec->bv_len - offset); - ret = kernel_sendpage(sock, bvec->bv_page, - bvec->bv_offset + offset, - len, 0); - if (ret < 0) - return ret; - *sentp += ret; - if (ret != len) - goto out; - remaining -= len; - offset = 0; - bvec++; - } - } + msg.msg_flags = MSG_SPLICE_PAGES; + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, xdr->bvec, + xdr_buf_pagecount(xdr), xdr->page_len); + ret = sock_sendmsg(sock, &msg); + if (ret < 0) + return ret; + *sentp += ret; if (tail->iov_len) { ret = svc_tcp_send_kvec(sock, tail, 0);