Thread (25 messages) 25 messages, 9 authors, 2020-10-19

Re: [External] Re: [PATCH] mm: proc: add Sock to /proc/meminfo

From: Muchun Song <hidden>
Date: 2020-10-13 03:30:24
Also in: linux-fsdevel, linux-mm, lkml

On Tue, Oct 13, 2020 at 5:47 AM Cong Wang [off-list ref] wrote:
On Sun, Oct 11, 2020 at 9:22 PM Muchun Song [off-list ref] wrote:
quoted
On Mon, Oct 12, 2020 at 2:39 AM Cong Wang [off-list ref] wrote:
quoted
On Sat, Oct 10, 2020 at 3:39 AM Muchun Song [off-list ref] wrote:
quoted
The amount of memory allocated to sockets buffer can become significant.
However, we do not display the amount of memory consumed by sockets
buffer. In this case, knowing where the memory is consumed by the kernel
We do it via `ss -m`. Is it not sufficient? And if not, why not adding it there
rather than /proc/meminfo?
If the system has little free memory, we can know where the memory is via
/proc/meminfo. If a lot of memory is consumed by socket buffer, we cannot
know it when the Sock is not shown in the /proc/meminfo. If the unaware user
can't think of the socket buffer, naturally they will not `ss -m`. The
end result
Interesting, we already have a few counters related to socket buffers,
are you saying these are not accounted in /proc/meminfo either?
Yeah, these are not accounted for in /proc/meminfo.
If yes, why are page frags so special here? If not, they are more
important than page frags, so you probably want to deal with them
first.

quoted
is that we still don’t know where the memory is consumed. And we add the
Sock to the /proc/meminfo just like the memcg does('sock' item in the cgroup
v2 memory.stat). So I think that adding to /proc/meminfo is sufficient.
It looks like actually the socket page frag is already accounted,
for example, the tcp_sendmsg_locked():

                        copy = min_t(int, copy, pfrag->size - pfrag->offset);

                        if (!sk_wmem_schedule(sk, copy))
                                goto wait_for_memory;
Yeah, it is already accounted for. But it does not represent real memory
usage. This is just the total amount of charged memory.

For example, if a task sends a 10-byte message, it only charges one
page to memcg. But the system may allocate 8 pages. Therefore, it
does not truly reflect the memory allocated by the page frag memory
allocation path.
quoted
quoted
quoted
 static inline void __skb_frag_unref(skb_frag_t *frag)
 {
-       put_page(skb_frag_page(frag));
+       struct page *page = skb_frag_page(frag);
+
+       if (put_page_testzero(page)) {
+               dec_sock_node_page_state(page);
+               __put_page(page);
+       }
 }
You mix socket page frag with skb frag at least, not sure this is exactly
what you want, because clearly skb page frags are frequently used
by network drivers rather than sockets.

Also, which one matches this dec_sock_node_page_state()? Clearly
not skb_fill_page_desc() or __skb_frag_ref().
Yeah, we call inc_sock_node_page_state() in the skb_page_frag_refill().
How is skb_page_frag_refill() possibly paired with __skb_frag_unref()?
quoted
So if someone gets the page returned by skb_page_frag_refill(), it must
put the page via __skb_frag_unref()/skb_frag_unref(). We use PG_private
to indicate that we need to dec the node page state when the refcount of
page reaches zero.
skb_page_frag_refill() is called on frags not within an skb, for instance,
sk_page_frag_refill() uses it for a per-socket or per-process page frag.
But, __skb_frag_unref() is specifically used for skb frags, which are
supposed to be filled by skb_fill_page_desc() (page is allocated by driver).

They are different things you are mixing them up, which looks clearly
wrong or at least misleading.
Yeah, it looks a little strange. I just want to account for page frag
allocations. So I have to use PG_private to distinguish the page
from page frag or others in the __skb_frag_unref(). If the page is
allocated from skb_page_frag_refill, we should decrease the
statistics.

Thanks.
Thanks.


-- 
Yours,
Muchun
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help