Re: [PATCH v7 00/11] extend task comm from 16 to 24
From: Yafang Shao <hidden>
Date: 2021-11-02 01:10:31
Also in:
bpf, linux-fsdevel, linux-mm, linux-rdma, lkml, netdev
On Tue, Nov 2, 2021 at 12:02 AM Petr Mladek [off-list ref] wrote:
On Mon 2021-11-01 22:34:30, Yafang Shao wrote:quoted
On Mon, Nov 1, 2021 at 10:07 PM Petr Mladek [off-list ref] wrote:quoted
On Mon 2021-11-01 06:04:08, Yafang Shao wrote:quoted
4. Print a warning if the kthread comm is still truncated. 5. What will happen to the out-of-tree tools after this change? If the tool get task comm through kernel API, for example prctl(2), bpf_get_current_comm() and etc, then it doesn't matter how large the user buffer is, because it will always get a string with a nul terminator. While if it gets the task comm through direct string copy, the user tool must make sure the copied string has a nul terminator itself. As TASK_COMM_LEN is not exposed to userspace, there's no reason that it must require a fixed-size task comm.The amount of code that has to be updated is really high. I am pretty sure that there are more potential buffer overflows left. You did not commented on the concerns in the thread https://lore.kernel.org/all/CAADnVQKm0Ljj-w5PbkAu1ugLFnZRRPt-Vk-J7AhXxDD5xVompA@mail.gmail.com/ (local)I thought Steven[1] and Kees[2] have already clearly explained why we do it like that, so I didn't give any more words on it. [1]. https://lore.kernel.org/all/20211025170503.59830a43@gandalf.local.home/ (local)Steven was against switching task->comm[16] into a dynamically allocated pointer. But he was not against storing longer names separately.Honestly, I am a bit confused by Kees' answer. IMHO, he agreed that switching task->comm[16] into a pointer was not worth it. But I am not sure what he meant by "Agreed -- this is a small change for what is already an "uncommon" corner case."quoted
quoted
Several people suggested to use a more conservative approach.Yes, they are Al[3] and Alexei[4]. [3]. https://lore.kernel.org/lkml/YVkmaSUxbg%2FJtBHb@zeniv-ca.linux.org.uk/ (local)IMHO, Al suggested to store the long name separately and return it by proc_task_name() when available.quoted
[4]. https://lore.kernel.org/all/CAADnVQKm0Ljj-w5PbkAu1ugLFnZRRPt-Vk-J7AhXxDD5xVompA@mail.gmail.com/ (local)Alexei used dentry->d_iname as an exaxmple. struct dentry uses d_iname[DNAME_INLINE_LEN] for short names. And dynamically allocated d_name for long names, see *__d_alloc() implementation.
Thanks for the summary. So with Stenven's new reply[1], the opinion in common is storing long names into a separate place. And no one is against it now. [1]. https://lore.kernel.org/lkml/20211101120636.3cfc5afa@gandalf.local.home/ (local)
quoted
quoted
I mean to keep comm[16] as is and add a new pointer to the full name. The buffer for the long name might be dynamically allocated only when needed.That would add a new allocation in the fork() for the threads with a long name. I'm not sure if it is worth it.The allocation will be done only when needed. IMHO, the performance is important only for userspace processes. I am not aware of any kernel subsystem that would heavily create and destroy kthreads.
XFS may create many kthreads with longer names, especially if there're
many partitions in the disk.
For example,
xfs-reclaim/sd{a, b, c, ...}
xfs-blockgc/sd{a, b, c, ...}
xfs-inodegc/sd{a, b, c, ...}
They are supposed to be created at boot time, and shouldn't be heavily
created and destroyed.
quoted
quoted
The pointer might be either in task_struct or struct kthread. It might be used the same way as the full name stored by workqueue kthreads.If we decide to do it like that, I think we'd better add it in task_struct, then it will work for all tasks.Is it really needed for userspace processes? For example, ps shows the information from /proc/*/cmdline instead.
Right. The userspace processes can be obtained from /proc/*/cmdline.
quoted
quoted
The advantage of the separate pointer: + would work for names longer than 32 + will not open security holes in codeYes, those are the advantages. And the disadvantage of it is: - new allocation in fork()It should not be a problem if we do it only when necessary and only for kthreads.
So if no one against, I will do it in two steps,
1. Send the task comm cleanups in a separate patchset named "task comm cleanups"
This patchset includes patch #1, #2, #4, #5, #6, #7 and #9.
Cleaning them up can make it less error prone, and it will be
helpful if we want to extend task comm in the future :)
2. Keep the current comm[16] as-is and introduce a separate pointer
to store kthread's long name
Now we only care about kthread, so we can put the pointer into a
kthread specific struct.
For example in the struct kthread, or in kthread->data (which may
conflict with workqueue).
And then dynamically allocate a longer name if it is truncated,
for example,
__kthread_create_on_node
len = vsnprintf(name, sizeof(name), namefmt, args);
if (len >= TASK_COMM_LEN) {
/* create a longer name */
}
And then we modify proc_task_name(), so the user can get
kthread's longer name via /proc/[pid]/comm.
And then free the allocated memory when the kthread is destroyed.
--
Thanks
Yafang