Re: [PATCH v7 00/11] extend task comm from 16 to 24

From: Yafang Shao <hidden>
Date: 2021-11-01 14:35:17
Also in: bpf, linux-fsdevel, linux-mm, linux-rdma, lkml, netdev

On Mon, Nov 1, 2021 at 10:07 PM Petr Mladek [off-list ref] wrote:

On Mon 2021-11-01 06:04:08, Yafang Shao wrote:

quoted

There're many truncated kthreads in the kernel, which may make trouble
for the user, for example, the user can't get detailed device
information from the task comm.

This patchset tries to improve this problem fundamentally by extending
the task comm size from 16 to 24, which is a very simple way.

In order to do that, we have to do some cleanups first.

1. Make the copy of task comm always safe no matter what the task
   comm size is. For example,

      Unsafe                 Safe
      strlcpy                strscpy_pad
      strncpy                strscpy_pad
      bpf_probe_read_kernel  bpf_probe_read_kernel_str
                             bpf_core_read_str
                             bpf_get_current_comm
                             perf_event__prepare_comm
                             prctl(2)

   After this step, the comm size change won't make any trouble to the
   kernel or the in-tree tools for example perf, BPF programs.

2. Cleanup some old hard-coded 16
   Actually we don't need to convert all of them to TASK_COMM_LEN or
   TASK_COMM_LEN_16, what we really care about is if the convert can
   make the code more reasonable or easier to understand. For
   example, some in-tree tools read the comm from sched:sched_switch
   tracepoint, as it is derived from the kernel, we'd better make them
   consistent with the kernel.

The above changes make sense even if we do not extend comm[] array in
task_struct.

quoted

3. Extend the task comm size from 16 to 24
   task_struct is growing rather regularly by 8 bytes. This size change
   should be acceptable. We used to think about extending the size for
   CONFIG_BASE_FULL only, but that would be a burden for maintenance
   and introduce code complexity.

4. Print a warning if the kthread comm is still truncated.

5. What will happen to the out-of-tree tools after this change?
   If the tool get task comm through kernel API, for example prctl(2),
   bpf_get_current_comm() and etc, then it doesn't matter how large the
   user buffer is, because it will always get a string with a nul
   terminator. While if it gets the task comm through direct string copy,
   the user tool must make sure the copied string has a nul terminator
   itself. As TASK_COMM_LEN is not exposed to userspace, there's no
   reason that it must require a fixed-size task comm.

The amount of code that has to be updated is really high. I am pretty
sure that there are more potential buffer overflows left.

You did not commented on the concerns in the thread
https://lore.kernel.org/all/CAADnVQKm0Ljj-w5PbkAu1ugLFnZRRPt-Vk-J7AhXxDD5xVompA@mail.gmail.com/ (local)

I thought Steven[1] and  Kees[2] have already clearly explained why we
do it like that, so I didn't give any more words on it.

[1]. https://lore.kernel.org/all/20211025170503.59830a43@gandalf.local.home/ (local)
[2]. https://lore.kernel.org/all/202110251406.56F87A3522@keescook/ (local)

Several people suggested to use a more conservative approach.

Yes, they are Al[3] and Alexei[4].

[3]. https://lore.kernel.org/lkml/YVkmaSUxbg%2FJtBHb@zeniv-ca.linux.org.uk/ (local)
[4]. https://lore.kernel.org/all/CAADnVQKm0Ljj-w5PbkAu1ugLFnZRRPt-Vk-J7AhXxDD5xVompA@mail.gmail.com/ (local)

I mean
to keep comm[16] as is and add a new pointer to the full name. The buffer
for the long name might be dynamically allocated only when needed.

That would add a new allocation in the fork() for the threads with a long name.
I'm not sure if it is worth it.

The pointer might be either in task_struct or struct kthread. It might
be used the same way as the full name stored by workqueue kthreads.

If we decide to do it like that, I think we'd better add it in
task_struct, then it will work for all tasks.

The advantage of the separate pointer:

   + would work for names longer than 32
   + will not open security holes in code

Yes, those are the advantages.  And the disadvantage of it is:

 - new allocation in fork()


-- 
Thanks
Yafang

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help