Re: [PATCH 1/6] fs/exec: Drop task_lock() inside __get_task_comm()
From: Alexei Starovoitov <hidden>
Date: 2024-06-02 16:35:32
Also in:
bpf, linux-fsdevel, linux-mm, linux-trace-kernel, selinux
On Sat, Jun 1, 2024 at 11:57 PM Yafang Shao [off-list ref] wrote:
On Sun, Jun 2, 2024 at 11:52 AM Eric W. Biederman [off-list ref] wrote:quoted
Yafang Shao [off-list ref] writes:quoted
Quoted from Linus [0]: Since user space can randomly change their names anyway, using locking was always wrong for readers (for writers it probably does make sense to have some lock - although practically speaking nobody cares there either, but at least for a writer some kind of race could have long-term mixed resultsUgh. Ick. This code is buggy. I won't argue that Linus is wrong, about removing the task_lock. Unfortunately strscpy_pad does not work properly with the task_lock removed, and buf_size larger that TASK_COMM_LEN. There is a race that will allow reading past the end of tsk->comm, if we read while tsk->common is being updated.It appears so. Thanks for pointing it out. Additionally, other code, such as the BPF helper bpf_get_current_comm(), also uses strscpy_pad() directly without the task_lock. It seems we should change that as well.
Hmm. What race do you see? If lock is removed from __get_task_comm() it probably can be removed from __set_task_comm() as well. And both are calling strscpy_pad to write and read comm. So I don't see how it would read past sizeof(comm), because 'buf' passed into __set_task_comm is NUL-terminated. So the concurrent read will find it.
quoted
So __get_task_comm needs to look something like: char *__get_task_comm(char *buf, size_t buf_size, struct task_struct *tsk) { size_t len = buf_size; if (len > TASK_COMM_LEN) len = TASK_COMM_LEN; memcpy(buf, tsk->comm, len); buf[len -1] = '\0'; return buf; }Thanks for your suggestion.quoted
What shows up in buf past the '\0' is not guaranteed in the above version but I would be surprised if anyone cares.I believe we pad it to prevent the leakage of kernel data. In this case, since no kernel data will be leaked, the following change may be unnecessary.
It's not about leaking of kernel data, but more about not writing garbage past NUL. Because comm[] is a part of some record that is used as a key in a hash map.