Re: [PATCH v2] fs/select: add vmalloc fallback for select(2)
From: Vlastimil Babka <hidden>
Date: 2016-09-23 09:58:34
Also in:
linux-api, linux-fsdevel, linux-man, linux-mm, lkml
On 09/23/2016 11:42 AM, David Laight wrote:
From: Vlastimil Babkaquoted
Sent: 22 September 2016 18:55...quoted
So in the case of select() it seems like the memory we need 6 bits per file descriptor, multiplied by the highest possible file descriptor (nfds) as passed to the syscall. According to the man page of select: EINVAL nfds is negative or exceeds the RLIMIT_NOFILE resource limit (see getrlimit(2)).That second clause is relatively recent.
Interesting... so it was added without actually being true in the kernel code?
quoted
The code actually seems to silently cap the value instead of returning EINVAL though? (IIUC): /* max_fds can increase, so grab it once to avoid race */ rcu_read_lock(); fdt = files_fdtable(current->files); max_fds = fdt->max_fds; rcu_read_unlock(); if (n > max_fds) n = max_fds; The default for this cap seems to be 1024 where I checked (again, IIUC, it's what ulimit -n returns?). I wasn't able to change it to more than 2048, which makes the bitmaps still below PAGE_SIZE. So if I get that right, the system admin would have to allow really large RLIMIT_NOFILE to even make vmalloc() possible here. So I don't see it as a large concern?4k open files isn't that many. Especially for programs that are using pipes to emulate windows events.
Sure but IIUC we need 6 bits per file. That means up to almost 42k files, we should fit into order-3 allocation, which effectively cannot fail right now.
I suspect that fdt->max_fds is an upper bound for the highest fd the process has open - not the RLIMIT_NOFILE value.
I gathered that the highest fd effectively limits the number of files, so it's the same. I might be wrong.
select() shouldn't be silently ignoring large values of 'n' unless the fd_set bits are zero.
Yeah that doesn't seem to conform to the manpage.
Of course, select does scale well for high numbered fds and neither poll nor select scale well for large numbers of fds.
True.
David