Thread (18 messages) 18 messages, 6 authors, 2016-09-27

Re: [PATCH v2] fs/select: add vmalloc fallback for select(2)

From: Vlastimil Babka <hidden>
Date: 2016-09-23 09:58:34
Also in: linux-api, linux-fsdevel, linux-man, linux-mm, lkml

On 09/23/2016 11:42 AM, David Laight wrote:
From: Vlastimil Babka
quoted
Sent: 22 September 2016 18:55
...
quoted
So in the case of select() it seems like the memory we need 6 bits per file
descriptor, multiplied by the highest possible file descriptor (nfds) as passed
to the syscall. According to the man page of select:

        EINVAL nfds is negative or exceeds the RLIMIT_NOFILE resource limit (see
getrlimit(2)).
That second clause is relatively recent.
Interesting... so it was added without actually being true in the kernel
code?
quoted
The code actually seems to silently cap the value instead of returning EINVAL
though? (IIUC):

        /* max_fds can increase, so grab it once to avoid race */
         rcu_read_lock();
         fdt = files_fdtable(current->files);
         max_fds = fdt->max_fds;
         rcu_read_unlock();
         if (n > max_fds)
                 n = max_fds;

The default for this cap seems to be 1024 where I checked (again, IIUC, it's
what ulimit -n returns?). I wasn't able to change it to more than 2048, which
makes the bitmaps still below PAGE_SIZE.

So if I get that right, the system admin would have to allow really large
RLIMIT_NOFILE to even make vmalloc() possible here. So I don't see it as a large
concern?
4k open files isn't that many.
Especially for programs that are using pipes to emulate windows events.
Sure but IIUC we need 6 bits per file. That means up to almost 42k
files, we should fit into order-3 allocation, which effectively cannot
fail right now.
I suspect that fdt->max_fds is an upper bound for the highest fd the
process has open - not the RLIMIT_NOFILE value.
I gathered that the highest fd effectively limits the number of files,
so it's the same. I might be wrong.
select() shouldn't be silently ignoring large values of 'n' unless
the fd_set bits are zero.
Yeah that doesn't seem to conform to the manpage.
Of course, select does scale well for high numbered fds
and neither poll nor select scale well for large numbers of fds.
True.
	David
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help