Re: [PATCH 3/3] netlink: Lock out table resizes while dumping Netlink sockets
From: Patrick McHardy <hidden>
Date: 2015-01-20 15:21:55
Also in:
netfilter-devel
On 20.01, Thomas Graf wrote:
On 01/20/15 at 02:31pm, Patrick McHardy wrote:quoted
On 20.01, Thomas Graf wrote:quoted
Lock out table resizes while dumping Netlink sockets to user space. This keeps disruptions to a minimum for readers which don't handle the NLM_F_DUMP_INTR flag.This doesn't lock them out for the duration of the entire dump of course, so the benefit seems rather small. Still with this patch, they will need to handle NLM_F_DUMP_INTR or will get unpredictable behaviour, in which case I'd think it makes more sense to not even try this, all it does is hide parts of the brokenness.If it would lock out the resize for the entire dump I would not have done patches 1 and 2 ;-) I does provide better behaviour if the whole dump fits into a single buffer or if it fits into 2 buffers and we are already dumping into the 2nd buffer when the resize occurs. Otherwise we will see resizes and thus tons of duplicates even in those scenarios even if no insert or removal occurs in parallel. In the case of Netlink diag that should be typical case. Most systems will not have 1000s of Netlink sockets in parallel.
I think its preferrable to make the need to handle NETLINK_F_DUMP_INTR as noticable as possible and not hide it. Silent failure is the worst kind of failure.
quoted
An alternative would be to set a flag in ht when a dump begins that indicates to skip resizing operations and on the end of the dump perform any resizing operations that might be necessary. Herbert disagrees though and he might be right.I don't like the flag as it prevents resizes (and possibly rehashes further down the road) for a long period of time. The hashtable becomes attackable.
Yeah. The point could be made that this is a regression though. We didn't require userspace to deal with interruptions before, and the behaviour was well defined and acceptable for most cases, its not anymore. So I think it should be handled by the kernel, without changes to userspace.