Re: Things I wish I'd known about Inotify

From: Jan Kara <hidden>
Date: 2014-07-14 11:28:38
Also in: linux-fsdevel, lkml

Possibly related (same subject, not in this thread)

2014-07-12 · Re: Things I wish I'd known about Inotify · Michael Kerrisk (man-pages) <hidden>
2014-04-12 · Re: Things I wish I'd known about Inotify · Michael Kerrisk (man-pages) <hidden>
2014-04-07 · Re: Things I wish I'd known about Inotify · Jan Kara <hidden>
2014-04-06 · Re: Things I wish I'd known about Inotify · Michael Kerrisk (man-pages) <hidden>
2014-04-04 · Re: Things I wish I'd known about Inotify · Stef Bon <hidden>

On Sat 12-07-14 21:06:45, Michael Kerrisk (man-pages) wrote:

Late follow up on this thread..., since another question occurred in
discussions with Jake.

On Fri, Apr 4, 2014 at 2:43 PM, Jan Kara [off-list ref] wrote:

quoted

On Fri 04-04-14 09:35:50, Michael Kerrisk (man-pages) wrote:

quoted

On 04/03/2014 10:52 PM, Jan Kara wrote:

quoted

On Thu 03-04-14 08:34:44, Michael Kerrisk (man-pages) wrote:

[...]

quoted

   Dealing with rename() events
       The  IN_MOVED_FROM  and  IN_MOVED_TO events that are generated by
       rename(2) are usually available as consecutive events when  read‐
       ing from the inotify file descriptor.  However, this is not guar‐
       anteed.  If multiple processes are triggering  events  for  moni‐
       tored  objects,  then  (on rare occasions) an arbitrary number of
       other events may appear between the IN_MOVED_FROM and IN_MOVED_TO
       events.

       Matching  up  the IN_MOVED_FROM and IN_MOVED_TO event pair gener‐
       ated by rename(2) is thus inherently racy.  (Don't forget that if
       an  object is renamed outside of a monitored directory, there may
       not even be an IN_MOVED_TO event.)  Heuristic  approaches  (e.g.,
       assume the events are always consecutive) can be used to ensure a
       match in most cases, but will inevitably miss some cases, causing
       the  application  to  perceive  the IN_MOVED_FROM and IN_MOVED_TO
       events as being unrelated.  If watch  descriptors  are  destroyed
       and  re-created as a result, then those watch descriptors will be
       inconsistent with the watch descriptors in  any  pending  events.
       (Re-creating the inotify file descriptor and rebuilding the cache
       may be useful to deal with this scenario.)

  Well, but there's 'cookie' value meant exactly for matching up
IN_MOVED_FROM and IN_MOVED_TO events. And 'cookie' is guaranteed to be
unique at least within the inotify instance (in fact currently it is unique
within the whole system but I don't think we want to give that promise).

Yes, that's already assumed by my discussion above (its described elsewhere
in the page). But your comment makes me think I should add a few words to
remind the reader of that fact. I'll do that.

  Yes, that would be good.

quoted

But, the point is that even with the cookie, matching the events is
nontrivial, since:

* There may not even be an IN_MOVED_FROM event
* There may be an arbitrary number of other events in between the
  IN_MOVED_FROM and the IN_MOVED_TO.

Therefore, one has to use heuristic approaches such as "allow at least
N millisconds" or "check the next N events" to see if there is an
IN_MOVED_FROM that matches the IN_MOVED_TO. I can't see any way around
that being inherently racy. (It's unfortunate that the kernel can't
provide a guarantee that the two events are always consecutive, since
that would simply user space's life considerably.)

  Yeah, it's unpleasant but doing that would be quite costly/complex at the
kernel side. And the race would in the worst case lead to application
thinking there's been file moved outside of watched area & a file moved
somewhere else inside the watched area. So the application will have to
possibly inspect that file. That doesn't seem too bad.

One further question. The IN_MOVED_FROM+IN_MOVED_TO pair may not be
guaranteed to be contiguous in the read buffer, but is their insertion
in the event queue guaranteed to be atomic from a user-space point of
view? That is to say: having read an IN_MOVED_FROM event, does user
space have the guarantee that if there is an IN_MOVED_TO event, then
it will already be in the queue? The reason I ask is that this would
affect how user space might try to read the IN_MOVED_TO event. If
there is no such guarantee, then a read() (or select()/poll()) with
(small) timeout is needed. If such a guarantee is provided, then a
nonblocking read() would suffice.

  That's a good question... So the events are not generated atomically even
from userspace POV - i.e., a userspace process may see a state where
IN_MOVED_FROM event is already in the buffer but IN_MOVED_TO event isn't
generated yet.

PS I just now found this code by John McCutchan
https://git.gnome.org/browse/gnome-vfs/tree/modules/inotify-kernel.c#n570
which suggests that the insertion of the event pair is not atomic
w.r.t. user space. Still, I wonder if there is any definitive
statement about this.

								Honza
-- 
Jan Kara [off-list ref]
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help