Thread (35 messages) 35 messages, 3 authors, 2012-10-11

Re: [PATCH 5/5] aio: Refactor aio_read_evt, use cmxchg(), fix bug

From: Kent Overstreet <hidden>
Date: 2012-10-10 00:47:52
Also in: dm-devel, lkml

On Tue, Oct 09, 2012 at 05:26:34PM -0700, Zach Brown wrote:
quoted
The AIO ringbuffer stuff just annoys me more than most
Not more than everyone, though, I can personally promise you that :).
quoted
(it wasn't until
the other day that I realized it was actually exported to userspace...
what led to figuring that out was noticing aio_context_t was a ulong,
and got truncated to 32 bits with a 32 bit program running on a 64 bit
kernel. I'd been horribly misled by the code comments and the lack of
documentation.) 
Yeah.  It's the userspace address of the mmaped ring.  This has annoyed
the process migration people who can't recreate the context in a new
kernel because there's no userspace interface to specify creation of a
context at a specific address.
Yeah I did finally figure that out - and a file descriptor that
userspace then mmap()ed would solve that problem...
quoted
But if we do have an explicit handle, I don't see why it shouldn't be a
file descriptor.
Because they're expensive to create and destroy when compared to a
single system call.  Imagine that we're using waiting for a single
completion to implement a cheap one-off sync call.  Imagine it's a
buffered op which happens to hit the cache and is really quick.
True. But that could be solved with a separate interface that either
doesn't use a context to submit a call synchronously, or uses an
implicit per thread context.
(And they're annoying to manage: libraries and O_CLOEXEC, running into
fd/file limit tunables, bleh.)
I don't have a _strong_ opinion there, but my intuition is that we
shouldn't be creating new types of handles without a good reason. I
don't think the annoyances are for the most part particular to file
descriptors, I think the tend to be applicable to handles in general and
at least with file descriptors they're known and solved.

Also, with a file descriptor it naturally works with an epoll event
loop. (eventfd for aio is a hack).
If the 'completion context' is no more than a structure in userspace
memory then a lot of stuff just works.  Tasks can share it amongst
themselves as they see fit.  A trivial one-off sync call can just dump
it on the stack and point to it.  It doesn't have to be specifically
torn down on task exit.
That would be awesome, though for it to be worthwhile there couldn't be
any kernel notion of a context at all and I'm not sure if that's
practical. But the idea hadn't occured to me before and I'm sure you've
thought about it more than I have... hrm.

Oh hey, that's what acall does :P

For completions though you really want the ringbuffer pinned... what do
you do about that?
quoted
quoted
And perhaps obviously, I'd start with the acall stuff :).  It was a lot
lighter.  We could talk about how to make it extensible without going
all the way to the generic packed variable size duplicating or not and
returning or not or.. attributes :).
Link? I haven't heard of acall before.
I linked to it after that giant silly comment earlier in the thread,
here it is again:

  http://lwn.net/Articles/316806/
Oh whoops, hadn't started reading yet - looking at it now :)
There's a mostly embarassing video of a jetlagged me giving that talk at
LCA kicking around.. ah, here:

 http://mirror.linux.org.au/pub/linux.conf.au/2009/Thursday/131.ogg

- z
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help