Re: [PATCH] fix unbalanced page refcounting in bio_map_user_iov
From: Al Viro <viro@ZenIV.linux.org.uk>
Date: 2017-09-24 14:27:42
Also in:
lkml
On Sat, Sep 23, 2017 at 09:33:23PM +0100, Al Viro wrote:
On Sat, Sep 23, 2017 at 06:19:26PM +0100, Al Viro wrote:quoted
On Sat, Sep 23, 2017 at 05:55:37PM +0100, Al Viro wrote:quoted
IOW, the loop on failure exit should go through the bio, like __bio_unmap_user() does. We *also* need to put everything left unused in pages[], but only from the last iteration through iov_for_each(). Frankly, I would prefer to reuse the pages[], rather than append to it on each iteration. Used iov_iter_get_pages_alloc(), actually.Something like completely untested diff below, perhaps...quoted
+ unsigned n = PAGE_SIZE - offs; + unsigned prev_bi_vcnt = bio->bi_vcnt;Sorry, that should've been followed by if (n > bytes) n = bytes; Anyway, a carved-up variant is in vfs.git#work.iov_iter. It still needs review and testing; the patch Vitaly has posted in this thread plus 6 followups, hopefully more readable than aggregate diff. Comments?
BTW, there's something fishy in bio_copy_user_iov(). If the area we'd asked for
had been too large for a single bio, we are going to create a bio and have
bio_add_pc_page() eventually fill it up to limit. Then we return into
__blk_rq_map_user_iov(), advance iter and call bio_copy_user_iov() again.
Fine, but... now we might have non-zero iter->iov_offset. And this
bmd->is_our_pages = map_data ? 0 : 1;
memcpy(bmd->iov, iter->iov, sizeof(struct iovec) * iter->nr_segs);
iov_iter_init(&bmd->iter, iter->type, bmd->iov,
iter->nr_segs, iter->count);
does not even look at iter->iov_offset. As the result, when it gets to
bio_uncopy_user(), we copy the data from each bio into the *beginning* of
the user area, overwriting that from the other bio.
At the very least, we need bmd->iter = *iter; bmd->iter.iov = bmd->iov;
instead of that iov_iter_init() in there. I'm not sure how far back does
it go; looks like "block: support large requests in blk_rq_map_user_iov"
is the earliest possible point, but it might need more digging to make
sure. v4.5+, if that's when the problems began...
Anyway, I'd added the obvious fix to #work.iov_iter, reordered it and
force-pushed the result.