Thread (24 messages) 24 messages, 4 authors, 2023-08-18

RE: [PATCH v3 2/2] iov_iter: Don't deal with iter->copy_mc in memcpy_from_iter_mc()

From: David Laight <hidden>
Date: 2023-08-17 15:17:29
Also in: linux-fsdevel, linux-mm, lkml

From: Linus Torvalds
Sent: Thursday, August 17, 2023 3:38 PM

On Thu, 17 Aug 2023 at 10:42, David Laight [off-list ref] wrote:
quoted
Although I'm not sure the bit-fields really help.
There are 8 bytes at the start of the structure, might as well
use them :-)
Actuallyç I wrote the patch that way because it seems to improve code
generation.

The bitfields are generally all set together as just plain one-time
constants at initialization time, and gcc sees that it's a full byte
write.
I've just spent too long on godbolt (again) :-)
Fiddling with:

#define t1 unsigned char
struct b {
    t1 b1:7;
    t1 b2:1;
};

void ff(struct b *,int);

void ff1(void)
{
    struct b b = {.b1=3, .b2 = 1};
    ff(&b, sizeof b);
}

gcc for x86-64 make a pigs-breakfast when the bitfields are 'char'
and loads the constant from memory using pc-relative access.
Otherwise pretty must all variants (with or without the bitfield)
get initialised in a single write.
(Although gcc seems to insist on loading a 32bit constant into %eax.)

I can well imagine that keeping the constant below 32768 will help
on architectures that have to construct large constants.
And the reason 'data_source' is not a bitfield is that it's not
a constant at iov_iter init time (it's an argument to all the init
functions), so having that one as a separate byte at init time is good
for code generation when you don't need to mask bits or anything like
that.

And once initialized, having things be dense and doing all the
compares with a bitwise 'and' instead of doing them as some value
compare again tends to generate good code.

Then being able to test multiple bits at the same time is just gravy
on top of that (ie that whole "remove user_backed, because it's easier
to just test the bit combination").
Indeed, they used to be bits but never got tested together.
quoted
OTOH the 'nofault' and 'copy_mc' flags could be put into much
higher bits of a 32bit value.
Once you start doing that, you often get bigger constants in the code stream.
I wasn't thinking of using 'really big' values :-)
Even 32768 can be an issue because some cpu sign extend all constants.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help