Thread (58 messages) 58 messages, 5 authors, 2024-07-17

Re: [PATCH net-next v9 06/13] mm: page_frag: reuse existing space for 'size' and 'pfmemalloc'

From: Alexander Duyck <hidden>
Date: 2024-07-11 16:49:47
Also in: linux-mm, lkml

On Thu, Jul 11, 2024 at 1:16 AM Yunsheng Lin [off-list ref] wrote:
On 2024/7/10 23:28, Alexander H Duyck wrote:
...
quoted
quoted
Yes, agreed. It would be good to be more specific about how to avoid
the above problem using a signed negative number for 'remaining' as
I am not sure how it can be done yet.
My advice would be to go back to patch 3 and figure out how to do this
re-ordering without changing the alignment behaviour. The old code
essentially aligned both the offset and fragsz by combining the two and
then doing the alignment. Since you are doing a count up setup you will
I am not sure I understand 'aligned both the offset and fragsz ' part, as
'fragsz' being aligned or not seems to depend on last caller' align_mask,
for the same page_frag_cache instance, suppose offset = 32768 initially for
the old code:
Step 1: __page_frag_alloc_align() is called with fragsz=7 and align_mask=~0u
       the offset after this is 32761, the true fragsz is 7 too.

Step 2: __page_frag_alloc_align() is called with fragsz=7 and align_mask=-16
        the offset after this is 32752, the true fragsz is 9, which does
        not seems to be aligned.
I was referring to my original code before this patchset. I was doing
the subtraction of the fragsz first, and then aligning which would end
up padding the end of the frame as it was adding to the total size by
pulling the offset down *after* I had already subtracted fragsz. The
result was that I was always adding additional room depending on the
setting of the fragsz and how it related to the alignment. After
changing the code to realign on the start of the next frag the fragsz
is at the mercy of the next caller's alignment. In the event that the
caller didn't bother to align the fragsz by the align mask before hand
they can end up with a scenario that might result in false sharing.
The above is why I added the below paragraph in the doc to make the semantic
more explicit:
"Depending on different aligning requirement, the page_frag API caller may call
page_frag_alloc*_align*() to ensure the returned virtual address or offset of
the page is aligned according to the 'align/alignment' parameter. Note the size
of the allocated fragment is not aligned, the caller needs to provide an aligned
fragsz if there is an alignment requirement for the size of the fragment."

And existing callers of page_frag aligned API does seems to follow the above
rule last time I checked.

Or did I miss something obvious here?
No you didn't miss anything. It is just that there is now more
potential for error than there was before.
quoted
need to align the remaining, then add fragsz, and then I guess you
could store remaining and then subtract fragsz from your final virtual
address to get back to where the starting offset is actually located.
remaining = __ALIGN_KERNEL_MASK(nc->remaining, ~align_mask);
remaining += fragsz;
nc->remaining = remaining;
return encoded_page_address(nc->encoded_va) + (size + remaining) - fragsz;

If yes, I am not sure what is the point of doing the above yet, it
just seem to make thing more complicated and harder to understand.
That isn't right. I am not sure why you are adding size + remaining or
what those are supposed to represent.

The issue was that the "remaining" ends up being an unaligned value as
you were starting by aligning it and then adding fragsz. So by
subtracting fragsz you can get back to the aliglined start. What this
patch was doing before was adding the raw unaligned nc->remaining at
the end of the function.
quoted
Basically your "remaining" value isn't a safe number to use for an
offset since it isn't aligned to your starting value at any point.
Does using 'aligned_remaining' local variable to make it more obvious
seem reasonable to you?
No, as the value you are storing above isn't guaranteed to be aligned.
If you stored it before adding fragsz then it would be aligned.
quoted
As far as the negative value, it is more about making it easier to keep
track of what is actually going on. Basically we can use regular
pointer math and as such I suspect the compiler is having to do extra
instructions to flip your value negative before it can combine the
values via something like the LEA (load effective address) assembler
call.
I am not an asm expert here, I am not sure I understand the optimization
trick here.
The LEA instruction takes a base address adds 1/2/4/8 times a multiple
and then a fixed offset all in one function and provides an address as
an output. The general idea is that you could look at converting
things such that you are putting together the page address +
remaining*1 + PAGE_SIZE. Basically what I was getting at is that
addition works, but it doesn't do negative values for the multiple.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help