Thread (46 messages) 46 messages, 7 authors, 2017-06-01

Re: [v3 0/9] parallelized "struct page" zeroing

From: David Miller <davem@davemloft.net>
Date: 2017-05-12 17:37:46
Also in: linux-mm, linux-s390, lkml, sparclinux

From: Pasha Tatashin <redacted>
Date: Fri, 12 May 2017 13:24:52 -0400
Right now it is larger, but what I suggested is to add a new optimized
routine just for this case, which would do STBI for 64-bytes but
without membar (do membar at the end of memmap_init_zone() and
deferred_init_memmap()

#define struct_page_clear(page)                                 \
        __asm__ __volatile__(                                   \
        "stxa   %%g0, [%0]%2\n"                                 \
        "stxa   %%xg0, [%0 + %1]%2\n"                           \
        : /* No output */                                       \
        : "r" (page), "r" (0x20), "i"(ASI_BLK_INIT_QUAD_LDD_P))

And insert it into __init_single_page() instead of memset()

The final result is 4.01s/T which is even faster compared to current
4.97s/T
Ok, indeed, that would work.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help