Thread (48 messages) 48 messages, 8 authors, 2021-11-08

Re: [dpdk-dev] [PATCH v6 2/3] eal: add memory pre-allocation from existing files

From: Dmitry Kozlyuk <hidden>
Date: 2021-10-12 21:09:36

-----Original Message-----
From: David Marchand <redacted>
Sent: 12 октября 2021 г. 20:33
To: Dmitry Kozlyuk <redacted>
Cc: dev <redacted>; Slava Ovsiienko <redacted>; Anatoly
Burakov [off-list ref]; NBU-Contact-Thomas Monjalon
[off-list ref]
Subject: Re: [dpdk-dev] [PATCH v6 2/3] eal: add memory pre-allocation from
existing files

External email: Use caution opening links or attachments


On Tue, Oct 12, 2021 at 5:55 PM Dmitry Kozlyuk [off-list ref]
wrote:
quoted
quoted
I have some trouble figuring the need for the list of files.
Why not use a global knob --mem-clear-on-alloc for this behavior
change?
quoted
Moving memset() doesn't speed anything up, it's a forced step for the
reasons below.
quoted
Currently, memory is cleared by the kernel when a page is mapped during
an allocation.
quoted
This cannot be turned off in stock kernels. The issue is that initial
allocations are longer by the time needed to clear the pages, which is
90%. For the memory intended for DMA this time is just wasted. If
allocations are large, application startup and restart take long. The only
way to get hugepages mapped without the kernel clearing them is to map
existing files in hugetlbfs. However, rte_zmalloc() needs to return clean
memory, that's why we move memset() there. Memory intended for DMA is just
never cleared this way. But memory freed and allocated again will be
cleared again, unfortunately.

Writing my limited understanding, please correct me.

The --mem-file that is proposed does:
- preallocate files which is something close to --socket-mem with the
following differences
  - --mem-file lets user decide on dpdk hugepage files names, which I
think conflicts with --huge-dir and --file-prefix,
  - --mem-file lets user device on hugepage size which I think could be
achieved with some --huge-dir option,
The comparison to --socket-mem is valid, because preallocated files form the initial amount of memory allocated from the system. However, using --mem-file does not preclude DPDK from allocating more memory according to --huge-dir and --file-prefix when the application runs out of preallocated blocks.
- bypasses unlink() of existing hugepage files which I had overlooked but
is the main painpoint,
- enforces "clear on alloc" in rte_malloc/rte_free.


From this, I see two parts in this patch:
- faster restart, reusing hugepage files as is (combination of not calling
unlink() and doing "clear on alloc"),
  This part is interesting, and I think a single knob for this would be
enough.
In combination with rte_extmem* API this know would indeed allow to implement the feature in the app. However, the drawback is that all the logic to select hugepage size, NUMA, and names would need to be done from the app, probably with its own options. OTOH, there is already hugetlbfs and numactl to avoid apps duplicating this logic. Also, it's not only the fast restart, but also the fast initial start on a prepared system.
- finegrained control of hugepage files, but it has the drawback of
imposing primary/secondary run with the same options.
  The second part seems complex to configure. I see conflicts with
existing options, so it seems a good way to get caught up in the carpet
(sorry if it translates badly from French :p).
I don't see why synchronizing memory options is a big issue.
Primary and secondary processes are inherently interdependent.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help