Re: [PATCH V3,net-next] net: mana: Add page pool for RX buffers
From: Jesper Dangaard Brouer <hidden>
Date: 2023-07-26 09:25:37
Also in:
bpf, linux-hyperv, linux-rdma, lkml
On 25/07/2023 21.02, Haiyang Zhang wrote:
quoted
-----Original Message----- From: Jesper Dangaard Brouer <redacted> Sent: Tuesday, July 25, 2023 2:01 PMquoted
quoted
Our driver is using NUMA 0 by default, so I implicitly assign NUMA node id to zero during pool init. And, if the IRQ/CPU affinity is changed, the page_pool_nid_changed() will update the nid for the pool. Does this sound good?Also, since our driver is getting the default node from here: gc->numa_node = dev_to_node(&pdev->dev); I will update this patch to set the default node as above, instead of implicitly assigning it to 0.In that case, I agree that it make sense to use dev_to_node(&pdev->dev), like: pprm.nid = dev_to_node(&pdev->dev); Driver must have a reason for assigning gc->numa_node for this hardware, which is okay. That is why page_pool API allows driver to control this. But then I don't think you should call page_pool_nid_changed() like page_pool_nid_changed(rxq->page_pool, numa_mem_id()); Because then you will (at first packet processing event) revert the dev_to_node() setting to use numa_mem_id() of processing/running CPU. (In effect this will be the same as setting NUMA_NO_NODE). I know, mlx5 do call page_pool_nid_changed(), but they showed benchmark numbers that this was preferred action, even-when sysadm had "misconfigured" the default smp_affinity RX-processing to happen on a remote NUMA node. AFAIK mlx5 keeps the descriptor rings on the originally configured NUMA node that corresponds to the NIC PCIe slot.In mana_gd_setup_irqs(), we set the default IRQ/CPU affinity to gc->numa_node too, so it won't revert the nid initial setting. Currently, the Azure hypervisor always indicates numa 0 as default. (In the future, it will start to provide the accurate default dev node.) When a user manually changes the IRQ/CPU affinity for perf tuning, we want to allow page_pool_nid_changed() to update the pool. Is this OK?
If I were you, I would wait with the page_pool_nid_changed() "optimization" and do a benchmark mark to see if this actually have a benefit. (You can do this in another patch). (In a Azure hypervisor environment is might not be the right choice). This reminds me, do you have any benchmark data on the improvement this patch (using page_pool) gave? --Jesper