RE: [PATCH net-next] mana: Add support for EQ sharing
From: Haiyang Zhang <haiyangz@microsoft.com>
Date: 2021-08-21 21:18:01
Also in:
lkml, netdev
-----Original Message----- From: Dexuan Cui <decui@microsoft.com> Sent: Friday, August 20, 2021 8:33 PM To: Haiyang Zhang <haiyangz@microsoft.com>; linux-hyperv@vger.kernel.org; netdev@vger.kernel.org Cc: Haiyang Zhang <haiyangz@microsoft.com>; KY Srinivasan [off-list ref]; Stephen Hemminger [off-list ref]; Paul Rosswurm [off-list ref]; Shachar Raindel [off-list ref]; olaf@aepfle.de; vkuznets [off-list ref]; davem@davemloft.net; linux-kernel@vger.kernel.org Subject: RE: [PATCH net-next] mana: Add support for EQ sharingquoted
Subject: [PATCH net-next] mana: Add support for EQ sharing"mana:" --> "net: mana:"
Will do.
quoted
The existing code uses (1 + #vPorts * #Queues) MSIXs, which may exceed the device limit. Support EQ sharing, so that multiple vPorts can share the same set of MSIXs. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>The patch itself looks good to me, but IMO the changes are too big to be in one patch. :-) Can you please split it into some smaller ones and please document the important changes in the commit messages, e.g.
Will do.
1) move NAPI processing from EQ to CQ. 2) report the EQ-sharing capability bit to the host, which means the host can potentially offer more vPorts and queues to the VM. 3) support up to 256 virtual ports (it was 16). 4) support up to 64 queues per net interface (it was 16). It looks like the default number of queues is also 64 if the VM has >=64 CPUs? -- should we add a new field apc->default_queues and limit it to 16 or 32? We'd like to make sure typically the best performance can be achieved with the default number of queues.
I found on a 40 cpu VM, the mana_query_vport_cfg() returns max_txq:32, max_rxq:32, so I didn't further reduce the number (32) from PF. That's also the opinion from the host team -- if they upgrade the NIC HW in the future, they can adjust the setting from PF side without requiring VF driver change.
5) If the VM has >=64 CPUs, with the patch we create 1 HWC EQ and 64 NIC EQs, and IMO the creation of the last NIC EQ fails since now the host PF driver allows only 64 MSI-X interrupts? If this is the case, I think mana_probe() -> mana_create_eq() fails and no net interface will be created. It looks like we should create up to 63 NIC EQs in this case, and make sure we don't create too many SQs/RQs accordingly. At the end of mana_gd_query_max_resources(), should we add something like: if (gc->max_num_queues >= gc->num_msix_usable -1) gc->max_num_queues = gc->num_msix_usable -1;
As said, the PF allows 32 queues, and 64 MSI-X interrupts for now. The PF should increase the MSI-X limit if the #queues is increased to 64+. But for robustness, I like your idea that add a check in VF like above.
6) Since we support up to 256 ports, up to 64 NIC EQs and up to 64 SQ CQs and 64 RQ CQs per port, the size of one EQ should be at least 256*2*GDMA_EQE_SIZE = 256*2*16 = 8KB. Currently EQ_SIZE is hardcoded to 8 pages (i.e. 32 KB on x86-64), which should be big enough. Let's add the below just in case we support more ports in future: BUILD_BUG_ON(MAX_PORTS_IN_MANA_DEV*2* GDMA_EQE_SIZE > EQ_SIZE);
Will do.
7) In mana_gd_read_cqe(), can we add a WARN_ON_ONCE() in the case of overflow. Currently the error (which normally should not happen) is sliently ignored.
Will do. Thank you for the detailed reviews! - Haiyang