Thread (46 messages) 46 messages, 11 authors, 2016-12-22

Re: [dpdk-dev, RFC] drivers: advertise kmod dependencies in pmdinfo

From: Stephen Hemminger <stephen@networkplumber.org>
Date: 2016-09-01 17:41:11

On Thu, 1 Sep 2016 13:35:19 -0400
Neil Horman [off-list ref] wrote:
On Thu, Sep 01, 2016 at 12:55:27PM +0000, Trahe, Fiona wrote:
quoted
Hi Neil and Olivier,
  
quoted
-----Original Message-----
From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Olivier Matz
Sent: Wednesday, August 31, 2016 2:40 PM
To: Neil Horman <nhorman@tuxdriver.com>
Cc: dev@dpdk.org; thomas.monjalon@6wind.com
Subject: Re: [dpdk-dev] [dpdk-dev, RFC] drivers: advertise kmod dependencies
in pmdinfo

Hi Neil,

On 08/31/2016 03:27 PM, Neil Horman wrote:  
quoted
On Wed, Aug 31, 2016 at 11:21:18AM +0200, Olivier Matz wrote:  
quoted
Hi Neil,

On 08/30/2016 03:23 PM, Neil Horman wrote:  
quoted
On Fri, Aug 26, 2016 at 03:20:46PM +0200, Olivier Matz wrote:  
quoted
Add a new macro DRIVER_REGISTER_KMOD_DEP() that allows a driver to
declare the list of kernel modules required to run properly.

Today, most PCI drivers require uio/vfio.

Signed-off-by: Olivier Matz <redacted>

---
In this RFC, I supposed that all PCI drivers require a the loading of a
uio/vfio module (except mlx*), this may be wrong.
Comments are welcome!


 buildtools/pmdinfogen/pmdinfogen.c      |  1 +
 buildtools/pmdinfogen/pmdinfogen.h      |  1 +
 drivers/crypto/qat/rte_qat_cryptodev.c  |  2 ++
 drivers/net/bnx2x/bnx2x_ethdev.c        |  4 ++++
 drivers/net/bnxt/bnxt_ethdev.c          |  2 ++
 drivers/net/cxgbe/cxgbe_ethdev.c        |  2 ++
 drivers/net/e1000/em_ethdev.c           |  2 ++
 drivers/net/e1000/igb_ethdev.c          |  4 ++++
 drivers/net/ena/ena_ethdev.c            |  2 ++
 drivers/net/enic/enic_ethdev.c          |  2 ++
 drivers/net/fm10k/fm10k_ethdev.c        |  2 ++
 drivers/net/i40e/i40e_ethdev.c          |  2 ++
 drivers/net/i40e/i40e_ethdev_vf.c       |  2 ++
 drivers/net/ixgbe/ixgbe_ethdev.c        |  4 ++++
 drivers/net/mlx4/mlx4.c                 |  2 ++
 drivers/net/mlx5/mlx5.c                 |  3 +++
 drivers/net/nfp/nfp_net.c               |  2 ++
 drivers/net/qede/qede_ethdev.c          |  4 ++++
 drivers/net/szedata2/rte_eth_szedata2.c |  2 ++
 drivers/net/thunderx/nicvf_ethdev.c     |  2 ++
 drivers/net/virtio/virtio_ethdev.c      |  2 ++
 drivers/net/vmxnet3/vmxnet3_ethdev.c    |  2 ++
 lib/librte_eal/common/include/rte_dev.h | 14 ++++++++++++++
 tools/dpdk-pmdinfo.py                   |  5 ++++-
 24 files changed, 69 insertions(+), 1 deletion(-)
 
Generally speaking, I like the idea, it makes sense to me in terms of using
pmdinfo to export this information

That said, This may need to be a set of macros.  By that I mean (and correct  
me  
quoted
quoted
quoted
if I'm wrong here), but the relationship between pmd's and kernel modules  
is in  
quoted
quoted
quoted
some cases, more complex than a 'requires' or 'depends' relationship.  That  
is  
quoted
quoted
quoted
to say, some pmd may need user space hardware access, but can use either  
uio OR  
quoted
quoted
quoted
vfio, but doesn't need both, and can continue to function if only one is
available.  Other PMD's may be able to use vfio or uio, but can still function
without either.  And some, as your patch implements, simply require one or  
the  
quoted
quoted
quoted
other to function.  As such it seems like you may want a few macros, in the  
form  
quoted
quoted
quoted
of:

DRIVER_REGISTER_KMOD_REQUEST - List of modules to attempt loading,  
ignore any  
quoted
quoted
quoted
failures
DRIVER_REGISTER_KMOD_REQUIRE - List of modules required to be  
loaded after  
quoted
quoted
quoted
request macro completes, fail if any are not loaded

Thats just spitballing, mind you, theres probably a better way to do it, but  
the  
quoted
quoted
quoted
idea is to list a set of modules you would like to have, and then create a
parsable syntax to describe the modules that need to be loaded after the  
request  
quoted
quoted
quoted
is complete so that you can accurately codify the situations I described  
above.  
quoted
quoted
Thank you for your feedback.
However, I'm not sure I'm perfectly getting what you suggest.

Do you think some PMDs could request a kernel module without really
requiring it? Do you have an example in mind?
 
Yes, thats precisely it.  The most clear example I could think of (though I'm
not sure if any pmd currently supports this), is a pmd that supports both UIO
and VFIO communication with the kernel.  Such a PMD requires that one of  
those  
quoted
two modules be loaded, but only one (i.e. both are not required), so if only  
the  
quoted
uio kernel module loads is a success case, likewise if only the vfio module
loads can be treated as success.  Both loading are clearly successful.  Only if
neither load do we have a failure case.  I'm suggesting that the grammer that
your exports define should take those cases into account.  Its not always as
simple as "I must have the following modules"
 
quoted
The syntax I've submitted lets you define several lists of modules, so
that the user or the script that starts the application can decide which
kmod list is better according to the environment.
 
If you have a human intervening in the module load process, sure, then its  
fine.  
quoted
But it seems that this particular feature that you're implemnting might have
automated uses.  That is to say the dpdk core library might be interested in
parsing this particular information to direct module autoloading, and if thats
desireable then you need to define these lists such that you can codify failure
and success conditions.
 
quoted
For example, most drivers will advertise
"uio,igb_uio:uio,uio_pci_generic:vfio,vfio-pci", and the user or script
will have to choose between loading:
- uio igb_uio
- uio uio_pci_generic
- vfio vfio-pci
 
Oh, I see, so your list is a colon delimited list of module load sets, where at
least one set must succeed by loading all modules in its set, but the failure of
any one set isn't fatal to the process?  e.g. a string like this:

uio,igb_uio:vfio,vfio-pci

could be interpreted to mean "I must load (uio AND igb_uio) OR (vfio AND
vfio-pci).  If the evaluation of that statement results in false, then the
operation fails, otherwise it succedes.

If thats the case, then, apologies, we're on the same page, and this will work
just fine.  
Yep, that's the idea.

Colon and commas are the best separators I've thought about, but any
idea to make the syntax clearer is welcome ;)

Maybe a syntax like is clearer:
  "(mod1 & mod2)|(mod3 & mod4)" ?
But it would let the user think that more complex expressions are valid,
like "(mod1 & (mod2 | mod3)) | mod4", which is probably overkill.

Regards,
Olivier  
This RFC seems like a good idea - and something the Intel QuickAssist PMD could benefit from.
However the (mod1 & mod2) can handle the QAT case better in my opinion.
i.e.
as well as needing one of 
* uio igb_uio
* uio uio_pci_generic
* vfio vfio-pci
QAT PMD also needs one of (depending on which physical device is plugged)
 * qat_dh895xcc
 * qat_c62x
 * qat_c3xxx

So the original syntax would result in a very long list of possible variations.
What really reflects the dependencies would be 
((uio & igb_uio) | (uio & uio_pci_generic) | (vfio & vfio_pci)) & (qat_dh895xcc | qat_c62x | qat_c3xxx)
  
Ah, I didn't consider that hardware specifics might create a use case where a
pmd must have one or more kernel modules available for hw support.  Perhaps it
is worthwhile to automate hardware support - that is to say, any module loading
script should automatically look at the pci table exported from a pmd, and, if
found, load any modules that claim support for that device:vendor tuple?  Though
that might break in the case of uio, if there are separate driver modules that
support native hardware and uio access.
I ended up writing a script that went the other way.
First look at the hardware and load VFIO if IOMMU is available.
Then look for special driver needed for Xen and HyperV
Lastly fallback to loading igb_uio if no VFIO and PCI device present.

In other words it is a system not driver issue.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help