Re: [PATCH wpan-next v3 2/4] net: ieee802154: Add support for inter PAN management

From: Alexander Aring <aahringo@redhat.com>
Date: 2022-07-01 12:23:49

Hi,

On Thu, Jun 30, 2022 at 8:50 PM Miquel Raynal [off-list ref] wrote:

Hi Alexander,

aahringo@redhat.com wrote on Thu, 30 Jun 2022 19:27:49 -0400:

quoted

Hi,

On Thu, Jun 30, 2022 at 4:14 AM Miquel Raynal [off-list ref] wrote:

quoted

Hi Alexander,

aahringo@redhat.com wrote on Wed, 29 Jun 2022 21:40:14 -0400:

quoted

Hi,

On Tue, Jun 28, 2022 at 3:58 AM Miquel Raynal [off-list ref] wrote:

quoted

Hi Alexander,

aahringo@redhat.com wrote on Mon, 27 Jun 2022 21:32:08 -0400:

quoted

Hi,

On Mon, Jun 27, 2022 at 4:43 AM Miquel Raynal [off-list ref] wrote:

quoted

Hi Alexander,

aahringo@redhat.com wrote on Sat, 25 Jun 2022 22:29:08 -0400:

quoted

Hi,

On Mon, Jun 20, 2022 at 10:26 AM Miquel Raynal
[off-list ref] wrote:

quoted

Let's introduce the basics for defining PANs:
- structures defining a PAN
- helpers for PAN registration
- helpers discarding old PANs

I think the whole pan management can/should be stored in user space by
a daemon running in background.

We need both, and currently:
- while the scan is happening, the kernel saves all the discovered PANs
- the kernel PAN list can be dumped (and also flushed) asynchronously by
  the userspace

IOW the userspace is responsible of keeping its own list of PANs in
sync with what the kernel discovers, so at any moment it can ask the
kernel what it has in memory, it can be done during a scan or after. It
can request a new scan to update the entries, or flush the kernel list.
The scan operation is always requested by the user anyway, it's not
something happening in the background.

I don't see what advantage it has to keep the discovered pan in the
kernel. You can do everything with a start/stop/pan discovered event.

I think the main reason is to be much more user friendly. Keeping track
of the known PANs in the kernel matters because when you start working
with 802.15.4 you won't blindly use a daemon (if there is any) and will
use test apps like iwpan which are stateless. Re-doing a scan on demand
just takes ages (from seconds to minutes, depending on the beacon
order).

I can see that things should work "out-of the box" and we are already
doing it by manual setting pan_id, etc. However, doing it in an
automatic way there exists a lot of "interpretation" about how you
want to handle it (doesn't matter if this is what the spec says or
not)... moving it to user space will offload it to the user.

quoted

Aside from this non technical reason, I also had in mind to retrieve
values gathered from the beacons (and stored in the PAN descriptors) to
know more about the devices when eg. listing associations, like
registering the short address of a coordinator. I don't yet know how
useful this is TBH.

quoted

It also has more advantages as you can look for a specific pan and
stop afterwards. At the end the daemon has everything that the kernel
also has, as you said it's in sync.

quoted

This can be a network manager as it
listens to netlink events as "detect PAN xy" and stores it and
offers it in their list to associate with it.

There are events produced, yes. But really, this is not something we
actually need. The user requests a scan over a given range, when the
scan is over it looks at the list and decides which PAN it
wants to associate with, and through which coordinator (95% of the
scenarii).

This isn't either a kernel job to decide which pan it will be
associated with.

Yes, "it looks at the list and decides" referred to "the user".

quoted

We need somewhere to draw a line and I guess the line is "Is this
information used e.g. as any lookup or something in the hot path", I
don't see this currently...

Each PAN descriptor is like 20 bytes, so that's why I don't feel back
keeping them, I think it's easier to be able to serve the list of PANs
upon request rather than only forwarding events and not being able to
retrieve the list a second time (at least during the development).

This has nothing to do with memory.

quoted

Overall I feel like this part is still a little bit blurry because it
has currently no user, perhaps I should send the next series which
actually makes the current series useful.

Will it get more used than caching entries in the kernel for user
space? Please also no in-kernel association feature.

I am aligned on this.

I am sorry I am not sure what that means.

I was referring to the "no in-kernel association feature".

There is however one situation which I _had_ to be handled in the
kernel: other devices asking for being associated or disassociated. In
the case of the disassociation, the receiving device is only notified
and cannot refuse the disassociation. For the association however,
the device receiving the association request has to make a decision.
There are three possible outcomes:
- accepting
- refusing because the PAN is at capacity
- refusing because the device is blacklisted

Why not move this decision to the user as well? The kernel will wait
for the reason? This isn't required to be fast and the decision may
depend on the current pan management...

I've opted out for the simplest option, which is allowing X devices
being associated, X being manageable by the user. For now I'll keep
this very simple approach, I propose we add this filtering feature
later?

What I suggest here is to move the filtering logic into the user
space. If the interface is a coordinator it will trigger an event for
the user and waits for an upper layer user space logic to get an
answer back what to do as answer.

However as I said I don't force you to program a user space software
which does that job but you code should be prepared to be get replaced
by such handling.

quoted

For now I've only implemented the first reason, because it's much
easier and only requires a maximum device number variable, set by the
user. For the second reason, it requires handling a
whitelist/blacklist, I don't plan to implement this for now, but that
should not impact the rest of the code. I'll let that to other
developers, or future-me, perhaps :-). Anyhow, you can kick-out devices
at any time anyway if needed with a disassociation notification
controlled by the user.

quoted

We can maybe agree to that point to put it under
IEEE802154_NL802154_EXPERIMENTAL config, as soon as we have some
_open_ user space program ready we will drop this feature again...
this program will show that there is no magic about it.

Yeah, do you want to move all the code scan/beacon/pan/association code
under EXPERIMENTAL sections? Or is it just the PAN management logic?

Yes, why not. But as I can see there exists two categories of
introducing your netlink api:

1. API candidates which are very likely to become stable
2. API candidates which we want to remove when we have a user
replacement for it (will probably never go stable)

The 2. should be defining _after_ the 1. In the "big" netlink API
enums of EXPERIMENTAL sections.

Yeah, got it.

quoted

Also you should provide for 2. some kind of ifdef/functions dummy/etc.
that it's easy to remove from the kernel when we have a user
replacement for it.
I hope that is fine for everybody.

I try to find solutions here, I don't see a reason for putting this
pan management into the kernel... whereas I appreciate the effort
which is done here and will not force you to write some user space
software that does this job. From my point of view I can't accept this
functionality in the kernel "yet".

I've already spent a couple of days reworking all that part, I've
dropped most of the in-kernel PAN management, which means:
- when a new coordinator gets discovered (beacon received), if the mac
  was scanning then it calls a generic function from the cfg layer to
  advertise this pan.
- the cfg layer will send a NL message to the user with all the
  important information
- BUT the cfg layer will also keep in memory the beacon information for
  the time of the scan (only), to avoid polluting the user with the same
  information over and over again, this seems a necessary step to me,
  because otherwise if you track on the same channel two coordinators
  not emitting at the same pace, you might end up with 100 user
  notifications, for just 2 devices. I think this is the kernel duty to
  filter out identical beacons.

Okay, I am sure if somebody complains about such kernel behaviour and
has a good argument to switch back... we still can do it.

Great!

I would say more here... there might be some API documentation where
you cannot expect anything from the kernel but it tries to avoid
stupid things (Whatever that means). As the API is experimental it can
be easily changed, otherwise some additional flag is required to
enable this feature or not. However I can say more about this when I
see code and we have some user experience about whatever the default
behaviour should be or such flag is really necessary.

We probably will find some solution...

- Alex

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help