Re: [PATCH] Improve behaviour of Netlink Sockets
From: jamal <hidden>
Date: 2004-09-28 12:32:17
Possibly related (same subject, not in this thread)
- 2004-09-30 · Re: [PATCH] Improve behaviour of Netlink Sockets · James Chapman <jchapman@katalix.com>
- 2004-09-29 · Re: [PATCH] Improve behaviour of Netlink Sockets · Herbert Xu <herbert@gondor.apana.org.au>
- 2004-09-29 · Re: [PATCH] Improve behaviour of Netlink Sockets · James Chapman <jchapman@katalix.com>
- 2004-09-19 · Re: [PATCH] Improve behaviour of Netlink Sockets · Herbert Xu <herbert@gondor.apana.org.au>
- 2004-08-31 · Re: [PATCH] Improve behaviour of Netlink Sockets · David S. Miller <hidden>
On Tue, 2004-09-28 at 07:11, Herbert Xu wrote:
On Tue, Sep 28, 2004 at 06:36:27AM -0400, jamal wrote:quoted
er, what about the host scope route msgs generated by same script? ;->
AFAIK, no async netlink event function uses NLM_GOODSIZE at all for obvious reasons.
Actually i just examined the events generated by the script, they are IFLA and ROUTE events and not IFA. So take a look at:rtmsg_ifinfo()
quoted
The state is per socket. You may need an intermediate queue etc which feeds to each user socket registered for the event. The socket queue acts as a essentially a retransmitQ for broadcast state. Just waving my hands throwing ideas here of course.Aha, you've fallen into my trap :)
Oh, goody ;->
Now let me demonstrate why having an intermediate queue doesn't help at all. Holding the packet on the intermediate queue is exactly the same as holding it on the receive queue of the destination socket. The reason is that we're simply cloning the packets. So moving it from one queue to another does not reduce system resource usage by much.
Ah, but theres clearly benefit into saving packets from crossing to user space in particular in the case of overload. You do save on system resources for sure in that case. In the case of normal operation, no overload case, you end up using a little more system resource - but thats a price tag that comes with the benefits (needs to be weighed out).
There is the cost in cloning the skbs. However, that's an orthogonal issue altogether. We can reduce the cost there by making the packets bigger. This can either be done at the sender end by coalescing successive messages. Or we can merge them in netlink_broadcast. Granted having an intermediate queue will avoid overruns if it is large enough. However, having all the receive queues to be as big as your intermediate queue will have exactly the same effect.
Agreed it will postpone the problem, and not cure it. Where i saw the benefit is if this queue is full/overloaded then you dont bother transfering skbs to the sock receiveQ - instead you overrun the event listeners (on purpose) before giving them any data. This assumes you only start feeding the listeners when the event generation is complete (in multi message batch when DONE is processed). In the case of overrunning the listeners you should alos flush the intermediate queue. Again, I am handwaving - there maybe a lot of practical issues whcih become obvious with actually get hands dirty. I actually tried to implement this a while back for socket packet tap listeners; cant find my patches. What i was trying to get feedback that i could feed all the way down to NAPI - it provide to be futile because i would need to have hardware drop selectively and such NICs dont exist.
In fact this has an advantage over the intermediate queue. With the latter, you need to hold the packet in place whether the applications need it or not. While currently, the application can choose whether it wants to receive a large batch of events and if so how large.
Right, but only find out after reading a subset of messages which cross into user space. Which is wasted cycles really. Now if you could say from user space "please continue where you left over" the messages before overrun wont be a waste. I do think thats not wise for events(you should be able a summary of the issue some other way as in overruns at the moment) but is definetely need for large get results.
Remember just because one application overruns, it doesn't mean that the other recipients of the same skb will overrun. They can continue to receive messages as long as their receive queue allows it.
Agreed. Note thats a design choice and the truth of which is a better scheme only comes out by testing both schemes. Easier to leave whats already in place - but what fun is that now?;-> Also to note this only applies to broadcasts.
So applications that really want to see every event should have a very large receive queue. Those that can recover easily should use with a much smaller queue.
Depending on how you look at it (since i am drinking the write variant of cofee right now, lets look at it from a philosphoical view: is it the area of the light radiated or the circumference of darkness surrounding the light? ;-> Choose your metric ;-> ): A large queue may actually be a problem if it also gets overflown since it takes relatively longer to find out. You still have to read the damn state to find out details. [..]
Jamal, maybe I've got the wrong impression but it almost seems that you think that if one applications overruns, then everyone else on that multicast address will overrun as well. This is definitely not the case.
I think its fair to assume that if the intermidiate queue is overflown all listeners will be.
With an intermediate queue, you will in fact impose overruns on everyone when it overflows which seems to be a step backwards.
Refer to above assumption.
quoted
The moral of this is: you could do it if you wanted - aint trivial.Well this is not what I'd call congestion control :) Let's take a TCP analogy. This is like batching up TCP packets on a router in the middle rather than shutting down the sender. Congestion control is where you shut the sender up.
Its actually worse than that -->which is a shame since we have more control over what can be sent to user. Congestion could be driven by receiver as well. Look at TCP zero windows for example. Or even ECN. cheers, jamal