Re: arping stuck with ENOBUFS in 4.19.150
From: Joakim Tjernlund <hidden>
Date: 2020-11-02 08:27:26
On Sat, 2020-10-31 at 09:48 +0800, Yunsheng Lin wrote:
quoted hunk ↗ jump to hunk
On 2020/10/30 19:50, Joakim Tjernlund wrote:quoted
On Fri, 2020-10-30 at 09:36 +0800, Yunsheng Lin wrote:quoted
CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe. On 2020/10/29 23:18, David Ahern wrote:quoted
On 10/29/20 8:10 AM, Joakim Tjernlund wrote:quoted
OK, bisecting (was a bit of a bother since we merge upstream releases into our tree, is there a way to just bisect that?) Result was commit "net: sch_generic: aviod concurrent reset and enqueue op for lockless qdisc" (749cc0b0c7f3dcdfe5842f998c0274e54987384f) Reverting that commit on top of our tree made it work again. How to fix?Adding the author of that patch (linyunsheng@huawei.com) to take a look.quoted
Jocke On Mon, 2020-10-26 at 12:31 -0600, David Ahern wrote:quoted
On 10/26/20 6:58 AM, Joakim Tjernlund wrote:quoted
Ping (maybe it should read "arping" instead :) Jocke On Thu, 2020-10-22 at 17:19 +0200, Joakim Tjernlund wrote:quoted
strace arping -q -c 1 -b -U -I eth1 0.0.0.0 ... sendto(3, "\0\1\10\0\6\4\0\1\0\6\234\v\6 \v\v\v\v\377\377\377\377\377\377\0\0\0\0", 28, 0, {sa_family=AF_PACKET, proto=0x806, if4, pkttype=PACKET_HOST, addr(6)={1, ffffffffffff}, 20) = -1 ENOBUFS (No buffer space available) .... and then arping loops. in 4.19.127 it was: sendto(3, "\0\1\10\0\6\4\0\1\0\6\234\5\271\362\n\322\212E\377\377\377\377\377\377\0\0\0\0", 28, 0, {sa_family=AF_PACKET, proto=0x806, if4, pkttype=PACKET_HOST, addr(6)={1, ffffffffffff}, 20) = 28 Seems like something has changed the IP behaviour between now and then ? eth1 is UP but not RUNNING and has an IP address."eth1 is UP but not RUNNING" usually mean user has configure the netdev as up, but the hardware has not detected a linkup yet. Also What is the output of "ethtool eth1"?echo 1 > /sys/class/net/eth1/carrier cu3-jocke ~ # arping -q -c 1 -b -U -I eth1 0.0.0.0 cu3-jocke ~ # echo 0 > /sys/class/net/eth1/carrier cu3-jocke ~ # arping -q -c 1 -b -U -I eth1 0.0.0.0 ^Ccu3-jocke ~ # ethtool eth1 Settings for eth1: Supported ports: [ MII ] Supported link modes: 1000baseT/Full Supported pause frame use: Symmetric Receive-only Supports auto-negotiation: Yes Advertised link modes: 1000baseT/Full Advertised pause frame use: Symmetric Receive-only Advertised auto-negotiation: Yes Speed: 10Mb/s Duplex: Half Port: MII PHYAD: 1 Transceiver: external Auto-negotiation: on Current message level: 0x00000037 (55) drv probe link ifdown ifup Link detected: no We have a writeable carrier since eth device is PHY less. Maybe that path is different ? Check drivers/net/ethernet/freescale/dpaa/dpa_eth.cThe above difference does not seems to matter.quoted
quoted
It would be good to see the status of netdev before and after executing arping cmd too.hmm, how do you mean?I was trying to find out when the netdev' state became "eth1 is UP but not RUNNING". Anyway, when I looked at the backported patch, I did find new qdisc assignment is missing from the upstream patch. Please see if the below patch fix your problem, thanks:diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c index bd96fd2..4e15913 100644 --- a/net/sched/sch_generic.c +++ b/net/sched/sch_generic.c@@ -1116,10 +1116,13 @@ static void dev_deactivate_queue(struct net_device *dev,void *_qdisc_default) { struct Qdisc *qdisc = rtnl_dereference(dev_queue->qdisc); + struct Qdisc *qdisc_default = _qdisc_default; if (qdisc) { if (!(qdisc->flags & TCQ_F_BUILTIN)) set_bit(__QDISC_STATE_DEACTIVATED, &qdisc->state); + + rcu_assign_pointer(dev_queue->qdisc, qdisc_default); } }
This patch seem to have resolved the problem, thanks. Please CC me on the formal patch for 4.19.x Jocke