Re: [Bugme-new] [Bug 16626] New: Machine hangs with EIP at skb_copy_and_csum_dev
From: Jarek Poplawski <hidden>
Date: 2010-08-23 12:47:44
Subsystem:
networking [general], the rest · Maintainers:
"David S. Miller", Eric Dumazet, Jakub Kicinski, Paolo Abeni, Linus Torvalds
On Mon, Aug 23, 2010 at 02:47:23PM +0300, Plamen Petrov wrote:
???? 21.8.2010 ??. 11:07, Jarek Poplawski ????????????:quoted
On Sat, Aug 21, 2010 at 09:50:58AM +0200, Eric Dumazet wrote:quoted
Le samedi 21 ao??t 2010 ?? 09:47 +0200, Jarek Poplawski a écrit :quoted
On Fri, Aug 20, 2010 at 09:38:35PM +0200, Jarek Poplawski wrote:quoted
Plamen Petrov wrote, On 20.08.2010 12:53:quoted
So, I guess its David and Herbert's turn?...If you're bored in the meantime I'd suggest to do check the realtek driver eg: - for locking with the patch below, - to turn off with ethtool its tx-checksumming and/or scatter-gather,
...
Yeah, 3 days and counting, right until I decided to try the freshly announced 2.6.36-rc2. So I upgraded the kernel, but left the scripts that turn GRO off for the tg3 card still run at system startup. This way the system ran for 2 and a half hours, when I decided its time to try turning GRO on. I first tried to turn GRO on for the tg3 nic, and the system oopsed immediately (if the panic screen is necessary - please, ask for it). After the system came back, I tried turning GRO on for the 2 RealTek 8139 nics, too, but ethtool only accepted turning GRO off. And unfortunately, I can't test if other nics will fail the same way as the motherboard integrated tg3 I have does, so for now, this is only a tg3 + GRO on problem; I don't have any other hardware to test with available.
A little misunderstanding: I was intersted with turning off some features on realteks to change the packet path from tg3 with gro to realtek without gro and without tx-checksumming etc. But maybe you could try the patch below instead (so the patched kernel, tg3 with gro on, and realteks without any change). Thanks, Jarek P.
--- (for debugging only)
diff --git a/net/core/dev.c b/net/core/dev.c
index 3721fbb..51823cd 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c@@ -1935,6 +1935,23 @@ static inline int skb_needs_linearize(struct sk_buff *skb, illegal_highdma(dev, skb)))); } +static int skb_csum_start_bug(struct sk_buff *skb) +{ + + if (skb->ip_summed == CHECKSUM_PARTIAL) { + long csstart; + + csstart = skb->csum_start - skb_headroom(skb); + if (WARN_ON(csstart > skb_headlen(skb))) { + pr_warning("csum_start %d, headroom %d, headlen %d\n", + skb->csum_start, skb_headroom(skb), + skb_headlen(skb)); + return 1; + } + } + return 0; +} + int dev_hard_start_xmit(struct sk_buff *skb, struct net_device *dev, struct netdev_queue *txq) {
@@ -1955,11 +1972,13 @@ int dev_hard_start_xmit(struct sk_buff *skb, struct net_device *dev, skb_orphan_try(skb); if (netif_needs_gso(dev, skb)) { + skb_csum_start_bug(skb); if (unlikely(dev_gso_segment(skb))) goto out_kfree_skb; if (skb->next) goto gso; } else { + skb_csum_start_bug(skb); if (skb_needs_linearize(skb, dev) && __skb_linearize(skb)) goto out_kfree_skb;
@@ -1997,7 +2016,12 @@ gso: if (dev->priv_flags & IFF_XMIT_DST_RELEASE) skb_dst_drop(nskb); - rc = ops->ndo_start_xmit(nskb, dev); + if (skb_csum_start_bug(skb)) { + kfree_skb(skb); + rc = NETDEV_TX_OK; + } else + rc = ops->ndo_start_xmit(nskb, dev); + if (unlikely(rc != NETDEV_TX_OK)) { if (rc & ~NETDEV_TX_MASK) goto out_kfree_gso_skb;