Re: kernel 2.6.23.8: KERNEL: assertion in net/ipv4/tcp_input.c
From: Wolfgang Walter <hidden>
Date: 2007-12-03 15:56:55
Am Montag, 3. Dezember 2007 14:34 schrieb Ilpo Järvinen:
On Mon, 3 Dec 2007, Wolfgang Walter wrote:quoted
with kernel 2.6.23.8 we saw a KERNEL: assertion ((int)tcp_packets_in_flight(tp) >= 0) failed at net/ipv4/tcp_input.c (1292)Is this the only message? Are there any Leak printouts?
No. 4 days earlier there were 3 messages: TCP: Treason uncloaked! Peer a.b.c.d:80/56532 shrinks window 3535507131:3535513869. Repaired.
Any tweaking done to TCP related sysctls? And for completeness, is GSO enabled (ethtool -k)?
rx-checksumming: on tx-checksumming: on scatter-gather: on tcp segmentation offload: off udp fragmentation offload: off generic segmentation offload: off
Most likely I broke the manual synchronization for left_out in sacktag by skipping over it when packets_out == 0 but so far I haven't been able to figure out how such state could develop in the first place... Ie., I couldn't find a case where tcp_fastretrans_alert wouldn't be called if left_out was non-zero (and it did the sync_left_out after modifying either sacked_out or lost_out, IIRC). ...If you can reproduce it, you could try if this patch below changes
I don't know how to reproduce it - we never saw the message before. I'll aply the patch. Let see if the WARN_ON triggers before we update to a newer kernel :-).
quoted hunk ↗ jump to hunk
anything (should silence the assert and trigger earlier a WARN_ON or two :-)). ...If this triggers, then I'm sure we can pollute TCP code by a larger number of more costly checks to catch it in early. This might reveal a long-standing inconsistency of left_out in some case I just couldn't come up with by code review. Left_out will be (is) anyway dropped as unnecessary in 2.6.24. In 2.6.23 sync for left_out occurs quite soon after that BUG_TRAP anyway so the effect won't be too dramatic, prior_in_flight would be once stale, won't lead to big problems (either missed cnwd or cwnd_cnt increment, or failure to do application limited check at that particular ACK). Thanks anyway for the report. ...If I figure something out here, I'll let you know. --diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index c9298a7..0c5194d 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c@@ -1012,8 +1012,12 @@ tcp_sacktag_write_queue(struct sock *sk, structsk_buff *ack_skb, u32 prior_snd_ if (before(TCP_SKB_CB(ack_skb)->ack_seq, prior_snd_una - tp->max_window)) return 0; - if (!tp->packets_out) + if (!tp->packets_out) { + WARN_ON(tp->sacked_out); + WARN_ON(tp->lost_out); + WARN_ON(tp->left_out); goto out; + } /* SACK fastpath: * if the only SACK change is the increase of the end_seq of@@ -1277,14 +1281,14 @@ tcp_sacktag_write_queue(struct sock *sk, structsk_buff *ack_skb, u32 prior_snd_ } } +out: + tp->left_out = tp->sacked_out + tp->lost_out; if ((reord < tp->fackets_out) && icsk->icsk_ca_state != TCP_CA_Loss && (!tp->frto_highmark || after(tp->snd_una, tp->frto_highmark))) tcp_update_reordering(sk, ((tp->fackets_out + 1) - reord), 0); -out: - #if FASTRETRANS_DEBUG > 0 BUG_TRAP((int)tp->sacked_out >= 0); BUG_TRAP((int)tp->lost_out >= 0);
Thanks and regards, -- Wolfgang Walter Studentenwerk München Anstalt des öffentlichen Rechts