Re: [PATCH net-next] fq_codel: report congestion notification at enqueue time
From: Dave Taht <hidden>
Date: 2012-06-28 17:51:33
On Thu, Jun 28, 2012 at 10:07 AM, Eric Dumazet [off-list ref] wrote:
quoted hunk ↗ jump to hunk
From: Eric Dumazet <edumazet@google.com> At enqueue time, check sojourn time of packet at head of the queue, and return NET_XMIT_CN instead of NET_XMIT_SUCCESS if this sejourn time is above codel @target. This permits local TCP stack to call tcp_enter_cwr() and reduce its cwnd without drops (for example if ECN is not enabled for the flow) Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Dave Taht <redacted> Cc: Tom Herbert <redacted> Cc: Matt Mathis <redacted> Cc: Yuchung Cheng <redacted> Cc: Nandita Dukkipati <redacted> Cc: Neal Cardwell <ncardwell@google.com> --- include/linux/pkt_sched.h | 1 + include/net/codel.h | 2 +- net/sched/sch_fq_codel.c | 19 +++++++++++++++---- 3 files changed, 17 insertions(+), 5 deletions(-)diff --git a/include/linux/pkt_sched.h b/include/linux/pkt_sched.h index 32aef0a..4d409a5 100644 --- a/include/linux/pkt_sched.h +++ b/include/linux/pkt_sched.h@@ -714,6 +714,7 @@ struct tc_fq_codel_qd_stats {*/ __u32 new_flows_len; /* count of flows in new list */ __u32 old_flows_len; /* count of flows in old list */ + __u32 congestion_count; }; struct tc_fq_codel_cl_stats {diff --git a/include/net/codel.h b/include/net/codel.h index 550debf..8c7d6a7 100644 --- a/include/net/codel.h +++ b/include/net/codel.h@@ -148,7 +148,7 @@ struct codel_vars {* struct codel_stats - contains codel shared variables and stats * @maxpacket: largest packet we've seen so far * @drop_count: temp count of dropped packets in dequeue() - * ecn_mark: number of packets we ECN marked instead of dropping + * @ecn_mark: number of packets we ECN marked instead of dropping */ struct codel_stats { u32 maxpacket;diff --git a/net/sched/sch_fq_codel.c b/net/sched/sch_fq_codel.c index 9fc1c62..c0485a0 100644 --- a/net/sched/sch_fq_codel.c +++ b/net/sched/sch_fq_codel.c@@ -62,6 +62,7 @@ struct fq_codel_sched_data {struct codel_stats cstats; u32 drop_overlimit; u32 new_flow_count; + u32 congestion_count; struct list_head new_flows; /* list of new flows */ struct list_head old_flows; /* list of old flows */@@ -196,16 +197,25 @@ static int fq_codel_enqueue(struct sk_buff *skb, struct Qdisc *sch)flow->deficit = q->quantum; flow->dropped = 0; } - if (++sch->q.qlen < sch->limit) + if (++sch->q.qlen < sch->limit) { + codel_time_t hdelay = codel_get_enqueue_time(skb) - + codel_get_enqueue_time(flow->head); + + /* If this flow is congested, tell the caller ! */ + if (codel_time_after(hdelay, q->cparams.target)) { + q->congestion_count++; + return NET_XMIT_CN; + } return NET_XMIT_SUCCESS; - + } q->drop_overlimit++; /* Return Congestion Notification only if we dropped a packet * from this flow. */ - if (fq_codel_drop(sch) == idx) + if (fq_codel_drop(sch) == idx) { + q->congestion_count++; return NET_XMIT_CN; - + } /* As we dropped a packet, better let upper stack know this */ qdisc_tree_decrease_qlen(sch, 1); return NET_XMIT_SUCCESS;@@ -467,6 +477,7 @@ static int fq_codel_dump_stats(struct Qdisc *sch, struct gnet_dump *d)st.qdisc_stats.maxpacket = q->cstats.maxpacket; st.qdisc_stats.drop_overlimit = q->drop_overlimit; + st.qdisc_stats.congestion_count = q->congestion_count; st.qdisc_stats.ecn_mark = q->cstats.ecn_mark; st.qdisc_stats.new_flow_count = q->new_flow_count;
clever idea. A problem is there are other forms of network traffic on a link, and this is punishing a single tcp stream that may not be the source of the problem in the first place, and basically putting it into double jeopardy. I am curious as to how often an enqueue is actually dropping in the codel/fq_codel case, the hope was that there would be plenty of headroom under far more circumstances on this qdisc. I note that on the dequeue side of codel (and in the network stack generally) I was thinking that supplying a netlink level message on a packet drop/congestion indication that userspace could register for and see would be very useful, particularly in the case of a routing daemon, but also for statistics collection, and perhaps other levels of overall network control (DCTCP-like) The existing NET_DROP functionality is hard to use, and your idea is "in-band", the more general netlink message idea would be "out of band" and more general. -- Dave Täht http://www.bufferbloat.net/projects/cerowrt/wiki - "3.3.8-6 is out with fq_codel!"