Re: [PATCH v6 net-next 2/2] tcp: Add Redundant Data Bundling (RDB)

[PATCH RFC net-next 0/2] tcp: Redundant Data Bundling (RDB) · Bendik Rønning Opstad <hidden> · 2015-10-23
[PATCH RFC net-next 1/2] tcp: Add DPIFL thin stream detection mechanism · Bendik Rønning Opstad <hidden> · 2015-10-23
Re: [PATCH RFC net-next 1/2] tcp: Add DPIFL thin stream detection mechanism · Eric Dumazet <hidden> · 2015-10-23
Re: [PATCH RFC net-next 1/2] tcp: Add DPIFL thin stream detection mechanism · Bendik Rønning Opstad <hidden> · 2015-10-25
[PATCH RFC net-next 2/2] tcp: Add Redundant Data Bundling (RDB) · Bendik Rønning Opstad <hidden> · 2015-10-23
Re: [PATCH RFC net-next 2/2] tcp: Add Redundant Data Bundling (RDB) · Neal Cardwell <ncardwell@google.com> · 2015-10-26
Re: [PATCH RFC net-next 2/2] tcp: Add Redundant Data Bundling (RDB) · Andreas Petlund <hidden> · 2015-10-26
Re: [PATCH RFC net-next 2/2] tcp: Add Redundant Data Bundling (RDB) · Yuchung Cheng <hidden> · 2015-10-26
Re: [PATCH RFC net-next 2/2] tcp: Add Redundant Data Bundling (RDB) · Jonas Markussen <hidden> · 2015-10-27
Re: [PATCH RFC net-next 2/2] tcp: Add Redundant Data Bundling (RDB) · Bendik Rønning Opstad <hidden> · 2015-10-29
RE: [PATCH RFC net-next 2/2] tcp: Add Redundant Data Bundling (RDB) · David Laight <hidden> · 2015-11-02
RE: [PATCH RFC net-next 2/2] tcp: Add Redundant Data Bundling (RDB) · David Laight <hidden> · 2015-11-02
Re: [PATCH RFC net-next 2/2] tcp: Add Redundant Data Bundling (RDB) · Bendik Rønning Opstad <hidden> · 2015-11-05
Re: [PATCH RFC net-next 0/2] tcp: Redundant Data Bundling (RDB) · Yuchung Cheng <hidden> · 2015-10-24
Re: [PATCH RFC net-next 0/2] tcp: Redundant Data Bundling (RDB) · Jonas Markussen <hidden> · 2015-10-24
Re: [PATCH RFC net-next 0/2] tcp: Redundant Data Bundling (RDB) · Eric Dumazet <hidden> · 2015-10-24
Re: [PATCH RFC net-next 0/2] tcp: Redundant Data Bundling (RDB) · Bendik Rønning Opstad <hidden> · 2015-11-09
[PATCH RFC v2 net-next 0/2] tcp: Redundant Data Bundling (RDB) · Bendik Rønning Opstad <hidden> · 2015-11-23
[PATCH RFC v2 net-next 1/2] tcp: Add DPIFL thin stream detection mechanism · Bendik Rønning Opstad <hidden> · 2015-11-23
[PATCH RFC v2 net-next 2/2] tcp: Add Redundant Data Bundling (RDB) · Bendik Rønning Opstad <hidden> · 2015-11-23
Re: [PATCH RFC v2 net-next 2/2] tcp: Add Redundant Data Bundling (RDB) · Eric Dumazet <hidden> · 2015-11-23
Re: [PATCH RFC v2 net-next 2/2] tcp: Add Redundant Data Bundling (RDB) · Bendik Rønning Opstad <hidden> · 2015-11-23
[PATCH v3 net-next 0/2] tcp: Redundant Data Bundling (RDB) · Bendik Rønning Opstad <hidden> · 2016-02-02
[PATCH v3 net-next 1/2] tcp: Add DPIFL thin stream detection mechanism · Bendik Rønning Opstad <hidden> · 2016-02-02
[PATCH v3 net-next 2/2] tcp: Add Redundant Data Bundling (RDB) · Bendik Rønning Opstad <hidden> · 2016-02-02
Re: [PATCH v3 net-next 2/2] tcp: Add Redundant Data Bundling (RDB) · Eric Dumazet <hidden> · 2016-02-02
Re: [PATCH v3 net-next 2/2] tcp: Add Redundant Data Bundling (RDB) · Bendik Rønning Opstad <hidden> · 2016-02-03
Re: [PATCH v3 net-next 2/2] tcp: Add Redundant Data Bundling (RDB) · Eric Dumazet <hidden> · 2016-02-03
Re: [PATCH v3 net-next 2/2] tcp: Add Redundant Data Bundling (RDB) · Bendik Rønning Opstad <hidden> · 2016-02-08
[PATCH v4 net-next 0/2] tcp: Redundant Data Bundling (RDB) · Bendik Rønning Opstad <hidden> · 2016-02-16
[PATCH v4 net-next 1/2] tcp: Add DPIFL thin stream detection mechanism · Bendik Rønning Opstad <hidden> · 2016-02-16
[PATCH v4 net-next 2/2] tcp: Add Redundant Data Bundling (RDB) · Bendik Rønning Opstad <hidden> · 2016-02-16
Re: [PATCH v4 net-next 2/2] tcp: Add Redundant Data Bundling (RDB) · Eric Dumazet <hidden> · 2016-02-18
Re: [PATCH v4 net-next 2/2] tcp: Add Redundant Data Bundling (RDB) · Bendik Rønning Opstad <hidden> · 2016-02-19
[PATCH v5 net-next 0/2] tcp: Redundant Data Bundling (RDB) · Bendik Rønning Opstad <hidden> · 2016-02-24
[PATCH v5 net-next 1/2] tcp: Add DPIFL thin stream detection mechanism · Bendik Rønning Opstad <hidden> · 2016-02-24
[PATCH v5 net-next 2/2] tcp: Add Redundant Data Bundling (RDB) · Bendik Rønning Opstad <hidden> · 2016-02-24
Re: [PATCH v5 net-next 2/2] tcp: Add Redundant Data Bundling (RDB) · David Miller <davem@davemloft.net> · 2016-03-02
Re: [PATCH v5 net-next 2/2] tcp: Add Redundant Data Bundling (RDB) · Bendik Rønning Opstad <hidden> · 2016-03-02
[PATCH v6 net-next 0/2] tcp: Redundant Data Bundling (RDB) · Bendik Rønning Opstad <hidden> · 2016-03-03
[PATCH v6 net-next 1/2] tcp: Add DPIFL thin stream detection mechanism · Bendik Rønning Opstad <hidden> · 2016-03-03
[PATCH v6 net-next 2/2] tcp: Add Redundant Data Bundling (RDB) · Bendik Rønning Opstad <hidden> · 2016-03-03
Re: [PATCH v6 net-next 2/2] tcp: Add Redundant Data Bundling (RDB) · Eric Dumazet <hidden> · 2016-03-14
Re: [PATCH v6 net-next 2/2] tcp: Add Redundant Data Bundling (RDB) · Rick Jones <hidden> · 2016-03-15
Re: [PATCH v6 net-next 2/2] tcp: Add Redundant Data Bundling (RDB) · Yuchung Cheng <hidden> · 2016-03-15
Re: [PATCH v6 net-next 2/2] tcp: Add Redundant Data Bundling (RDB) · Bendik Rønning Opstad <hidden> · 2016-03-18
Re: [PATCH v6 net-next 2/2] tcp: Add Redundant Data Bundling (RDB) · Yuchung Cheng <hidden> · 2016-03-14
Re: [PATCH v6 net-next 2/2] tcp: Add Redundant Data Bundling (RDB) · Bill Fink <hidden> · 2016-03-15
Re: [PATCH v6 net-next 2/2] tcp: Add Redundant Data Bundling (RDB) · Bendik Rønning Opstad <hidden> · 2016-03-17
Re: [PATCH v6 net-next 2/2] tcp: Add Redundant Data Bundling (RDB) · Yuchung Cheng <hidden> · 2016-03-21
Re: [PATCH v6 net-next 2/2] tcp: Add Redundant Data Bundling (RDB) · Bendik Rønning Opstad <hidden> · 2016-06-16
Re: [PATCH v6 net-next 0/2] tcp: Redundant Data Bundling (RDB) · David Miller <davem@davemloft.net> · 2016-03-07
Re: [PATCH v6 net-next 0/2] tcp: Redundant Data Bundling (RDB) · Yuchung Cheng <hidden> · 2016-03-10
Re: [PATCH v6 net-next 0/2] tcp: Redundant Data Bundling (RDB) · Jonas Markussen <hidden> · 2016-03-10
Re: [PATCH v6 net-next 0/2] tcp: Redundant Data Bundling (RDB) · Yuchung Cheng <hidden> · 2016-03-10
Re: [PATCH v6 net-next 0/2] tcp: Redundant Data Bundling (RDB) · Jonas Markussen <hidden> · 2016-03-12
Re: [PATCH v6 net-next 0/2] tcp: Redundant Data Bundling (RDB) · Bendik Rønning Opstad <hidden> · 2016-03-13
Re: [PATCH v6 net-next 0/2] tcp: Redundant Data Bundling (RDB) · Yuchung Cheng <hidden> · 2016-03-14
Re: [PATCH v6 net-next 0/2] tcp: Redundant Data Bundling (RDB) · Bendik Rønning Opstad <hidden> · 2016-03-18
[PATCH v7 net-next 0/2] tcp: Redundant Data Bundling (RDB) · Bendik Rønning Opstad <hidden> · 2016-06-22
[PATCH v7 net-next 1/2] tcp: Add DPIFL thin stream detection mechanism · Bendik Rønning Opstad <hidden> · 2016-06-22
[PATCH v7 net-next 2/2] tcp: Add Redundant Data Bundling (RDB) · Bendik Rønning Opstad <hidden> · 2016-06-22

From: Bendik Rønning Opstad <hidden>
Date: 2016-06-16 17:15:18

On 21/03/16 19:54, Yuchung Cheng wrote:

On Thu, Mar 17, 2016 at 4:26 PM, Bendik Rønning Opstad
[off-list ref] wrote:

quoted

diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt
index 6a92b15..8f3f3bf 100644
--- a/Documentation/networking/ip-sysctl.txt
+++ b/Documentation/networking/ip-sysctl.txt

@@ -716,6 +716,21 @@ tcp_thin_dpifl_itt_lower_bound - INTEGER
        calculated, which is used to classify whether a stream is thin.
        Default: 10000

+tcp_rdb - BOOLEAN
+       Enable RDB for all new TCP connections.

  Please describe RDB briefly, perhaps with a pointer to your paper.

Ah, yes, that description may have been a bit too brief...

What about pointing to tcp-thin.txt in the brief description, and
rewrite tcp-thin.txt with a more detailed description of RDB along
with a paper reference?

+1

quoted

   I suggest have three level of controls:
   0: disable RDB completely
   1: enable indiv. thin-stream conn. to use RDB via TCP_RDB socket
options
   2: enable RDB on all thin-stream conn. by default

   currently it only provides mode 1 and 2. but there may be cases where
   the administrator wants to disallow it (e.g., broken middle-boxes).

Good idea. Will change this.

I have implemented your suggestion in the next patch.

quoted

It also seems better to
allow individual socket to select the redundancy level (e.g.,
setsockopt TCP_RDB=3 means <=3 pkts per bundle) vs a global setting.
This requires more bits in tcp_sock but 2-3 more is suffice.

Most certainly. We decided not to implement this for the patch to keep
it as simple as possible, however, we surely prefer to have this
functionality included if possible.

Next patch version has a socket option to allow modifying the different
RDB settings.

quoted

+int tcp_transmit_rdb_skb(struct sock *sk, struct sk_buff *xmit_skb,
+                        unsigned int mss_now, gfp_t gfp_mask)
+{
+       struct sk_buff *rdb_skb = NULL;
+       struct sk_buff *first_to_bundle;
+       u32 bytes_in_rdb_skb = 0;
+
+       /* How we detect that RDB was used. When equal, no RDB data was sent */
+       TCP_SKB_CB(xmit_skb)->tx.rdb_start_seq = TCP_SKB_CB(xmit_skb)->seq;

quoted

+
+       if (!tcp_stream_is_thin_dpifl(tcp_sk(sk)))

During loss recovery tcp inflight fluctuates and would like to trigger
this check even for non-thin-stream connections.

Good point.

quoted

Since the loss
already occurs, RDB can only take advantage from limited-transmit,
which it likely does not have (b/c its a thin-stream). It might be
checking if the state is open.

You mean to test for open state to avoid calling rdb_can_bundle_test()
unnecessarily if we (presume to) know it cannot bundle anyway? That
makes sense, however, I would like to do some tests on whether "state
!= open" is a good indicator on when bundling is not possible.

When testing this I found that bundling can often be performed when
not in Open state. For the most part in CWR mode, but also the other
modes, so this does not seem like a good indicator.

The only problem with tcp_stream_is_thin_dpifl() triggering for
non-thin streams in loss recovery would be the performance penalty of
calling rdb_can_bundle_test(). It would not be able to bundle anyways
since the previous SKB would contain >= mss worth of data.

The most reliable test is to check available space in the previous
SKB, i.e. if (xmit_skb->prev->len == mss_now). Do you suggest, for
performance reasons, to do this before the call to
tcp_stream_is_thin_dpifl()?

quoted

since RDB will cause DSACKs, and we only blindly count DSACKs to
perform CWND undo. How does RDB handle that false positives?

That is a very good question. The simple answer is that the
implementation does not handle any such false positives, which I
expect can result in incorrectly undoing CWND reduction in some cases.
This gets a bit complicated, so I'll have to do some more testing on
this to verify with certainty when it happens.

When there is no loss, and each RDB packet arriving at the receiver
contains both already received and new data, the receiver will respond
with an ACK that acknowledges new data (moves snd_una), with the SACK
field populated with the already received sequence range (DSACK).

The DSACKs in these incoming ACKs are not counted (tp->undo_retrans--)
unless tp->undo_marker has been set by tcp_init_undo(), which is
called by either tcp_enter_loss() or tcp_enter_recovery(). However,
whenever a loss is detected by rdb_detect_loss(), tcp_enter_cwr() is
called, which disables CWND undo. Therefore, I believe the incorrect

thanks for the clarification. it might worth a short comment on why we
use tcp_enter_cwr() (to disable undo)

quoted

counting of DSACKs from ACKs on RDB packets will only be a problem
after the regular loss detection mechanisms (Fast Retransmit/RTO) have
been triggered (i.e. we are in either TCP_CA_Recovery or TCP_CA_Loss).

We have recorded the CWND values for both RDB and non-RDB streams in
our experiments, and have not found any obvious red flags when
analysing the results, so I presume (hope may be more precise) this is
not a major issue we have missed. Nevertheless, I will investigate
this in detail and get back to you.

I've looked into this and tried to figure out in which cases this is
actually a problem, but I have failed to find any.

One scenario I considered is when an RDB packet is sent right after a
retransmit, which would result in DSACK in the ACK in response to the
RDB packet.

With a bundling limit of 1 packet, two packets must be lost for RDB to
fail to repair the loss, causing dupACKs. So if three packets are sent,
where the first two are lost, the last packet will cause a dupACK,
resulting in a fast retransmit (and entering recovery which calls
tcp_init_undo()).

By writing new data to the socket right after the fast retransmit,
a new RDB packet is built with some old data that was just
retransmitted.

On the ACK on the fast retransmit the state is changed from Recovery
to Open. The next incoming ACK (on the RDB packet) will contain a DSACK
range, but it will not be considered dubious (tcp_ack_is_dubious())
since "!(flag & FLAG_NOT_DUP)" is false (new data was acked), state is
Open, and "flag & FLAG_CA_ALERT" evaluates to false.


Feel free to suggest scenarios (as detailed as possible) with the
potential to cause such false positives, and I'll test them with
packetdrill.

Bendik

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help