Thread (32 messages) 32 messages, 5 authors, 2017-11-15

Re: [PATCH V4 net 0/2] ipv6: fix flowlabel issue for reset packet

From: Martin KaFai Lau <hidden>
Date: 2017-11-15 18:27:44

On Tue, Nov 14, 2017 at 01:59:03PM -0800, Shaohua Li wrote:
On Tue, Nov 14, 2017 at 11:13:10AM -0800, Tom Herbert wrote:
quoted
On Tue, Nov 14, 2017 at 10:24 AM, Shaohua Li [off-list ref] wrote:
quoted
On Wed, Nov 08, 2017 at 09:44:51AM -0800, Tom Herbert wrote:
quoted
On Fri, Aug 18, 2017 at 3:27 PM, David Miller [off-list ref] wrote:
quoted
From: Martin KaFai Lau <redacted>
Date: Fri, 18 Aug 2017 13:51:36 -0700
quoted
It seems like that middle box specifically drops TCP_RST if it
does not know anything about this flow.  Since the flowlabel of the TCP_RST
(sent in TW state) is always different, it always lands to a different middle
box.  All of these TCP_RST cannot be delivered.
This really is illegal behavior.  The flow label is not a flow _KEY_
by any definition whatsoever.

Flow labels are an optimization, not a determinant for flow matching
particularly for proper TCP state processing.

I'd rather you invest all of this energy getting that vendor to fix
their kit.
We're now seeing several router vendors recommending people to not use
flow labels for ECMP hashing. This is precisely because when a flow
label changes, network devices that maintain state (firewalls, NAT,
load balancers) can't deal with packets being rerouted so connections
are dropped. Unfortunately, the need for packets of a flow to always
follow the same path has become an implicit requirement that I think
we need follow at least as the default behavior.

Martin: is there any change you could resurrect these patches? In
order to solve the general problem of making routing consistent, I
believe we want to keep sk_tx_hash consistent for the connection from
which a consistent flow label can be derived. To avoid the overhead of
a hash field in sk_common, maybe we could initially set a connection
hash to a five-tuple hash for a flow instead of a random value? So in
TW state the consistent hash can be computed on the fly.
Hi Tom,
Do we really need to use the five-tupe hash? There are several places using
current random hash, which looks more lightweight. To fix issue, we only need
to make sure reset packet include the correct flowlabel. Like what my previous
patch did, we can set tw->tw_flowlabel in tcp_time_wait based on txhash and use
it reset packet. In this way we can use the random hash and not add extra field
in sock.
Shaohua,

But that patch discards the full txhash in TW. So it's not just a
problem with the flow label. sk_tx_hash can also be used for route
selection in ECMP, port selection we're doing tunneling, etc. The
general solution should maintains tx_hash or be able to reconstruct it
in any state, flow label fix is a point solution.
Hi Tom,

do you want to keep sk_rethink_txhash() then? If we changed the hash to random
number, we can't reconstruct it for sure.
A followup question on rethink.  Does it mean we need
a new sysctl (persistent_txhash) to avoid sk_rethink_txhash() together
such that it keeps the routing decision consistent (e.g. flowlabel) ?
Thanks,
Shaohua
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help