Re: [PATCH net 1/2] tcp: Limit number of segments generated by GSO per skb
From: Ben Hutchings <hidden>
Date: 2012-07-30 19:41:15
On Mon, 2012-07-30 at 10:23 -0700, Ben Greear wrote:
On 07/30/2012 10:16 AM, Ben Hutchings wrote:quoted
A peer (or local user) may cause TCP to use a nominal MSS of as little as 88 (actual MSS of 76 with timestamps). Given that we have a sufficiently prodigious local sender and the peer ACKs quickly enough, it is nevertheless possible to grow the window for such a connection to the point that we will try to send just under 64K at once. This results in a single skb that expands to 861 segments. In some drivers with TSO support, such an skb will require hundreds of DMA descriptors; a substantial fraction of a TX ring or even more than a full ring. The TX queue selected for the skb may stall and trigger the TX watchdog repeatedly (since the problem skb will be retried after the TX reset). This particularly affects sfc, for which the issue is designated as CVE-2012-3412. However it may be that some hardware or firmware also fails to handle such an extreme TSO request correctly. Therefore, limit the number of segments per skb to 100. This should make no difference to behaviour unless the actual MSS is less than about 700.Please do not do this...or at least allow over-rides. We love the trick of seting very small MSS and making the NICs generate huge numbers of small TCP frames with efficient user-space logic. We use this for stateful TCP load testing when high numbers of tcp packets-per-second is desired.
Please test whether this actually makes a difference - my suspicion is that 100 segments per skb is easily enough to prevent the host being a bottleneck.
Intel NICs, including 10G, work just fine with minimal MSS in this scenario.
I'll leave this to the Intel maintainers to answer. Ben. -- Ben Hutchings, Staff Engineer, Solarflare Not speaking for my employer; that's the marketing department's job. They asked us to note that Solarflare product names are trademarked.