Thread (28 messages) 28 messages, 4 authors, 2018-09-14

Re: [PATCH stable 4.4 0/9] fix SegmentSmack in stable branch (CVE-2018-5390)

From: maowenan <hidden>
Date: 2018-08-16 15:04:15
Also in: stable


On 2018/8/16 19:39, Michal Kubecek wrote:
On Thu, Aug 16, 2018 at 03:55:16PM +0800, maowenan wrote:
quoted
On 2018/8/16 15:44, Michal Kubecek wrote:
quoted
On Thu, Aug 16, 2018 at 03:39:14PM +0800, maowenan wrote:
quoted

On 2018/8/16 15:23, Michal Kubecek wrote:
quoted
On Thu, Aug 16, 2018 at 03:19:12PM +0800, maowenan wrote:
quoted
On 2018/8/16 14:52, Michal Kubecek wrote:
quoted
My point is that backporting all this into stable 4.4 is quite intrusive
so that if we can achieve similar results with a simple fix of an
obvious omission, it would be preferrable.
There are five patches in mainline to fix this CVE, only two patches
have no effect on stable 4.4, the important reason is 4.4 use simple
queue but mainline use RB tree.

I have tried my best to use easy way to fix this with dropping packets
12.5%(or other value) based on simple queue, but the result is not
very well, so the RB tree is needed and tested result is my desire.

If we only back port two patches but they don't fix the issue, I think
they don't make any sense.
There is an obvious omission in one of the two patches and Takashi's
patch fixes it. If his follow-up fix (applied on top of what is in
stable 4.4 now) addresses the problem, I would certainly prefer using it
over backporting the whole series.
Do you mean below codes from Takashi can fix this CVE?
But I have already tested like this two days ago, it is not good effect.
IIRC what you proposed was different, you proposed to replace the "=" in
the other branch by "+=".
No, I think you don't get what I mean, I have already tested stable 4.4,
based on commit dc6ae4d, and change the codes like Takashi, which didn't
contain any codes I have sent in this patch series.
I suspect you may be doing something wrong with your tests. I checked
the segmentsmack testcase and the CPU utilization on receiving side
(with sending 10 times as many packets as default) went down from ~100%
to ~3% even when comparing what is in stable 4.4 now against older 4.4
kernel.
There seems no obvious problem when you send packets with default parameter in Segmentsmack POC,
Which is also very related with your server's hardware configuration. Please try with below parameter
to form OFO packets then you will see cpu usage is high, and perf top shows that tcp_data_queue costs
cpu about 55.6%.
If dst port is 22, then you will see sshd about 95%.

int main(int argc, char **argv)
{
  // Adjust dst_mac, src_mac and dst_ip to match source and target!
  // Adjust dst_port to match the target, needs to be an open port!
  char dst_mac[6] = {0xb8,0x27,0xeb,0x54,0x23,0x4a};
  char src_mac[6] = {0x08,0x00,0x27,0xbc,0x91,0x93};
  uint32_t dst_ip = (192<<24)|(168<<16)|(1<<8)|225;
  uint32_t src_ip = 0;
  uint16_t dst_port = 22;   //attack existed ssh link
  uint16_t src_port = 0;

  ......

    for (j = 0; j < 102400*10; j++)    //10240->102400
    {
      for (i = 0; i < 1024; i++)      // 128->1024
      {
        tcp_set_ack_on(only_tcp[i]);
        tcp_set_src_port(only_tcp[i], src_port);
        tcp_set_dst_port(only_tcp[i], dst_port);
        tcp_set_seq_number(only_tcp[i], isn+2+2*(rand()%16384));
        //tcp_set_seq_number(only_tcp[i], isn+2);
        tcp_set_ack_number(only_tcp[i], other_isn+1);
        tcp_set_data_offset(only_tcp[i], 20);
        tcp_set_window(only_tcp[i], 65535);
        tcp_set_cksum_calc(ip, 20, only_tcp[i], sizeof(only_tcp[i]));
      }
      ret = ldp_out_inject_chunk(outq, pkt_tbl_chunk, 1024);  //128->1024
      printf("sent %d packets\n", ret);
      ldp_out_txsync(outq);
      usleep(10*1000); // Adjust this and packet count to match the target!, sleep 100ms->10ms
    }
  ......
This is actually not surprising. the testcase only sends 1-byte segments
starting at even offsets so that receiver never gets two adjacent
segments and the "range_truesize != head->truesize" condition would
never be satisfied, whether you fix the backport or not.

Where the missing "range_truesize += skb->truesize" makes a difference
is in the case when there are some adjacent out of order segments, e.g.
if you omitted 1:10 and then sent 10:11, 11:12, 12:13, 13:14, ...
Then the missing update of range_truesize would prevent collapsing the
queue until the total length of the range would exceed the value of
SKB_WITH_OVERHEAD(SK_MEM_QUANTUM) (i.e. a bit less than 4 KB).

Michal Kubecek


.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help