Re: [PATCH bpf-next 0/5] fix test_sockmap

From: John Fastabend <john.fastabend@gmail.com>
Date: 2018-05-25 14:01:58

On 05/25/2018 01:28 AM, Prashant Bhole wrote:


On 5/24/2018 1:58 PM, John Fastabend wrote:

quoted

On 05/23/2018 09:47 PM, Prashant Bhole wrote:

quoted


On 5/23/2018 6:44 PM, Prashant Bhole wrote:

quoted


On 5/22/2018 2:08 AM, John Fastabend wrote:

quoted

On 05/20/2018 10:13 PM, Prashant Bhole wrote:

quoted


On 5/19/2018 1:42 AM, John Fastabend wrote:

quoted

On 05/18/2018 12:17 AM, Prashant Bhole wrote:

quoted

This series fixes bugs in test_sockmap code. They weren't caught
previously because failure in RX/TX thread was not notified to the
main thread.

Also fixed data verification logic and slightly improved test
output
such that parameters values (cork, apply, start, end) of failed
test
can be easily seen.

Great, this was on my list so thanks for taking care of it.

quoted

Note: Even after fixing above problems there are issues with tests
which set cork parameter. Tests fail (RX thread timeout) when cork
value is non-zero and overall data sent by TX thread isn't
multiples
of cork value.


This is expected. When 'cork' is set the sender should only xmit
the data when 'cork' bytes are available. If the user doesn't
provide the N bytes the data is cork'ed waiting for the bytes and
if the socket is closed the state is cleaned up. What these tests
are testing is the cleanup path when a user doesn't provide the
N bytes. In practice this is used to validate headers and prevent
users from sending partial headers. We want to keep these tests
because
they verify a tear-down path in the code.

Ok.

quoted

After your changes do these get reported as failures? If so we
need to account for the above in the calculations.

Yes, cork related test are reported as failures because of RX thread
timeout.

So with your above description, I think we need to differentiate cork
tests with partial data and full data. In partial data test we can
have
something like "timeout_expected" flag. Any other way to fix it?

Adding a flag seems reasonable to me. Lets do this for now. Also I
plan to add more negative tests so we can either use the same
flag or a new one for those cases as well.

John,
I worked on this for some time and noticed that the RX-timeout of
tests with cork parameter is dependent on various parameters. So we
can not set a flag like the way 'drop_expected' flag is set before
executing the test.

So I decided to write a function which judges all parameters before
each test and decides whether a test with cork parameter will
timeout or not. Then the conditions in the function became
complicated. For example some tests fail if opt->rate < 17 (with
some other conditions). Here is 17 is related to FRAGS_PER_SKB.
Consider following two examples.

I'm sorry. Correction: s/FRAGS_PER_SKB/MAX_SKB_FRAGS/

quoted

./test_sockmap --cgroup /mnt/cgroup2 -r 16 -i 1 -l 30 -t sendpage
--txmsg --txmsg_cork 1024   # RX timeout occurs

./test_sockmap --cgroup /mnt/cgroup2 -r 17 -i 1 -l 30 -t sendpage
--txmsg --txmsg_cork 1024   # Success!

Ah yes this hits the buffer limit and flushes the queue. The kernel
side doesn't know how to merge those specific sendpage requests so
it gives each request its own buffer and when the limit is reached
we flush it.

quoted

Do we need to keep such tests? if yes, then I will continue with
adding such conditions in the function.

Yes, these tests are needed because they are testing the edge cases.
These are probably the most important tests because my normal usage
will catch any issues in the "good" cases its these types of things
that can go unnoticed (at least for a short while) if we don't have
specific tests for them.

I tried but it is difficult to come up with a right set of conditions
which lead to test failure.

Agreed, it can be yes. How about adding your logic for all tests except
"cork" cases. If there is a flag to set if the timeout is expected we
can always manually set it in the test invocation. Might not be as
nice as automatically learning the expected results but possibly easier
than building some complicated logic to figure it out.

Would you mind submitting your series again without the "cork" tests
being tracked? And if you want add a bit to tell if the "cork" tests are
going to timeout or not setting it per test manually. But I think
your series can just omit the cork test for now and still be useful.

-Prashant

quoted

Thanks for doing this.
John

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help