Thread (30 messages) 30 messages, 5 authors, 2018-01-09

Re: [RFC PATCH bpf-next v2 4/4] error-injection: Support fault injection framework

From: Masami Hiramatsu <mhiramat@kernel.org>
Date: 2017-12-28 07:51:46
Also in: lkml

On Wed, 27 Dec 2017 19:49:28 -0800
Alexei Starovoitov [off-list ref] wrote:
On 12/27/17 5:38 PM, Masami Hiramatsu wrote:
quoted
On Wed, 27 Dec 2017 14:49:46 -0800
Alexei Starovoitov [off-list ref] wrote:
quoted
On 12/27/17 12:09 AM, Masami Hiramatsu wrote:
quoted
On Tue, 26 Dec 2017 18:12:56 -0800
Alexei Starovoitov [off-list ref] wrote:
quoted
On Tue, Dec 26, 2017 at 04:48:25PM +0900, Masami Hiramatsu wrote:
quoted
Support in-kernel fault-injection framework via debugfs.
This allows you to inject a conditional error to specified
function using debugfs interfaces.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
---
 Documentation/fault-injection/fault-injection.txt |    5 +
 kernel/Makefile                                   |    1
 kernel/fail_function.c                            |  169 +++++++++++++++++++++
 lib/Kconfig.debug                                 |   10 +
 4 files changed, 185 insertions(+)
 create mode 100644 kernel/fail_function.c
diff --git a/Documentation/fault-injection/fault-injection.txt b/Documentation/fault-injection/fault-injection.txt
index 918972babcd8..6243a588dd71 100644
--- a/Documentation/fault-injection/fault-injection.txt
+++ b/Documentation/fault-injection/fault-injection.txt
@@ -30,6 +30,11 @@ o fail_mmc_request
   injects MMC data errors on devices permitted by setting
   debugfs entries under /sys/kernel/debug/mmc0/fail_mmc_request

+o fail_function
+
+  injects error return on specific functions by setting debugfs entries
+  under /sys/kernel/debug/fail_function. No boot option supported.
I like it.
Could you document it a bit better?
Yes, I will do in next series.
quoted
In particular retval is configurable, but without an example no one
will be able to figure out how to use it.
Ah, right. BTW, as I pointed in the covermail, should we store the
expected error value range into the injectable list? e.g.

ALLOW_ERROR_INJECTION(open_ctree, -1, -MAX_ERRNO)

And provide APIs to check/get it.
I'm afraid such check would be too costly.
Right now we have only two functions marked but I expect hundreds more
will be added in the near future as soon as developers realize the
potential of such error injection.
All of ALLOW_ERROR_INJECTION marks add 8 byte overhead each to .data.
Multiple by 1k and we have 8k of data spent on marks.
If we add max/min range marks that doubles it for very little use.
I think marking function only is enough.
Sorry, I don't think so.
Even if it takes 16 bytes more for each points, I don't think it is
any overhead for machines in these days. Even if so, we can provide
a kconfig to reduce it.
I mean, we are living in GB-order memory are, and it will be bigger
in the future. Why we have to worry about hundreds of 16bytes memory
pieces? It will take a few KB, and even if we mark thousands of
functions, it never reaches 1MB, in GB memory pool. :)

Of course, for many small-footprint embedded devices (like having
less than 128MB memory), this feature can be a overhead. But they
can cut off the table by kconfig.
I still disagree on wasting 16-byte * num_of_funcs of .data here.
The trade-off of usability vs memory just not worth it. Sorry.
Please focus on testing your changes instead.
Then what happen if the user set invalid retval to those functions?
even if we limit the injectable functions, it can cause a problem,

for example, 

 obj = func_return_object();
 if (!obj) {
    handling_error...;
 }
 obj->field = x;

In this case, obviously func_return_object() must return NULL if there is
an error, not -ENOMEM. But without the correct retval information, how would
you check the BPF code doesn't cause a trouble?
Currently it seems you are expecting only the functions which return error code.

 ret = func_return_state();
 if (ret < 0) {
    handling_error...;
 }

But how we can distinguish those?

If we have the error range for each function, we can ensure what is
*correct* error code, NULL or errno, or any other error numbers. :)

At least fail_function needs this feature because it can check
return value when setting it up.

Thank you,

-- 
Masami Hiramatsu [off-list ref]
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help