Thread (13 messages) 13 messages, 4 authors, 2017-05-22

[PATCH v3 4/5] drivers/perf: Add support for ARMv8.2 Statistical Profiling Extension

From: Kim Phillips <hidden>
Date: 2017-05-22 23:24:41
Also in: lkml

On Mon, 22 May 2017 17:22:12 +0100
Mark Rutland [off-list ref] wrote:
On Mon, May 22, 2017 at 10:45:21AM -0500, Kim Phillips wrote:
quoted
On Mon, 22 May 2017 13:44:46 +0100
Mark Rutland [off-list ref] wrote:
quoted
On Mon, May 22, 2017 at 07:32:49AM -0500, Kim Phillips wrote:
quoted
On Thu, 18 May 2017 18:24:32 +0100
Will Deacon [off-list ref] wrote:
quoted
+/* Perf callbacks */
+static int arm_spe_pmu_event_init(struct perf_event *event)
+{
+	u64 reg;
+	struct perf_event_attr *attr = &event->attr;
+	struct arm_spe_pmu *spe_pmu = to_spe_pmu(event->pmu);
+
+	/* This is, of course, deeply driver-specific */
+	if (attr->type != event->pmu->type)
+		return -ENOENT;
+
[trimming other return sites]
Thanks but other conditions, such as the user specified sample period
check would be more appropriate to be left in for this discussion.
Sure, I was just trimming to a single example for brevity.  I appreciate
there are cases where it may not be as simple to determine the cause
from userspace today.
That helps, thanks.
quoted
quoted
quoted
I've consistently brought up lack of proper user error messaging in all
previous submissions of this driver:
... and we've consistently explained why logging such things to dmesg by
default will not fly. As before, while we call these return codes error
values, they are *not* errors in the same sense as pr_err().
I've expressed my disagreement to that matter here:
 
https://lkml.org/lkml/2017/4/7/223
 
yet it got no response.
That's not strictly true.

I replied to the mail you cited, attempting to clarify as best I could.
You replied again, and it's true I didn't respond there, but there was
no new substantiative argument. To summarize that thread, to the best of
my understanding:

* We disagree on the semantic of "an error" in this context. Clearly we
  aren't going to agree.
That's bad.  We ought to agree on what an error is, in this and any
other context.  I'm willing to listen if you have a convincing
argument, but none was given after my last reply:

"The driver is trying to report an error:  in the above example, it's
reporting that it cannot support an operation by returning
-*E*OPNOTSUPP: an ERROR because it was unable to complete the request:
the request failed.  Unlike e.g., a warning where something may not
have been quite right, but went along with executing the operation
anyway."

To put it another way, perf_event_open returning errno EINVAL is no
different than open() returning the same with the meaning 'Invalid
value in flags.'  In fact, the perf_event_open manpage says errors in
setting the sample frequency make the syscall return the error code -1
and EINVAL in errno.

Prior to that I see what might possibly be the underlying cause for the
discrepancy:  you said:
quoted
quoted
The above cases are not (system) errors, and using dev_err (even
ratelimited) is certainly not appropriate. These are pr_debug() at best.
So is it that you are resisting technically calling it an error because
that would imply we use pr_err() instead of pr_debug() perhaps?  In
which case, is that because of fuzzing?:

quoting you again:

"There are some cases where they're actively harmful (e.g. when fuzzing)."

to which my response remains:

"I'd expect fuzzer users to be more amenable to manually modifying the
driver rather than regular users of the driver."

to which your then-response was seemingly irrelevant, and against the
benefit of normal user of the driver:

"When fuzzing, I take a mainline, defconfig kernel, and run it through
its paces. I don't touch each and every driver."

If this is the case, can we find another solution to make both regular
fuzzer runners and regular users happy?
* We agree that error reporting and handling is painful in this area.

* We disagree w.r.t. using printk() and friends. My position has not
  been swayed. 

[...]
I beg you to please reconsider, given we agree that this particular
syscall is bad, and the alternative (no messaging) will truly be worse
for our users.
quoted
quoted
quoted
AFAICT, my comments hold, yet the driver still gets resubmitted without
them being addressed.  How do we get out of this loop?
We've repeatedly explained why the approach you suggest is not feasible.
Perhaps you could try to explain why our approach doesn't seem feasible
to you.
I don't want SPE users to have to manually instrument the driver
in order to find out what it didn't like about the parameters they
specified.  This problem has already been reported by other early
adopters.  perf itself says "dmesg may provide additional information",
so let's please use it.
Sorry, but regardless of any argument there is to be had on how best to
handle errors, I'm not going to be swayed to the position that the
solution is printk() or its ilk, for the reasons that I have outlined
several times previously.

As one of the maintainers of PMU code, I must NAK such code in any PMU
driver.
We disagree here:  I am of the belief that users should be made aware
of what they're doing wrong, and right now, dmesg is the vehicle to do
so.
FWIW, I'm more than happy to:

* Add pr_debug() statements so that developers directly using the perf
  interfaces can debug their userspace code and without having to first
  develop a full knowledge of what is and isn't permitted.
Perhaps this is a terminology context problem again, but to be
abundantly clear: This isn't for developers per se; this is for normal,
regular perf users trying to use perf to debug the performance of their
applications.  I don't expect these users to have to know how to turn
on pr_debug messaging, esp. because it might turn on other noisy
drivers in use at the same time.
* Add documentation such that userspace developers can figure out what
  is and is not supported.

* Add interfaces as appropriate such that userspace can more reliably
  determine the reason(s) an error code has been returned. For example,
  we might expose sample period information under sysfs.

* Help with any userspace error handling code. I am more than happy to
  review such code and to provide improvements myself.

... so if you want to make any progress on this front, please either
look at one of those, or make a *new* suggestion that does not involve
printk.
Not that I was looking, but I did just happen to notice this posting
today:

https://lkml.org/lkml/2017/5/22/578

but I have no clue if or when it will be accepted, let alone whether
it's applicable to perf, so *for the time being*, dmesg is what we have
for now.

Kim
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help