Thread (15 messages) 15 messages, 3 authors, 2017-06-29

Re: [PATCH 0/8] Support for 24x7 hcall interface version 2

From: Thiago Jung Bauermann <hidden>
Date: 2017-06-28 22:02:51

Michael Ellerman [off-list ref] writes:
Thiago Jung Bauermann [off-list ref] writes:
quoted
The hypervisor interface to access 24x7 performance counters (which collect
performance information from system power on to system power off) has been
extended in POWER9 adding new fields to the request and result element
structures.

Also, results for some domains now return more than one result element and
those need to be added to get a total count.

The first two patches fix bugs in the existing code. The following 4
patches are code improvements and the last two finally implement support
for the changes in POWER9 described above.

POWER8 systems only support version 1 of the interface, while POWER9
systems only support version 2. I tested these patches on POWER8 to verify
that there are no regressions, and also on POWER9 DD1.
Where is version 2 documented?
Only in an internal specification.
And what happens when we boot on a POWER9 in POWER8 compatibility mode?
It will still use version 2 of the API, and still require aggregation of
result elements. Does this mean that hv_24x7_init should check
cur_cpu_spec->oprofile_cpu_type for "ppc64/power9" instead of
cpu_has_feature(CPU_FTR_ARCH_300)?

There were a couple of comments from Suka which still needed resolving:

Sukadev Bhattiprolu [off-list ref] writes:
Thiago Jung Bauermann [bauerman@linux.vnet.ibm.com] wrote:
quoted
+/**
+ * get_count_from_result - get event count from the given result
+ *
+ * @event:	Event associated with @res.
+ * @resb:	Result buffer containing @res.
+ * @res:	Result to work on.
+ * @countp:	Output variable containing the event count.
+ * @next:	Optional output variable pointing to the next result in @resb.
+ */
+static int get_count_from_result(struct perf_event *event,
+				 struct hv_24x7_data_result_buffer *resb,
+				 struct hv_24x7_result *res, u64 *countp,
+				 struct hv_24x7_result **next)
+{
+	u16 num_elements = be16_to_cpu(res->num_elements_returned);
+	u16 data_size = be16_to_cpu(res->result_element_data_size);
+	unsigned int data_offset;
+	void *element_data;
+	int ret = 0;
+
+	/*
+	 * We can bail out early if the result is empty.
+	 */
+	if (!num_elements) {
+		pr_debug("Result of request %hhu is empty, nothing to do\n",
+			 res->result_ix);
+
+		if (next)
+			*next = (struct hv_24x7_result *) res->elements;
+
+		return -ENODATA;
+	}
+
+	/*
+	 * This code assumes that a result has only one element.
+	 */
+	if (num_elements != 1) {
+		pr_debug("Error: result of request %hhu has %hu elements\n",
+			 res->result_ix, num_elements);
Could this happen due to an user request or would this indicate a bug
in the way we submitted the request (perf should submit separate request
for each lpar/index - we set ->max_ix and ->max_num_lpars to cpu_be16(1).
By specifying 1 as the maximum for the smallest resource we're
requesting, we guarantee one element per result. Since that's what we
do this would indeed be a bug. For v2 (which I'll send shortly) I
changed the message above to a pr_err, and changed the returned error to
-EIO instead of -ENOTSUPP.
Minor inconsistency with proceeding, is that if the next element passes,
this return code is lost/over written. IOW, h_24x7_event_commit_txn()'s
return value depends on the last element we process, even if intermediate
ones encounter an error.
For v2, I changed h_24x7_event_commit_txn to break out of the loop if
there's an error in any processed result. Also, since in that case @next
isn't used anymore, get_count_from_result will exit early instead of
going through the motions.
quoted
+
+		if (!next)
+			return -ENOTSUPP;
+
+		/*
+		 * We still need to go through the motions so that we can return
+		 * a pointer to the next result.
+		 */
+		ret = -ENOTSUPP;
+	}
+
+	if (data_size != sizeof(u64)) {
+		pr_debug("Error: result of request %hhu has data of %hu bytes\n",
+			 res->result_ix, data_size);
+
+		if (!next)
+			return -ENOTSUPP;
+
+		ret = -ENOTSUPP;
+	}
+
+	if (resb->interface_version == 1)
+		data_offset = offsetof(struct hv_24x7_result_element_v1,
+				       element_data);
+	else
+		data_offset = offsetof(struct hv_24x7_result_element_v2,
+				       element_data);
+
+	element_data = res->elements + data_offset;
+
+	if (!ret)
+		*countp = be64_to_cpu(*((u64 *) element_data));
+
+	/* The next result is after the result element. */
+	if (next)
+		*next = element_data + data_size;
+
+	return ret;
+}
+
 static int single_24x7_request(struct perf_event *event, u64 *count)
 {
 	int ret;
-	u16 num_elements;
-	struct hv_24x7_result *result;
 	struct hv_24x7_request_buffer *request_buffer;
 	struct hv_24x7_data_result_buffer *result_buffer;
@@ -1158,14 +1259,9 @@ static int single_24x7_request(struct perf_event *event, u64 *count)
 	if (ret)
 		goto out;

-	result = result_buffer->results;
-
-	/* This code assumes that a result has only one element. */
-	num_elements = be16_to_cpu(result->num_elements_returned);
-	WARN_ON_ONCE(num_elements != 1);
-
 	/* process result from hcall */
-	*count = be64_to_cpu(result->elements[0].element_data[0]);
+	ret = get_count_from_result(event, result_buffer,
+				    result_buffer->results, count, NULL);

 out:
 	put_cpu_var(hv_24x7_reqb);
@@ -1425,16 +1521,13 @@ static int h_24x7_event_commit_txn(struct pmu *pmu)
 	for (i = 0, res = result_buffer->results;
 	     i < result_buffer->num_results; i++, res = next_res) {
 		struct perf_event *event = h24x7hw->events[res->result_ix];
-		u16 num_elements = be16_to_cpu(res->num_elements_returned);
-		u16 data_size = be16_to_cpu(res->result_element_data_size);

-		/* This code assumes that a result has only one element. */
-		WARN_ON_ONCE(num_elements != 1);
+		ret = get_count_from_result(event, result_buffer, res, &count,
+					    &next_res);
+		if (ret)
+			continue;

-		count = be64_to_cpu(res->elements[0].element_data[0]);
 		update_event_count(event, count);
-
-		next_res = (void *) res->elements[0].element_data + data_size;
 	}

 	put_cpu_var(hv_24x7_hw);
-- 
Thiago Jung Bauermann
IBM Linux Technology Center
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help