Thread (37 messages) 37 messages, 6 authors, 2015-10-27
STALE3871d
Revisions (15)
  1. v2 [diff vs current]
  2. v2 [diff vs current]
  3. v2 [diff vs current]
  4. v3 current
  5. v4 [diff vs current]
  6. v4 [diff vs current]
  7. v5 [diff vs current]
  8. v6 [diff vs current]
  9. v7 [diff vs current]
  10. v8 [diff vs current]
  11. v9 [diff vs current]
  12. v10 [diff vs current]
  13. v11 [diff vs current]
  14. v12 [diff vs current]
  15. v13 [diff vs current]

[PATCH v3 00/20] KVM: ARM64: Add guest PMU support

From: Shannon Zhao <hidden>
Date: 2015-10-21 07:26:46
Also in: kvm, kvmarm


On 2015/10/17 1:01, Christopher Covington wrote:
On 10/16/2015 12:55 AM, Wei Huang wrote:
quoted
quoted

On 09/24/2015 05:31 PM, Shannon Zhao wrote:
quoted
quoted
This patchset adds guest PMU support for KVM on ARM64. It takes
trap-and-emulate approach. When guest wants to monitor one event, it
will be trapped by KVM and KVM will call perf_event API to create a perf
event and call relevant perf_event APIs to get the count value of event.

Use perf to test this patchset in guest. When using "perf list", it
shows the list of the hardware events and hardware cache events perf
supports. Then use "perf stat -e EVENT" to monitor some event. For
example, use "perf stat -e cycles" to count cpu cycles and
"perf stat -e cache-misses" to count cache misses.

Below are the outputs of "perf stat -r 5 sleep 5" when running in host
and guest.

Host:
 Performance counter stats for 'sleep 5' (5 runs):

          0.551428      task-clock (msec)         #    0.000 CPUs utilized            ( +-  0.91% )
                 1      context-switches          #    0.002 M/sec
                 0      cpu-migrations            #    0.000 K/sec
                48      page-faults               #    0.088 M/sec                    ( +-  1.05% )
           1150265      cycles                    #    2.086 GHz                      ( +-  0.92% )
   <not supported>      stalled-cycles-frontend
   <not supported>      stalled-cycles-backend
            526398      instructions              #    0.46  insns per cycle          ( +-  0.89% )
   <not supported>      branches
              9485      branch-misses             #   17.201 M/sec                    ( +-  2.35% )

       5.000831616 seconds time elapsed                                          ( +-  0.00% )

Guest:
 Performance counter stats for 'sleep 5' (5 runs):

          0.730868      task-clock (msec)         #    0.000 CPUs utilized            ( +-  1.13% )
                 1      context-switches          #    0.001 M/sec
                 0      cpu-migrations            #    0.000 K/sec
                48      page-faults               #    0.065 M/sec                    ( +-  0.42% )
           1642982      cycles                    #    2.248 GHz                      ( +-  1.04% )
   <not supported>      stalled-cycles-frontend
   <not supported>      stalled-cycles-backend
            637964      instructions              #    0.39  insns per cycle          ( +-  0.65% )
   <not supported>      branches
             10377      branch-misses             #   14.198 M/sec                    ( +-  1.09% )

       5.001289068 seconds time elapsed                                          ( +-  0.00% )
Thanks for V3. One suggestion is to run more perf stress tests, such as
"perf test". So we know the corner cases are covered as much as possible.
I'd also recommend Vince Weaver's perf_event_tests. It tests things like
signal-on-counter-overflow that I've never seen anywhere else (other than some
of my own code).

https://github.com/deater/perf_event_tests
Ok. Thanks for your suggestion.

-- 
Shannon
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help