Thread (33 messages) 33 messages, 4 authors, 2014-02-20

Re: [PATCH 0/4] x86: Add Cache QoS Monitoring (CQM) support

From: Waskiewicz Jr, Peter P <hidden>
Date: 2014-02-18 19:54:39
Also in: lkml

On Tue, 2014-02-18 at 20:35 +0100, Peter Zijlstra wrote:
On Tue, Feb 18, 2014 at 05:29:42PM +0000, Waskiewicz Jr, Peter P wrote:
quoted
quoted
Its not a problem that changing the task:RMID map is expensive, what is
a problem is that there's no deterministic fashion of doing it.
We are going to add to the SDM that changing RMID's often/frequently is
not the intended use case for this feature, and can cause bogus data.
The real intent is to land threads into an RMID, and run that until the
threads are effectively done.

That being said, reassigning a thread to a new RMID is certainly
supported, just "frequent" updates is not encouraged at all.
You don't even need really high frequency, just unsynchronized wrt
reading the counter. Suppose A flips the RMIDs about and just when its
done programming B reads them.

At that point you've got 0 guarantee the data makes any kind of sense.
Agreed, there is no guarantee with how the hardware is designed.  We
don't have an instruction that can nuke RMID-tagged cachelines from the
cache, and the CPU guys (along with hpa) have been very explicit that
wbinv is not an option.
quoted
I do see that, however the userspace interface for this isn't ideal for
how the feature is intended to be used.  I'm still planning to have this
be managed per process in /proc/<pid>, I just had other priorities push
this back a bit on my stovetop.
So I really don't like anything /proc/$pid/ nor do I really see a point in
doing that. What are you going to do in the /proc/$pid/ thing anyway?
Exposing raw RMIDs is an absolute no-no, and anything else is going to
end up being yet-another-grouping thing and thus not much different from
cgroups.
Exactly.  The cgroup grouping mechanisms fit really well with this
feature.  I was exploring another way to do it given the pushback on
using cgroups initially.  The RMID's won't be exposed, rather a group
identifier (in cgroups it's the new subdirectory in the subsystem), and
RMIDs are assigned by the kernel, completely hidden to userspace.
quoted
Also, now that the new SDM is available
Can you guys please set up a mailing list already so we know when
there's new versions out? Ideally mailing out the actual PDF too so I
get the automagic download and archive for all versions.
I assume this has been requested before.  As I'm typing this, I just
received the notification internally that the new SDM is now published.
I'll forward your request along and see what I hear back.
quoted
, there is a new feature added to
the same family as CQM, called Memory Bandwidth Monitoring (MBM).  The
original cgroup approach would have allowed another subsystem be added
next to cacheqos; the perf-cgroup here is not easily expandable.
The /proc/<pid> approach can add MBM pretty easily alongside CQM.
I'll have to go read up what you've done now, but if its also RMID based
I don't see why the proposed scheme won't work.
Yes please do look at the cgroup patches.  For the RMID allocation, we
could use your proposal to manage allocation/reclamation, and the
management interface to userspace will match the use cases I'm trying to
enable.
quoted
quoted
The below is a rough draft, most if not all XXXs should be
fixed/finished. But given I don't actually have hardware that supports
this stuff (afaik) I couldn't be arsed.
The hardware is not publicly available yet, but I know that Red Hat and
others have some of these platforms for testing.
Yeah, not in my house therefore it doesn't exist :-)
quoted
I really appreciate the patch.  There was a good amount of thought put
into this, and gave a good set of different viewpoints.  I'll keep the
comments all here in one place, it'll be easier to discuss than
disjointed in the code.

The rotation idea to reclaim RMID's no longer in use is interesting.
This differs from the original patch where the original patch would
reclaim the RMID when monitoring was disabled for that group of
processes.

I can see a merged sort of approach, where if monitoring for a group of
processes is disabled, we can place that RMID onto a reclaim list.  The
next time an RMID is requested (monitoring is enabled for a
process/group of processes), the reclaim list is searched for an RMID
that has 0 occupancy (i.e. not in use), or worst-case, find and assign
one with the lowest occupancy.  I did discuss this with hpa offline and
this seemed reasonable.

Thoughts?
So you have to wait for one 'freed' RMID to become empty before
'allowing' reads of the other RMIDs, otherwise the visible value can be
complete rubbish. Even for low frequency rotation, see the above
scenario about asynchronous operations.

This means you have to always have at least one free RMID.
Understood now, I was missing the asynchronous point you were trying to
make.  I thought you wanted the free RMID to use that to always assign
so you know it's "empty," not to get around the twiddling that can
occur.

Let me know what you think about the cacheqos cgroup implementation I
sent, and if things don't look horrible, I can respin with your RMID
management scheme.

Thanks,
-PJ

-- 
PJ Waskiewicz				Open Source Technology Center
peter.p.waskiewicz.jr-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org		Intel Corp.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help