[PATCH 0/4] x86: Add Cache QoS Monitoring (CQM) support

Waskiewicz Jr, Peter P peter.p.waskiewicz.jr at intel.com
Mon Jan 6 20:10:45 UTC 2014


On Mon, 2014-01-06 at 19:06 +0100, Peter Zijlstra wrote:
> On Mon, Jan 06, 2014 at 04:47:57PM +0000, Waskiewicz Jr, Peter P wrote:
> > > As is I don't really see a good use for RMIDs and I would simply not use
> > > them.
> > 
> > If you want to use CQM in the hardware, then the RMID is how you get the
> > cache usage data from the CPU.  If you don't want to use CQM, then you
> > can ignore RMIDs.
> 
> I think you can make do with a single RMID (per cpu). When you program
> the counter (be it for a task, cpu or cgroup context) you set the 1 RMID
> and EVSEL and read the CTR.
> 
> What I'm not entirely clear on is if the EVSEL and CTR MSR are per
> logical CPU or per L3 (package); /me prays they're per logical CPU.

There is one per logical CPU.  However, in the current generation, they
report on the usage of the same L3 cache.  But the CPU takes care of the
resolution of which MSR write and read comes from the logical CPU, so
software doesn't need to lock access to it from different CPUs.

> > One of the best use cases for using RMIDs is in virtualization.
> 
> *groan*.. /me plugs wax in ears and goes la-la-la-la
> 
> > A VM
> > may be a heavy cache user, or a light cache user.  Tracing different VMs
> > on different RMIDs can allow an admin to identify which VM may be
> > causing high levels of eviction, and either migrate it to another host,
> > or move other tasks/VMs to other hosts.  Without CQM, it's much harder
> > to find which process is eating the cache up.
> 
> Not necessarily VMs, there's plenty large processes that exhibit similar
> problems.. why must people always do VMs :-(

Completely agreed.  It's just the loudest people right now asking for
this capability are using VMs for the most part.

> That said, even with a single RMID you can get that information by
> simply running it against all competing processes one at a time. Since
> there's limited RMID space you need to rotate at some point anyway.
> 
> The cgroup interface you propose wouldn't allow for rotation; other than
> manual by creating different cgroups one after another.

I see your points, and I also think that the cgroup approach now isn't
the best way to make this completely flexible.  What about this:

Add a new read/write entry to the /proc/<pid> attributes that is the
RMID to assign that process to.  Then expose all the available RMIDs
in /sys/devices/system/cpu, say in a new directory platformqos (or
whatever), which then have all the statistics inside those, plus a knob
to enable monitoring or not.  Then all the kernel exposes is a way to
assign a PID to an RMID, and a way to turn on monitoring or turn it off,
and get the data.  I can then put a simple userspace tool together to
make the management suck less.

Thoughts?

Cheers,
-PJ

-- 
PJ Waskiewicz				Open Source Technology Center
peter.p.waskiewicz.jr at intel.com		Intel Corp.


More information about the Containers mailing list