RFC: Attaching threads to cgroups is OK?
Fernando Luis Vázquez Cao
fernando at oss.ntt.co.jp
Wed Aug 20 20:08:01 PDT 2008
On Wed, 2008-08-20 at 20:48 +0900, Hirokazu Takahashi wrote:
> > > Tsuruta-san, how about your bio-cgroup's tracking concerning this?
> > > If we want to use your tracking functions for each threads seperately,
> > > there seems to be a problem.
> > > ===cf. mm_get_bio_cgroup()===================
> > > owner
> > > mm_struct ----> task_struct ----> bio_cgroup
> > > =============================================
> > > In my understanding, the mm_struct of a thread is same as its parent's.
> > > So, even if we attach the TIDs of some threads to different cgroups the
> > > tracking always returns the same bio_cgroup -- its parent's group.
> > > Do you have some policy about in which case we can use your tracking?
> > >
> > It's will be resitriction when io-controller reuse information of the owner
> > of memory. But if it's very clear who issues I/O (by tracking read/write
> > syscall), we may have chance to record the issuer of I/O to page_cgroup
> > struct.
> This might be slightly different topic though,
> I've been thinking where we should add hooks to track I/O reqeust.
> I think the following set of hooks is enough whether we are going to
> support thread based cgroup or not.
> Hook-1: called when allocating a page, where the memory controller
> already have a hoook.
> Hook-2: called when making a page in page-cache dirty.
> For anonymous pages, Hook-1 is enough to track any type of I/O request.
> For pages in page-cache, Hook-1 is also enough for read I/O because
> the I/O is issued just once right after allocting the page.
> For write I/O requests to pages in page-cache, Hook-1 will be okay
> in most cases but sometimes process in another cgroup may write
> the pages. In this case, Hook-2 is needed to keep accurate to track
> I/O requests.
This relative simplicity is what prompted me to say that we probably
should try to disentangle the io tracking functionality from the memory
controller a bit more (of course we still should reuse as much as we can
from it). The rationale for this is that the existing I/O scheduler
would benefit from proper io tracking capabilities too, so it'd be nice
if we could have them even in non-cgroup-capable kernels.
As an aside, when the IO context of a certain IO operation is known
(synchronous IO comes to mind) I think it should be cashed in the
resulting bio so that we can do without the expensive accesses to
bio_cgroup once it enters the block layer.
More information about the Containers