[RFC][PATCH -mm 0/5] cgroup: block device i/o controller (v9)

Hirokazu Takahashi taka at valinux.co.jp
Thu Sep 18 04:24:16 PDT 2008


> > Hi,
> > 
> >> TODO:
> >>
> >> * Try to push down the throttling and implement it directly in the I/O
> >>   schedulers, using bio-cgroup (http://people.valinux.co.jp/~ryov/bio-cgroup/)
> >>   to keep track of the right cgroup context. This approach could lead to more
> >>   memory consumption and increases the number of dirty pages (hard/slow to
> >>   reclaim pages) in the system, since dirty-page ratio in memory is not
> >>   limited. This could even lead to potential OOM conditions, but these problems
> >>   can be resolved directly into the memory cgroup subsystem
> >>
> >> * Handle I/O generated by kswapd: at the moment there's no control on the I/O
> >>   generated by kswapd; try to use the page_cgroup functionality of the memory
> >>   cgroup controller to track this kind of I/O and charge the right cgroup when
> >>   pages are swapped in/out
> > 
> > FYI, this also can be done with bio-cgroup, which determine the owner cgroup
> > of a given anonymous page.
> > 
> > Thanks,
> > Hirokazu Takahashi
> That would be great! FYI here is how I would like to proceed:
> - today I'll post a new version of my cgroup-io-throttle patch rebased
>   to 2.6.27-rc5-mm1 (it's well tested and seems to be stable enough).
>   To keep the things light and simpler I've implemented custom
>   get_cgroup_from_page() / put_cgroup_from_page() in the memory
>   controller to retrieve the owner of a page, holding a reference to the
>   corresponding memcg, during async writes in submit_bio(); this is not
>   probably the best way to proceed, and a more generic framework like
>   bio-cgroup sounds better, but it seems to work quite well. The only
>   problem I've found is that during swap_writepage() the page is not
>   assigned to any page_cgroup (page_get_page_cgroup() returns NULL), and

This behavior depends on the version of memory-cgroup.
In the previous version, pages in the swap cache were owned by one of
the cgroups.

Kamezawa-san, one of the implementer, told me he got this feature off
temporarily and he was going to turn it on again. I think this
workaround is chosen because the current implementation of memory
cgroup has a weak point under memory pressure.

>   so I'm not able to charge the cost of this I/O operation to the right
>   cgroup. Does bio-cgroup address or even resolve this issue?

Bio-cgroup can't support pages in the swap cache temporarily with the
current linux kernel either since it shares the same infrastructure
with memory-cgroup.

Now, they have just started to rewrite the infrastructure to track pages
with page_cgroup, which is going to give us good performance ever.
After that I'm going to enhance bio-cgroup more, such as dirty page
tracking. To tell the truth, I already have dirty pages tracking patch
for the current linux in my hand, which isn't posted yet. I'm going to
port it on the new infrastructure.

If memory cgroup team change their mind, I will implement swap-pages
tracking in bio-cgroup.

> - begin to implement a new branch of cgroup-io-throttle on top of
>   bio-cgroup
> - also start to implement an additional request queue to provide first a
>   control at the cgroup level and a dispatcher to pass the request to
>   the elevator (as suggested by Vivek)
> Thanks,
> -Andrea

Hirokazu Takahashi.

More information about the Containers mailing list