[RFC] writeback and cgroup

Vivek Goyal vgoyal at redhat.com
Tue Apr 10 21:20:41 UTC 2012


On Tue, Apr 10, 2012 at 11:05:05PM +0200, Jan Kara wrote:

[..]
> > Ok. So what is the meaning of "make process wait" here? What it will be
> > dependent on? I am thinking of a case where a process has 100MB of dirty
> > data, has 10MB/s write limit and it issues fsync. So before that process
> > is able to open a transaction, one needs to wait atleast 10seconds
> > (assuming other processes are not doing IO in same cgroup). 
>   The original idea was that we'd have "bdi-congested-for-cgroup" flag
> and the process starting a transaction will wait for this flag to get
> cleared before starting a new transaction. This will be easy to implement
> in filesystems and won't have serialization issues. But my knowledge of
> blk-throttle is lacking so there might be some problems with this approach.

I have implemented and posted patches for per bdi per cgroup congestion
flag. The only problem I see with that is that a group might be congested
for a long time because of lots of other IO happening (say direct IO) and
if you keep on backing off and never submit the metadata IO (transaction),
you get starved. And if you go ahead and submit IO in a congested group,
we are back to serialization issue.

[..]
> > One more factor makes absolute throttling interesting and that is global
> > throttling and not per device throttling. For example in case of btrfs,
> > there is no single stacked device on which to put total throttling
> > limits.
>   Yes. My intended interface for the throttling is bdi. But you are right
> it does not exactly match the fact that the throttling happens per device
> so it might get tricky. Which brings up a question - shouldn't the
> throttling blk-throttle does rather happen at bdi layer? Because the
> uses of the functionality I have in mind would match that better.

I guess throttling at bdi layer will take care of network filesystem
case too?  But isn't the notion of "bdi" internal to kernel and user does
not really program thing in terms of bdi.

Also per bdi limit mechanism will not solve the issue of global throttling
where in case of btrfs an IO might go to multiple bdi's. So throttling limits
are not total but per bdi.

Thanks
Vivek


More information about the Containers mailing list