[RFC][PATCH -mm 0/5] cgroup: block device i/o controller (v9)

Takuya Yoshikawa yoshikawa.takuya at oss.ntt.co.jp
Wed Sep 17 02:04:00 PDT 2008


Andrea Righi wrote:
> * Try to push down the throttling and implement it directly in the I/O
>   schedulers, using bio-cgroup (http://people.valinux.co.jp/~ryov/bio-cgroup/)
>   to keep track of the right cgroup context. This approach could lead to more
>   memory consumption and increases the number of dirty pages (hard/slow to
>   reclaim pages) in the system, since dirty-page ratio in memory is not
>   limited. This could even lead to potential OOM conditions, but these problems
>   can be resolved directly into the memory cgroup subsystem
> * Handle I/O generated by kswapd: at the moment there's no control on the I/O
>   generated by kswapd; try to use the page_cgroup functionality of the memory
>   cgroup controller to track this kind of I/O and charge the right cgroup when
>   pages are swapped in/out

Could you explain which cgroup we should charge when swap in or out occurs?
Are there any difference between the following cases?

Target page is
1. used as page cache and not mapped to any space
2. used as page cache and mapped to some space
3. not used as page cache and mapped to some space

I do not think it is fair to charge the process for this kind of I/O, am I wrong?

> * Improve fair throttling: distribute the time to sleep among all the tasks of
>   a cgroup that exceeded the I/O limits, depending of the amount of IO activity
>   generated in the past by each task (see task_io_accounting)
> * Try to reduce the cost of calling cgroup_io_throttle() on every submit_bio();
>   this is not too much expensive, but the call of task_subsys_state() has
>   surely a cost. A possible solution could be to temporarily account I/O in the
>   current task_struct and call cgroup_io_throttle() only on each X MB of I/O.
>   Or on each Y number of I/O requests as well. Better if both X and/or Y can be
>   tuned at runtime by a userspace tool
> * Think an alternative design for general purpose usage; special purpose usage
>   right now is restricted to improve I/O performance predictability and
>   evaluate more precise response timings for applications doing I/O. To a large
>   degree the block I/O bandwidth controller should implement a more complex
>   logic to better evaluate real I/O operations cost, depending also on the
>   particular block device profile (i.e. USB stick, optical drive, hard disk,
>   etc.). This would also allow to appropriately account I/O cost for seeky
>   workloads, respect to large stream workloads. Instead of looking at the
>   request stream and try to predict how expensive the I/O cost will be, a
>   totally different approach could be to collect request timings (start time /
>   elapsed time) and based on collected informations, try to estimate the I/O
>   cost and usage
> -Andrea

Takuya Yoshikawa

More information about the Containers mailing list