I/O bandwidth controller (was Re: Too many I/O controllerpatches)
Caitlin.Bestler at neterion.com
Wed Aug 6 11:01:13 PDT 2008
Fernando Luis Vázquez Cao wrote:
> *** Goals
> 1. Cgroups-aware I/O scheduling (being able to define arbitrary
> groupings of processes and treat each group as a single scheduling
> 2. Being able to perform I/O bandwidth control independently on each
> 3. I/O bandwidth shaping.
> 4. Scheduler-independent I/O bandwidth control.
> 5. Usable with stacking devices (md, dm and other devices of that
> 6. I/O tracking (handle buffered and asynchronous I/O properly).
> The list of goals above is not exhaustive and it is also likely to
> contain some not-so-nice-to-have features so your feedback would be
I'm following this discussion on the virtualization mailing list,
and I'm limiting my reply to virtualization-related mailing lists.
What strikes me about Fernando's RFC is that it is quite readable.
So much so that I understood most of it, and I really shouldn't have.
That is because it is mostly at the scope of a kernel scheduler,
something I have no reason to be involved with.
What is missing from the discussion is anything on how IO bandwidth
could be regulated in a virtualized environment, other than the objective
that cgroups should be able to be set at the scope of a container.
For a virtualized environment, it strikes me that there are a number
of additional objectives that need to be considered:
- A Backend (Dom0/DomD/whatever) that is providing block storage
services to Guests needs to be able to provide QoS controls with
or without the co-operation of the Guest kernels. However GOS
involvement is valuable, because the backend on its own would treat
all IO from a given guest as an undifferentiated glop. Particularly,
to the extent that the improved algorithms under discussion in the
RFC are implemented in the Linux kernel, how do they become hints
when passed to the backend?
- In virtualized environments it is *more* likely that the underlying
storage is networked. This means that the backend scheduler probably
only has crude control over when I/O is actually done. It can only
control when requests are issued. The target has more control over
when the I/O actually occurs, especially for writes.
- When the storage is networked it also means that the scheduling will
also be QoS regulated at the network traffic layer. The two QoS
scheduling algorithms should not work across purposes, nor should
system administrators be forced to configure two sets of policies
to achieve a single result.
- Does it make sense to extend storage-related network bandwidth controls
to deal with sub-division by device or Guest? Network bandwidth control
is likely to give a rate/quota for networked storage traffic. Would it
make sense to sub-divide that control based on cgroups/whatever?
- Should the Guest I/O scheduler be given a hint that although it *looks*
like a SCSI device that it really is a virtualized storage device and
that therefore the Guest probably should not spend that many CPU cycles
trying to optimize the movements of a head that does not really exist?
More information about the Containers