[PATCH v3 0/7] cpuset: implement sane hierarchy behaviors

Li Zefan lizefan at huawei.com
Thu Jun 13 07:04:36 UTC 2013


On 2013/6/10 0:03, Tejun Heo wrote:
> Hello, Li.
> 
> On Sun, Jun 09, 2013 at 05:14:02PM +0800, Li Zefan wrote:
>> v2 -> v3:
>> Currently some cpuset behaviors are not friendly when cpuset is co-mounted
>> with other cgroup controllers.
>>
>> Now with this patchset if cpuset is mounted with sane_behavior option, it
>> behaves differently:
>>
>> - Tasks will be kept in empty cpusets when hotplug happens and take masks
>> of ancestors with non-empty cpus/mems, instead of being moved to an ancestor.
>>
>> - A task can be moved into an empty cpuset, and again it takes masks of
>> ancestors, so the user can drop a task into a newly created cgroup without
>> having to do anything for it.
> 
> I applied 1-2 and the rest of the series also look correct to me and
> seem like a step in the right direction; however, I'm not quite sure
> this is the final interface we want.
> 
> * cpus/mems_allowed changing as CPUs go up and down is nasty.  There
>   should be separation between the configured CPUs and currently
>   available CPUs.  The current behavior makes sense when coupled with
>   the irreversible task migration and all.  If we're allowing tasks to
>   remain in empty cpusets, it only makes sense to retain and re-apply
>   configuration as CPUs come back online.
> 
>   I find the original behavior of changing configurations as system
>   state changes pretty weird especially because it's happening without
>   any notification making it pretty difficult to use in any sort of
>   automated way - anything which wants to wrap cpuset would have to
>   track the configuration and CPU/nodes up/down states separately on
>   its own, which is a very easy way to introduce incoherencies.
> 
> * validate_change() rejecting updates to config if any of its
>   descendants are using some is weird.  The config change should be
>   enforced in hierarchical manner too.  If the parent drops some CPUs,
>   it should simply drop those CPUs from the children.  The same in the
>   other direction, children having configs which aren't fully
>   contained inside their parents is fine as long as the effective
>   masks are correct.
> 

I've just checked other cgroup controllers, and they do behavior the
way you described. So yeah, it makes sense that cpuset behaviors
coherently.

>   IOW, validate_change() doesn't really make sense if we're keeping
>   tasks in empty cgroups.  As CPUs go down and up, we'd keep the
>   organization but lose the configuration, which is just weird.
> 
> I think what we want is expanding on this patchset so that we have
> separate "configured" and "effective" masks, which are preferably
> exposed to userland and just let the config propagation deal with
> computing the effective masks as CPUs/nodes go down/up and config
> changes.  The code actually could be simpler that way although
> there'll be complications due to the old behaviors.
> 
> What do you think?  If you agree, how should we proceed?  We can apply
> these patches and build on top if you prefer.
> 

I would prefer those patches are applied first, as the new changes can
be based on this patchset, and the changes should be quite straightforward,
and also I don't have to rebase those patches again.



More information about the Containers mailing list