[patch 2/2] sched: fix nr_uninterruptible accounting of frozen tasks really

Peter Zijlstra peterz at infradead.org
Fri Jul 17 05:31:50 PDT 2009

On Fri, 2009-07-17 at 12:25 +0000, Thomas Gleixner wrote:
> plain text document attachment (freezer-fix-accounting-for-real.patch)
> commit e3c8ca8336 (sched: do not count frozen tasks toward load) broke
> the nr_uninterruptible accounting on freeze/thaw. On freeze the task
> is excluded from accounting with a check for (task->flags &
> PF_FROZEN), but that flag is cleared before the task is thawed. So
> while we prevent that the freezing task with state
> TASK_UNINTERRUPTIBLE is accounted to nr_uninterruptible we decrement
> nr_uninterruptible on thaw.
> Use a separate flag which is handled by the freezing task itself. Set
> it before calling the scheduler with TASK_UNINTERRUPTIBLE state and
> clear it after we return from frozen state.

Right, so I'm wondering why we don't fully revert e3c8ca8336 to begin

The changelog reads:

commit e3c8ca8336707062f3f7cb1cd7e6b3c753baccdd
Author: Nathan Lynch <ntl at pobox.com>
Date:   Wed Apr 8 19:45:12 2009 -0500

    sched: do not count frozen tasks toward load

    Freezing tasks via the cgroup freezer causes the load average to climb
    because the freezer's current implementation puts frozen tasks in
    uninterruptible sleep (D state).

    Some applications which perform job-scheduling functions consult the
    load average when making decisions.  If a cgroup is frozen, the load
    average does not provide a useful measure of the system's utilization
    to such applications.  This is especially inconvenient if the job
    scheduler employs the cgroup freezer as a mechanism for preempting low
    priority jobs.  Contrast this with using SIGSTOP for the same purpose:
    the stopped tasks do not count toward system load.

    Change task_contributes_to_load() to return false if the task is
    frozen.  This results in /proc/loadavg behavior that better meets
    users' expectations.

It appears to me that a frozen cgroup is a transient state. Either you
would typically do something like:

  freeze -> {snapshot, migrate} -> {thaw, destroy}

Therefore a short increase in load doesn't seem like too big a problem,
its going to be gone soon anyway.


More information about the Containers mailing list