[Bugme-new] [Bug 16991] New: divide by zero bug in find_busiest_group (actually inlined update_sg_lb_stats )

bugzilla-daemon at bugzilla.kernel.org bugzilla-daemon at bugzilla.kernel.org
Tue Aug 24 15:48:07 PDT 2010


           Summary: divide by zero bug in find_busiest_group  (actually
                    inlined update_sg_lb_stats )
           Product: Process Management
           Version: 2.5
          Platform: All
        OS/Version: Linux
              Tree: Mainline
            Status: NEW
          Severity: blocking
          Priority: P1
         Component: Scheduler
        AssignedTo: mingo at elte.hu
        ReportedBy: chetan.ahuja at gmail.com
        Regression: No


  We upgraded to the "official" 2.6.32 kernel on our small cluster of linux
boxes a few months ago. Recently, within the last week or two, we've started
seeing this divide-by-zero crash on multiple of these servers at seemingly
random times. We haven't been able to catch one of these while attached on our
console servers but luckily we can capture an image of the last screen from the
stack trace, one of which I'm adding as an attachment:

  Using the RIP in that trace, I was able to pin down the line of code where
this crash is happening to the following (in the 2.6.32 treee):

The function
*balance = 0;

/* Adjust by relative CPU power of the group */
sgs->avg_load = (sgs->group_load * SCHED_LOAD_SCALE) /
aff5: 48 c1 e0 0a shl $0xa,%rax
aff9: 48 f7 f6 div %rsi

  So it looks like group->cpu_power happens to be zero in these crashes. All
the crashes we've examined have been at the exact same adddress. The same code
structure is present in 2.6.35 series kernels except it seems to have moved to
the sched_fair.c file now. 

 The above is a snippet from an objdump of sched.o built with -g flag. The
instruction address in this dump doesn't match the RIP from the screenshot
since these are unrelcaoted addresses. I did match the assembly here to that in
the full vmlinux objdump and I'm quite sure of the location (The -g build of
the entire kernel is too huge for me to work with.)

btw, here's a report of another divide-by-zero in find_busiest_group in 2.6.35
kernel here:

 Looking at the code, what strikes me is that group->cpu_power fields are
initt'd to 0  in build_xxx_sched_groups. But I don't see any synchronization
between these places - where group->cpu_power starts off as zero -- and the
usage of that variable (as a denominator) in find_busiest_group and
find_idlest_group functions. It's possible that there's some logical "can't
happen" type synchronization inherent in the design that's not visible in the
code. If so, I'd love to understand the argument that makes it ok.


Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.

More information about the Bugme-new mailing list