[Bugme-new] [Bug 11000] New: kacpi_notify still takes 90% of cpu

bugme-daemon at bugzilla.kernel.org bugme-daemon at bugzilla.kernel.org
Sat Jun 28 13:53:48 PDT 2008


http://bugzilla.kernel.org/show_bug.cgi?id=11000

           Summary: kacpi_notify still takes 90% of cpu
           Product: ACPI
           Version: 2.5
     KernelVersion: 2.6.25.6
          Platform: All
        OS/Version: Linux
              Tree: Mainline
            Status: NEW
          Severity: high
          Priority: P1
         Component: Other
        AssignedTo: acpi_other at kernel-bugs.osdl.org
        ReportedBy: joeyadams3.14159 at gmail.com


Distribution:  Fedora 9 (problem has been seen in Ubuntu Hardy and earlier
versions of both Fedora and Ubuntu).
Hardware Environment:  HP Pavilion 503n, Intel Celeron 1.7GHz and Pentium 4
Northwood (single core).
Problem Description:

Like the bugs mentioned at
http://ubuntuforums.org/showthread.php?t=399619http://ubuntuforums.org/showthread.php?t=399619
and
http://bugzilla.kernel.org/show_bug.cgi?id=10224http://bugzilla.kernel.org/show_bug.cgi?id=10224
, the kacpi_notify workqueue process consumes 90% of the CPU after things like
a suspend or the CPU getting hot.

After a lot of printk tracing, I found that acpi_thermal_notify (in thermal.c)
with event type ACPI_THERMAL_NOTIFY_THRESHOLDS is triggered, leading to
acpi_power_off_device (in power.c).  When it gets to:

status = acpi_evaluate_object(resource->device->handle, "_OFF", NULL, NULL);

It does turn down the fan like it should, but it also triggers the
acpi_ex_opcode_2A_0T_0R opcode, leading to a notify event, and the cycle begins
again.  Moreover, the ACPI driver can't determine the state of the fan (I've
only heard it in high power and low power, never off).  It also thinks it
should turn down the fan when the CPU gets hot.  However, these are separate
issues that I'm not too concerned about.

The kacpid and kacpi_notify problems appear in many different manifestations,
and all of them are ostensibly due to buggy ACPI BIOS implementations.  What I
posted above is likely going to be a different call chain than someone else's. 
Nevertheless, these problems indicate that the ACPI driver should be able to
detect when a workqueue is getting flooded and queue delayed events when it is.

Steps to reproduce:

On my system, letting the CPU get above 51C or so for a while or suspending and
resuming will turn down (that's right, turn down) the fan and trigger the
infinite CPU hogging.  echo 3 > /proc/acpi/fan/FAN1/state will turn down the
fan, but will not trigger the spinning.

Proposed solution:

Create a flood protection mechanism for events queued to kacpid and
kacpi_notify.  For instance, if more than 10 events are sent to one of these
queues within a tenth of a second, more events should be delayed by 10 seconds
until a timer elapses.  Thus, normal operation should not be interfered with to
a severe extent, and the kacpi_notify and kacpid queue flooding will cause
unnoticeable impact.

However, if a whole bunch of ACPI activity happens before the timer is put in
place (e.g. the timer source is based on ACPI), my proposed general solution
could possibly stall correct ACPI operation and not get the timer up and
running at all.  Thus, flood protection should be disabled until, say, ten
seconds of ACPI driver uptime (thus verifying that the clock works).


-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


More information about the Bugme-new mailing list