[linux-pm] [RFC][PATCH] PM: Force GFP_NOIO during suspend/resume (was: Re: Memory allocations in .suspend became very unreliable)

Benjamin Herrenschmidt benh at kernel.crashing.org
Wed Jan 20 13:11:04 PST 2010


On Wed, 2010-01-20 at 12:31 +0100, Oliver Neukum wrote:
> 
> But we have the freezer. So generally we don't require that knowledge.
> We can expect no normal IO to happen.

That came before and it's just not a safe assumption :-) The freezer to
some extent protects drivers against ioctl's and that sort of stuff but
really that's about it. There's plenty of things in the kernel that can
kick IOs on their own for a reason or another, or do memory allocations
which in turn will try to push something out and do IOs etc... even when
"frozen".

> The question is in the suspend paths. We never may use anything
> but GFP_NOIO (and GFP_ATOMIC) in the suspend() path. We can
> take care of that requirement in the allocator only if the whole
> system
> is suspended. As soon as a driver does runtime power management,
> it is on its own. 

I'm not sure I understand what you are trying to say here :-)

First of all, the problem goes beyond what a driver does in its own
suspend() path. If it was just that, we might be able to some extent to
push enough stuff up for the driver to specify the right GFP flags
(though even that could be nasty).

The problem with system suspend also happens when your driver has not
been suspended yet, but another one, which happens to be a block device
with dirty pages for example, has.

Your not-yet-suspended driver might well be blocked somewhere in an
allocation or about to make one with some kind of internal mutex held,
that sort of thing, as part of it's normal operations, and -that- can
hang, causing problems when subsequently that same driver suspend() gets
called and tries to synchronize with the driver operations, for example
by trying to acquire that same mutex.

There's more of similarily nasty scenario. The fact is that it's going
to hit rarely, probably without a bakctrace or a crash, and so will
basically cause one of those rare "my laptop didn't suspend" cases that
may not even be reported, and just contribute to the general
unreliability of suspend/resume.

Cheers,
Ben.




More information about the linux-pm mailing list