[Ksummit-discuss] [TECH(CORE?) TOPIC] Energy conservation bias interfaces

Mon May 12 16:06:15 UTC 2014

On 05/08/2014 07:53 PM, Iyer, Sundar wrote:
>> -----Original Message-----
>> From: Preeti U Murthy [mailto:preeti at linux.vnet.ibm.com]
>> Sent: Thursday, May 8, 2014 2:30 PM
>> To: Iyer, Sundar; Peter Zijlstra; Rafael J. Wysocki
>> Cc: Brown, Len; Daniel Lezcano; Ingo Molnar; ksummit-
> 
>> True that 'race to halt' also ends up saving energy. But when the kernel goes
>> conservative on energy, the scheduler would look at racing to idle *within a
>> power domain* as much as possible. Only if the load crosses a certain
>> threshold would it spread across to other power domains.
> 
> I think Rafael mentioned in an another thread about shared supplies and resources.
> In such a case, the race-to-idle within a power domain may actually negate the overall
> platform savings.

Perhaps. However that does not mean that we will not save any power.

To answer your below question, I was referring to the CPU power domains
since we were talking about 'race to halt'. Scheduler spreading the
tasks across cpus leads to race to halt and possibly saving power since
the tasks finish quicker.

That told, with regard to power savings when there are shared resources,
if the scheduler consolidates tasks to one socket out of two because the
arch exposed the sockets as separate power domains, we will save power
at a processor level. However if they have shared memory controllers, it
would mean that the controller would still be powered on. That is still
fine and we cannot do much about it given the condition that there are
some tasks on the system. But we *can* save power somewhere; better than
not being aware of the power domains and randomly spreading tasks.

This patch lkml.org/lkml/2014/4/11/142 introduces the concept of CPU
power domains precisely to help the scheduler decide the placement of
tasks better for power savings.

> 
> And to confirm, you are referring to generic power domains beyond the CPU right?
> 
>> These are general heuristics. These simple heuristics must work out for most
>> platforms but may not work for all. If it works for majority of the cases then
>> I believe we can safely call it a success.
> 
> And which is why I mentioned that this is heavily platform dependent. This is 
> completely dependent on how the rest of the system power management works.

I don't think you can say *completely* dependent on the platform. If
every aspect of power management is dependent on the platform, there is
very little we can do in the kernel.

The cpuidle sub-system is behaving fairly well on some of the platforms.
A lot of CPU  power management is platform specific. But by exposing
arch specific details like the details about the idle states that are
present, through the cpuidle drivers, the kernel is able to make
reasonably good predictions about the duration of idleness of a cpu and
choose the idle state that it must enter into.
  The point is that we have succeeded in the past in getting the high
level power management reasonably right in the kernel although they were
platform dependent.

Regards
Preeti U Murthy

> 
> Cheers!
>