[PATCH v2 02/28] vmscan: take at least one pass with shrinkers

Thu Apr 11 09:25:24 UTC 2013

On Thu, Apr 11, 2013 at 03:27:30PM +0800, Wanpeng Li wrote:
> On Thu, Apr 11, 2013 at 10:41:14AM +1000, Dave Chinner wrote:
> >On Wed, Apr 10, 2013 at 11:03:39PM +0900, JoonSoo Kim wrote:
> >> Another one what I found is that they don't account "nr_reclaimed" precisely.
> >> There is no code which check whether "current->reclaim_state" exist or not,
> >> except prune_inode().
> >
> >That's because prune_inode() can free page cache pages when the
> >inode mapping is invalidated. Hence it accounts this in addition
> >to the slab objects being freed.
> >
> >IOWs, if you have a shrinker that frees pages from the page cache,
> >you need to do this. Last time I checked, only inode cache reclaim
> >caused extra page cache reclaim to occur, so most (all?) other
> >shrinkers do not need to do this.
> >
> 
> If we should account "nr_reclaimed" against huge zero page? There are 
> large number(512) of pages reclaimed which can throttle direct or 
> kswapd relcaim to avoid reclaim excess pages. I can do this work if 
> you think the idea is needed.

I'm not sure. the zero hugepage is allocated through:

	zero_page = alloc_pages((GFP_TRANSHUGE | __GFP_ZERO) & ~__GFP_MOVABLE,   
				HPAGE_PMD_ORDER);

which means the pages reclaimed by the shrinker aren't file/anon LRU
pages.  Hence I'm not sure what extra accounting might be useful
here, but accounting them as LRU pages being reclaimed seems wrong.

FWIW, the reclaim of a single global object by a shrinker is not
really a use case the shrinkers were designed for, so I suspect that
anything we try to do right now within the current framework will
just be a hack.

I suspect that what we need to do is add the current zone reclaim
priority to the shrinker control structure (like has been done with
the nodemask) so that objects like this can be considered for
removal at a specific reclaim priority level rather than trying to
use scan/count trickery to get where we want to be.

Perhaps we need a shrinker->shrink_priority method that is called just
once when the reclaim priority is high enough to trigger it. i.e.
all these "do something special when memory reclaim is struggling to
make progress" operations set the priority at which they get called
and every time shrink_slab() is then called with that priority (or
higher) the shrinker->shrink_priority method is called just once?

Cheers,

Dave.
-- 
Dave Chinner
david at fromorbit.com