[Bugme-janitors] [Bug 9182] Critical memory leak (dirty pages)

bugme-daemon at bugzilla.kernel.org bugme-daemon at bugzilla.kernel.org
Thu Dec 13 23:56:01 PST 2007


http://bugzilla.kernel.org/show_bug.cgi?id=9182





------- Comment #28 from anonymous at kernel-bugs.osdl.org  2007-12-13 23:56 -------
Reply-To: osterried at jesse.de

Hello,

> > BTW: Could someone please look at this problem? I feel little ignored and 
> > in my situation this is a critical regression.
> 
> I was hoping to get around to it today, but I guess tomorrow will have
> to do :-/
> 
> So, its ext3, dirty some pages, sync, and dirty doesn't fall to 0,
> right?
> 
> Does it happen with other filesystems as well?
> 
> What are you ext3 mount options?

I had described my problem in detail in the thread "Strange system hangs",
  http://marc.info/?l=linux-kernel&m=119400497503209&w=2

In the meantime i found out that kernel 2.6.18.5 is the latest kernel working
for me. Some machines had not or seldom shown the error of inusability,
other machines showed the symptom in behalf of 10 days.

rsync seems to be a catalysator for the problem, but not the (only) cause.
If you have a machine where the error occurs often, and one where it never
happend (with the same kernel), and completely exchange the hardware,
then the error still occours on the system where it occoured often.

The problem of non-terminating processes was caused by the program "atop"
(which i installed for reason of debuging that problem). atop enabled process
accounting, and the accounting data would have written to a file during
sys_exit() - but the write of the accounting data stalled in 
balance_dirty_pages_rtatelimited_nr()..
My extension in Appendix II (sys_acct() diagnostic) helped me to find out. I'd
be glad if this patch could go to the kernel, because it helps for diagnose of
those kind of side effects.

In the question Krzysztof raised, if really 2.6.18.5 and not 2.6.19 is the
latest
kernel not showing this bug, i thought to be sure. But yesterday i dig in the
logs ($MAIL when the first hang was complained, and a script that log'ed
the kernel version on that day), and obviously it's 2.6.20 on the day of the
first report. I may have misguessed, because the machine was in test
for quite a long time before it went into production, and maybe in that time
we've upgraded the kernel from 2.6.19 (the kernel we've shiped it).

Regards,
        - Thomas Osterried


-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.


More information about the Bugme-janitors mailing list