[Linux-kernel-mentees] [STABLE TEST] 5.3.13

Amol Grover frextrite at gmail.com
Tue Dec 3 17:18:17 UTC 2019


On Tue, Dec 03, 2019 at 01:03:04PM +0100, Greg KH wrote:
> On Tue, Dec 03, 2019 at 12:44:15PM +0530, Amol Grover wrote:
> > On Tue, Dec 03, 2019 at 07:40:52AM +0100, Greg KH wrote:
> > > On Tue, Dec 03, 2019 at 11:55:03AM +0530, Amol Grover wrote:
> > > > Compiled, Booted, however I'm getting the following errors when running
> > > > "make kselftest"
> > > > 
> > > > sudo dmesg -l alert
> > > > 
> > > > [34381.903893] BUG: kernel NULL pointer dereference, address: 0000000000000008
> > > > [34381.903904] #PF: supervisor read access in kernel mode
> > > > [34381.903908] #PF: error_code(0x0000) - not-present page
> > > 
> > > Which test causes this problem?
> > 
> > IIRC I didn't run make kselftest with summary=1 option. Is there any
> > other way to get that information? The logs that kselftest generated
> > also don't seem to help in this.
> 
> Watch the output when you run this?  I don't know, try re-running it
> with that option.
> 

I did. The tls test under tools/testing/selftests/net seems to be the
culprit. More information below.

> > > ANd is it new in 5.3.13?
> > > 
> > 
> > I previously ran kselftest on 5.4-rc7 and 5.3.9 (default kernel shipped
> > by openSUSE), both were fine. However, a bit of backstory:
> > 
> > A day ago I used kselftest from the linux/next branch and ran it (w/o
> > sudo).  It showed me the exact same error. However, I was running a
> > modified version of 5.3.13, but those modifications were actually
> > trivial (5 lines changed) and shouldn't have resulted in this kernel
> > error. So, I switched to the vanilla 5.3.13 and ran kselftest (w/o sudo)
> > again. I ran it 3 times (w/o any errors), switched back to the modified
> > kernel and ran kselftest (w/o root) 2 more times and everything was
> > fine. Then I decided to test the vanilla one again for the 4th time, but
> > this time I ran kselftest as root where this BUG popped again.
> 
> Try a kernel.org 5.3.9 and if that works, then try 5.3.13 and if that
> fails, run 'git bisect' and try to find the offending kernel commit.
> 

After finding the test that was resulting in the BUG, I decided to check
again against 5.3.13 (= BUG). I had recently compiled 5.4.1 as well so I
decided to run this test against 5.4.1 (= BUG). After this I realized I
had 5.3.9 kernel kept away so I decided to run the test against that too
(= BUG). After 5.3.9's error something didn't feel right. Mind you I was
using kselftest from linux/next so I got suspicious about the test
it-self. I ran kselftest from 5.4.1 on kernels 5.4.1 and 5.3.13 and none
of them resulted in the BUG this time. After a bit of digging I found
out the next branch had 2 additional test cases for the tls test, and
one of them (sendmsg_fragmented) is the actual culprit that was causing
all this.

TL;DR: tools/testing/selftests/net/tls.ci:sendmsg_fragmented from the
next tree appears to be broken(?)

Thanks
Amol

> thanks,
> 
> greg k-h


More information about the Linux-kernel-mentees mailing list