[Ksummit-discuss] [MAINTAINERS SUMMIT] Bug-introducing patches

Fri Sep 7 09:13:20 UTC 2018

On Fri, Sep 7, 2018 at 6:27 AM, Theodore Y. Ts'o <tytso at mit.edu> wrote:
> On Fri, Sep 07, 2018 at 01:49:31AM +0000, Sasha Levin via Ksummit-discuss wrote:
>>
>> How can you justify sneaking a patch that spent 0 days in linux-next,
>> never ran through any of our automated test frameworks and was never
>> tested by a single real user into a Stable kernel release?
>
> At least for file system patches, my file system regression testing
> (gce-xfstests) beats *all* of the Linux-next bots.  And in fact, the
> regression tests actually catch more problems than users, because most
> users' file system workloads are incredibly boring.  :-)
>
> It might be different for fixes in hardware drivers, where a fix for
> Model 785 might end up breaking Model 770.  But short of the driver
> developer having an awesomely huge set of hardware in their testing
> lab, what are they going to do?  And is holding off until the Merge
> window really going to help find the regression?  The linux-bots
> aren't likely to find such problems!
>
> As far as users testing Linux-next --- I'm willing to try running
> anything past, say, -rc3 on my laptop.  But running linux-next?  Heck,
> no!  That's way too scary for me.
>
> Side bar comment:
>
> There actually is a perverse incentive to having all of the test
> 'bots, which is that I suspect some people have come to rely on it to
> catch problems.  I generally run a full set of regression tests before
> I push an update to git.kernel.org (it only takes about 2 hours, and
> 12 VM's :-); and by the time we get to the late -rc's I *always* will
> do a full regression test.

This is what imo a well-run subsystem should sound like from a testing
pov. All the subsystem specific testing should be done before merging.
Post-merge is only for integration testing and catching the long-tail
issues that need months/years of machine time to surface.

Of course this is much harder for anything that needs physical
hardware, but even for driver subsystems there's lots you can do with
test-drivers, selftests and a pile of emulation, to at least catch
bugs in generic code. And for reasonably sized teams like drm/i915
building a proper CI is a very obvious investement that will pay off.

> In the early-to-mid- rc's, sometimes if I'm in a real rush, I'll just
> run the 15 minute smoke test; but I'll do at least *some* testing.
>
> But other trees seem to be much more loosey-goosey about what they
> will push to linux-next, since they want to let the 'bots catch
> problems.  With the net result that they scare users away from wanting
> to use linux-next.

Yeah, if maintainers see linux-next as their personal testing ground
then it becomes useless for actual integration testing. But it's not
quite as bleak as it was 2 years ago I think, at least from what I'm
seeing when in our linux-next runs for drm/i915. It still does need to
get a lot better though.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch