From hare at suse.com  Sat Jul  1 17:24:39 2017
From: hare at suse.com (Hannes Reinecke)
Date: Sat, 1 Jul 2017 19:24:39 +0200
Subject: [Ksummit-discuss] [TECH TOPIC] is Kconfig a bit hard sometimes?
In-Reply-To: <20170630175201.GC26257@fury>
References: <20170627135839.GB1886@jagdpanzerIV.localdomain>
	<CA+55aFxBUe4QAa11i-ySr2i959EpvcEZYf=9NumWyTDH+6BEQw@mail.gmail.com>
	<20170630175201.GC26257@fury>
Message-ID: <92305974-328a-d1b2-7301-4321f374ab8f@suse.com>

On 06/30/2017 07:52 PM, Darren Hart wrote:
> On Tue, Jun 27, 2017 at 10:18:04AM -0700, Linus Torvalds wrote:
>> On Tue, Jun 27, 2017 at 6:58 AM, Sergey Senozhatsky
>> <sergey.senozhatsky.work at gmail.com> wrote:
>>>
>>> am I the only one who struggle with the Kconfig sometimes?
>>
>> I hate our Kconfig. It's my least favorite part of the kernel. It asks
>> questions about insane things that nobody can know the answer to.
>>
>> Taking a distro default config and doing"make localmodconfig" is what
>> I end up doing on new machines, and it has all kinds of suckage too.
>>
>> I don't have a solution to it. But I think part of the solution would
>> be for us to have various "sane minimal requirement" Kconfig
>> fragments, and trhe ability to feed them incrementally, so that people
>> can build up a sane Kconfig from "I want this".
> 
> This was, in part, the intent behind the configuration fragments and the
> merge_config.sh script. I use this with the x86 platform drivers:
> 
> $ make defconfig pdx86.config
> 
> But I have to generate, also scripted, the pdx86.config by scraping the
> Kconfig file. The kvm_guest.config. There are other things I would like
> to see subconfigs for, like "efi.config" - but I wasn't sure what the
> current view on such things were. I'm glad to know I'm not along in my
> frustration with the overly granular nature of Kconfig.
> 
> The problem with this model of course is keeping the config fragments
> current with Kconfig changes. The mergeconfig script does call out
> problems with specified config options. We can address this with
> a configcheck target or similar which would audit the config fragments
> to ensure they are kept in sync with the Kconfig files.
> 
> ...
> 
>>
>> And note that none of this is about technoliogy, and SAT solvers and
>> resolving the KConfig depdendencies that some techie people love
>> talking about. It's all about "what if we just had some kconfig
>> fragments to enable some commonly used stuff" (where "commonly used"
>> is obviously architecture dependent, but also target-dependent - a
>> "simpleconfig" for a PC workstation kind of config is very different
>> from a "simpleconfig" for a server or some ARM embedded thing).
>>
> 
> It sounds like the existing config fragment mechanism is sufficient for
> what you describe and what we need to do is create these fragments.
> 
> One thing that would be nice is if we could have fragment nesting so you
> could create your "simpleconfig" which in turn includes a few of the
> more specific config fragments.
> 
And what would be totally cool if we could have fragments _per default_.
EG by not having a massive .config, but rather keeping it per directory,
or maybe corresponding in the directory where each Kconfig lives.
That way it would be easier to figure out where this blasted option cam
from, plus one could easily provide (and check!) configurations for
several systems, keeping the common parts intact and modify only the
machine specific ones.

And it would solve the 'keeping the config current' problem, as one
could quite simply identify which configuration will need to be changed
for a Kconfig change, seeing that both will be kept in the same directory.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare at suse.com			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N?rnberg
GF: F. Imend?rffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG N?rnberg)

From sre at kernel.org  Sun Jul  2 11:28:26 2017
From: sre at kernel.org (Sebastian Reichel)
Date: Sun, 2 Jul 2017 13:28:26 +0200
Subject: [Ksummit-discuss] [TECH TOPIC] mobile phones
In-Reply-To: <20170628211008.GA19571@amd>
References: <20170625104850.GA24717@amd>
	<87shinzkp9.fsf@notabene.neil.brown.name>
	<20170626083407.GA9621@amd> <20170627123947.krne6a2saolcndih@earth>
	<20170627215755.GC5250@amd> <20170628164502.itggbf4xuhsv3oyf@earth>
	<20170628211008.GA19571@amd>
Message-ID: <20170702112826.zirafz7lbo5lnabd@earth>

Hi,

On Wed, Jun 28, 2017 at 11:10:08PM +0200, Pavel Machek wrote:
> > > So to be exact... u-boot does not know about battery charging. And
> > > NoLo can only do very, very slow charging.
> > 
> > Yes. The idea is, that normally NoLo only charges far enough, that
> > Linux can be booted.
> > 
> > > Yes, unfortunately that does not work quite well here. Voltage goes
> > > too low before Linux can boot, so it resets, but it is still high
> > > enough for the bootloader, so it attempts to boot Linux one more time,
> > > but battery is empty and voltage goes too low before Linux can boot, ...
> > 
> > I guess your battery is not the fittest anymore?
> 
> I guess that's one issue. (One of my batteries is actually so bad that
> GSM modem fails with it.)
> 
> Well, I guess Debian boots a little longer than Maemo. Plus, I believe
> we should charge the battery from kernel by default; it will enable
> running fsck etc, and it will mean slow userspace boot will not
> break...

That sounds like really bad battery :)

> > > > On N950 there is an unsupported gps connected via i2c iirc (with
> > > > unknown protocol that needs to be RE'd) and TI's WiLink provides
> > > > GPS on a shared UART link with bluetooth-style header using yet
> > > > another protocol. I agree, that we should have a GPS subsystem.
> > > 
> > > Two GPSes in one box, interesting design. Are both of them connected
> > > to useful antenna?
> > 
> > Actually there are probably 3 GPS implementations in Droid 4:
> > 
> >  * WL1285
> >  * MDM6600 modem
> >  * LTE modem
> > 
> > As far as I understand it modems are required to have GPS access in
> > US. I'm not yet sure which of the implementations is used by Droid 4's
> > stock system, but Motorola explicitly added a driver for the WL1285 GPS
> > making it a likely candidate (The userspace part is a closed source
> > shared object used by Android).
> 
> Interesting :-). I guess you could do really fair comparison of the
> chipsets.
> 
> (Binary driver -- bad Motorola :-( )

Yeah :( Note, that Nokia also has binary driver for the N900 (which
had been reverse engineered) and a different on on N950 (which has
not been reverse engineered so far).

> > > +static int generic_protect(struct power_supply *psy)
> > > +{
> > > +	union power_supply_propval val;
> > > +	int res;
> > > +	int mV, mA, mOhm = 430, mVadj = 0;
> > 
> > 430 mOhm?
> 
> Yes, 0.43 Ohm.

What's the source of this value?

-- Sebastian
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://lists.linuxfoundation.org/pipermail/ksummit-discuss/attachments/20170702/8ce51758/attachment.sig>

From sre at kernel.org  Sun Jul  2 12:03:21 2017
From: sre at kernel.org (Sebastian Reichel)
Date: Sun, 2 Jul 2017 14:03:21 +0200
Subject: [Ksummit-discuss] [TECH TOPIC] mobile phones
In-Reply-To: <20170628202722.GC18101@amd>
References: <20170626111207.GA11688@amd> <20170626114931.GG23064@atomide.com>
	<20170626131401.GA11980@amd> <20170626134904.GH23064@atomide.com>
	<20170626204932.GA19396@amd> <20170627071835.GJ23064@atomide.com>
	<20170627121455.tljtekx6bmzlezxa@earth> <20170627215727.GA5250@amd>
	<20170628160112.ip2ambkzlkkoz2ww@earth>
	<20170628202722.GC18101@amd>
Message-ID: <20170702120321.tskgcrbfehg4fccx@earth>

Hi,

On Wed, Jun 28, 2017 at 10:27:22PM +0200, Pavel Machek wrote:
> > > Oh, another major piece is DSP coprocessor that is there. Unlike
> > > graphics, we don't even know how support for it should like.
> > 
> > https://www.kernel.org/doc/Documentation/remoteproc.txt
> > 
> > config OMAP_REMOTEPROC
> > 	tristate "OMAP remoteproc support"
> > 	[...]
> > 	help
> > 	  Say y here to support OMAP's remote processors (dual M3
> > 	  and DSP on OMAP4) via the remote processor framework.
> > 
> > 	  Currently only supported on OMAP4.
> > 
> > 	  Usually you want to say Y here, in order to enable multimedia
> > 	  use-cases to run on your platform (multimedia codecs are
> > 	  offloaded to remote DSP processors using this framework).
> > 
> > 	  It's safe to say N here if you're not interested in multimedia
> > 	  offloading or just want a bare minimum kernel.
> > 
> > I have been told by some Nokia people (I do not remember who it
> > was, possibly Sakari), that the DSP is not that powerful and any
> > calculation should also be possible on CPU (wasting a bit of
> > energy).
> 
> Ok, we probably don't care about DSP, but lets say we had really
> fast DSP or really cared about power.
> 
> We'd need remoteproc. Sure. But that has no interface for userland,
> right?

remoteproc is only for start/stop + fw loading. The actual
communication is done using rpmsg, which does not yet have a
userspace API afaik.

> So we'd need to introduce interface for userland... fine.
> 
> And then we'd need cross-compilers for the DSP used. Ok.

Yes. As far as I know there is currently no open source toolchain
for the TI DSP.

> And then we'd need to split mpg123 to CPU and DSP parts, and modify it
> so that it can run the DSP parts using remoteproc, when available?
> 
> That starts to be ... complex, with changes all over the system :-(.

With little gain expected on N900...

-- Sebastian
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://lists.linuxfoundation.org/pipermail/ksummit-discuss/attachments/20170702/a8ff23e3/attachment-0001.sig>

From sre at kernel.org  Sun Jul  2 12:11:04 2017
From: sre at kernel.org (Sebastian Reichel)
Date: Sun, 2 Jul 2017 14:11:04 +0200
Subject: [Ksummit-discuss] [TECH TOPIC] mobile phones
In-Reply-To: <20170628183756.GA30277@amd>
References: <20170625104850.GA24717@amd>
	<87shinzkp9.fsf@notabene.neil.brown.name>
	<20170626083407.GA9621@amd>
	<20170626112052.oze7qxmxiyu67wzh@sirena.org.uk>
	<20170626122224.GA11441@amd>
	<20170627114026.iwsqbbwytleyurmi@sirena.org.uk>
	<20170628183756.GA30277@amd>
Message-ID: <20170702121104.azcq3jhx3hmjphq5@earth>

Hi,

On Wed, Jun 28, 2017 at 08:37:56PM +0200, Pavel Machek wrote:
> Right now, first priority is to get useful quality of the voice
> calls. If Nokia is the only one, then this is perhaps not a big
> problem.
> 
> I'd like to know how Pyra handles this (
> https://pyra-handheld.com/boards/pages/pyra/ ).

It provides a normal PCM interface, check
20161116-DragonFly-MAIN-4G-sch.pdf, page 11 from here:

https://pyra-handheld.com/boards/threads/power-memory-and-schematics.78631/

The Nokia phones are really special in this regard.

-- Sebastian
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://lists.linuxfoundation.org/pipermail/ksummit-discuss/attachments/20170702/f77709da/attachment.sig>

From linux at leemhuis.info  Sun Jul  2 17:51:43 2017
From: linux at leemhuis.info (Thorsten Leemhuis)
Date: Sun, 2 Jul 2017 19:51:43 +0200
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
	regression tracking
Message-ID: <576cea07-770a-4864-c3f5-0832ff211e94@leemhuis.info>

Hi! Sorry, I know I'm late -- real life (travel, day job, ...) kept me
away from spending time on Linux kernel regression work :-/

Maybe I'm taking it a bit to far for the new kid in town, but I think I
want to propose two sessions. One for the maintainer summit, that deals
with a the most critical issues relevant to regression tracking. And one
technical session to deal with all the other stuff. Obviously we can
move below mentioned topics from one to the other or talk about them at
both if we want.

= [MAINTAINERS SUMMIT] Improve regression tracking =

 * Follow up from last year: What to do about bugzilla.kernel.org?
Reporters still get stranded there.
 * How to get subsystems maintainer involved more in regression tracking
to better make sure that reported regressions are tracked and not
forgotten accidentally.
 * Frustrations with regression tracking aka. how to establish
regression tracking properly to make sure it will never go away again.

= [TECH TOPIC] Improve the kernels quality by getting more people
involved in regression testing and reporting =

 * A short report from the outcome of the maintainer summit discussion;
also pick up and topics here that where not properly discussed on the
maintainer summit or were postponed to this session.
 * How to get distros more involved in regression tracking; especially
those that have a technical aware user base or normally ship up2date
kernel images (and thus have an greater interest in avoiding
regressions). I'm mainly thinking about Arch Linux, Debian, Fedora, and
openSUSE Tumbleweed here; having Ubuntu in the boat would be good, too!
(might be wise to talk about this on the maintainers summit as well, if
the right people are there)
 * How to make it more easy to (ideally automatically!) track the
current status and the progress of each regression? Are there any tools
that could make regression tracking easier for all of us while not
introducing much overhead for maintainers?

= Details =

Below you'll find few more words about some points mentioned above;
there are a few other topics as well we could discuss if we want. But
first, a few general words on regression tracking from my point of view:

 * There are a lot of areas in regression tracking where things are far
from good (read: in a bad state). That makes it easy to discuss current
problems and their solutions for hours -- and at the same time forget
that discussing itself doesn't get us much forward (the old bugzilla
issue mentioned in this mail is a good example). We thus IMHO should
focus on the most important issues and lay the groundwork to establish
regression tracking properly again, then we move on to solve things that
are harder to solve.

 * Regression tracking currently is quite boring and exhausting (read:
high burn-out risk), as it involves quite a lot of manual work finding
regressions and keeping track of their progress (and at the end of the
day it does not feel like you achieved much). Some of that work can not
be automated. But quite a bit can and that would help a great deal to
establish regression tracking properly (currently I'm the only one doing
it and some development cycles I simply don't find spare time for it).

   I currently don't see any existing solutions that fit well with our
mail focused workflow and at the same time do not introduce much
overhead for subsystem maintainers (which I assume is what everyone
wants, as I fear solutions with much overhead won't fly at all). Ideas
how to solve this tricky problem area are highly welcomed. It's
something that can be discussed when the aforementioned points
"establish regression tracking properly" and "make it more easy to
manually or automatically track the current status of a regression" come up.

== What to do about bugzilla.kernel.org =

Discussed last year already; see https://lwn.net/Articles/705245/ for
details. Situation didn't change much since then: the bugzilla instance
was updated, but people still get stranded there as most subsystems
ignore it. That afaics frustrates people and makes them stop testing or
reporting bugs.

Discuss how to improve things. [my2cent] Maybe a short term solution
like this could work: Serve a static page on bugzilla.kernel.org that
tells people where regressions/bugs for certain subsystems can be
reported, as it most of the time is some mailing list anyway. Such a
page could get compiled from MAINTAINERS (there is the "B:" field now
that points to bugzilla; if its not there point to a mailing lists; also
explain get_maintainers.pl).

  Leave our bugzilla reachable via bugzilla.kernel.org/frontpage (or
something like that) for those few subsystems that use it; that's afaics
ACPI and PM (including Cpufreq, Cpuidle, Hibernation, Suspend, ...) and
maybe PCI (not sure) -- or should we tell them to move to
bugzilla.freedesktop.org (or somewhere else) to get rid of our bugzilla
in the long etrm and make Konstantins life easier? Anyway: Make sure
bugs for other subsystems can't get filed in bugzilla.kernel.org anymore
to make sure they get lost there. [/my2cent]

== How to get subsystems maintainer more involved in regression tracking
to [?] ==

One reasons why I put this up is: It would help me a lot if people let
regressions at leemhuis.info (side note: might be wise to make a
mailing-list that replaces this address) get told about regressions --
simply CCing it on reports or answers to regressions reports is enough;
forwarding/bouncing mails there (even without additional text) is fine,
too.

The other reason I included it: This came up in last years discussion on
this list and it seemed some people thought we can get the subsystems
maintainers more involved; so I thought it might be wise to discuss it.
Might also be a good idea to discuss here how to get distro kernel
maintainer more involved if enough are around.

== How to establish regression tracking properly [?] ==

This is a pretty vague topic on purpose. People seem to agree that
regression tracking is important, but for years nobody did it (it
stopped a little while after Rafael had to move on) and the little bit
that I can do in my rare spare time won't help much (and I have no idea
how long I can continue to find time for it).

== Make it easier to track the progress of regression ==

One of the main reasons that makes regression tracking hard currently:
getting aware or regressions and tracking their progress is a lot of
manual work. I plan one step that hopefully makes the job a little
easier and at the same time might allow some automation in the long
term: ask people to include a certain keyword in their regressions
reports. Maybe something like "Linux-Regression" that doesn't get too
much false positives when searching for it on lists and via Google
(suggestions for a better tag welcome).

In addition, I plan to hand out some form of ID for each regressions I
track and ask people to include it -- especially when they post patches
that fix said regression or move the discussion to a new place (like
"Corrects: Linux-Regression-d2afd"; again: suggestions welcome! Maybe I
should just use a URL where people find details?).

That way I can notice more easy when a fix for a regression hits
linux-next or master; I also get aware if a discussion moves from
bugzilla to LKML or from one thread to another (fingers crossed).
Obviously it depends on cooperation of those involved.

If this works out we could write a script or something that watches
mailing lists, bug trackers and git trees for the tag in question. That
script could file a database and automatically do some of the tracking job.

== get distros more involved ==

I assume at least Ben (Debian), Laura (Fedora), and Takashi (openSUSE)
are around, so it might be a good idea to sit together and talk
regression tracking in general and how we could get the distros kernel
maintainers more involved. Even better would be to sit down before to
maybe come up with some ideas/plans we could talk during this session.

One topic could be: How to make it easier for users of popular distros
to get involved in testing. The "Kernel of the day" (KOTD) from
SUSE/openSUSE was mentioned recently on this list already, but I got the
impression that the existence of this repo is not well known; guess it's
the same for my own Kernel Vanilla Repositories for Fedora (those
contain packages with a quite recent mainline version; see
https://fedoraproject.org/wiki/Kernel_Vanilla_Repositories ) or the fact
that Fedora rawhide ships a recent mainline snapshot all the time. But
should distros also offer Linux-next somewhere? Or anything else? And
should the distros send experienced users upstream when they found a
regression? Or will subsystem maintainers send those users away because
they assume those kernels are not vanilla?


== Topics or vague ideas I left out on purpose ==

Here is a list of other things we could talk about, but I think better
left for a later time:

 * Kerneloops (http://oops.kernel.org/): It was discussed last year on
this list. I have no idea what the current status is. Is someone
watching & analysing it? And poking the right people when needed? (I
doubt it)

 * Regression tracking for stable kernels (many bugs only get noticed
once a new mainline version got released; at that time it might still be
easy to revert a certain patch in mainline and stable)

 * statistics: I didn't spend time to create statistics, like Rafael did
in the past. They'd be nice to have, but for now I think my time is
better spend elsewhere.

 * work towards growing the number of tester by making it easier for
them (better documentation, easier configuration, bisection scripts, ...)

 * maybe document a few some procedures for those that are not regular
kernel developers (like the "When users report bugs on the Fedora
tracker that look like actual upstream bugs, what's the best way to have
those reported?" thing that Laura mentioned earlier this month in the
mail "Bug reporting feedback loop"

 * provide better services than only a plain text list of regression on
a mailing list?

 * better documentation? for example explain the difference between bugs
and regressions somewhere to make people understand why their bugs might
get ignored, but as the same time know that we handle regressions more
seriously.

 * Should the regression tracker nag subsystem maintainers (and
reporters) more often if they are inactive? How do people for example
feel about (Semi-)Automatic nagging mails for regressions where there is
no progress?

 * Is the data and the format of the current reports show useful at all?
If not: How to improve it?

 * regression tracking is a fair amount of work, and it's frustrating,
and people burn out. How to avoid that? Can we maybe get regression
tracking on solid ground by somehow building a healthy community around
it (containing kernel developers, Distro maintainers and people that are
willing to help in their spare time) that work on regressions
testing/tracking and other QA stuff?

 * how to make the Linux kernel development so good that the mainstream
distros stop their kernel forks and do what they do with Firefox: Ship
the latest stable version (users get a new version with new features
every few weeks) or a longterm branch (makes a big version jump about
once a year; see Firefox ESR).

Ugh, pretty long mail. Sorry about that. Maybe I shouldn't have looked
so closely into LWN.net articles about regression tracking and older
discussions about it.

Ciao, Thorsten

From pavel at ucw.cz  Sun Jul  2 18:14:45 2017
From: pavel at ucw.cz (Pavel Machek)
Date: Sun, 2 Jul 2017 20:14:45 +0200
Subject: [Ksummit-discuss] [TECH TOPIC] mobile phones
In-Reply-To: <20170702112826.zirafz7lbo5lnabd@earth>
References: <20170625104850.GA24717@amd>
	<87shinzkp9.fsf@notabene.neil.brown.name>
	<20170626083407.GA9621@amd> <20170627123947.krne6a2saolcndih@earth>
	<20170627215755.GC5250@amd> <20170628164502.itggbf4xuhsv3oyf@earth>
	<20170628211008.GA19571@amd>
	<20170702112826.zirafz7lbo5lnabd@earth>
Message-ID: <20170702181445.GB1894@xo-6d-61-c0.localdomain>

Hi!

> > > I guess your battery is not the fittest anymore?
> > 
> > I guess that's one issue. (One of my batteries is actually so bad that
> > GSM modem fails with it.)
> > 
> > Well, I guess Debian boots a little longer than Maemo. Plus, I believe
> > we should charge the battery from kernel by default; it will enable
> > running fsck etc, and it will mean slow userspace boot will not
> > break...
> 
> That sounds like really bad battery :)

Yes, fortunately I have two others :-).

> > Interesting :-). I guess you could do really fair comparison of the
> > chipsets.
> > 
> > (Binary driver -- bad Motorola :-( )
> 
> Yeah :( Note, that Nokia also has binary driver for the N900 (which
> had been reverse engineered) and a different on on N950 (which has
> not been reverse engineered so far).

Yes... and another reason to keep N900.

> > > > +	int mV, mA, mOhm = 430, mVadj = 0;
> > > 
> > > 430 mOhm?
> > 
> > Yes, 0.43 Ohm.
> 
> What's the source of this value?

Estimate on one of my batteries. Real value differs with temperature and battery
age (among other things) but this is already significantly better than assuming
internal resistance is zero.

Best regards,
									Pavel


-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

From rostedt at goodmis.org  Mon Jul  3 16:30:25 2017
From: rostedt at goodmis.org (Steven Rostedt)
Date: Mon, 3 Jul 2017 12:30:25 -0400
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
 regression tracking
In-Reply-To: <576cea07-770a-4864-c3f5-0832ff211e94@leemhuis.info>
References: <576cea07-770a-4864-c3f5-0832ff211e94@leemhuis.info>
Message-ID: <20170703123025.7479702e@gandalf.local.home>

On Sun, 2 Jul 2017 19:51:43 +0200
Thorsten Leemhuis <linux at leemhuis.info> wrote:

> Hi! Sorry, I know I'm late -- real life (travel, day job, ...) kept me
> away from spending time on Linux kernel regression work :-/
> 
> Maybe I'm taking it a bit to far for the new kid in town, but I think I
> want to propose two sessions. One for the maintainer summit, that deals
> with a the most critical issues relevant to regression tracking. And one
> technical session to deal with all the other stuff. Obviously we can
> move below mentioned topics from one to the other or talk about them at
> both if we want.
> 
> = [MAINTAINERS SUMMIT] Improve regression tracking =
> 
>  * Follow up from last year: What to do about bugzilla.kernel.org?
> Reporters still get stranded there.
>  * How to get subsystems maintainer involved more in regression tracking
> to better make sure that reported regressions are tracked and not
> forgotten accidentally.

We should push harder for all reproducer tests to be put into
selftests. I try to do that myself (although I admit, I forget to do it
myself here and there. But I'm pushing myself to be better)

>  * Frustrations with regression tracking aka. how to establish
> regression tracking properly to make sure it will never go away again.

By adding reproducing tests to selftests, we can easily see what
regressions are still there.

> 
> = [TECH TOPIC] Improve the kernels quality by getting more people
> involved in regression testing and reporting =

Again, this can be answered by placing more reproducers into selftests.

> 
>  * A short report from the outcome of the maintainer summit discussion;
> also pick up and topics here that where not properly discussed on the
> maintainer summit or were postponed to this session.
>  * How to get distros more involved in regression tracking; especially
> those that have a technical aware user base or normally ship up2date
> kernel images (and thus have an greater interest in avoiding
> regressions). I'm mainly thinking about Arch Linux, Debian, Fedora, and
> openSUSE Tumbleweed here; having Ubuntu in the boat would be good, too!
> (might be wise to talk about this on the maintainers summit as well, if
> the right people are there)
>  * How to make it more easy to (ideally automatically!) track the
> current status and the progress of each regression? Are there any tools
> that could make regression tracking easier for all of us while not
> introducing much overhead for maintainers?

What is selftests?  (Jeopardy answer for all of the above ;-)

> 
> = Details =
> 
> Below you'll find few more words about some points mentioned above;
> there are a few other topics as well we could discuss if we want. But
> first, a few general words on regression tracking from my point of view:
> 
>  * There are a lot of areas in regression tracking where things are far
> from good (read: in a bad state). That makes it easy to discuss current
> problems and their solutions for hours -- and at the same time forget
> that discussing itself doesn't get us much forward (the old bugzilla
> issue mentioned in this mail is a good example). We thus IMHO should
> focus on the most important issues and lay the groundwork to establish
> regression tracking properly again, then we move on to solve things that
> are harder to solve.
> 
>  * Regression tracking currently is quite boring and exhausting (read:
> high burn-out risk), as it involves quite a lot of manual work finding
> regressions and keeping track of their progress (and at the end of the
> day it does not feel like you achieved much). Some of that work can not
> be automated. But quite a bit can and that would help a great deal to
> establish regression tracking properly (currently I'm the only one doing
> it and some development cycles I simply don't find spare time for it).
> 
>    I currently don't see any existing solutions that fit well with our
> mail focused workflow and at the same time do not introduce much
> overhead for subsystem maintainers (which I assume is what everyone
> wants, as I fear solutions with much overhead won't fly at all). Ideas
> how to solve this tricky problem area are highly welcomed. It's
> something that can be discussed when the aforementioned points
> "establish regression tracking properly" and "make it more easy to
> manually or automatically track the current status of a regression" come up.
> 
> == What to do about bugzilla.kernel.org =
> 
> Discussed last year already; see https://lwn.net/Articles/705245/ for
> details. Situation didn't change much since then: the bugzilla instance
> was updated, but people still get stranded there as most subsystems
> ignore it. That afaics frustrates people and makes them stop testing or
> reporting bugs.
> 
> Discuss how to improve things. [my2cent] Maybe a short term solution
> like this could work: Serve a static page on bugzilla.kernel.org that
> tells people where regressions/bugs for certain subsystems can be
> reported, as it most of the time is some mailing list anyway. Such a
> page could get compiled from MAINTAINERS (there is the "B:" field now
> that points to bugzilla; if its not there point to a mailing lists; also
> explain get_maintainers.pl).
> 
>   Leave our bugzilla reachable via bugzilla.kernel.org/frontpage (or
> something like that) for those few subsystems that use it; that's afaics
> ACPI and PM (including Cpufreq, Cpuidle, Hibernation, Suspend, ...) and
> maybe PCI (not sure) -- or should we tell them to move to
> bugzilla.freedesktop.org (or somewhere else) to get rid of our bugzilla
> in the long etrm and make Konstantins life easier? Anyway: Make sure
> bugs for other subsystems can't get filed in bugzilla.kernel.org anymore
> to make sure they get lost there. [/my2cent]
> 
> == How to get subsystems maintainer more involved in regression tracking
> to [?] ==
> 
> One reasons why I put this up is: It would help me a lot if people let
> regressions at leemhuis.info (side note: might be wise to make a
> mailing-list that replaces this address) get told about regressions --
> simply CCing it on reports or answers to regressions reports is enough;
> forwarding/bouncing mails there (even without additional text) is fine,
> too.
> 
> The other reason I included it: This came up in last years discussion on
> this list and it seemed some people thought we can get the subsystems
> maintainers more involved; so I thought it might be wise to discuss it.
> Might also be a good idea to discuss here how to get distro kernel
> maintainer more involved if enough are around.
> 
> == How to establish regression tracking properly [?] ==
> 
> This is a pretty vague topic on purpose. People seem to agree that
> regression tracking is important, but for years nobody did it (it
> stopped a little while after Rafael had to move on) and the little bit
> that I can do in my rare spare time won't help much (and I have no idea
> how long I can continue to find time for it).
> 
> == Make it easier to track the progress of regression ==
> 
> One of the main reasons that makes regression tracking hard currently:
> getting aware or regressions and tracking their progress is a lot of
> manual work. I plan one step that hopefully makes the job a little
> easier and at the same time might allow some automation in the long
> term: ask people to include a certain keyword in their regressions
> reports. Maybe something like "Linux-Regression" that doesn't get too
> much false positives when searching for it on lists and via Google
> (suggestions for a better tag welcome).
> 
> In addition, I plan to hand out some form of ID for each regressions I
> track and ask people to include it -- especially when they post patches
> that fix said regression or move the discussion to a new place (like
> "Corrects: Linux-Regression-d2afd"; again: suggestions welcome! Maybe I
> should just use a URL where people find details?).
> 
> That way I can notice more easy when a fix for a regression hits
> linux-next or master; I also get aware if a discussion moves from
> bugzilla to LKML or from one thread to another (fingers crossed).
> Obviously it depends on cooperation of those involved.
> 
> If this works out we could write a script or something that watches
> mailing lists, bug trackers and git trees for the tag in question. That
> script could file a database and automatically do some of the tracking job.
> 
> == get distros more involved ==
> 
> I assume at least Ben (Debian), Laura (Fedora), and Takashi (openSUSE)
> are around, so it might be a good idea to sit together and talk
> regression tracking in general and how we could get the distros kernel
> maintainers more involved. Even better would be to sit down before to
> maybe come up with some ideas/plans we could talk during this session.
> 
> One topic could be: How to make it easier for users of popular distros
> to get involved in testing. The "Kernel of the day" (KOTD) from
> SUSE/openSUSE was mentioned recently on this list already, but I got the
> impression that the existence of this repo is not well known; guess it's
> the same for my own Kernel Vanilla Repositories for Fedora (those
> contain packages with a quite recent mainline version; see
> https://fedoraproject.org/wiki/Kernel_Vanilla_Repositories ) or the fact
> that Fedora rawhide ships a recent mainline snapshot all the time. But
> should distros also offer Linux-next somewhere? Or anything else? And
> should the distros send experienced users upstream when they found a
> regression? Or will subsystem maintainers send those users away because
> they assume those kernels are not vanilla?
> 
> 
> == Topics or vague ideas I left out on purpose ==
> 
> Here is a list of other things we could talk about, but I think better
> left for a later time:
> 
>  * Kerneloops (http://oops.kernel.org/): It was discussed last year on
> this list. I have no idea what the current status is. Is someone
> watching & analysing it? And poking the right people when needed? (I
> doubt it)
> 
>  * Regression tracking for stable kernels (many bugs only get noticed
> once a new mainline version got released; at that time it might still be
> easy to revert a certain patch in mainline and stable)
> 
>  * statistics: I didn't spend time to create statistics, like Rafael did
> in the past. They'd be nice to have, but for now I think my time is
> better spend elsewhere.
> 
>  * work towards growing the number of tester by making it easier for
> them (better documentation, easier configuration, bisection scripts, ...)
> 
>  * maybe document a few some procedures for those that are not regular
> kernel developers (like the "When users report bugs on the Fedora
> tracker that look like actual upstream bugs, what's the best way to have
> those reported?" thing that Laura mentioned earlier this month in the
> mail "Bug reporting feedback loop"
> 
>  * provide better services than only a plain text list of regression on
> a mailing list?
> 
>  * better documentation? for example explain the difference between bugs
> and regressions somewhere to make people understand why their bugs might
> get ignored, but as the same time know that we handle regressions more
> seriously.
> 
>  * Should the regression tracker nag subsystem maintainers (and
> reporters) more often if they are inactive? How do people for example
> feel about (Semi-)Automatic nagging mails for regressions where there is
> no progress?
> 
>  * Is the data and the format of the current reports show useful at all?
> If not: How to improve it?
> 
>  * regression tracking is a fair amount of work, and it's frustrating,
> and people burn out. How to avoid that? Can we maybe get regression
> tracking on solid ground by somehow building a healthy community around
> it (containing kernel developers, Distro maintainers and people that are
> willing to help in their spare time) that work on regressions
> testing/tracking and other QA stuff?
> 
>  * how to make the Linux kernel development so good that the mainstream
> distros stop their kernel forks and do what they do with Firefox: Ship
> the latest stable version (users get a new version with new features
> every few weeks) or a longterm branch (makes a big version jump about
> once a year; see Firefox ESR).

This wont ever happen (famous last words). Distros want "stable
kernels" with new features. That's not what stable is about.

> 
> Ugh, pretty long mail. Sorry about that. Maybe I shouldn't have looked
> so closely into LWN.net articles about regression tracking and older
> discussions about it.

Anyway, I know that selftests are not the answer for everything, but
anything that has a way to reproduce a bug should be added to it. Sure,
it may depend on various hardware and/or file systems and different
configs, but if we have a central location to place all bug reproducing
tests (which we do have), then we should utilize it.

When it's in the kernel tree, it will be used much more often.

-- Steve

From dan.j.williams at intel.com  Mon Jul  3 18:50:42 2017
From: dan.j.williams at intel.com (Dan Williams)
Date: Mon, 3 Jul 2017 11:50:42 -0700
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
 regression tracking
In-Reply-To: <20170703123025.7479702e@gandalf.local.home>
References: <576cea07-770a-4864-c3f5-0832ff211e94@leemhuis.info>
	<20170703123025.7479702e@gandalf.local.home>
Message-ID: <CAPcyv4gUOs4+cNNGL_VeZms2P5Ypwgu-WknE4kJ0-sRij5y1QQ@mail.gmail.com>

On Mon, Jul 3, 2017 at 9:30 AM, Steven Rostedt <rostedt at goodmis.org> wrote:
> On Sun, 2 Jul 2017 19:51:43 +0200
[..]
>>
>> Ugh, pretty long mail. Sorry about that. Maybe I shouldn't have looked
>> so closely into LWN.net articles about regression tracking and older
>> discussions about it.
>
> Anyway, I know that selftests are not the answer for everything, but
> anything that has a way to reproduce a bug should be added to it. Sure,
> it may depend on various hardware and/or file systems and different
> configs, but if we have a central location to place all bug reproducing
> tests (which we do have), then we should utilize it.

I agree with Steven, and I would add that you don't necessarily need
specific hardware to write a test for a driver regression, see
examples in tools/testing/nvdimm. I also tend to think that
back-stopping regressions with new tests helps with the burn-out
problem of tracking regressions. Where building tools and tests is
potentially more fulfilling than just bug tracking.

From jkosina at suse.cz  Mon Jul  3 20:41:16 2017
From: jkosina at suse.cz (Jiri Kosina)
Date: Mon, 3 Jul 2017 22:41:16 +0200 (CEST)
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] Driver and/or module
 versions
In-Reply-To: <20170630162155.GB26257@fury>
References: <20170625072423.GR1248@mtr-leonro.local>
	<CA+55aFx9A=5cc0QZ7CySC4F2K7eYaEfzkdYEc9JaNgCcV25=rg@mail.gmail.com>
	<20170630162155.GB26257@fury>
Message-ID: <alpine.LSU.2.20.1707032238300.15036@cbobk.fhfr.pm>

On Fri, 30 Jun 2017, Darren Hart wrote:

> New features also fall into the independent tracking bucket, although 
> your point about feature masks could reduce that need. Is there a 
> definitive mechanism for the feature mask approach? I see a lot of 
> sysfs_filename:value key:value pairs for this kind of thing.

Adding those sysfs attributes seems like exactly the thing that people 
will keep forgetting to do, as there is no (real) functionality depending 
on them.

I doubt there is any better 'description' of the 'state' of the driver 
than SHA of the topmost commit + the tree it's related to.

-- 
Jiri Kosina
SUSE Labs

From dvhart at infradead.org  Mon Jul  3 21:25:35 2017
From: dvhart at infradead.org (Darren Hart)
Date: Mon, 3 Jul 2017 14:25:35 -0700
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] Driver and/or module
 versions
In-Reply-To: <alpine.LSU.2.20.1707032238300.15036@cbobk.fhfr.pm>
References: <20170625072423.GR1248@mtr-leonro.local>
	<CA+55aFx9A=5cc0QZ7CySC4F2K7eYaEfzkdYEc9JaNgCcV25=rg@mail.gmail.com>
	<20170630162155.GB26257@fury>
	<alpine.LSU.2.20.1707032238300.15036@cbobk.fhfr.pm>
Message-ID: <20170703212535.GA6739@fury>

On Mon, Jul 03, 2017 at 10:41:16PM +0200, Jiri Kosina wrote:
> On Fri, 30 Jun 2017, Darren Hart wrote:
> 
> > New features also fall into the independent tracking bucket, although 
> > your point about feature masks could reduce that need. Is there a 
> > definitive mechanism for the feature mask approach? I see a lot of 
> > sysfs_filename:value key:value pairs for this kind of thing.
> 
> Adding those sysfs attributes seems like exactly the thing that people 
> will keep forgetting to do, as there is no (real) functionality depending 
> on them.
> 
> I doubt there is any better 'description' of the 'state' of the driver 
> than SHA of the topmost commit + the tree it's related to.

This is exactly what I have been saying in my inward facing roles. Here
I'm trying to make sure I'm not missing something that makes this not
100% accurate.

For specific things, I could see those sysfs attributes being useful,
but as you say, they are informational only.

-- 
Darren Hart
VMware Open Source Technology Center

From peterz at infradead.org  Tue Jul  4 14:51:10 2017
From: peterz at infradead.org (Peter Zijlstra)
Date: Tue, 4 Jul 2017 16:51:10 +0200
Subject: [Ksummit-discuss] [TECH TOPIC] Pulling away from the tracing
 ABI quicksands
In-Reply-To: <20170629221245.489760b1@gandalf.local.home>
References: <152520246.5707.1498771254819.JavaMail.zimbra@efficios.com>
	<20170629195537.534445e7@gandalf.local.home>
	<CA+55aFxW_vhYWRoWFwy4zQgG7iPJg3V4u0-XkjZiGJJfZtZ=ng@mail.gmail.com>
	<20170629203224.6bf7f29a@gandalf.local.home>
	<20170629205218.5b9a7923@gandalf.local.home>
	<CA+55aFyQE_T7Rp7ay_EbAZNDqLE6ffJ-6xkL6B_961oZ0+aSpA@mail.gmail.com>
	<20170629211641.5aeb3af7@gandalf.local.home>
	<20170629212750.5c3542ee@gandalf.local.home>
	<CA+55aFzzCPMUDt72hckauYu+fj=Q2MWjx+XiR06KpMLAr1EBAA@mail.gmail.com>
	<20170629221245.489760b1@gandalf.local.home>
Message-ID: <20170704145110.GD7287@worktop>


Yay, tracing fight!! :/

On Thu, Jun 29, 2017 at 10:12:45PM -0400, Steven Rostedt wrote:
> On Thu, 29 Jun 2017 18:51:14 -0700
> Linus Torvalds <torvalds at linux-foundation.org> wrote:

> > But yes, I was talking about something very similar to what I think
> > Peter is talking about - the ability to attach a ebpf script to
> > kprobes and extract data dynamically. We've supported ebpf tracepoints
> > for years afaik, what is actually missing from using that for whatever
> > particular extension people want to use?
> 
> Well, I don't want to put words in his mouth, but as he's probably
> currently putting mush in a baby's mouth, so I'll do it anyway. ;-) We
> were talking about making the static tracepoints more "dynamic". I'm not
> sure he's ever used eBPF with tracing.

So my concerns/objections are two-fold:

 - I want only a single static tracepoint in the code.

 - I want only a single 'event' associated with this in userspace.

   (in particular I only see confusion happening when we have:
    sched_switch_fair, sched_switch_rt, sched_switch_deadline events for
    the exact same event; people will forget to enable one or more and
    wonder WTF they have holes in their traces)

These are not strange constraints / demands in my book. Just turns out
its 'difficult' to pull off or something.  I'm in fact fine with simply
adding bits to the one tracepoint we have; although others (that'd be
you Steve) are not because expensive.

Further complications seem to stem from the fact that I use the tracefs
interface exclusively. I don't know how to use perf or trace-cmd or any
of that new fangled stuff to do tracing -- nor do I really care, it
works for me (same why I'm happy with sysvinit, I don't _want_ to have
to relearn my 20+ year old sysadmin skillz, there's better things in live
to spend time on, that baby you mentioned for example).

So on that same vein, I'd be entirely helpless using eBPF to do tracing,
that's even more complicated. That said, I don't typically need this
crud anyway, I just change my kernel and rebuild, reboot and am happy,
that's far easier than trying to figure out how eBPF works.


In any case, baby vomit is more fun that this subject :-)

From linux at leemhuis.info  Tue Jul  4 19:03:22 2017
From: linux at leemhuis.info (Thorsten Leemhuis)
Date: Tue, 4 Jul 2017 21:03:22 +0200
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
 regression tracking
In-Reply-To: <20170703123025.7479702e@gandalf.local.home>
References: <576cea07-770a-4864-c3f5-0832ff211e94@leemhuis.info>
	<20170703123025.7479702e@gandalf.local.home>
Message-ID: <ad94dc65-cc9c-f4f1-27c1-5a48603c7f59@leemhuis.info>

On 03.07.2017 18:30, Steven Rostedt wrote:
> On Sun, 2 Jul 2017 19:51:43 +0200
> Thorsten Leemhuis <linux at leemhuis.info> wrote:
>>  * How to get subsystems maintainer involved more in regression tracking
>> to better make sure that reported regressions are tracked and not
>> forgotten accidentally.
> We should push harder for all reproducer tests to be put into
> selftests. I try to do that myself [...]
> [...]
> By adding reproducing tests to selftests, we can easily see what
> regressions are still there.
> [...]
> What is selftests?  (Jeopardy answer for all of the above ;-)

Sure, writing and running selftests is a good idea. But as you said
yourself in the later part of your mail: it won't help much in
situations where the kernel (or a selftest) needs to run on a certain
hardware or a specific (and maybe rare or complex) configuration. Sadly
a lot of the regressions in my recent reports were of this kind afaics :-/

In fact I got the impression that most of the regressions that might get
caught by selftests were directly handled by the subsystem maintainer
and never made it to me or my reports -- and thus I can't ask
maintainers to write selftests. *If* I got better aware of those
problems I (a) could make sure they are not forgotten and (b) sooner or
later could publicly state something like "hey, you had ten regressions
recently in your subsystem where writing a selftest might have been a
good idea, but you didn't even write one -- why?" (if we want something
like that).

> [?]
>>  * how to make the Linux kernel development so good that the mainstream
>> distros stop their kernel forks and do what they do with Firefox: Ship
>> the latest stable version (users get a new version with new features
>> every few weeks) or a longterm branch (makes a big version jump about
>> once a year; see Firefox ESR).

Hehe, I maybe left the field "regression tracking" to much here and
wandered too far into QA territory.

> This wont ever happen (famous last words). Distros want "stable
> kernels" with new features. 

Ha, yes, it's a long shot (and maybe more a vague idea to work towards
to). And maybe Debian stable and RHEL will always use the model they use
today. But Fedora, rolling release distros (Tumbleweed, Arch, ...), and
some others are updating to the latest Linux kernel release every few
weeks already and it works fine for them. Maybe we can get Ubuntu and
others to follow sooner or later.

Sure, for some people a version jump to a major new kernel release will
sound crazy, but when Linus introduced the current development scheme  a
lot of people also said "that will never fly" -- that was 13 years ago
now and it works quite well. The situation was similar with Firefox as well.

> That's not what stable is about.

That afaics (disclaimer: English is not my mother tongue) depends on the
interpretation of the word, as it can mean "nothing changes" or "rock
solid/reliable" (even when two people have a "stable relationship" it
does not mean that nothing changes between them...).

>> Ugh, pretty long mail. Sorry about that. Maybe I shouldn't have looked
>> so closely into LWN.net articles about regression tracking and older
>> discussions about it.
> Anyway, I know that selftests are not the answer for everything, but
> anything that has a way to reproduce a bug should be added to it. Sure,
> it may depend on various hardware and/or file systems and different
> configs, but if we have a central location to place all bug reproducing
> tests (which we do have), then we should utilize it.
> When it's in the kernel tree, it will be used much more often.

+1

Ciao, Thorsten

From rostedt at goodmis.org  Wed Jul  5 12:45:28 2017
From: rostedt at goodmis.org (Steven Rostedt)
Date: Wed, 5 Jul 2017 08:45:28 -0400
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
 regression tracking
In-Reply-To: <ad94dc65-cc9c-f4f1-27c1-5a48603c7f59@leemhuis.info>
References: <576cea07-770a-4864-c3f5-0832ff211e94@leemhuis.info>
	<20170703123025.7479702e@gandalf.local.home>
	<ad94dc65-cc9c-f4f1-27c1-5a48603c7f59@leemhuis.info>
Message-ID: <20170705084528.67499f8c@gandalf.local.home>

On Tue, 4 Jul 2017 21:03:22 +0200
Thorsten Leemhuis <linux at leemhuis.info> wrote:

> On 03.07.2017 18:30, Steven Rostedt wrote:
> > On Sun, 2 Jul 2017 19:51:43 +0200
> > Thorsten Leemhuis <linux at leemhuis.info> wrote:  
> >>  * How to get subsystems maintainer involved more in regression tracking
> >> to better make sure that reported regressions are tracked and not
> >> forgotten accidentally.  
> > We should push harder for all reproducer tests to be put into
> > selftests. I try to do that myself [...]
> > [...]
> > By adding reproducing tests to selftests, we can easily see what
> > regressions are still there.
> > [...]
> > What is selftests?  (Jeopardy answer for all of the above ;-)  
> 
> Sure, writing and running selftests is a good idea. But as you said
> yourself in the later part of your mail: it won't help much in
> situations where the kernel (or a selftest) needs to run on a certain
> hardware or a specific (and maybe rare or complex) configuration. Sadly
> a lot of the regressions in my recent reports were of this kind afaics :-/
> 
> In fact I got the impression that most of the regressions that might get
> caught by selftests were directly handled by the subsystem maintainer
> and never made it to me or my reports -- and thus I can't ask
> maintainers to write selftests. *If* I got better aware of those
> problems I (a) could make sure they are not forgotten and (b) sooner or
> later could publicly state something like "hey, you had ten regressions
> recently in your subsystem where writing a selftest might have been a
> good idea, but you didn't even write one -- why?" (if we want something
> like that).

I'm betting there's a lot of reproducer code that never makes it into a
test. How do we solve that? Perhaps we need people looking at LKML for
any signs "I did this, and it caused a bug" or "Here's a test case
which can trigger the bug". Each of these instances should end up in
selftests, and I'm sure they are not.

We can't do much for special hardware, even though those tests should
still be in the selftests for those that have the hardware, but we can
do something about special configs. Perhaps selfttests should have a
"config test" section. I have that in my own tests, but I use ktest to
build them.


> 
> > [?]  
> >>  * how to make the Linux kernel development so good that the mainstream
> >> distros stop their kernel forks and do what they do with Firefox: Ship
> >> the latest stable version (users get a new version with new features
> >> every few weeks) or a longterm branch (makes a big version jump about
> >> once a year; see Firefox ESR).  
> 
> Hehe, I maybe left the field "regression tracking" to much here and
> wandered too far into QA territory.
> 
> > This wont ever happen (famous last words). Distros want "stable
> > kernels" with new features.   
> 
> Ha, yes, it's a long shot (and maybe more a vague idea to work towards
> to). And maybe Debian stable and RHEL will always use the model they use
> today. But Fedora, rolling release distros (Tumbleweed, Arch, ...), and
> some others are updating to the latest Linux kernel release every few
> weeks already and it works fine for them. Maybe we can get Ubuntu and
> others to follow sooner or later.
> 
> Sure, for some people a version jump to a major new kernel release will
> sound crazy, but when Linus introduced the current development scheme  a
> lot of people also said "that will never fly" -- that was 13 years ago
> now and it works quite well. The situation was similar with Firefox as well.
> 
> > That's not what stable is about.  
> 
> That afaics (disclaimer: English is not my mother tongue) depends on the
> interpretation of the word, as it can mean "nothing changes" or "rock
> solid/reliable" (even when two people have a "stable relationship" it
> does not mean that nothing changes between them...).

Nothing to do with what language your mother tongue is ;-)

When the stable releases were created, there was some pretty strict
requirements for what should go into stable. Of course the requirements
have changed throughout the years. But there are big differences in
what Red Hat considers something "stable" and what the Linux stable
releases consider to be stable. That is where I meant that things wont
change.

-- Steve


From rostedt at goodmis.org  Wed Jul  5 13:27:57 2017
From: rostedt at goodmis.org (Steven Rostedt)
Date: Wed, 5 Jul 2017 09:27:57 -0400
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
 regression tracking
In-Reply-To: <4080ecc7-1aa8-2940-f230-1b79d656cdb4@redhat.com>
References: <576cea07-770a-4864-c3f5-0832ff211e94@leemhuis.info>
	<20170703123025.7479702e@gandalf.local.home>
	<ad94dc65-cc9c-f4f1-27c1-5a48603c7f59@leemhuis.info>
	<20170705084528.67499f8c@gandalf.local.home>
	<4080ecc7-1aa8-2940-f230-1b79d656cdb4@redhat.com>
Message-ID: <20170705092757.63dc2328@gandalf.local.home>

On Wed, 5 Jul 2017 09:09:51 -0400
Carlos O'Donell <carlos at redhat.com> wrote:

> This problem is a reflection of our own explicit or implicit priorities.
> The priorities of developers and reviewers needs to change to make an
> impact on the problem. This is a hard problem.

I 100% agree.

> 
> As a concrete action item, glibc core developers took a harder stance on
> (a) all user-visible bugs need a bug # (forces people to think about the

Unfortunately, we don't have a good system for a "bug #". Most kernel
developers hate bugzilla, and I think that includes Linus ;-) Which
means, unless Linus builds us a new bug tracking system, there wont be
any mandate for it.


> problem and file a coherent public bug about it) (b) all bugs needs a
> regression test if possible, (c) and if not possible we need to extend

I would love all bug fixes to come with a test (when possible).

> the testing framework to make it possible (we've started using kernel
> namespaces to create isolated test configurations).

Well, we have a selftest directory that should include all of these.
And most people run them on either a test box or a VM.

> 
> This change in reviewer priorities has had a noticeable impact on developer
> priorities over the last 5 years. Timelines for this problem will be
> measured in years.
> 

Your "b" above is what I would like to push. But who's going to enforce
this? With 10,000 changes per release, and a lot of them are fixes, the
best we can do is the honor system. Start shaming people that don't
have a regression test along with a Fixes tag (but we don't want people
to fix bugs without adding that tag either). There is a fine line one
must walk between getting people to change their approaches to bugs and
regression tests, and pissing them off where they start doing the
opposite of what would be best for the community.

-- Steve

From greg at kroah.com  Wed Jul  5 14:06:07 2017
From: greg at kroah.com (Greg KH)
Date: Wed, 5 Jul 2017 16:06:07 +0200
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
 regression tracking
In-Reply-To: <20170705092757.63dc2328@gandalf.local.home>
References: <576cea07-770a-4864-c3f5-0832ff211e94@leemhuis.info>
	<20170703123025.7479702e@gandalf.local.home>
	<ad94dc65-cc9c-f4f1-27c1-5a48603c7f59@leemhuis.info>
	<20170705084528.67499f8c@gandalf.local.home>
	<4080ecc7-1aa8-2940-f230-1b79d656cdb4@redhat.com>
	<20170705092757.63dc2328@gandalf.local.home>
Message-ID: <20170705140607.GA30187@kroah.com>

On Wed, Jul 05, 2017 at 09:27:57AM -0400, Steven Rostedt wrote:
> Your "b" above is what I would like to push. But who's going to enforce
> this? With 10,000 changes per release, and a lot of them are fixes, the
> best we can do is the honor system. Start shaming people that don't
> have a regression test along with a Fixes tag (but we don't want people
> to fix bugs without adding that tag either). There is a fine line one
> must walk between getting people to change their approaches to bugs and
> regression tests, and pissing them off where they start doing the
> opposite of what would be best for the community.

I would bet, for the huge majority of our fixes, they are fixes for
specific hardware, or workarounds for specific hardware issues.  Now
writing tests for those is not an impossible task (look at what the i915
developers have), but it is very very hard overall, especially if the
base infrastructure isn't there to do it.

For specific examples, here's the shortlog for fixes that went into
drivers/usb/host/ for 4.12 after 4.12-rc1 came out.  Do you know of a
way to write a test for these types of things?
	usb: xhci: ASMedia ASM1042A chipset need shorts TX quirk
	usb: xhci: Fix USB 3.1 supported protocol parsing
	usb: host: xhci-plat: propagate return value of platform_get_irq()
	xhci: Fix command ring stop regression in 4.11
	xhci: remove GFP_DMA flag from allocation
	USB: xhci: fix lock-inversion problem
	usb: host: xhci-ring: don't need to clear interrupt pending for MSI enabled hcd
	usb: host: xhci-mem: allocate zeroed Scratchpad Buffer
	xhci: apply PME_STUCK_QUIRK and MISSING_CAS quirk for Denverton
	usb: xhci: trace URB before giving it back instead of after
	USB: host: xhci: use max-port define
	USB: ehci-platform: fix companion-device leak
	usb: r8a66597-hcd: select a different endpoint on timeout
	usb: r8a66597-hcd: decrease timeout

And look at the commits with the "Fixes:" tag in it, I do, I read every
one of them.  See if writing a test for the majority of them would even
be possible...

I don't mean to poo-poo the idea, but please realize that around 75% of
the kernel is hardware/arch support, so that means that 75% of the
changes/fixes deal with hardware things (yes, change is in direct
correlation to size of the codebase in the tree, strange but true).

If only I had a subsystem that didn't have to deal with hardware, that
must be so easy to work with :)

thanks,

greg k-h

From rostedt at goodmis.org  Wed Jul  5 14:33:35 2017
From: rostedt at goodmis.org (Steven Rostedt)
Date: Wed, 5 Jul 2017 10:33:35 -0400
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
 regression tracking
In-Reply-To: <20170705140607.GA30187@kroah.com>
References: <576cea07-770a-4864-c3f5-0832ff211e94@leemhuis.info>
	<20170703123025.7479702e@gandalf.local.home>
	<ad94dc65-cc9c-f4f1-27c1-5a48603c7f59@leemhuis.info>
	<20170705084528.67499f8c@gandalf.local.home>
	<4080ecc7-1aa8-2940-f230-1b79d656cdb4@redhat.com>
	<20170705092757.63dc2328@gandalf.local.home>
	<20170705140607.GA30187@kroah.com>
Message-ID: <20170705103335.0cbd9984@gandalf.local.home>

On Wed, 5 Jul 2017 16:06:07 +0200
Greg KH <greg at kroah.com> wrote:

> On Wed, Jul 05, 2017 at 09:27:57AM -0400, Steven Rostedt wrote:
> > Your "b" above is what I would like to push. But who's going to enforce
> > this? With 10,000 changes per release, and a lot of them are fixes, the
> > best we can do is the honor system. Start shaming people that don't
> > have a regression test along with a Fixes tag (but we don't want people
> > to fix bugs without adding that tag either). There is a fine line one
> > must walk between getting people to change their approaches to bugs and
> > regression tests, and pissing them off where they start doing the
> > opposite of what would be best for the community.  
> 
> I would bet, for the huge majority of our fixes, they are fixes for
> specific hardware, or workarounds for specific hardware issues.  Now
> writing tests for those is not an impossible task (look at what the i915
> developers have), but it is very very hard overall, especially if the
> base infrastructure isn't there to do it.
> 
> For specific examples, here's the shortlog for fixes that went into
> drivers/usb/host/ for 4.12 after 4.12-rc1 came out.  Do you know of a
> way to write a test for these types of things?
> 	usb: xhci: ASMedia ASM1042A chipset need shorts TX quirk
> 	usb: xhci: Fix USB 3.1 supported protocol parsing
> 	usb: host: xhci-plat: propagate return value of platform_get_irq()
> 	xhci: Fix command ring stop regression in 4.11
> 	xhci: remove GFP_DMA flag from allocation
> 	USB: xhci: fix lock-inversion problem
> 	usb: host: xhci-ring: don't need to clear interrupt pending for MSI enabled hcd
> 	usb: host: xhci-mem: allocate zeroed Scratchpad Buffer
> 	xhci: apply PME_STUCK_QUIRK and MISSING_CAS quirk for Denverton
> 	usb: xhci: trace URB before giving it back instead of after
> 	USB: host: xhci: use max-port define
> 	USB: ehci-platform: fix companion-device leak
> 	usb: r8a66597-hcd: select a different endpoint on timeout
> 	usb: r8a66597-hcd: decrease timeout
> 
> And look at the commits with the "Fixes:" tag in it, I do, I read every
> one of them.  See if writing a test for the majority of them would even
> be possible...
> 
> I don't mean to poo-poo the idea, but please realize that around 75% of
> the kernel is hardware/arch support, so that means that 75% of the
> changes/fixes deal with hardware things (yes, change is in direct
> correlation to size of the codebase in the tree, strange but true).

I would say that if it's for a specific hardware, then it's really up
to the maintainer if there should be a test or not. As a lot of these
is just to deal with some quirk or non standard that the hardware does.
But are these regressions, or just some feature that's been broken on
that hardware since its conception?

That is, Thorsten this is more for you, how much real regressions are in
hardware? A bug that's been there forever is not a regression. It's a
feature ;-)  A regression is something that use to work and now does
not. Is that number still as high with hardware? Those probably could
be where tests can be focused on.

I'm worried more about infrastructure too. I would look at general
functionality of say USB, to see if something can be written to test a
device. Using the one change above that actually mentions "regression"
would it be possible to test completion codes? (I have no idea, I only
read the change log and I'm speaking out of my derri?re)

If we have a bunch of generic tests that can test hardware (general
video tests, USB tests, network cards, etc) and people ran these on
their own hardware, and it were to trigger a failure, then it would be
easier for users to report these issues to the maintainers. And these
would be easier to find and fix.

No test should be written for a single specific hardware. It should be a
general functionality that different hardware can execute.

> 
> If only I had a subsystem that didn't have to deal with hardware, that
> must be so easy to work with :)

*smack*!  ;-)

-- Steve

From broonie at kernel.org  Wed Jul  5 14:33:41 2017
From: broonie at kernel.org (Mark Brown)
Date: Wed, 5 Jul 2017 15:33:41 +0100
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
 regression tracking
In-Reply-To: <20170705140607.GA30187@kroah.com>
References: <576cea07-770a-4864-c3f5-0832ff211e94@leemhuis.info>
	<20170703123025.7479702e@gandalf.local.home>
	<ad94dc65-cc9c-f4f1-27c1-5a48603c7f59@leemhuis.info>
	<20170705084528.67499f8c@gandalf.local.home>
	<4080ecc7-1aa8-2940-f230-1b79d656cdb4@redhat.com>
	<20170705092757.63dc2328@gandalf.local.home>
	<20170705140607.GA30187@kroah.com>
Message-ID: <20170705143341.oees22k2snhtmkxo@sirena.org.uk>

On Wed, Jul 05, 2017 at 04:06:07PM +0200, Greg KH wrote:

> I don't mean to poo-poo the idea, but please realize that around 75% of
> the kernel is hardware/arch support, so that means that 75% of the
> changes/fixes deal with hardware things (yes, change is in direct
> correlation to size of the codebase in the tree, strange but true).

Then add in all the fixes for concurrency/locking issues and so on
that're hard to reliably reproduce as well...
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: not available
URL: <http://lists.linuxfoundation.org/pipermail/ksummit-discuss/attachments/20170705/7626d634/attachment.sig>

From rostedt at goodmis.org  Wed Jul  5 14:36:58 2017
From: rostedt at goodmis.org (Steven Rostedt)
Date: Wed, 5 Jul 2017 10:36:58 -0400
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
 regression tracking
In-Reply-To: <20170705143341.oees22k2snhtmkxo@sirena.org.uk>
References: <576cea07-770a-4864-c3f5-0832ff211e94@leemhuis.info>
	<20170703123025.7479702e@gandalf.local.home>
	<ad94dc65-cc9c-f4f1-27c1-5a48603c7f59@leemhuis.info>
	<20170705084528.67499f8c@gandalf.local.home>
	<4080ecc7-1aa8-2940-f230-1b79d656cdb4@redhat.com>
	<20170705092757.63dc2328@gandalf.local.home>
	<20170705140607.GA30187@kroah.com>
	<20170705143341.oees22k2snhtmkxo@sirena.org.uk>
Message-ID: <20170705103658.226099c6@gandalf.local.home>

On Wed, 5 Jul 2017 15:33:41 +0100
Mark Brown <broonie at kernel.org> wrote:

> On Wed, Jul 05, 2017 at 04:06:07PM +0200, Greg KH wrote:
> 
> > I don't mean to poo-poo the idea, but please realize that around 75% of
> > the kernel is hardware/arch support, so that means that 75% of the
> > changes/fixes deal with hardware things (yes, change is in direct
> > correlation to size of the codebase in the tree, strange but true).  
> 
> Then add in all the fixes for concurrency/locking issues and so on
> that're hard to reliably reproduce as well...

All tests should be run with lockdep enabled ;-)  Which a surprising
few developers appear to do :-p

-- Steve

From James.Bottomley at HansenPartnership.com  Wed Jul  5 14:50:28 2017
From: James.Bottomley at HansenPartnership.com (James Bottomley)
Date: Wed, 05 Jul 2017 07:50:28 -0700
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
 regression tracking
In-Reply-To: <20170705103658.226099c6@gandalf.local.home>
References: <576cea07-770a-4864-c3f5-0832ff211e94@leemhuis.info>
	<20170703123025.7479702e@gandalf.local.home>
	<ad94dc65-cc9c-f4f1-27c1-5a48603c7f59@leemhuis.info>
	<20170705084528.67499f8c@gandalf.local.home>
	<4080ecc7-1aa8-2940-f230-1b79d656cdb4@redhat.com>
	<20170705092757.63dc2328@gandalf.local.home>
	<20170705140607.GA30187@kroah.com>
	<20170705143341.oees22k2snhtmkxo@sirena.org.uk>
	<20170705103658.226099c6@gandalf.local.home>
Message-ID: <1499266228.3668.10.camel@HansenPartnership.com>

On Wed, 2017-07-05 at 10:36 -0400, Steven Rostedt wrote:
> On Wed, 5 Jul 2017 15:33:41 +0100
> Mark Brown <broonie at kernel.org> wrote:
> 
> > 
> > On Wed, Jul 05, 2017 at 04:06:07PM +0200, Greg KH wrote:
> > 
> > > 
> > > I don't mean to poo-poo the idea, but please realize that around
> > > 75% of the kernel is hardware/arch support, so that means that
> > > 75% of the changes/fixes deal with hardware things (yes, change
> > > is in direct correlation to size of the codebase in the tree,
> > > strange but true). ?
> > 
> > Then add in all the fixes for concurrency/locking issues and so on
> > that're hard to reliably reproduce as well...
> 
> All tests should be run with lockdep enabled ;-)??Which a surprising
> few developers appear to do :-p

Lockdep checks the locking hierarchies and makes assumptions about them
which it then validates ... it doesn't tell you if the data you think
you're protecting was accessed outside the lock, which is the usual
source of concurrency problems. ?In other words lockdep is useful but
it's not a panacea.

James


From broonie at kernel.org  Wed Jul  5 14:52:20 2017
From: broonie at kernel.org (Mark Brown)
Date: Wed, 5 Jul 2017 15:52:20 +0100
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
 regression tracking
In-Reply-To: <20170705103335.0cbd9984@gandalf.local.home>
References: <576cea07-770a-4864-c3f5-0832ff211e94@leemhuis.info>
	<20170703123025.7479702e@gandalf.local.home>
	<ad94dc65-cc9c-f4f1-27c1-5a48603c7f59@leemhuis.info>
	<20170705084528.67499f8c@gandalf.local.home>
	<4080ecc7-1aa8-2940-f230-1b79d656cdb4@redhat.com>
	<20170705092757.63dc2328@gandalf.local.home>
	<20170705140607.GA30187@kroah.com>
	<20170705103335.0cbd9984@gandalf.local.home>
Message-ID: <20170705145220.5u3qpxs45sbbpzpx@sirena.org.uk>

On Wed, Jul 05, 2017 at 10:33:35AM -0400, Steven Rostedt wrote:

> That is, Thorsten this is more for you, how much real regressions are in
> hardware? A bug that's been there forever is not a regression. It's a
> feature ;-)  A regression is something that use to work and now does
> not. Is that number still as high with hardware? Those probably could
> be where tests can be focused on.

A relatively common case IME is things that were always bugs but depend
on some external thing to become visible, like someone trying to use a
device in a slightly different way, doing more detailed testing of some
kind or some subsystem change.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: not available
URL: <http://lists.linuxfoundation.org/pipermail/ksummit-discuss/attachments/20170705/3f114319/attachment.sig>

From rostedt at goodmis.org  Wed Jul  5 14:56:51 2017
From: rostedt at goodmis.org (Steven Rostedt)
Date: Wed, 5 Jul 2017 10:56:51 -0400
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
 regression tracking
In-Reply-To: <1499266228.3668.10.camel@HansenPartnership.com>
References: <576cea07-770a-4864-c3f5-0832ff211e94@leemhuis.info>
	<20170703123025.7479702e@gandalf.local.home>
	<ad94dc65-cc9c-f4f1-27c1-5a48603c7f59@leemhuis.info>
	<20170705084528.67499f8c@gandalf.local.home>
	<4080ecc7-1aa8-2940-f230-1b79d656cdb4@redhat.com>
	<20170705092757.63dc2328@gandalf.local.home>
	<20170705140607.GA30187@kroah.com>
	<20170705143341.oees22k2snhtmkxo@sirena.org.uk>
	<20170705103658.226099c6@gandalf.local.home>
	<1499266228.3668.10.camel@HansenPartnership.com>
Message-ID: <20170705105651.5da9c969@gandalf.local.home>

On Wed, 05 Jul 2017 07:50:28 -0700
James Bottomley <James.Bottomley at HansenPartnership.com> wrote:

> On Wed, 2017-07-05 at 10:36 -0400, Steven Rostedt wrote:
> > On Wed, 5 Jul 2017 15:33:41 +0100
> > Mark Brown <broonie at kernel.org> wrote:
> >   
> > > 
> > > On Wed, Jul 05, 2017 at 04:06:07PM +0200, Greg KH wrote:
> > >   
> > > > 
> > > > I don't mean to poo-poo the idea, but please realize that around
> > > > 75% of the kernel is hardware/arch support, so that means that
> > > > 75% of the changes/fixes deal with hardware things (yes, change
> > > > is in direct correlation to size of the codebase in the tree,
> > > > strange but true). ?  
> > > 
> > > Then add in all the fixes for concurrency/locking issues and so on
> > > that're hard to reliably reproduce as well...  
> > 
> > All tests should be run with lockdep enabled ;-)??Which a surprising
> > few developers appear to do :-p  
> 
> Lockdep checks the locking hierarchies and makes assumptions about them
> which it then validates ... it doesn't tell you if the data you think

We should probably look at adding infrastructure that helps in that.
RCU already has a lot of there to help know if data is being protected
by RCU or not.

Hmm, maybe we could add a __rcu like type that we can associate
protected data with, where a config can associate access to a variable
with a lock being held?

> you're protecting was accessed outside the lock, which is the usual
> source of concurrency problems. ?In other words lockdep is useful but
> it's not a panacea.

Still not an excuse to not have lockdep enabled during tests.

-- Steve

From James.Bottomley at HansenPartnership.com  Wed Jul  5 15:09:49 2017
From: James.Bottomley at HansenPartnership.com (James Bottomley)
Date: Wed, 05 Jul 2017 08:09:49 -0700
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
 regression tracking
In-Reply-To: <20170705105651.5da9c969@gandalf.local.home>
References: <576cea07-770a-4864-c3f5-0832ff211e94@leemhuis.info>
	<20170703123025.7479702e@gandalf.local.home>
	<ad94dc65-cc9c-f4f1-27c1-5a48603c7f59@leemhuis.info>
	<20170705084528.67499f8c@gandalf.local.home>
	<4080ecc7-1aa8-2940-f230-1b79d656cdb4@redhat.com>
	<20170705092757.63dc2328@gandalf.local.home>
	<20170705140607.GA30187@kroah.com>
	<20170705143341.oees22k2snhtmkxo@sirena.org.uk>
	<20170705103658.226099c6@gandalf.local.home>
	<1499266228.3668.10.camel@HansenPartnership.com>
	<20170705105651.5da9c969@gandalf.local.home>
Message-ID: <1499267389.3668.16.camel@HansenPartnership.com>

On Wed, 2017-07-05 at 10:56 -0400, Steven Rostedt wrote:
> On Wed, 05 Jul 2017 07:50:28 -0700
> James Bottomley <James.Bottomley at HansenPartnership.com> wrote:
> 
> > 
> > On Wed, 2017-07-05 at 10:36 -0400, Steven Rostedt wrote:
> > > 
> > > On Wed, 5 Jul 2017 15:33:41 +0100
> > > Mark Brown <broonie at kernel.org> wrote:
> > > ??
> > > > 
> > > > 
> > > > On Wed, Jul 05, 2017 at 04:06:07PM +0200, Greg KH wrote:
> > > > ??
> > > > > 
> > > > > 
> > > > > I don't mean to poo-poo the idea, but please realize that
> > > > > around 75% of the kernel is hardware/arch support, so that
> > > > > means that 75% of the changes/fixes deal with hardware things
> > > > > (yes, change is in direct correlation to size of the codebase
> > > > > in the tree, strange but true). ? ?
> > > > 
> > > > Then add in all the fixes for concurrency/locking issues and so
> > > > on that're hard to reliably reproduce as well... ?
> > > 
> > > All tests should be run with lockdep enabled ;-)??Which a
> > > surprising few developers appear to do :-p ?
> > 
> > Lockdep checks the locking hierarchies and makes assumptions about
> > them which it then validates ... it doesn't tell you if the data
> > you think
> 
> We should probably look at adding infrastructure that helps in that.
> RCU already has a lot of there to help know if data is being
> protected by RCU or not.
> 
> Hmm, maybe we could add a __rcu like type that we can associate
> protected data with, where a config can associate access to a
> variable with a lock being held?

That's about 10x more complex than the releases/acquires/must_hold
annotation, which we have fairly dismal coverage on.

If you remember the hotplug annotations, which were a shining example:
there's a limit of complexity before any annotation system simply
becomes a make work tyranny.?

> > you're protecting was accessed outside the lock, which is the usual
> > source of concurrency problems. ?In other words lockdep is useful
> > but it's not a panacea.
> 
> Still not an excuse to not have lockdep enabled during tests.

OK, what makes you think lockdep isn't enabled? ?Since Kconfig is so
complex, I usually use a distro config ... they have it enabled (or at
least openSUSE does), so it's enabled for everything I do.

James


From linux at roeck-us.net  Wed Jul  5 15:16:33 2017
From: linux at roeck-us.net (Guenter Roeck)
Date: Wed, 5 Jul 2017 08:16:33 -0700
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
 regression tracking
In-Reply-To: <20170705140607.GA30187@kroah.com>
References: <576cea07-770a-4864-c3f5-0832ff211e94@leemhuis.info>
	<20170703123025.7479702e@gandalf.local.home>
	<ad94dc65-cc9c-f4f1-27c1-5a48603c7f59@leemhuis.info>
	<20170705084528.67499f8c@gandalf.local.home>
	<4080ecc7-1aa8-2940-f230-1b79d656cdb4@redhat.com>
	<20170705092757.63dc2328@gandalf.local.home>
	<20170705140607.GA30187@kroah.com>
Message-ID: <a462fb3b-a6d4-e969-b301-b404981de224@roeck-us.net>

On 07/05/2017 07:06 AM, Greg KH wrote:
> On Wed, Jul 05, 2017 at 09:27:57AM -0400, Steven Rostedt wrote:
>> Your "b" above is what I would like to push. But who's going to enforce
>> this? With 10,000 changes per release, and a lot of them are fixes, the
>> best we can do is the honor system. Start shaming people that don't
>> have a regression test along with a Fixes tag (but we don't want people
>> to fix bugs without adding that tag either). There is a fine line one
>> must walk between getting people to change their approaches to bugs and
>> regression tests, and pissing them off where they start doing the
>> opposite of what would be best for the community.
> 
> I would bet, for the huge majority of our fixes, they are fixes for
> specific hardware, or workarounds for specific hardware issues.  Now
> writing tests for those is not an impossible task (look at what the i915
> developers have), but it is very very hard overall, especially if the
> base infrastructure isn't there to do it.
> 
> For specific examples, here's the shortlog for fixes that went into
> drivers/usb/host/ for 4.12 after 4.12-rc1 came out.  Do you know of a
> way to write a test for these types of things?
> 	usb: xhci: ASMedia ASM1042A chipset need shorts TX quirk
> 	usb: xhci: Fix USB 3.1 supported protocol parsing
> 	usb: host: xhci-plat: propagate return value of platform_get_irq()
> 	xhci: Fix command ring stop regression in 4.11
> 	xhci: remove GFP_DMA flag from allocation
> 	USB: xhci: fix lock-inversion problem
> 	usb: host: xhci-ring: don't need to clear interrupt pending for MSI enabled hcd
> 	usb: host: xhci-mem: allocate zeroed Scratchpad Buffer
> 	xhci: apply PME_STUCK_QUIRK and MISSING_CAS quirk for Denverton
> 	usb: xhci: trace URB before giving it back instead of after
> 	USB: host: xhci: use max-port define
> 	USB: ehci-platform: fix companion-device leak
> 	usb: r8a66597-hcd: select a different endpoint on timeout
> 	usb: r8a66597-hcd: decrease timeout
> 
> And look at the commits with the "Fixes:" tag in it, I do, I read every
> one of them.  See if writing a test for the majority of them would even
> be possible...
> 
> I don't mean to poo-poo the idea, but please realize that around 75% of
> the kernel is hardware/arch support, so that means that 75% of the
> changes/fixes deal with hardware things (yes, change is in direct
> correlation to size of the codebase in the tree, strange but true).
> 

The reproducers for several of the usb fixes I submitted recently took hours of
stress test to reproduce the underlying problems. I have one more to fix which
takes days to reproduce, if at all (I have seen that problem only two or three
times during weeks of stress test). Due to the nature of the problems, reproducing
them heavily depended on the underlying hardware. None of the reproducers can
guarantee that the problem is fixed; they are intended to show the problem,
not that it is fixed. This happens a lot with race conditions - in many cases
it is impossible to prove that the problem is fixed; one can only prove that
it still exists.

Echoing what you said, I have no idea how it would even be possible to write
unit tests to verify if the problems I fixed are really fixed.

Several of the fixes I have submitted are based on single-instance error logs with
no reproducer. Many others are compile time fixes or fix problems found with code
inspection (manual or automatic).

If we start shaming people for not providing unit tests, all we'll accomplish is
that people will stop providing bug fixes.

Guenter

From broonie at kernel.org  Wed Jul  5 15:20:26 2017
From: broonie at kernel.org (Mark Brown)
Date: Wed, 5 Jul 2017 16:20:26 +0100
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
 regression tracking
In-Reply-To: <1499267389.3668.16.camel@HansenPartnership.com>
References: <ad94dc65-cc9c-f4f1-27c1-5a48603c7f59@leemhuis.info>
	<20170705084528.67499f8c@gandalf.local.home>
	<4080ecc7-1aa8-2940-f230-1b79d656cdb4@redhat.com>
	<20170705092757.63dc2328@gandalf.local.home>
	<20170705140607.GA30187@kroah.com>
	<20170705143341.oees22k2snhtmkxo@sirena.org.uk>
	<20170705103658.226099c6@gandalf.local.home>
	<1499266228.3668.10.camel@HansenPartnership.com>
	<20170705105651.5da9c969@gandalf.local.home>
	<1499267389.3668.16.camel@HansenPartnership.com>
Message-ID: <20170705152026.rkw73q2f6xmiju37@sirena.org.uk>

On Wed, Jul 05, 2017 at 08:09:49AM -0700, James Bottomley wrote:
> On Wed, 2017-07-05 at 10:56 -0400, Steven Rostedt wrote:
> > James Bottomley <James.Bottomley at HansenPartnership.com> wrote:

> > > you're protecting was accessed outside the lock, which is the usual
> > > source of concurrency problems. ?In other words lockdep is useful
> > > but it's not a panacea.

> > Still not an excuse to not have lockdep enabled during tests.

> OK, what makes you think lockdep isn't enabled? ?Since Kconfig is so
> complex, I usually use a distro config ... they have it enabled (or at
> least openSUSE does), so it's enabled for everything I do.

Yeah, I see enough reports with it in embedded contexts to make me think
people use it there.  I know I tend to have it turned on most of the
time.  The concurrency stuff I'm thinking of here is more the things
you're mentioning with just not taking locks at all when they are needed
or concurrency with hardware.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: not available
URL: <http://lists.linuxfoundation.org/pipermail/ksummit-discuss/attachments/20170705/fbc441c3/attachment.sig>

From rostedt at goodmis.org  Wed Jul  5 15:20:47 2017
From: rostedt at goodmis.org (Steven Rostedt)
Date: Wed, 5 Jul 2017 11:20:47 -0400
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
 regression tracking
In-Reply-To: <1499267389.3668.16.camel@HansenPartnership.com>
References: <576cea07-770a-4864-c3f5-0832ff211e94@leemhuis.info>
	<20170703123025.7479702e@gandalf.local.home>
	<ad94dc65-cc9c-f4f1-27c1-5a48603c7f59@leemhuis.info>
	<20170705084528.67499f8c@gandalf.local.home>
	<4080ecc7-1aa8-2940-f230-1b79d656cdb4@redhat.com>
	<20170705092757.63dc2328@gandalf.local.home>
	<20170705140607.GA30187@kroah.com>
	<20170705143341.oees22k2snhtmkxo@sirena.org.uk>
	<20170705103658.226099c6@gandalf.local.home>
	<1499266228.3668.10.camel@HansenPartnership.com>
	<20170705105651.5da9c969@gandalf.local.home>
	<1499267389.3668.16.camel@HansenPartnership.com>
Message-ID: <20170705112047.23ee09f6@gandalf.local.home>

On Wed, 05 Jul 2017 08:09:49 -0700
James Bottomley <James.Bottomley at HansenPartnership.com> wrote:

 
> > > you're protecting was accessed outside the lock, which is the usual
> > > source of concurrency problems. ?In other words lockdep is useful
> > > but it's not a panacea.  
> > 
> > Still not an excuse to not have lockdep enabled during tests.  
> 
> OK, what makes you think lockdep isn't enabled? ?Since Kconfig is so
> complex, I usually use a distro config ... they have it enabled (or at
> least openSUSE does), so it's enabled for everything I do.

openSuSE has it enabled? I hope not for its production config, as
lockdep has a huge performance penalty.

I'm thinking you don't have it enabled. What config are you looking at?
The actual config that does the testing of locks is
CONFIG_PROVE_LOCKING, which selects LOCKDEP to be compiled in.

-- Steve

From rostedt at goodmis.org  Wed Jul  5 15:27:07 2017
From: rostedt at goodmis.org (Steven Rostedt)
Date: Wed, 5 Jul 2017 11:27:07 -0400
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
 regression tracking
In-Reply-To: <a462fb3b-a6d4-e969-b301-b404981de224@roeck-us.net>
References: <576cea07-770a-4864-c3f5-0832ff211e94@leemhuis.info>
	<20170703123025.7479702e@gandalf.local.home>
	<ad94dc65-cc9c-f4f1-27c1-5a48603c7f59@leemhuis.info>
	<20170705084528.67499f8c@gandalf.local.home>
	<4080ecc7-1aa8-2940-f230-1b79d656cdb4@redhat.com>
	<20170705092757.63dc2328@gandalf.local.home>
	<20170705140607.GA30187@kroah.com>
	<a462fb3b-a6d4-e969-b301-b404981de224@roeck-us.net>
Message-ID: <20170705112707.54d7f345@gandalf.local.home>

On Wed, 5 Jul 2017 08:16:33 -0700
Guenter Roeck <linux at roeck-us.net> wrote:

> The reproducers for several of the usb fixes I submitted recently took hours of
> stress test to reproduce the underlying problems. I have one more to fix which
> takes days to reproduce, if at all (I have seen that problem only two or three
> times during weeks of stress test). Due to the nature of the problems, reproducing
> them heavily depended on the underlying hardware. None of the reproducers can
> guarantee that the problem is fixed; they are intended to show the problem,
> not that it is fixed. This happens a lot with race conditions - in many cases
> it is impossible to prove that the problem is fixed; one can only prove that
> it still exists.
> 
> Echoing what you said, I have no idea how it would even be possible to write
> unit tests to verify if the problems I fixed are really fixed.
> 
> Several of the fixes I have submitted are based on single-instance error logs with
> no reproducer. Many others are compile time fixes or fix problems found with code
> inspection (manual or automatic).
> 
> If we start shaming people for not providing unit tests, all we'll accomplish is
> that people will stop providing bug fixes.

I need to be clearer on this. What I meant was, if there's a bug
where someone has a test that easily reproduces the bug, then if
there's not a test added to selftests for said bug, then we should
shame those into doing so.

A bug that is found by inspection or hard to reproduce test cases are
not applicable, as they don't have tests that can show a regression.

And I'm betting that those bugs are NOT REGRESSIONS! Most likely are
bugs that always existed, but because of the unpredictable hitting of
the bug (as you said, it required hours of stress tests to reproduce),
the bug was not previously hit during development. That's not a
regression, that's a feature.

Are we tracking regressions or just simply bugs?

-- Steve

From carlos at redhat.com  Wed Jul  5 13:09:51 2017
From: carlos at redhat.com (Carlos O'Donell)
Date: Wed, 5 Jul 2017 09:09:51 -0400
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
 regression tracking
In-Reply-To: <20170705084528.67499f8c@gandalf.local.home>
References: <576cea07-770a-4864-c3f5-0832ff211e94@leemhuis.info>
	<20170703123025.7479702e@gandalf.local.home>
	<ad94dc65-cc9c-f4f1-27c1-5a48603c7f59@leemhuis.info>
	<20170705084528.67499f8c@gandalf.local.home>
Message-ID: <4080ecc7-1aa8-2940-f230-1b79d656cdb4@redhat.com>

On 07/05/2017 08:45 AM, Steven Rostedt wrote:
> I'm betting there's a lot of reproducer code that never makes it into a
> test. How do we solve that? Perhaps we need people looking at LKML for
> any signs "I did this, and it caused a bug" or "Here's a test case
> which can trigger the bug". Each of these instances should end up in
> selftests, and I'm sure they are not.
> 
> We can't do much for special hardware, even though those tests should
> still be in the selftests for those that have the hardware, but we can
> do something about special configs. Perhaps selfttests should have a
> "config test" section. I have that in my own tests, but I use ktest to
> build them.

This problem is a reflection of our own explicit or implicit priorities.
The priorities of developers and reviewers needs to change to make an
impact on the problem. This is a hard problem.

As a concrete action item, glibc core developers took a harder stance on
(a) all user-visible bugs need a bug # (forces people to think about the
problem and file a coherent public bug about it) (b) all bugs needs a
regression test if possible, (c) and if not possible we need to extend
the testing framework to make it possible (we've started using kernel
namespaces to create isolated test configurations).

This change in reviewer priorities has had a noticeable impact on developer
priorities over the last 5 years. Timelines for this problem will be
measured in years.

-- 
Cheers,
Carlos.

From carlos at redhat.com  Wed Jul  5 14:06:24 2017
From: carlos at redhat.com (Carlos O'Donell)
Date: Wed, 5 Jul 2017 10:06:24 -0400
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
 regression tracking
In-Reply-To: <20170705092757.63dc2328@gandalf.local.home>
References: <576cea07-770a-4864-c3f5-0832ff211e94@leemhuis.info>
	<20170703123025.7479702e@gandalf.local.home>
	<ad94dc65-cc9c-f4f1-27c1-5a48603c7f59@leemhuis.info>
	<20170705084528.67499f8c@gandalf.local.home>
	<4080ecc7-1aa8-2940-f230-1b79d656cdb4@redhat.com>
	<20170705092757.63dc2328@gandalf.local.home>
Message-ID: <9b377a08-bf38-b41e-040c-41cb078bcfc3@redhat.com>

On 07/05/2017 09:27 AM, Steven Rostedt wrote:
>> As a concrete action item, glibc core developers took a harder stance on
>> (a) all user-visible bugs need a bug # (forces people to think about the
> 
> Unfortunately, we don't have a good system for a "bug #". Most kernel
> developers hate bugzilla, and I think that includes Linus ;-) Which
> means, unless Linus builds us a new bug tracking system, there wont be
> any mandate for it.

Use the XMLRPC API to build a better interface for kernel developers?

Our "fixed bugs" list is automatically culled via XMLRPC to generate our
release announcement with "fixed bugs."

The bug # mandate has had a few key effects. It allows non-developers to
search for old similar regressions in an easier fashion than having to
trawl the mailing list for incomprehensible (to them) discussions about
semantics. The bugs are described and talked about in terms of user
facing aspects, not internal implementation details. Regressed bugs can
be reopened and discussed on the mailing list with links to the discussions
and summaries of conclusions.

All of this means we have a cleaner, clearer, description of the problem
from the user side. This again needs priority from a group of people for
whom time is precious, so you have to get buy in from them.

I don't think (a) is needed, but the glibc community found it helpful.
 
>> problem and file a coherent public bug about it) (b) all bugs needs a
>> regression test if possible, (c) and if not possible we need to extend
> 
> I would love all bug fixes to come with a test (when possible).

We have lots of hardware-specific tests that are marked UNSUPPORTED if
say you're not running on AVX512 enabled hardware.

>> the testing framework to make it possible (we've started using kernel
>> namespaces to create isolated test configurations).
> 
> Well, we have a selftest directory that should include all of these.
> And most people run them on either a test box or a VM.
Improving the test infrastructure must also be a priority, otherwise you
will grow to the limit of that infrastructure.

>> This change in reviewer priorities has had a noticeable impact on developer
>> priorities over the last 5 years. Timelines for this problem will be
>> measured in years.
> 
> Your "b" above is what I would like to push. But who's going to enforce
> this? With 10,000 changes per release, and a lot of them are fixes, the
> best we can do is the honor system. Start shaming people that don't
> have a regression test along with a Fixes tag (but we don't want people
> to fix bugs without adding that tag either). There is a fine line one
> must walk between getting people to change their approaches to bugs and
> regression tests, and pissing them off where they start doing the
> opposite of what would be best for the community.

I did say "hard problem" earlier didn't I?

* Start with yourself.

* For everyone you know well, and have met in person, be brutal and
  require them to submit regression tests with their bug fixes. These
  people are already committed to getting their fixes in and they will
  understand you are making an example of them.

* For everyone you don't know well, be gentle, and begin reminding
  them you need a regression test, and if you feel generous try to write
  one yourself for them. Often the act of writing such a test will show
  you how hard it is, and what is missing from your infrastructure to make
  this easy, because if it was easy everyone would do it.

YMMV.

-- 
Cheers,
Carlos.

From carlos at redhat.com  Wed Jul  5 14:28:30 2017
From: carlos at redhat.com (Carlos O'Donell)
Date: Wed, 5 Jul 2017 10:28:30 -0400
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
 regression tracking
In-Reply-To: <20170705140607.GA30187@kroah.com>
References: <576cea07-770a-4864-c3f5-0832ff211e94@leemhuis.info>
	<20170703123025.7479702e@gandalf.local.home>
	<ad94dc65-cc9c-f4f1-27c1-5a48603c7f59@leemhuis.info>
	<20170705084528.67499f8c@gandalf.local.home>
	<4080ecc7-1aa8-2940-f230-1b79d656cdb4@redhat.com>
	<20170705092757.63dc2328@gandalf.local.home>
	<20170705140607.GA30187@kroah.com>
Message-ID: <6401b327-cc2c-5e0a-716b-0b9ea70adcb0@redhat.com>

On 07/05/2017 10:06 AM, Greg KH wrote:
> I don't mean to poo-poo the idea, but please realize that around 75% of
> the kernel is hardware/arch support, so that means that 75% of the
> changes/fixes deal with hardware things (yes, change is in direct
> correlation to size of the codebase in the tree, strange but true).

We should distinguish between the reviewer reviewing the regression test
and running the regression test. As long as the submitter ran the 
regression test on their hardware, and it passed, the reviewer need only
review the test for logical consistency and correctness?

Lack of test infrastructure was a serious problem for us in glibc. We are
relying on namespaces for more complex network and filesystem testing.
Without namespaces we would have needed a much more complex setup that
might never have seen developer adoption. When I attended LPC 2016 I
prioritized listening in on namespaces discussions to make sure nothing
was changing that might break our testing framework.

This conversation is going to lead down the path of driver HAL or
emulation in order to provide regression testing for code above the actual
hardware, and that's another hard problem, but one need not go there.

Starting with real hardware tests can have benefit.

In glibc we test SSE, AVX, AVX512, TSX etc. but if you don't have the
extensions you get a bunch of UNSUPPORTED tests. While upstream kernel 
may have a more limited set of available hardware per-person, the
collective set of developers has hardware to cover all configurations,
and they should run the regression tests for hardware they care about
... and *must* do so if they submit a patch to fix a bug! :-)

-- 
Cheers,
Carlos.

From carlos at redhat.com  Wed Jul  5 15:08:48 2017
From: carlos at redhat.com (Carlos O'Donell)
Date: Wed, 5 Jul 2017 11:08:48 -0400
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
 regression tracking
In-Reply-To: <20170705103335.0cbd9984@gandalf.local.home>
References: <576cea07-770a-4864-c3f5-0832ff211e94@leemhuis.info>
	<20170703123025.7479702e@gandalf.local.home>
	<ad94dc65-cc9c-f4f1-27c1-5a48603c7f59@leemhuis.info>
	<20170705084528.67499f8c@gandalf.local.home>
	<4080ecc7-1aa8-2940-f230-1b79d656cdb4@redhat.com>
	<20170705092757.63dc2328@gandalf.local.home>
	<20170705140607.GA30187@kroah.com>
	<20170705103335.0cbd9984@gandalf.local.home>
Message-ID: <8c6843e8-73d9-a898-0366-0b72dfeb79a2@redhat.com>

On 07/05/2017 10:33 AM, Steven Rostedt wrote:
> No test should be written for a single specific hardware. It should be a
> general functionality that different hardware can execute.

Why? We test all sorts of hardware in userspace and we see value in that
testing.

-- 
Cheers,
Carlos.

From James.Bottomley at HansenPartnership.com  Wed Jul  5 15:32:28 2017
From: James.Bottomley at HansenPartnership.com (James Bottomley)
Date: Wed, 05 Jul 2017 08:32:28 -0700
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
 regression tracking
In-Reply-To: <20170705112047.23ee09f6@gandalf.local.home>
References: <576cea07-770a-4864-c3f5-0832ff211e94@leemhuis.info>
	<20170703123025.7479702e@gandalf.local.home>
	<ad94dc65-cc9c-f4f1-27c1-5a48603c7f59@leemhuis.info>
	<20170705084528.67499f8c@gandalf.local.home>
	<4080ecc7-1aa8-2940-f230-1b79d656cdb4@redhat.com>
	<20170705092757.63dc2328@gandalf.local.home>
	<20170705140607.GA30187@kroah.com>
	<20170705143341.oees22k2snhtmkxo@sirena.org.uk>
	<20170705103658.226099c6@gandalf.local.home>
	<1499266228.3668.10.camel@HansenPartnership.com>
	<20170705105651.5da9c969@gandalf.local.home>
	<1499267389.3668.16.camel@HansenPartnership.com>
	<20170705112047.23ee09f6@gandalf.local.home>
Message-ID: <1499268748.3668.20.camel@HansenPartnership.com>

On Wed, 2017-07-05 at 11:20 -0400, Steven Rostedt wrote:
> On Wed, 05 Jul 2017 08:09:49 -0700
> James Bottomley <James.Bottomley at HansenPartnership.com> wrote:
> 
> ?
> > 
> > > 
> > > > 
> > > > you're protecting was accessed outside the lock, which is the
> > > > usual source of concurrency problems. ?In other words lockdep
> > > > is useful but it's not a panacea. ?
> > > 
> > > Still not an excuse to not have lockdep enabled during tests.??
> > 
> > OK, what makes you think lockdep isn't enabled? ?Since Kconfig is
> > so complex, I usually use a distro config ... they have it enabled
> > (or at least openSUSE does), so it's enabled for everything I do.
> 
> openSuSE has it enabled? I hope not for its production config, as
> lockdep has a huge performance penalty.

Then, surely, it's the last thing we want when tracking down race
conditgions since it will alter timings dramatically.

> I'm thinking you don't have it enabled. What config are you looking
> at? The actual config that does the testing of locks is
> CONFIG_PROVE_LOCKING, which selects LOCKDEP to be compiled in.

This is what it has:

jejb at jarvis:~/git/linux-build> grep LOCKDEP /boot/config-4.4.73-18.17-default?
CONFIG_LOCKDEP_SUPPORT=y

James


From greg at kroah.com  Wed Jul  5 15:32:59 2017
From: greg at kroah.com (Greg KH)
Date: Wed, 5 Jul 2017 17:32:59 +0200
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
 regression tracking
In-Reply-To: <a462fb3b-a6d4-e969-b301-b404981de224@roeck-us.net>
References: <576cea07-770a-4864-c3f5-0832ff211e94@leemhuis.info>
	<20170703123025.7479702e@gandalf.local.home>
	<ad94dc65-cc9c-f4f1-27c1-5a48603c7f59@leemhuis.info>
	<20170705084528.67499f8c@gandalf.local.home>
	<4080ecc7-1aa8-2940-f230-1b79d656cdb4@redhat.com>
	<20170705092757.63dc2328@gandalf.local.home>
	<20170705140607.GA30187@kroah.com>
	<a462fb3b-a6d4-e969-b301-b404981de224@roeck-us.net>
Message-ID: <20170705153259.GA7265@kroah.com>

On Wed, Jul 05, 2017 at 08:16:33AM -0700, Guenter Roeck wrote:
> If we start shaming people for not providing unit tests, all we'll accomplish is
> that people will stop providing bug fixes.

Yes, this is the key!

Steven, just look at everything marked with a "Fixes:" or "stable@" tag
from 4.12-rc1..4.12 and try to determine how you would write a test for
the majority of them.

Yes, for some subsystems this can work (look at xfstests as one great
example for filesystems, same for the i915 tests), but for the majority
of the kernel, at this point in time, it doesn't make sense.

So take Carlos's advice, start small, do it for your subsystem if you
don't touch hardware (easy peasy, right?), and let's see how it goes,
and see if we have the infrastructure to do it even today.  Right now,
kselftests is finally getting a unified output format, which is great,
it shows that people are starting to use and rely on it.  What else will
we need to make this more widely used, we don't know yet...

thanks,

greg k-h

From carlos at redhat.com  Wed Jul  5 15:36:23 2017
From: carlos at redhat.com (Carlos O'Donell)
Date: Wed, 5 Jul 2017 11:36:23 -0400
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
 regression tracking
In-Reply-To: <20170705153259.GA7265@kroah.com>
References: <576cea07-770a-4864-c3f5-0832ff211e94@leemhuis.info>
	<20170703123025.7479702e@gandalf.local.home>
	<ad94dc65-cc9c-f4f1-27c1-5a48603c7f59@leemhuis.info>
	<20170705084528.67499f8c@gandalf.local.home>
	<4080ecc7-1aa8-2940-f230-1b79d656cdb4@redhat.com>
	<20170705092757.63dc2328@gandalf.local.home>
	<20170705140607.GA30187@kroah.com>
	<a462fb3b-a6d4-e969-b301-b404981de224@roeck-us.net>
	<20170705153259.GA7265@kroah.com>
Message-ID: <a0b4b680-b07b-9060-defb-a74db18c26fb@redhat.com>

On 07/05/2017 11:32 AM, Greg KH wrote:
> So take Carlos's advice, start small, do it for your subsystem if you
> don't touch hardware (easy peasy, right?), and let's see how it goes,
> and see if we have the infrastructure to do it even today.  Right now,
> kselftests is finally getting a unified output format, which is great,
> it shows that people are starting to use and rely on it.  What else will
> we need to make this more widely used, we don't know yet...

+1 ;-)

-- 
Cheers,
Carlos.

From James.Bottomley at HansenPartnership.com  Wed Jul  5 15:36:55 2017
From: James.Bottomley at HansenPartnership.com (James Bottomley)
Date: Wed, 05 Jul 2017 08:36:55 -0700
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
 regression tracking
In-Reply-To: <20170705112707.54d7f345@gandalf.local.home>
References: <576cea07-770a-4864-c3f5-0832ff211e94@leemhuis.info>
	<20170703123025.7479702e@gandalf.local.home>
	<ad94dc65-cc9c-f4f1-27c1-5a48603c7f59@leemhuis.info>
	<20170705084528.67499f8c@gandalf.local.home>
	<4080ecc7-1aa8-2940-f230-1b79d656cdb4@redhat.com>
	<20170705092757.63dc2328@gandalf.local.home>
	<20170705140607.GA30187@kroah.com>
	<a462fb3b-a6d4-e969-b301-b404981de224@roeck-us.net>
	<20170705112707.54d7f345@gandalf.local.home>
Message-ID: <1499269015.3668.25.camel@HansenPartnership.com>

On Wed, 2017-07-05 at 11:27 -0400, Steven Rostedt wrote:
> On Wed, 5 Jul 2017 08:16:33 -0700
> Guenter Roeck <linux at roeck-us.net> wrote:
> 
> > 
> > The reproducers for several of the usb fixes I submitted recently
> > took hours of stress test to reproduce the underlying problems. I
> > have one more to fix which takes days to reproduce, if at all (I
> > have seen that problem only two or three times during weeks of
> > stress test). Due to the nature of the problems, reproducing
> > them heavily depended on the underlying hardware. None of the
> > reproducers can guarantee that the problem is fixed; they are
> > intended to show the problem, not that it is fixed. This happens a
> > lot with race conditions - in many cases it is impossible to prove
> > that the problem is fixed; one can only prove that it still exists.
> > 
> > Echoing what you said, I have no idea how it would even be possible
> > to write unit tests to verify if the problems I fixed are really
> > fixed.
> > 
> > Several of the fixes I have submitted are based on single-instance
> > error logs with no reproducer. Many others are compile time fixes
> > or fix problems found with code inspection (manual or automatic).
> > 
> > If we start shaming people for not providing unit tests, all we'll
> > accomplish is that people will stop providing bug fixes.
> 
> I need to be clearer on this. What I meant was, if there's a bug
> where someone has a test that easily reproduces the bug, then if
> there's not a test added to selftests for said bug, then we should
> shame those into doing so.
> 
> A bug that is found by inspection or hard to reproduce test cases are
> not applicable, as they don't have tests that can show a regression.
> 
> And I'm betting that those bugs are NOT REGRESSIONS! Most likely are
> bugs that always existed, but because of the unpredictable hitting of
> the bug (as you said, it required hours of stress tests to
> reproduce), the bug was not previously hit during development. That's
> not a regression, that's a feature.
> 
> Are we tracking regressions or just simply bugs?

A lot of device driver regressions are bugs that previously existed in
the code but which didn't manifest until something else happened. ?A
huge number of locking and timing issues are like this. ?The irony is
that a lot of them go from race always being won (so bug never noticed)
to race being lost often enough to make something unusable. ?To a user
that ends up being a kernel regression because it's a bug in the
current kernel which they didn't see previously which makes it unusable
for them.

I've got to vote with my users here: that's a regression not a
"feature".

James


From geert at linux-m68k.org  Wed Jul  5 15:40:44 2017
From: geert at linux-m68k.org (Geert Uytterhoeven)
Date: Wed, 5 Jul 2017 17:40:44 +0200
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
 regression tracking
In-Reply-To: <20170705152026.rkw73q2f6xmiju37@sirena.org.uk>
References: <ad94dc65-cc9c-f4f1-27c1-5a48603c7f59@leemhuis.info>
	<20170705084528.67499f8c@gandalf.local.home>
	<4080ecc7-1aa8-2940-f230-1b79d656cdb4@redhat.com>
	<20170705092757.63dc2328@gandalf.local.home>
	<20170705140607.GA30187@kroah.com>
	<20170705143341.oees22k2snhtmkxo@sirena.org.uk>
	<20170705103658.226099c6@gandalf.local.home>
	<1499266228.3668.10.camel@HansenPartnership.com>
	<20170705105651.5da9c969@gandalf.local.home>
	<1499267389.3668.16.camel@HansenPartnership.com>
	<20170705152026.rkw73q2f6xmiju37@sirena.org.uk>
Message-ID: <CAMuHMdU401+nZxJJU2O=x33LGU4iEAGNyYmhWvc2jCcFGdsLRw@mail.gmail.com>

On Wed, Jul 5, 2017 at 5:20 PM, Mark Brown <broonie at kernel.org> wrote:
> On Wed, Jul 05, 2017 at 08:09:49AM -0700, James Bottomley wrote:
>> On Wed, 2017-07-05 at 10:56 -0400, Steven Rostedt wrote:
>> > James Bottomley <James.Bottomley at HansenPartnership.com> wrote:
>
>> > > you're protecting was accessed outside the lock, which is the usual
>> > > source of concurrency problems.  In other words lockdep is useful
>> > > but it's not a panacea.
>
>> > Still not an excuse to not have lockdep enabled during tests.
>
>> OK, what makes you think lockdep isn't enabled?  Since Kconfig is so
>> complex, I usually use a distro config ... they have it enabled (or at
>> least openSUSE does), so it's enabled for everything I do.
>
> Yeah, I see enough reports with it in embedded contexts to make me think
> people use it there.  I know I tend to have it turned on most of the
> time.  The concurrency stuff I'm thinking of here is more the things
> you're mentioning with just not taking locks at all when they are needed
> or concurrency with hardware.

I try to have it enabled as much as possible.
However, as it increases kernel size (huge static tables), hitting boot
loader limitations on several boards, I cannot enable all debugging
I would like to on all boards.

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert at linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

From rostedt at goodmis.org  Wed Jul  5 15:43:16 2017
From: rostedt at goodmis.org (Steven Rostedt)
Date: Wed, 5 Jul 2017 11:43:16 -0400
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
 regression tracking
In-Reply-To: <1499268748.3668.20.camel@HansenPartnership.com>
References: <576cea07-770a-4864-c3f5-0832ff211e94@leemhuis.info>
	<20170703123025.7479702e@gandalf.local.home>
	<ad94dc65-cc9c-f4f1-27c1-5a48603c7f59@leemhuis.info>
	<20170705084528.67499f8c@gandalf.local.home>
	<4080ecc7-1aa8-2940-f230-1b79d656cdb4@redhat.com>
	<20170705092757.63dc2328@gandalf.local.home>
	<20170705140607.GA30187@kroah.com>
	<20170705143341.oees22k2snhtmkxo@sirena.org.uk>
	<20170705103658.226099c6@gandalf.local.home>
	<1499266228.3668.10.camel@HansenPartnership.com>
	<20170705105651.5da9c969@gandalf.local.home>
	<1499267389.3668.16.camel@HansenPartnership.com>
	<20170705112047.23ee09f6@gandalf.local.home>
	<1499268748.3668.20.camel@HansenPartnership.com>
Message-ID: <20170705114316.424a9e28@gandalf.local.home>

On Wed, 05 Jul 2017 08:32:28 -0700
James Bottomley <James.Bottomley at HansenPartnership.com> wrote:


> > openSuSE has it enabled? I hope not for its production config, as
> > lockdep has a huge performance penalty.  
> 
> Then, surely, it's the last thing we want when tracking down race
> conditgions since it will alter timings dramatically.

It's to be run when you want to make sure locking order is at least not
an issue. And it's not about running when tracking down race
conditions, its to be run when developing new code.


> 
> > I'm thinking you don't have it enabled. What config are you looking
> > at? The actual config that does the testing of locks is
> > CONFIG_PROVE_LOCKING, which selects LOCKDEP to be compiled in.  
> 
> This is what it has:
> 
> jejb at jarvis:~/git/linux-build> grep LOCKDEP /boot/config-4.4.73-18.17-default?
> CONFIG_LOCKDEP_SUPPORT=y

That means your architecture supports it, it's not enabled.

-- Steve

From broonie at kernel.org  Wed Jul  5 15:47:47 2017
From: broonie at kernel.org (Mark Brown)
Date: Wed, 5 Jul 2017 16:47:47 +0100
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
 regression tracking
In-Reply-To: <4080ecc7-1aa8-2940-f230-1b79d656cdb4@redhat.com>
References: <576cea07-770a-4864-c3f5-0832ff211e94@leemhuis.info>
	<20170703123025.7479702e@gandalf.local.home>
	<ad94dc65-cc9c-f4f1-27c1-5a48603c7f59@leemhuis.info>
	<20170705084528.67499f8c@gandalf.local.home>
	<4080ecc7-1aa8-2940-f230-1b79d656cdb4@redhat.com>
Message-ID: <20170705154747.gtu6v5rrol2xrgbx@sirena.org.uk>

On Wed, Jul 05, 2017 at 09:09:51AM -0400, Carlos O'Donell wrote:

> This problem is a reflection of our own explicit or implicit priorities.
> The priorities of developers and reviewers needs to change to make an
> impact on the problem. This is a hard problem.

Take a look at the trajectory for the build and boot testing for a
concrete example of this - the failure rates go down over time but it's
not a quick process.

> As a concrete action item, glibc core developers took a harder stance on
> (a) all user-visible bugs need a bug # (forces people to think about the
> problem and file a coherent public bug about it) (b) all bugs needs a
> regression test if possible, (c) and if not possible we need to extend
> the testing framework to make it possible (we've started using kernel
> namespaces to create isolated test configurations).

One thing I'd really like to see here is an equivalent of the build and
boot testing we currently have that exercises some of the testsuites on
a regular basis so we can push on keeping them running cleanly.  As well
as just the intrisic value of the tests themselves I'd hope that a
visible practical interest would help push more activity in this area.

There's a couple of efforts I'm aware of in this area, hopefully one or
both of them will start delivering.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: not available
URL: <http://lists.linuxfoundation.org/pipermail/ksummit-discuss/attachments/20170705/76afe7f9/attachment-0001.sig>

From rostedt at goodmis.org  Wed Jul  5 15:52:19 2017
From: rostedt at goodmis.org (Steven Rostedt)
Date: Wed, 5 Jul 2017 11:52:19 -0400
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
 regression tracking
In-Reply-To: <20170705153259.GA7265@kroah.com>
References: <576cea07-770a-4864-c3f5-0832ff211e94@leemhuis.info>
	<20170703123025.7479702e@gandalf.local.home>
	<ad94dc65-cc9c-f4f1-27c1-5a48603c7f59@leemhuis.info>
	<20170705084528.67499f8c@gandalf.local.home>
	<4080ecc7-1aa8-2940-f230-1b79d656cdb4@redhat.com>
	<20170705092757.63dc2328@gandalf.local.home>
	<20170705140607.GA30187@kroah.com>
	<a462fb3b-a6d4-e969-b301-b404981de224@roeck-us.net>
	<20170705153259.GA7265@kroah.com>
Message-ID: <20170705115219.02370220@gandalf.local.home>

On Wed, 5 Jul 2017 17:32:59 +0200
Greg KH <greg at kroah.com> wrote:

> On Wed, Jul 05, 2017 at 08:16:33AM -0700, Guenter Roeck wrote:
> > If we start shaming people for not providing unit tests, all we'll accomplish is
> > that people will stop providing bug fixes.  
> 
> Yes, this is the key!

And I mentioned this in my initial email.

> 
> Steven, just look at everything marked with a "Fixes:" or "stable@" tag
> from 4.12-rc1..4.12 and try to determine how you would write a test for
> the majority of them.

It only makes sense if there's a reproducible case. For cases where
stress testing is required and you hope to hit the bug, well, that's
never an easy answer, and this is not something that will fix it.

> 
> Yes, for some subsystems this can work (look at xfstests as one great
> example for filesystems, same for the i915 tests), but for the majority
> of the kernel, at this point in time, it doesn't make sense.

I already do. Actually, I have just fixed a bug that I need to add a
selftest for. Yes, it is easier for non hardware, but for cases which
has specs on hardware behavior, why can't we have tests to test if the
hardware matches the spec?

Everyone is focusing on that "shaming" comment and not looking at the
rest of what I wrote. My main point is, there's a lot of reproducers in
change logs or emails that are not in selftests. There's no excuse for
that. Lets fix that issue, and not go into a bike shedding fight about
the entire approach.

> 
> So take Carlos's advice, start small, do it for your subsystem if you

Yes, lets start small. What do you think about all reproducers getting
into selftests? If it's not 100% reproducing, then it's up to the
individual, but any test that can trigger a bug 100% should be added.

I'd like to expand selftests to include configs too. If there's a
config that triggers a bug, that should be added to a list of "configs"
to be tested as well.

> don't touch hardware (easy peasy, right?), and let's see how it goes,
> and see if we have the infrastructure to do it even today.  Right now,
> kselftests is finally getting a unified output format, which is great,
> it shows that people are starting to use and rely on it.  What else will
> we need to make this more widely used, we don't know yet...

I've been using selftests for ftrace for some time. I have my own tests
that I run (which do test any config that has failed me in the past),
and I'm slowing getting those into the selftests directory as well.

-- Steve


From rostedt at goodmis.org  Wed Jul  5 16:04:59 2017
From: rostedt at goodmis.org (Steven Rostedt)
Date: Wed, 5 Jul 2017 12:04:59 -0400
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
 regression tracking
In-Reply-To: <1499269015.3668.25.camel@HansenPartnership.com>
References: <576cea07-770a-4864-c3f5-0832ff211e94@leemhuis.info>
	<20170703123025.7479702e@gandalf.local.home>
	<ad94dc65-cc9c-f4f1-27c1-5a48603c7f59@leemhuis.info>
	<20170705084528.67499f8c@gandalf.local.home>
	<4080ecc7-1aa8-2940-f230-1b79d656cdb4@redhat.com>
	<20170705092757.63dc2328@gandalf.local.home>
	<20170705140607.GA30187@kroah.com>
	<a462fb3b-a6d4-e969-b301-b404981de224@roeck-us.net>
	<20170705112707.54d7f345@gandalf.local.home>
	<1499269015.3668.25.camel@HansenPartnership.com>
Message-ID: <20170705120459.41e81f7b@gandalf.local.home>

On Wed, 05 Jul 2017 08:36:55 -0700
James Bottomley <James.Bottomley at HansenPartnership.com> wrote:

> > Are we tracking regressions or just simply bugs?  
> 
> A lot of device driver regressions are bugs that previously existed in
> the code but which didn't manifest until something else happened. ?A
> huge number of locking and timing issues are like this. ?The irony is
> that a lot of them go from race always being won (so bug never noticed)
> to race being lost often enough to make something unusable. ?To a user
> that ends up being a kernel regression because it's a bug in the
> current kernel which they didn't see previously which makes it unusable
> for them.
> 
> I've got to vote with my users here: that's a regression not a
> "feature".

Let's take a step back. What exactly is the problem?

The regressions that we want to track? Why are they not fixed? Is it
because they are hard to reproduce? If so, how do we know they are a
regression or just some hard to hit bug? If it's hard to hit, how do we
know we fixed it?

What exactly are the questions we want solved.

Granted, I used this thread to push more use of kselftests, and I don't
see any SCSI tests there at all!

-- Steve

From rostedt at goodmis.org  Wed Jul  5 16:10:00 2017
From: rostedt at goodmis.org (Steven Rostedt)
Date: Wed, 5 Jul 2017 12:10:00 -0400
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
 regression tracking
In-Reply-To: <8c6843e8-73d9-a898-0366-0b72dfeb79a2@redhat.com>
References: <576cea07-770a-4864-c3f5-0832ff211e94@leemhuis.info>
	<20170703123025.7479702e@gandalf.local.home>
	<ad94dc65-cc9c-f4f1-27c1-5a48603c7f59@leemhuis.info>
	<20170705084528.67499f8c@gandalf.local.home>
	<4080ecc7-1aa8-2940-f230-1b79d656cdb4@redhat.com>
	<20170705092757.63dc2328@gandalf.local.home>
	<20170705140607.GA30187@kroah.com>
	<20170705103335.0cbd9984@gandalf.local.home>
	<8c6843e8-73d9-a898-0366-0b72dfeb79a2@redhat.com>
Message-ID: <20170705121000.5430d7d0@gandalf.local.home>

On Wed, 5 Jul 2017 11:08:48 -0400
Carlos O'Donell <carlos at redhat.com> wrote:

> On 07/05/2017 10:33 AM, Steven Rostedt wrote:
> > No test should be written for a single specific hardware. It should be a
> > general functionality that different hardware can execute.  
> 
> Why? We test all sorts of hardware in userspace and we see value in that
> testing.
> 

One reason is for bit rot. I'm not totally against it. But I envision
that if we have hundreds of tests for very specific pieces of hardware,
it's value will diminish over time. Unless we can get a good
infrastructure written where the hardware info is more of a data sheet
then a single test itself.

-- Steve

From linux at roeck-us.net  Wed Jul  5 16:48:31 2017
From: linux at roeck-us.net (Guenter Roeck)
Date: Wed, 5 Jul 2017 09:48:31 -0700
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
 regression tracking
In-Reply-To: <20170705112707.54d7f345@gandalf.local.home>
References: <576cea07-770a-4864-c3f5-0832ff211e94@leemhuis.info>
	<20170703123025.7479702e@gandalf.local.home>
	<ad94dc65-cc9c-f4f1-27c1-5a48603c7f59@leemhuis.info>
	<20170705084528.67499f8c@gandalf.local.home>
	<4080ecc7-1aa8-2940-f230-1b79d656cdb4@redhat.com>
	<20170705092757.63dc2328@gandalf.local.home>
	<20170705140607.GA30187@kroah.com>
	<a462fb3b-a6d4-e969-b301-b404981de224@roeck-us.net>
	<20170705112707.54d7f345@gandalf.local.home>
Message-ID: <c782a15a-4e73-7373-ca66-5b55e9406059@roeck-us.net>

On 07/05/2017 08:27 AM, Steven Rostedt wrote:
> On Wed, 5 Jul 2017 08:16:33 -0700
> Guenter Roeck <linux at roeck-us.net> wrote:
[ ... ]
>>
>> If we start shaming people for not providing unit tests, all we'll accomplish is
>> that people will stop providing bug fixes.
> 
> I need to be clearer on this. What I meant was, if there's a bug
> where someone has a test that easily reproduces the bug, then if
> there's not a test added to selftests for said bug, then we should
> shame those into doing so.
> 

I don't think that public shaming of kernel developers is going to work
any better than public shaming of children or teenagers.

Maybe a friendlier approach would be more useful ?

If a test to reproduce a problem exists, it might be more beneficial to suggest
to the patch submitter that it would be great if that test would be submitted
as unit test instead of shaming that person for not doing so. Acknowledging and
praising kselftest submissions might help more than shaming for non-submissions.

> A bug that is found by inspection or hard to reproduce test cases are
> not applicable, as they don't have tests that can show a regression.
> 

My concern would be that once the shaming starts, it won't stop.

Guenter

From dan.j.williams at intel.com  Wed Jul  5 16:54:29 2017
From: dan.j.williams at intel.com (Dan Williams)
Date: Wed, 5 Jul 2017 09:54:29 -0700
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
 regression tracking
In-Reply-To: <20170705140607.GA30187@kroah.com>
References: <576cea07-770a-4864-c3f5-0832ff211e94@leemhuis.info>
	<20170703123025.7479702e@gandalf.local.home>
	<ad94dc65-cc9c-f4f1-27c1-5a48603c7f59@leemhuis.info>
	<20170705084528.67499f8c@gandalf.local.home>
	<4080ecc7-1aa8-2940-f230-1b79d656cdb4@redhat.com>
	<20170705092757.63dc2328@gandalf.local.home>
	<20170705140607.GA30187@kroah.com>
Message-ID: <CAPcyv4iOV2-hndx1rQmpPQF+myp=P8rmpf5JhXQXZxPhR6qoQw@mail.gmail.com>

On Wed, Jul 5, 2017 at 7:06 AM, Greg KH <greg at kroah.com> wrote:
> On Wed, Jul 05, 2017 at 09:27:57AM -0400, Steven Rostedt wrote:
>> Your "b" above is what I would like to push. But who's going to enforce
>> this? With 10,000 changes per release, and a lot of them are fixes, the
>> best we can do is the honor system. Start shaming people that don't
>> have a regression test along with a Fixes tag (but we don't want people
>> to fix bugs without adding that tag either). There is a fine line one
>> must walk between getting people to change their approaches to bugs and
>> regression tests, and pissing them off where they start doing the
>> opposite of what would be best for the community.
>
> I would bet, for the huge majority of our fixes, they are fixes for
> specific hardware, or workarounds for specific hardware issues.  Now
> writing tests for those is not an impossible task (look at what the i915
> developers have), but it is very very hard overall, especially if the
> base infrastructure isn't there to do it.
>
> For specific examples, here's the shortlog for fixes that went into
> drivers/usb/host/ for 4.12 after 4.12-rc1 came out.  Do you know of a
> way to write a test for these types of things?
>         usb: xhci: ASMedia ASM1042A chipset need shorts TX quirk
>         usb: xhci: Fix USB 3.1 supported protocol parsing
>         usb: host: xhci-plat: propagate return value of platform_get_irq()
>         xhci: Fix command ring stop regression in 4.11
>         xhci: remove GFP_DMA flag from allocation
>         USB: xhci: fix lock-inversion problem
>         usb: host: xhci-ring: don't need to clear interrupt pending for MSI enabled hcd
>         usb: host: xhci-mem: allocate zeroed Scratchpad Buffer
>         xhci: apply PME_STUCK_QUIRK and MISSING_CAS quirk for Denverton
>         usb: xhci: trace URB before giving it back instead of after
>         USB: host: xhci: use max-port define
>         USB: ehci-platform: fix companion-device leak
>         usb: r8a66597-hcd: select a different endpoint on timeout
>         usb: r8a66597-hcd: decrease timeout

I wrote some test infrastructure to go after xhci TRB boundary
conditions [1]. So, yes, some of these are possible to unit test, but
of course not all.

[1]: http://marc.info/?l=linux-usb&m=140872785411304&w=2

From dan.j.williams at intel.com  Wed Jul  5 16:58:06 2017
From: dan.j.williams at intel.com (Dan Williams)
Date: Wed, 5 Jul 2017 09:58:06 -0700
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
 regression tracking
In-Reply-To: <c782a15a-4e73-7373-ca66-5b55e9406059@roeck-us.net>
References: <576cea07-770a-4864-c3f5-0832ff211e94@leemhuis.info>
	<20170703123025.7479702e@gandalf.local.home>
	<ad94dc65-cc9c-f4f1-27c1-5a48603c7f59@leemhuis.info>
	<20170705084528.67499f8c@gandalf.local.home>
	<4080ecc7-1aa8-2940-f230-1b79d656cdb4@redhat.com>
	<20170705092757.63dc2328@gandalf.local.home>
	<20170705140607.GA30187@kroah.com>
	<a462fb3b-a6d4-e969-b301-b404981de224@roeck-us.net>
	<20170705112707.54d7f345@gandalf.local.home>
	<c782a15a-4e73-7373-ca66-5b55e9406059@roeck-us.net>
Message-ID: <CAPcyv4ips5B1DjEPjpO2JDdt+g-5hEZzvV1b1SbVJZxVpweHpQ@mail.gmail.com>

On Wed, Jul 5, 2017 at 9:48 AM, Guenter Roeck <linux at roeck-us.net> wrote:
> On 07/05/2017 08:27 AM, Steven Rostedt wrote:
>>
>> On Wed, 5 Jul 2017 08:16:33 -0700
>> Guenter Roeck <linux at roeck-us.net> wrote:
>
> [ ... ]
>>>
>>>
>>> If we start shaming people for not providing unit tests, all we'll
>>> accomplish is
>>> that people will stop providing bug fixes.
>>
>>
>> I need to be clearer on this. What I meant was, if there's a bug
>> where someone has a test that easily reproduces the bug, then if
>> there's not a test added to selftests for said bug, then we should
>> shame those into doing so.
>>
>
> I don't think that public shaming of kernel developers is going to work
> any better than public shaming of children or teenagers.
>
> Maybe a friendlier approach would be more useful ?
>
> If a test to reproduce a problem exists, it might be more beneficial to
> suggest
> to the patch submitter that it would be great if that test would be
> submitted
> as unit test instead of shaming that person for not doing so. Acknowledging
> and
> praising kselftest submissions might help more than shaming for
> non-submissions.
>
>> A bug that is found by inspection or hard to reproduce test cases are
>> not applicable, as they don't have tests that can show a regression.
>>
>
> My concern would be that once the shaming starts, it won't stop.

Agreed, this shouldn't be a new burden for maintainers, this should be
a contribution path for new kernel developers. Go beyond our standard
"fix a bug" advice, which is a great advice, and also recommend
"backstop a regression with a unit test".

From James.Bottomley at HansenPartnership.com  Wed Jul  5 16:58:28 2017
From: James.Bottomley at HansenPartnership.com (James Bottomley)
Date: Wed, 05 Jul 2017 09:58:28 -0700
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
 regression tracking
In-Reply-To: <20170705120459.41e81f7b@gandalf.local.home>
References: <576cea07-770a-4864-c3f5-0832ff211e94@leemhuis.info>
	<20170703123025.7479702e@gandalf.local.home>
	<ad94dc65-cc9c-f4f1-27c1-5a48603c7f59@leemhuis.info>
	<20170705084528.67499f8c@gandalf.local.home>
	<4080ecc7-1aa8-2940-f230-1b79d656cdb4@redhat.com>
	<20170705092757.63dc2328@gandalf.local.home>
	<20170705140607.GA30187@kroah.com>
	<a462fb3b-a6d4-e969-b301-b404981de224@roeck-us.net>
	<20170705112707.54d7f345@gandalf.local.home>
	<1499269015.3668.25.camel@HansenPartnership.com>
	<20170705120459.41e81f7b@gandalf.local.home>
Message-ID: <1499273908.3668.30.camel@HansenPartnership.com>

On Wed, 2017-07-05 at 12:04 -0400, Steven Rostedt wrote:
> On Wed, 05 Jul 2017 08:36:55 -0700
> James Bottomley <James.Bottomley at HansenPartnership.com> wrote:
> 
> > 
> > > 
> > > Are we tracking regressions or just simply bugs???
> > 
> > A lot of device driver regressions are bugs that previously existed
> > in the code but which didn't manifest until something else
> > happened. ?A huge number of locking and timing issues are like
> > this. ?The irony is that a lot of them go from race always being
> > won (so bug never noticed) to race being lost often enough to make
> > something unusable. ?To a user that ends up being a kernel
> > regression because it's a bug in the current kernel which they
> > didn't see previously which makes it unusable for them.
> > 
> > I've got to vote with my users here: that's a regression not a
> > "feature".
> 
> Let's take a step back. What exactly is the problem?

You mean what question was I answering? ?It was your "is your problem a
regression?" one.

> The regressions that we want to track? Why are they not fixed? Is it
> because they are hard to reproduce? If so, how do we know they are a
> regression or just some hard to hit bug? If it's hard to hit, how do
> we know we fixed it?

Usually for the exposed races we develop a theoretical model which
tells us what the problem is and also the solution.

> What exactly are the questions we want solved.

In the context of this subthread? ?Tracking and fixing of regressions
meaning behaviour that damages or destroys usability of version k+1
that wasn't present in version k.

> Granted, I used this thread to push more use of kselftests, and I
> don't see any SCSI tests there at all!

It would be an interesting question for another thread to consider
whether that's a problem or not.

James


From rostedt at goodmis.org  Wed Jul  5 17:02:00 2017
From: rostedt at goodmis.org (Steven Rostedt)
Date: Wed, 5 Jul 2017 13:02:00 -0400
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
 regression tracking
In-Reply-To: <c782a15a-4e73-7373-ca66-5b55e9406059@roeck-us.net>
References: <576cea07-770a-4864-c3f5-0832ff211e94@leemhuis.info>
	<20170703123025.7479702e@gandalf.local.home>
	<ad94dc65-cc9c-f4f1-27c1-5a48603c7f59@leemhuis.info>
	<20170705084528.67499f8c@gandalf.local.home>
	<4080ecc7-1aa8-2940-f230-1b79d656cdb4@redhat.com>
	<20170705092757.63dc2328@gandalf.local.home>
	<20170705140607.GA30187@kroah.com>
	<a462fb3b-a6d4-e969-b301-b404981de224@roeck-us.net>
	<20170705112707.54d7f345@gandalf.local.home>
	<c782a15a-4e73-7373-ca66-5b55e9406059@roeck-us.net>
Message-ID: <20170705130200.7c653f61@gandalf.local.home>

On Wed, 5 Jul 2017 09:48:31 -0700
Guenter Roeck <linux at roeck-us.net> wrote:

> On 07/05/2017 08:27 AM, Steven Rostedt wrote:
> > On Wed, 5 Jul 2017 08:16:33 -0700
> > Guenter Roeck <linux at roeck-us.net> wrote:  
> [ ... ]
> >>
> >> If we start shaming people for not providing unit tests, all we'll accomplish is
> >> that people will stop providing bug fixes.  
> > 
> > I need to be clearer on this. What I meant was, if there's a bug
> > where someone has a test that easily reproduces the bug, then if
> > there's not a test added to selftests for said bug, then we should
> > shame those into doing so.
> >   
> 
> I don't think that public shaming of kernel developers is going to work
> any better than public shaming of children or teenagers.
> 
> Maybe a friendlier approach would be more useful ?

I'm a friendly shamer ;-)

> 
> If a test to reproduce a problem exists, it might be more beneficial to suggest
> to the patch submitter that it would be great if that test would be submitted
> as unit test instead of shaming that person for not doing so. Acknowledging and
> praising kselftest submissions might help more than shaming for non-submissions.
> 
> > A bug that is found by inspection or hard to reproduce test cases are
> > not applicable, as they don't have tests that can show a regression.
> >   
> 
> My concern would be that once the shaming starts, it won't stop.

I think this is a communication issue. My word for "shaming" was to
call out a developer for not submitting a test. It wasn't about making
fun of them, or anything like that. I was only making a point
about how to teach people that they need to be more aware of the
testing infrastructure. Not about actually demeaning people.

Lets take a hypothetical sample. Say someone posted a bug report with
an associated reproducer for it. The developer then runs the reproducer
sees the bug, makes a fix and sends it to Linus and stable. Now the
developer forgets this and continues on their merry way. Along comes
someone like myself and sees a reproducing test case for a bug, but
sees no test added to kselftests. I would send an email along the lines
of "Hi, I noticed that there was a reproducer for this bug you fixed.
How come there was no test added to the kselftests to make sure it
doesn't appear again?" There, I "shamed" them ;-)

-- Steve

From rostedt at goodmis.org  Wed Jul  5 17:07:24 2017
From: rostedt at goodmis.org (Steven Rostedt)
Date: Wed, 5 Jul 2017 13:07:24 -0400
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
 regression tracking
In-Reply-To: <1499273908.3668.30.camel@HansenPartnership.com>
References: <576cea07-770a-4864-c3f5-0832ff211e94@leemhuis.info>
	<20170703123025.7479702e@gandalf.local.home>
	<ad94dc65-cc9c-f4f1-27c1-5a48603c7f59@leemhuis.info>
	<20170705084528.67499f8c@gandalf.local.home>
	<4080ecc7-1aa8-2940-f230-1b79d656cdb4@redhat.com>
	<20170705092757.63dc2328@gandalf.local.home>
	<20170705140607.GA30187@kroah.com>
	<a462fb3b-a6d4-e969-b301-b404981de224@roeck-us.net>
	<20170705112707.54d7f345@gandalf.local.home>
	<1499269015.3668.25.camel@HansenPartnership.com>
	<20170705120459.41e81f7b@gandalf.local.home>
	<1499273908.3668.30.camel@HansenPartnership.com>
Message-ID: <20170705130724.66637518@gandalf.local.home>

On Wed, 05 Jul 2017 09:58:28 -0700
James Bottomley <James.Bottomley at HansenPartnership.com> wrote:

> On Wed, 2017-07-05 at 12:04 -0400, Steven Rostedt wrote:
> > On Wed, 05 Jul 2017 08:36:55 -0700
> > James Bottomley <James.Bottomley at HansenPartnership.com> wrote:
> >   
> > >   
> > > > 
> > > > Are we tracking regressions or just simply bugs???  
> > > 
> > > A lot of device driver regressions are bugs that previously existed
> > > in the code but which didn't manifest until something else
> > > happened. ?A huge number of locking and timing issues are like
> > > this. ?The irony is that a lot of them go from race always being
> > > won (so bug never noticed) to race being lost often enough to make
> > > something unusable. ?To a user that ends up being a kernel
> > > regression because it's a bug in the current kernel which they
> > > didn't see previously which makes it unusable for them.
> > > 
> > > I've got to vote with my users here: that's a regression not a
> > > "feature".  
> > 
> > Let's take a step back. What exactly is the problem?  
> 
> You mean what question was I answering? ?It was your "is your problem a
> regression?" one.

No that's not what I meant. I mean that we are going off tangent to the
original topic.

> 
> > The regressions that we want to track? Why are they not fixed? Is it
> > because they are hard to reproduce? If so, how do we know they are a
> > regression or just some hard to hit bug? If it's hard to hit, how do
> > we know we fixed it?  
> 
> Usually for the exposed races we develop a theoretical model which
> tells us what the problem is and also the solution.

I think the problem is that the regressions that are not being fixed
happen to be where we have no model to create, as the problem may be
too hard to hit, and it could just be a "works for me" issue.

> 
> > What exactly are the questions we want solved.  
> 
> In the context of this subthread? ?Tracking and fixing of regressions
> meaning behaviour that damages or destroys usability of version k+1
> that wasn't present in version k.

Agreed with this part. And I believe this is also in the context of the
entire thread.

> 
> > Granted, I used this thread to push more use of kselftests, and I
> > don't see any SCSI tests there at all!  
> 
> It would be an interesting question for another thread to consider
> whether that's a problem or not.

It's not a problem for me, but it begs the question of whether it would
be useful or not. But I agree, that's for another thread.

-- Steve

From daniel.vetter at ffwll.ch  Wed Jul  5 18:17:05 2017
From: daniel.vetter at ffwll.ch (Daniel Vetter)
Date: Wed, 5 Jul 2017 20:17:05 +0200
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
 regression tracking
In-Reply-To: <20170705103658.226099c6@gandalf.local.home>
References: <576cea07-770a-4864-c3f5-0832ff211e94@leemhuis.info>
	<20170703123025.7479702e@gandalf.local.home>
	<ad94dc65-cc9c-f4f1-27c1-5a48603c7f59@leemhuis.info>
	<20170705084528.67499f8c@gandalf.local.home>
	<4080ecc7-1aa8-2940-f230-1b79d656cdb4@redhat.com>
	<20170705092757.63dc2328@gandalf.local.home>
	<20170705140607.GA30187@kroah.com>
	<20170705143341.oees22k2snhtmkxo@sirena.org.uk>
	<20170705103658.226099c6@gandalf.local.home>
Message-ID: <CAKMK7uF=mMHYfDSv2PfPT1xx6nrL8DEAk3XSyZZ6qfxRZXRVKg@mail.gmail.com>

On Wed, Jul 5, 2017 at 4:36 PM, Steven Rostedt <rostedt at goodmis.org> wrote:
> On Wed, 5 Jul 2017 15:33:41 +0100
> Mark Brown <broonie at kernel.org> wrote:
>> On Wed, Jul 05, 2017 at 04:06:07PM +0200, Greg KH wrote:
>> > I don't mean to poo-poo the idea, but please realize that around 75% of
>> > the kernel is hardware/arch support, so that means that 75% of the
>> > changes/fixes deal with hardware things (yes, change is in direct
>> > correlation to size of the codebase in the tree, strange but true).
>>
>> Then add in all the fixes for concurrency/locking issues and so on
>> that're hard to reliably reproduce as well...
>
> All tests should be run with lockdep enabled ;-)  Which a surprising
> few developers appear to do :-p

We're slowly working towards running the i915 testsuite with kasan
enabled as the next level of evil. It's ... interesting, to say the
least.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

From daniel.vetter at ffwll.ch  Wed Jul  5 18:24:26 2017
From: daniel.vetter at ffwll.ch (Daniel Vetter)
Date: Wed, 5 Jul 2017 20:24:26 +0200
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
 regression tracking
In-Reply-To: <1499267389.3668.16.camel@HansenPartnership.com>
References: <576cea07-770a-4864-c3f5-0832ff211e94@leemhuis.info>
	<20170703123025.7479702e@gandalf.local.home>
	<ad94dc65-cc9c-f4f1-27c1-5a48603c7f59@leemhuis.info>
	<20170705084528.67499f8c@gandalf.local.home>
	<4080ecc7-1aa8-2940-f230-1b79d656cdb4@redhat.com>
	<20170705092757.63dc2328@gandalf.local.home>
	<20170705140607.GA30187@kroah.com>
	<20170705143341.oees22k2snhtmkxo@sirena.org.uk>
	<20170705103658.226099c6@gandalf.local.home>
	<1499266228.3668.10.camel@HansenPartnership.com>
	<20170705105651.5da9c969@gandalf.local.home>
	<1499267389.3668.16.camel@HansenPartnership.com>
Message-ID: <CAKMK7uEGrNH-VGh=cX5y77F-=b+VZfnBzGPh3a1MQYNuohO2pg@mail.gmail.com>

On Wed, Jul 5, 2017 at 5:09 PM, James Bottomley
<James.Bottomley at hansenpartnership.com> wrote:
>> > > All tests should be run with lockdep enabled ;-)  Which a
>> > > surprising few developers appear to do :-p
>> >
>> > Lockdep checks the locking hierarchies and makes assumptions about
>> > them which it then validates ... it doesn't tell you if the data
>> > you think
>>
>> We should probably look at adding infrastructure that helps in that.
>> RCU already has a lot of there to help know if data is being
>> protected by RCU or not.
>>
>> Hmm, maybe we could add a __rcu like type that we can associate
>> protected data with, where a config can associate access to a
>> variable with a lock being held?
>
> That's about 10x more complex than the releases/acquires/must_hold
> annotation, which we have fairly dismal coverage on.

Yeah, I've never found those useful at all. What we're trying to do in
drm code is liberally sprinkle lockdep_assert_held into accessor and
helper functions (there's lots of nontrivial stuff where you need a
little bit of computation around a pure access, so doesn't result in
ugly code). That catches a lot of these, but of course not all.

The problem with static annotations is that often the lock you need to
hold isn't statically known, and annotating the entire callchain is a
no-go as James points out. But maybe we could use such annotations
plus a gcc plugin to auto-insert the right lockdep_assert_held every
time you read/write into a given field?

That's not going to cover locking rules where the locking rules change
during the lifetime of an object, but I think even without that it
would cover a _lot_ of cases. And if your static annotation would be
allowed to chase pointers (well, just any C expression that takes the
struct pointer as parameter would be sweet) you could even annotate
fields where the protecting lock is in some parent struct.

Another thing I'm really looking forward to (but it's somehow not
moving fast) is the cross-release stuff. Too many times I've screamed
at kernel backtraces stuck in wait_event, and lockdep could have
directly told me what's wrong long before a stress test successfully
hit that race.

There's definitely a lot of room to prove more stuff in locking using tools.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

From daniel.vetter at ffwll.ch  Wed Jul  5 18:29:51 2017
From: daniel.vetter at ffwll.ch (Daniel Vetter)
Date: Wed, 5 Jul 2017 20:29:51 +0200
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
 regression tracking
In-Reply-To: <20170705153259.GA7265@kroah.com>
References: <576cea07-770a-4864-c3f5-0832ff211e94@leemhuis.info>
	<20170703123025.7479702e@gandalf.local.home>
	<ad94dc65-cc9c-f4f1-27c1-5a48603c7f59@leemhuis.info>
	<20170705084528.67499f8c@gandalf.local.home>
	<4080ecc7-1aa8-2940-f230-1b79d656cdb4@redhat.com>
	<20170705092757.63dc2328@gandalf.local.home>
	<20170705140607.GA30187@kroah.com>
	<a462fb3b-a6d4-e969-b301-b404981de224@roeck-us.net>
	<20170705153259.GA7265@kroah.com>
Message-ID: <CAKMK7uEBB0jLyr0vfXcJ5U1peSfyPYOkMJ9+2FG7Evtznz+7bg@mail.gmail.com>

On Wed, Jul 5, 2017 at 5:32 PM, Greg KH <greg at kroah.com> wrote:
> On Wed, Jul 05, 2017 at 08:16:33AM -0700, Guenter Roeck wrote:
>> If we start shaming people for not providing unit tests, all we'll accomplish is
>> that people will stop providing bug fixes.
>
> Yes, this is the key!
>
> Steven, just look at everything marked with a "Fixes:" or "stable@" tag
> from 4.12-rc1..4.12 and try to determine how you would write a test for
> the majority of them.
>
> Yes, for some subsystems this can work (look at xfstests as one great
> example for filesystems, same for the i915 tests), but for the majority
> of the kernel, at this point in time, it doesn't make sense.
>
> So take Carlos's advice, start small, do it for your subsystem if you
> don't touch hardware (easy peasy, right?), and let's see how it goes,
> and see if we have the infrastructure to do it even today.  Right now,
> kselftests is finally getting a unified output format, which is great,
> it shows that people are starting to use and rely on it.  What else will
> we need to make this more widely used, we don't know yet...

This is very hard work and takes a long time. Since 3 years I'm trying
to establish the i915 test suite as an overall drm validation set. At
least the generic parts like for the cross-driver kernel modeset
interfaces, but also allowing other drivers to test their hw specific
command submission. It's very slow going ...
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

From greg at kroah.com  Wed Jul  5 18:42:45 2017
From: greg at kroah.com (Greg KH)
Date: Wed, 5 Jul 2017 20:42:45 +0200
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
 regression tracking
In-Reply-To: <20170705115219.02370220@gandalf.local.home>
References: <576cea07-770a-4864-c3f5-0832ff211e94@leemhuis.info>
	<20170703123025.7479702e@gandalf.local.home>
	<ad94dc65-cc9c-f4f1-27c1-5a48603c7f59@leemhuis.info>
	<20170705084528.67499f8c@gandalf.local.home>
	<4080ecc7-1aa8-2940-f230-1b79d656cdb4@redhat.com>
	<20170705092757.63dc2328@gandalf.local.home>
	<20170705140607.GA30187@kroah.com>
	<a462fb3b-a6d4-e969-b301-b404981de224@roeck-us.net>
	<20170705153259.GA7265@kroah.com>
	<20170705115219.02370220@gandalf.local.home>
Message-ID: <20170705184245.GA22044@kroah.com>

On Wed, Jul 05, 2017 at 11:52:19AM -0400, Steven Rostedt wrote:
> > So take Carlos's advice, start small, do it for your subsystem if you
> 
> Yes, lets start small. What do you think about all reproducers getting
> into selftests? If it's not 100% reproducing, then it's up to the
> individual, but any test that can trigger a bug 100% should be added.

That would be great.  One could argue that we should be adding the
"stack guard" testing apps to the selftest tree now, as a number of us
have them floating around in their test directories at the moment.

> I'd like to expand selftests to include configs too. If there's a
> config that triggers a bug, that should be added to a list of "configs"
> to be tested as well.

So a test needs a specific configuration?  We need a way to specify that
in a generic fashion so that all tests don't have to duplicate that
logic.  Time to write a helper function to parse /proc/config.gz :)

thanks,

greg k-h

From greg at kroah.com  Wed Jul  5 18:45:44 2017
From: greg at kroah.com (Greg KH)
Date: Wed, 5 Jul 2017 20:45:44 +0200
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
 regression tracking
In-Reply-To: <CAPcyv4iOV2-hndx1rQmpPQF+myp=P8rmpf5JhXQXZxPhR6qoQw@mail.gmail.com>
References: <576cea07-770a-4864-c3f5-0832ff211e94@leemhuis.info>
	<20170703123025.7479702e@gandalf.local.home>
	<ad94dc65-cc9c-f4f1-27c1-5a48603c7f59@leemhuis.info>
	<20170705084528.67499f8c@gandalf.local.home>
	<4080ecc7-1aa8-2940-f230-1b79d656cdb4@redhat.com>
	<20170705092757.63dc2328@gandalf.local.home>
	<20170705140607.GA30187@kroah.com>
	<CAPcyv4iOV2-hndx1rQmpPQF+myp=P8rmpf5JhXQXZxPhR6qoQw@mail.gmail.com>
Message-ID: <20170705184544.GB22044@kroah.com>

On Wed, Jul 05, 2017 at 09:54:29AM -0700, Dan Williams wrote:
> 
> I wrote some test infrastructure to go after xhci TRB boundary
> conditions [1]. So, yes, some of these are possible to unit test, but
> of course not all.
> 
> [1]: http://marc.info/?l=linux-usb&m=140872785411304&w=2

I forgot about that, what ever happened to it, any reason it never got
merged?

thanks,

greg k-h

From dan.j.williams at intel.com  Wed Jul  5 19:47:25 2017
From: dan.j.williams at intel.com (Dan Williams)
Date: Wed, 5 Jul 2017 12:47:25 -0700
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
 regression tracking
In-Reply-To: <20170705184544.GB22044@kroah.com>
References: <576cea07-770a-4864-c3f5-0832ff211e94@leemhuis.info>
	<20170703123025.7479702e@gandalf.local.home>
	<ad94dc65-cc9c-f4f1-27c1-5a48603c7f59@leemhuis.info>
	<20170705084528.67499f8c@gandalf.local.home>
	<4080ecc7-1aa8-2940-f230-1b79d656cdb4@redhat.com>
	<20170705092757.63dc2328@gandalf.local.home>
	<20170705140607.GA30187@kroah.com>
	<CAPcyv4iOV2-hndx1rQmpPQF+myp=P8rmpf5JhXQXZxPhR6qoQw@mail.gmail.com>
	<20170705184544.GB22044@kroah.com>
Message-ID: <CAPcyv4in4xtyL0qk86QQJpofwn-+HZSsRqTAHLQ6fTmJNkOjWw@mail.gmail.com>

On Wed, Jul 5, 2017 at 11:45 AM, Greg KH <greg at kroah.com> wrote:
> On Wed, Jul 05, 2017 at 09:54:29AM -0700, Dan Williams wrote:
>>
>> I wrote some test infrastructure to go after xhci TRB boundary
>> conditions [1]. So, yes, some of these are possible to unit test, but
>> of course not all.
>>
>> [1]: http://marc.info/?l=linux-usb&m=140872785411304&w=2
>
> I forgot about that, what ever happened to it, any reason it never got
> merged?

Ran out of time before being consumed by NVDIMM stuff, but I did take
some of the lessons learned over into tools/testing/nvdimm/. I haven't
done the work to integrate that into kselftest, so far it's only
exercised by the tests in the 'ndctl' [1] utility.

[1]: https://github.com/pmem/ndctl

From broonie at kernel.org  Thu Jul  6 09:28:36 2017
From: broonie at kernel.org (Mark Brown)
Date: Thu, 6 Jul 2017 10:28:36 +0100
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
 regression tracking
In-Reply-To: <20170705130200.7c653f61@gandalf.local.home>
References: <20170703123025.7479702e@gandalf.local.home>
	<ad94dc65-cc9c-f4f1-27c1-5a48603c7f59@leemhuis.info>
	<20170705084528.67499f8c@gandalf.local.home>
	<4080ecc7-1aa8-2940-f230-1b79d656cdb4@redhat.com>
	<20170705092757.63dc2328@gandalf.local.home>
	<20170705140607.GA30187@kroah.com>
	<a462fb3b-a6d4-e969-b301-b404981de224@roeck-us.net>
	<20170705112707.54d7f345@gandalf.local.home>
	<c782a15a-4e73-7373-ca66-5b55e9406059@roeck-us.net>
	<20170705130200.7c653f61@gandalf.local.home>
Message-ID: <20170706092836.ifcnc2qqwufndhdl@sirena.org.uk>

On Wed, Jul 05, 2017 at 01:02:00PM -0400, Steven Rostedt wrote:
> Guenter Roeck <linux at roeck-us.net> wrote:

> > If a test to reproduce a problem exists, it might be more beneficial to suggest
> > to the patch submitter that it would be great if that test would be submitted
> > as unit test instead of shaming that person for not doing so. Acknowledging and
> > praising kselftest submissions might help more than shaming for non-submissions.

> > My concern would be that once the shaming starts, it won't stop.

> I think this is a communication issue. My word for "shaming" was to
> call out a developer for not submitting a test. It wasn't about making
> fun of them, or anything like that. I was only making a point
> about how to teach people that they need to be more aware of the
> testing infrastructure. Not about actually demeaning people.

I think before anything like that is viable we need to show a concerted
and visible interest in actually running the tests we already have and
paying attention to the results - if people can see that they're just
checking a checkbox that will often result in low quality tests which
can do more harm than good.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: not available
URL: <http://lists.linuxfoundation.org/pipermail/ksummit-discuss/attachments/20170706/af1e936c/attachment.sig>

From daniel.vetter at ffwll.ch  Thu Jul  6 09:41:39 2017
From: daniel.vetter at ffwll.ch (Daniel Vetter)
Date: Thu, 6 Jul 2017 11:41:39 +0200
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
 regression tracking
In-Reply-To: <20170706092836.ifcnc2qqwufndhdl@sirena.org.uk>
References: <20170703123025.7479702e@gandalf.local.home>
	<ad94dc65-cc9c-f4f1-27c1-5a48603c7f59@leemhuis.info>
	<20170705084528.67499f8c@gandalf.local.home>
	<4080ecc7-1aa8-2940-f230-1b79d656cdb4@redhat.com>
	<20170705092757.63dc2328@gandalf.local.home>
	<20170705140607.GA30187@kroah.com>
	<a462fb3b-a6d4-e969-b301-b404981de224@roeck-us.net>
	<20170705112707.54d7f345@gandalf.local.home>
	<c782a15a-4e73-7373-ca66-5b55e9406059@roeck-us.net>
	<20170705130200.7c653f61@gandalf.local.home>
	<20170706092836.ifcnc2qqwufndhdl@sirena.org.uk>
Message-ID: <CAKMK7uFH+Kz8Mdph=J_FCZ4LC3tzoOmwNJPpSO+snTz6p0Xz+w@mail.gmail.com>

On Thu, Jul 6, 2017 at 11:28 AM, Mark Brown <broonie at kernel.org> wrote:
> On Wed, Jul 05, 2017 at 01:02:00PM -0400, Steven Rostedt wrote:
>> Guenter Roeck <linux at roeck-us.net> wrote:
>
>> > If a test to reproduce a problem exists, it might be more beneficial to suggest
>> > to the patch submitter that it would be great if that test would be submitted
>> > as unit test instead of shaming that person for not doing so. Acknowledging and
>> > praising kselftest submissions might help more than shaming for non-submissions.
>
>> > My concern would be that once the shaming starts, it won't stop.
>
>> I think this is a communication issue. My word for "shaming" was to
>> call out a developer for not submitting a test. It wasn't about making
>> fun of them, or anything like that. I was only making a point
>> about how to teach people that they need to be more aware of the
>> testing infrastructure. Not about actually demeaning people.
>
> I think before anything like that is viable we need to show a concerted
> and visible interest in actually running the tests we already have and
> paying attention to the results - if people can see that they're just
> checking a checkbox that will often result in low quality tests which
> can do more harm than good.

+1. That pretty much means large-scale CI. The i915 test suite has
suffered quite a bit over the past years because the CI infrastructure
didn't keep up. Result is that running full CI kills pretty much every
platform there is eventually, and it's really hard to get back to a
state where the testsuite can be used to catch regressions again.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

From laurent.pinchart at ideasonboard.com  Thu Jul  6 11:34:38 2017
From: laurent.pinchart at ideasonboard.com (Laurent Pinchart)
Date: Thu, 06 Jul 2017 14:34:38 +0300
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
	regression tracking
In-Reply-To: <20170705121000.5430d7d0@gandalf.local.home>
References: <576cea07-770a-4864-c3f5-0832ff211e94@leemhuis.info>
	<8c6843e8-73d9-a898-0366-0b72dfeb79a2@redhat.com>
	<20170705121000.5430d7d0@gandalf.local.home>
Message-ID: <7042009.5tkGy6PEBL@avalon>

On Wednesday 05 Jul 2017 12:10:00 Steven Rostedt wrote:
> On Wed, 5 Jul 2017 11:08:48 -0400 Carlos O'Donell wrote:
> > On 07/05/2017 10:33 AM, Steven Rostedt wrote:
> > > No test should be written for a single specific hardware. It should be a
> > > general functionality that different hardware can execute.
> > 
> > Why? We test all sorts of hardware in userspace and we see value in that
> > testing.
> 
> One reason is for bit rot. I'm not totally against it. But I envision
> that if we have hundreds of tests for very specific pieces of hardware,
> it's value will diminish over time. Unless we can get a good
> infrastructure written where the hardware info is more of a data sheet
> then a single test itself.

That's all nice, but when the hardware is complex and not fully abstracted 
behind a kernel API, tests are bound to be hardware-specific. Of course, a bug 
or regression observed only with a specific device, but triggered through the 
usage of abstract APIs only, can lead to a test case written for that device 
but runnable with any device in the same category. In that case the test case 
should certainly be added to a test suite for the corresponding API/subsystem, 
not to an hidden test suite for a particular device.

-- 
Regards,

Laurent Pinchart


From dan.carpenter at oracle.com  Thu Jul  6 14:40:29 2017
From: dan.carpenter at oracle.com (Dan Carpenter)
Date: Thu, 6 Jul 2017 17:40:29 +0300
Subject: [Ksummit-discuss] [TECH TOPIC] is Kconfig a bit hard sometimes?
In-Reply-To: <20170627135839.GB1886@jagdpanzerIV.localdomain>
References: <20170627135839.GB1886@jagdpanzerIV.localdomain>
Message-ID: <20170706144028.46a2mt2mdzpt6ip7@mwanda>

People have mentioned "make oldconfig" but I've never had a lot of luck
with that.  It always just prints "* Restart config..." and deletes my
config.

Also I hate menus.  It's such a pain if you want to enable a feature and
you have to do a dungeon crawl through our menu system to try find it.

I wrote a script a couple years ago to create kernel configs.  I do a
make defconfig, then I take a distro config and I do:

    for i in $(grep =m old_config) ; do
	./scripts/kconfig/kconfig set $i
    done

This prints a lot of errors and the code is only half implemented but
it's honestly the easiest way for me to get a bootable kernel these
days.  If someone wanted to the could add a "./scripts/kconfig/kconfig
file <name>" command that would read a line at a time and call
`./scripts/kconfig/kconfig set $line` over and over.

regards,
dan carpenter


From dan.carpenter at oracle.com  Thu Jul  6 14:41:16 2017
From: dan.carpenter at oracle.com (Dan Carpenter)
Date: Thu, 6 Jul 2017 17:41:16 +0300
Subject: [Ksummit-discuss] [PATCH 1/2] kconfig: add a silent option to
	conf_write()
In-Reply-To: <20170706144028.46a2mt2mdzpt6ip7@mwanda>
Message-ID: <20170706144116.kcvhyxezcpinhwq7@mwanda>

The conf_write() function prints output "configuration written to .config" but
I don't want it to print anything so I have added an option for that.

Signed-off-by: Dan Carpenter <dan.carpenter at oracle.com>
---
 scripts/kconfig/conf.c      | 4 ++--
 scripts/kconfig/confdata.c  | 5 +++--
 scripts/kconfig/gconf.c     | 4 ++--
 scripts/kconfig/lkc_proto.h | 2 +-
 scripts/kconfig/mconf.c     | 4 ++--
 scripts/kconfig/nconf.c     | 4 ++--
 6 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/scripts/kconfig/conf.c b/scripts/kconfig/conf.c
index 866369f10ff8..c73b5ab859a2 100644
--- a/scripts/kconfig/conf.c
+++ b/scripts/kconfig/conf.c
@@ -690,7 +690,7 @@ int main(int ac, char **av)
 		/* silentoldconfig is used during the build so we shall update autoconf.
 		 * All other commands are only used to generate a config.
 		 */
-		if (conf_get_changed() && conf_write(NULL)) {
+		if (conf_get_changed() && conf_write(NULL, 0)) {
 			fprintf(stderr, _("\n*** Error during writing of the configuration.\n\n"));
 			exit(1);
 		}
@@ -705,7 +705,7 @@ int main(int ac, char **av)
 			return 1;
 		}
 	} else if (input_mode != listnewconfig) {
-		if (conf_write(NULL)) {
+		if (conf_write(NULL, 0)) {
 			fprintf(stderr, _("\n*** Error during writing of the configuration.\n\n"));
 			exit(1);
 		}
diff --git a/scripts/kconfig/confdata.c b/scripts/kconfig/confdata.c
index 297b079ae4d9..7e8dbae6af30 100644
--- a/scripts/kconfig/confdata.c
+++ b/scripts/kconfig/confdata.c
@@ -738,7 +738,7 @@ int conf_write_defconfig(const char *filename)
 	return 0;
 }
 
-int conf_write(const char *name)
+int conf_write(const char *name, bool silent)
 {
 	FILE *out;
 	struct symbol *sym;
@@ -831,7 +831,8 @@ int conf_write(const char *name)
 			return 1;
 	}
 
-	conf_message(_("configuration written to %s"), newname);
+	if (!silent)
+		conf_message(_("configuration written to %s"), newname);
 
 	sym_set_change_count(0);
 
diff --git a/scripts/kconfig/gconf.c b/scripts/kconfig/gconf.c
index cfddddb9c9d7..115b5602d05e 100644
--- a/scripts/kconfig/gconf.c
+++ b/scripts/kconfig/gconf.c
@@ -523,7 +523,7 @@ void on_load1_activate(GtkMenuItem * menuitem, gpointer user_data)
 
 void on_save_activate(GtkMenuItem * menuitem, gpointer user_data)
 {
-	if (conf_write(NULL))
+	if (conf_write(NULL), 0)
 		text_insert_msg(_("Error"), _("Unable to save configuration !"));
 }
 
@@ -536,7 +536,7 @@ store_filename(GtkFileSelection * file_selector, gpointer user_data)
 	fn = gtk_file_selection_get_filename(GTK_FILE_SELECTION
 					     (user_data));
 
-	if (conf_write(fn))
+	if (conf_write(fn), 0)
 		text_insert_msg(_("Error"), _("Unable to save configuration !"));
 
 	gtk_widget_destroy(GTK_WIDGET(user_data));
diff --git a/scripts/kconfig/lkc_proto.h b/scripts/kconfig/lkc_proto.h
index d5398718ec2a..1690888bdbc4 100644
--- a/scripts/kconfig/lkc_proto.h
+++ b/scripts/kconfig/lkc_proto.h
@@ -5,7 +5,7 @@ void conf_parse(const char *name);
 int conf_read(const char *name);
 int conf_read_simple(const char *name, int);
 int conf_write_defconfig(const char *name);
-int conf_write(const char *name);
+int conf_write(const char *name, bool silent);
 int conf_write_autoconf(void);
 bool conf_get_changed(void);
 void conf_set_changed_callback(void (*fn)(void));
diff --git a/scripts/kconfig/mconf.c b/scripts/kconfig/mconf.c
index 315ce2c7cb9d..c029b5417fa9 100644
--- a/scripts/kconfig/mconf.c
+++ b/scripts/kconfig/mconf.c
@@ -937,7 +937,7 @@ static void conf_save(void)
 		case 0:
 			if (!dialog_input_result[0])
 				return;
-			if (!conf_write(dialog_input_result)) {
+			if (!conf_write(dialog_input_result, 0)) {
 				set_config_filename(dialog_input_result);
 				return;
 			}
@@ -971,7 +971,7 @@ static int handle_exit(void)
 
 	switch (res) {
 	case 0:
-		if (conf_write(filename)) {
+		if (conf_write(filename, 0)) {
 			fprintf(stderr, _("\n\n"
 					  "Error while writing of the configuration.\n"
 					  "Your configuration changes were NOT saved."
diff --git a/scripts/kconfig/nconf.c b/scripts/kconfig/nconf.c
index 003114779815..b4b0666bdc4c 100644
--- a/scripts/kconfig/nconf.c
+++ b/scripts/kconfig/nconf.c
@@ -666,7 +666,7 @@ static int do_exit(void)
 	/* if we got here, the user really wants to exit */
 	switch (res) {
 	case 0:
-		res = conf_write(filename);
+		res = conf_write(filename, 0);
 		if (res)
 			btn_dialog(
 				main_window,
@@ -1436,7 +1436,7 @@ static void conf_save(void)
 		case 0:
 			if (!dialog_input_result[0])
 				return;
-			res = conf_write(dialog_input_result);
+			res = conf_write(dialog_input_result, 0);
 			if (!res) {
 				set_config_filename(dialog_input_result);
 				return;
-- 
2.11.0

From dan.carpenter at oracle.com  Thu Jul  6 14:42:08 2017
From: dan.carpenter at oracle.com (Dan Carpenter)
Date: Thu, 6 Jul 2017 17:42:08 +0300
Subject: [Ksummit-discuss] [PATCH 2/2] kconfig: new command line kernel
	configuration tool
In-Reply-To: <20170706144028.46a2mt2mdzpt6ip7@mwanda>
Message-ID: <20170706144208.6hlgxwo37gntk6qm@mwanda>

This tool barely works, it's just a rough draft.

Sometimes I want to search for a config so I have to load menuconfig,
then search for the config entry, then exit.  With this script I
simply run:

    ./scripts/kconfig/kconfig search COMEDI

Quite often I find myself trying to enable a feature by doing this:

    echo CONFIG_FEATURE=y >> .config

But when I try to boot the new kernel, I find that the feature isn't
there because the kernel runs `make oldconfig` and I didn't have all
the depends selected so it silently removed it.  With this feature
what you can do is:

    ./scripts/kconfig/kconfig set FEATURE=y

It helps you enable the dependencies or it at least prints an error
if it can't enable the feature.

But this code isn't all implemented.  1) It doesn't calculate the
dependencies well.  See expr_parse() for more details.  2)  It
doesn't work well for things like:

	./scripts/kconfig/kconfig set BT_INTEL=m

because those aren't visible, they can only be using depend
statements.  Or say you try to set FEATURE=m when something else
depends on it be set =y then the error message is wrong.  The
other problem is that I don't know how to print the help text.
Again, this is just a rough draft.

Signed-off-by: Dan Carpenter <dan.carpenter at oracle.com>
---
 scripts/kconfig/Makefile |   6 +-
 scripts/kconfig/kconfig  |  33 +++++
 scripts/kconfig/lconf.c  | 332 +++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 370 insertions(+), 1 deletion(-)
 create mode 100755 scripts/kconfig/kconfig
 create mode 100644 scripts/kconfig/lconf.c

diff --git a/scripts/kconfig/Makefile b/scripts/kconfig/Makefile
index eb8144643b78..a2a90be2e149 100644
--- a/scripts/kconfig/Makefile
+++ b/scripts/kconfig/Makefile
@@ -33,6 +33,9 @@ config: $(obj)/conf
 nconfig: $(obj)/nconf
 	$< $(silent) $(Kconfig)
 
+lconfig: $(obj)/lconf
+	@ $< $(silent) $(Kconfig)
+
 silentoldconfig: $(obj)/conf
 	$(Q)mkdir -p include/config include/generated
 	$(Q)test -e include/generated/autoksyms.h || \
@@ -183,12 +186,13 @@ lxdialog += lxdialog/textbox.o lxdialog/yesno.o lxdialog/menubox.o
 conf-objs	:= conf.o  zconf.tab.o
 mconf-objs     := mconf.o zconf.tab.o $(lxdialog)
 nconf-objs     := nconf.o zconf.tab.o nconf.gui.o
+lconf-objs     := lconf.o zconf.tab.o
 kxgettext-objs	:= kxgettext.o zconf.tab.o
 qconf-cxxobjs	:= qconf.o
 qconf-objs	:= zconf.tab.o
 gconf-objs	:= gconf.o zconf.tab.o
 
-hostprogs-y := conf nconf mconf kxgettext qconf gconf
+hostprogs-y := conf nconf mconf kxgettext qconf gconf lconf
 
 clean-files	:= qconf.moc .tmp_qtcheck .tmp_gtkcheck
 clean-files	+= zconf.tab.c zconf.lex.c zconf.hash.c gconf.glade.h
diff --git a/scripts/kconfig/kconfig b/scripts/kconfig/kconfig
new file mode 100755
index 000000000000..beab8fc829c9
--- /dev/null
+++ b/scripts/kconfig/kconfig
@@ -0,0 +1,33 @@
+#!/bin/sh
+
+usage() {
+	echo "kconfig [search|set] string"
+	exit 1;
+}
+
+if [ "$1" = "" ] ; then
+	usage
+fi
+
+if [ "$1" = "search" ] ; then
+
+	search=$2
+	NCONFIG_MODE=kconfig_search SEARCH=${search} make lconfig
+
+elif [ "$1" = "set" ] ; then
+
+	config=$2
+	setting=$3
+
+	if [ $config = "" ] ; then
+		echo "nothing to set"
+		exit 1
+	fi
+
+	NCONFIG_MODE=kconfig_set CONFIG=${config} SETTING=${setting} make lconfig
+
+else
+	usage
+fi
+
+
diff --git a/scripts/kconfig/lconf.c b/scripts/kconfig/lconf.c
new file mode 100644
index 000000000000..ebc3cbd4ef83
--- /dev/null
+++ b/scripts/kconfig/lconf.c
@@ -0,0 +1,332 @@
+/*
+ * Copyright (C) 2015 Oracle
+ * Released under the terms of the GNU GPL v2.0.
+ *
+ */
+#define _GNU_SOURCE
+#include <string.h>
+#include <stdlib.h>
+
+#include "lkc.h"
+#include "nconf.h"
+#include <ctype.h>
+
+static int indent;
+static char line[128];
+
+static int get_depends(struct symbol *sym);
+
+static void strip(char *str)
+{
+	char *p = str;
+	int l;
+
+	while ((isspace(*p)))
+		p++;
+	l = strlen(p);
+	if (p != str)
+		memmove(str, p, l + 1);
+	if (!l)
+		return;
+	p = str + l - 1;
+	while ((isspace(*p)))
+		*p-- = 0;
+}
+
+static void xfgets(char *str, int size, FILE *in)
+{
+	if (fgets(str, size, in) == NULL)
+		fprintf(stderr, "\nError in reading or end of file.\n");
+}
+
+static tristate str_to_tristate(const char *str)
+{
+	switch (str[0]) {
+	case 'y': case 'Y':
+		return yes;
+	case 'm': case 'M':
+		return mod;
+	case 'n': case 'N':
+	default:
+		return no;
+	}
+}
+
+static int conf_askvalue(struct symbol *sym, const char *def)
+{
+	enum symbol_type type = sym_get_type(sym);
+
+	if (!sym_has_value(sym))
+		printf(_("(NEW) "));
+
+	line[0] = '\n';
+	line[1] = 0;
+
+	if (!sym_is_changable(sym)) {
+		printf("%s\n", def);
+		line[0] = '\n';
+		line[1] = 0;
+		return 0;
+	}
+
+	fflush(stdout);
+	xfgets(line, 128, stdin);
+
+	switch (type) {
+	case S_INT:
+	case S_HEX:
+	case S_STRING:
+		printf("%s\n", def);
+		return 1;
+	default:
+		;
+	}
+	printf("%s", line);
+	return 1;
+}
+
+static struct property *get_symbol_prop(struct symbol *sym)
+{
+	struct property *prop = NULL;
+
+	for_all_properties(sym, prop, P_SYMBOL)
+		break;
+	return prop;
+}
+
+static int conf_sym(struct symbol *sym)
+{
+	tristate oldval, newval;
+	struct property *prop;
+
+	while (1) {
+		if (sym->name)
+			printf("%s: ", sym->name);
+		for_all_prompts(sym, prop)
+			printf("%*s%s ", indent - 1, "",  _(prop->text));
+		putchar('[');
+		oldval = sym_get_tristate_value(sym);
+		switch (oldval) {
+		case no:
+			putchar('N');
+			break;
+		case mod:
+			putchar('M');
+			break;
+		case yes:
+			putchar('Y');
+			break;
+		}
+		if (oldval != no && sym_tristate_within_range(sym, no))
+			printf("/n");
+		if (oldval != mod && sym_tristate_within_range(sym, mod))
+			printf("/m");
+		if (oldval != yes && sym_tristate_within_range(sym, yes))
+			printf("/y");
+		/* FIXME: I don't know how to get the help text from the sym */
+		printf("] ");
+		if (!conf_askvalue(sym, sym_get_string_value(sym)))
+			return 0;
+		strip(line);
+
+		switch (line[0]) {
+		case 'n':
+		case 'N':
+			newval = no;
+			if (!line[1] || !strcmp(&line[1], "o"))
+				break;
+			continue;
+		case 'm':
+		case 'M':
+			newval = mod;
+			if (!line[1])
+				break;
+			continue;
+		case 'y':
+		case 'Y':
+			newval = yes;
+			if (!line[1] || !strcmp(&line[1], "es"))
+				break;
+			continue;
+		case 0:
+			newval = oldval;
+			break;
+		default:
+			continue;
+		}
+		if (sym_set_tristate_value(sym, newval)) {
+			/* FIXME: if I don't write it doesn't save */
+			conf_write(NULL, 1);
+			return 1;
+		}
+	}
+}
+
+static int enable_sym(struct symbol *sym)
+{
+	if (sym_get_tristate_value(sym) != no)
+		return 0;
+
+	if (!sym->visible) {
+		if (!get_depends(sym))
+			return 0;
+		printf("%s: has missing dependencies\n", sym->name);
+	}
+
+	return conf_sym(sym);
+}
+
+static void expr_parse(struct expr *e)
+{
+	if (!e)
+		return;
+
+	switch (e->type) {
+	case E_EQUAL:
+		printf("set '%s' to '%s'\n", e->left.sym->name, e->right.sym->name);
+		break;
+
+	case E_AND:
+		expr_parse(e->left.expr);
+		expr_parse(e->right.expr);
+		break;
+
+	case E_SYMBOL:
+		enable_sym(e->left.sym);
+		break;
+
+	case E_NOT:
+	case E_UNEQUAL:
+	case E_OR:
+	case E_LIST:
+	case E_RANGE:
+	default:
+		printf("HELP.  Lot of unimplemented code.  %d\n", e->type);
+		break;
+	}
+}
+
+static int get_depends(struct symbol *sym)
+{
+	struct property *prop;
+	struct gstr res = str_new();
+
+	prop = get_symbol_prop(sym);
+	if (!prop)
+		return 0;
+
+	expr_gstr_print(prop->visible.expr, &res);
+	printf("%s\n\n", str_get(&res));
+
+	expr_parse(prop->visible.expr);
+
+	return 1;
+}
+
+static void kconfig_search(void)
+{
+	char *search_str;
+	struct symbol **sym_arr;
+	struct gstr res;
+
+	search_str = getenv("SEARCH");
+	if (!search_str)
+		return;
+
+	sym_arr = sym_re_search(search_str);
+	res = get_relations_str(sym_arr, NULL);
+	printf("%s", str_get(&res));
+}
+
+static void kconfig_set(void)
+{
+	struct symbol *sym;
+	char *config;
+	char *setting;
+	int res;
+
+	config = getenv("CONFIG");
+	if (!config)
+		return;
+	if (strncmp(config, "CONFIG_", 7) == 0)
+		config += 7;
+
+	setting = strchr(config, '=');
+	if (setting) {
+		*setting = '\0';
+		setting++;
+	} else {
+		setting = getenv("SETTING");
+		if (setting && *setting == '\0')
+			setting = NULL;
+	}
+
+	sym = sym_find(config);
+	if (!sym) {
+		printf("Error: '%s' not found.\n", config);
+		return;
+	}
+
+	if (sym->curr.tri == str_to_tristate(setting)) {
+		printf("Already set:  %s=%s\n", sym->name, setting);
+		return;
+	}
+
+	if (!sym->visible) {
+		printf("\n%s: has missing dependencies\n", sym->name);
+		if (!get_depends(sym))
+			return;
+	}
+	if (!sym->visible) {
+		printf("Error: unmet dependencies\n");
+		return;
+	}
+
+	if (!setting) {
+		conf_sym(sym);
+	} else if (!sym_set_string_value(sym, setting)) {
+		printf("Error: setting '%s=%s' failed.\n", sym->name, setting);
+		return;
+	}
+
+	res = conf_write(NULL, 1);
+	if (res) {
+		printf("Error during writing of configuration.\n"
+			"Your configuration changes were NOT saved.\n");
+		return;
+	}
+
+	printf("set: %s=%s\n", config, sym_get_string_value(sym));
+}
+
+int main(int ac, char **av)
+{
+	char *mode;
+
+	setlocale(LC_ALL, "");
+	bindtextdomain(PACKAGE, LOCALEDIR);
+	textdomain(PACKAGE);
+
+	if (ac > 1 && strcmp(av[1], "-s") == 0) {
+		/* Silence conf_read() until the real callback is set up */
+		conf_set_message_callback(NULL);
+		av++;
+	}
+	conf_parse(av[1]);
+	conf_read(NULL);
+
+	mode = getenv("NCONFIG_MODE");
+	if (!mode)
+		return 1;
+
+	if (strcmp(mode, "kconfig_search") == 0) {
+		kconfig_search();
+		return 0;
+	}
+	if (strcmp(mode, "kconfig_set") == 0) {
+		kconfig_set();
+		return 0;
+	}
+
+	return 1;
+}
-- 
2.11.0


From James.Bottomley at HansenPartnership.com  Thu Jul  6 14:48:05 2017
From: James.Bottomley at HansenPartnership.com (James Bottomley)
Date: Thu, 06 Jul 2017 07:48:05 -0700
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
 regression tracking
In-Reply-To: <20170706092836.ifcnc2qqwufndhdl@sirena.org.uk>
References: <20170703123025.7479702e@gandalf.local.home>
	<ad94dc65-cc9c-f4f1-27c1-5a48603c7f59@leemhuis.info>
	<20170705084528.67499f8c@gandalf.local.home>
	<4080ecc7-1aa8-2940-f230-1b79d656cdb4@redhat.com>
	<20170705092757.63dc2328@gandalf.local.home>
	<20170705140607.GA30187@kroah.com>
	<a462fb3b-a6d4-e969-b301-b404981de224@roeck-us.net>
	<20170705112707.54d7f345@gandalf.local.home>
	<c782a15a-4e73-7373-ca66-5b55e9406059@roeck-us.net>
	<20170705130200.7c653f61@gandalf.local.home>
	<20170706092836.ifcnc2qqwufndhdl@sirena.org.uk>
Message-ID: <1499352485.2765.14.camel@HansenPartnership.com>

On Thu, 2017-07-06 at 10:28 +0100, Mark Brown wrote:
> On Wed, Jul 05, 2017 at 01:02:00PM -0400, Steven Rostedt wrote:
> > 
> > Guenter Roeck <linux at roeck-us.net> wrote:
> 
> > 
> > > 
> > > If a test to reproduce a problem exists, it might be more
> > > beneficial to suggest to the patch submitter that it would be
> > > great if that test would be submitted as unit test instead of
> > > shaming that person for not doing so. Acknowledging and
> > > praising kselftest submissions might help more than shaming for
> > > non-submissions.
> 
> > 
> > > 
> > > My concern would be that once the shaming starts, it won't stop.
> 
> > 
> > I think this is a communication issue. My word for "shaming" was to
> > call out a developer for not submitting a test. It wasn't about
> > making fun of them, or anything like that. I was only making a
> > point about how to teach people that they need to be more aware of
> > the testing infrastructure. Not about actually demeaning people.
> 
> I think before anything like that is viable we need to show a
> concerted and visible interest in actually running the tests we
> already have and paying attention to the results - if people can see
> that they're just checking a checkbox that will often result in low
> quality tests which can do more harm than good.

it depends what you mean by "we". ?I used to run a battery of tests
over every SCSI commit. ?It was time consuming and slowed down the
process, plus it was me who always got to diagnose failures. ?Nowadays
I don't bother: I rely on 0day to run its usual tests plus a couple of
extras I asked for it's a much more streamlined process (meaning less
work for me) and everyone is happy.

The corollary I take away from this is that the less intrusive the test
infrastructure is (at least to my process) the happier I am. ?The 0day
quantum leap for me was going from testing my tree and telling me of
problems after I've added the patch to testing patches posted to the
mailing list, which tells me of problems *before* the commit gets added
to the tree.

James
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part
URL: <http://lists.linuxfoundation.org/pipermail/ksummit-discuss/attachments/20170706/d585e99d/attachment.sig>

From tytso at mit.edu  Thu Jul  6 14:53:46 2017
From: tytso at mit.edu (Theodore Ts'o)
Date: Thu, 6 Jul 2017 10:53:46 -0400
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
 regression tracking
In-Reply-To: <CAKMK7uFH+Kz8Mdph=J_FCZ4LC3tzoOmwNJPpSO+snTz6p0Xz+w@mail.gmail.com>
References: <20170705084528.67499f8c@gandalf.local.home>
	<4080ecc7-1aa8-2940-f230-1b79d656cdb4@redhat.com>
	<20170705092757.63dc2328@gandalf.local.home>
	<20170705140607.GA30187@kroah.com>
	<a462fb3b-a6d4-e969-b301-b404981de224@roeck-us.net>
	<20170705112707.54d7f345@gandalf.local.home>
	<c782a15a-4e73-7373-ca66-5b55e9406059@roeck-us.net>
	<20170705130200.7c653f61@gandalf.local.home>
	<20170706092836.ifcnc2qqwufndhdl@sirena.org.uk>
	<CAKMK7uFH+Kz8Mdph=J_FCZ4LC3tzoOmwNJPpSO+snTz6p0Xz+w@mail.gmail.com>
Message-ID: <20170706145346.6w2uzcf7xacbr3or@thunk.org>

On Thu, Jul 06, 2017 at 11:41:39AM +0200, Daniel Vetter wrote:
> 
> +1. That pretty much means large-scale CI. The i915 test suite has
> suffered quite a bit over the past years because the CI infrastructure
> didn't keep up. Result is that running full CI kills pretty much every
> platform there is eventually, and it's really hard to get back to a
> state where the testsuite can be used to catch regressions again.

I assume the i915 test suite requires real hardware and can't be run
on VM's; is that correct?

						- Ted

From rostedt at goodmis.org  Thu Jul  6 15:08:04 2017
From: rostedt at goodmis.org (Steven Rostedt)
Date: Thu, 6 Jul 2017 11:08:04 -0400
Subject: [Ksummit-discuss] [PATCH 1/2] kconfig: add a silent option to
 conf_write()
In-Reply-To: <20170706144116.kcvhyxezcpinhwq7@mwanda>
References: <20170706144028.46a2mt2mdzpt6ip7@mwanda>
	<20170706144116.kcvhyxezcpinhwq7@mwanda>
Message-ID: <20170706110804.44ca24b9@gandalf.local.home>

On Thu, 6 Jul 2017 17:41:16 +0300
Dan Carpenter <dan.carpenter at oracle.com> wrote:

> The conf_write() function prints output "configuration written to .config" but
> I don't want it to print anything so I have added an option for that.
> 
> Signed-off-by: Dan Carpenter <dan.carpenter at oracle.com>
> ---

I know you replied to the TECH TOPIC about kconfig, but did you really
mean to send patches to the ksummit-discuss mailing list and not to any
other mailing list (like LKML or linux-kbuild at vger.kernel.org)?

-- Steve

From torvalds at linux-foundation.org  Thu Jul  6 16:41:36 2017
From: torvalds at linux-foundation.org (Linus Torvalds)
Date: Thu, 6 Jul 2017 09:41:36 -0700
Subject: [Ksummit-discuss] [TECH TOPIC] is Kconfig a bit hard sometimes?
In-Reply-To: <20170706144028.46a2mt2mdzpt6ip7@mwanda>
References: <20170627135839.GB1886@jagdpanzerIV.localdomain>
	<20170706144028.46a2mt2mdzpt6ip7@mwanda>
Message-ID: <CA+55aFyPQeMYUafR32parh=jGU0O8prn5vhOhhW+WBU-LbR4HQ@mail.gmail.com>

On Thu, Jul 6, 2017 at 7:40 AM, Dan Carpenter <dan.carpenter at oracle.com> wrote:
> People have mentioned "make oldconfig" but I've never had a lot of luck
> with that.  It always just prints "* Restart config..." and deletes my
> config.

Really?

For me, "make oldconfig" is pretty much the only thing I ever use
(apart from build testing).

It's very convenient once you have a baseline, and want to just get
the new questions for when the Kconfig files change. It's also how I
notice when somebody adds a new config entry that doesn't default to
'n'.

It's also very convenient when you end up changing your config: just
edit the damn .config file directly, and then re-run "make oldconfig"
just to make sure everything gets updated (and then you'll notice that
you tried to disable some config entry, but it got re-enabled again
because there was something else that depended on it and selected it
;)

So I wonder why it wouldn't work for you.

Now, admittedly, I literally only ever use two source files: the
previous ".config" file, and if that is missing (after a "git clean
-dqfx" or similar), just /etc/kernel-config.

The oldconfig logic has fallbacks to other cases, but they are all useless imho.

Also, I build in the source tree. Maybe you use a separate object tree
and it gets that case wrong.

                   Linus

From rdunlap at infradead.org  Thu Jul  6 17:11:26 2017
From: rdunlap at infradead.org (Randy Dunlap)
Date: Thu, 6 Jul 2017 10:11:26 -0700
Subject: [Ksummit-discuss] [TECH TOPIC] is Kconfig a bit hard sometimes?
In-Reply-To: <CA+55aFyPQeMYUafR32parh=jGU0O8prn5vhOhhW+WBU-LbR4HQ@mail.gmail.com>
References: <20170627135839.GB1886@jagdpanzerIV.localdomain>
	<20170706144028.46a2mt2mdzpt6ip7@mwanda>
	<CA+55aFyPQeMYUafR32parh=jGU0O8prn5vhOhhW+WBU-LbR4HQ@mail.gmail.com>
Message-ID: <966c1fce-6f2f-d158-d086-cf8e2eac97a9@infradead.org>

On 07/06/2017 09:41 AM, Linus Torvalds wrote:
> On Thu, Jul 6, 2017 at 7:40 AM, Dan Carpenter <dan.carpenter at oracle.com> wrote:
>> People have mentioned "make oldconfig" but I've never had a lot of luck
>> with that.  It always just prints "* Restart config..." and deletes my
>> config.
> 
> Really?
> 
> For me, "make oldconfig" is pretty much the only thing I ever use
> (apart from build testing).
> 
> It's very convenient once you have a baseline, and want to just get
> the new questions for when the Kconfig files change. It's also how I
> notice when somebody adds a new config entry that doesn't default to
> 'n'.
> 
> It's also very convenient when you end up changing your config: just
> edit the damn .config file directly, and then re-run "make oldconfig"
> just to make sure everything gets updated (and then you'll notice that
> you tried to disable some config entry, but it got re-enabled again
> because there was something else that depended on it and selected it
> ;)
> 
> So I wonder why it wouldn't work for you.
> 
> Now, admittedly, I literally only ever use two source files: the
> previous ".config" file, and if that is missing (after a "git clean
> -dqfx" or similar), just /etc/kernel-config.
> 
> The oldconfig logic has fallbacks to other cases, but they are all useless imho.
> 
> Also, I build in the source tree. Maybe you use a separate object tree
> and it gets that case wrong.

Nah, I use O=objdir all the time and oldconfig works for me.

-- 
~Randy

From rostedt at goodmis.org  Thu Jul  6 19:10:08 2017
From: rostedt at goodmis.org (Steven Rostedt)
Date: Thu, 6 Jul 2017 15:10:08 -0400
Subject: [Ksummit-discuss] [TECH TOPIC] Pulling away from the tracing
 ABI quicksands
In-Reply-To: <658A3F80-5E48-4EC4-A591-E3783AD3DADC@fb.com>
References: <20170629195537.534445e7@gandalf.local.home>
	<CA+55aFxW_vhYWRoWFwy4zQgG7iPJg3V4u0-XkjZiGJJfZtZ=ng@mail.gmail.com>
	<20170629203224.6bf7f29a@gandalf.local.home>
	<20170629205218.5b9a7923@gandalf.local.home>
	<CA+55aFyQE_T7Rp7ay_EbAZNDqLE6ffJ-6xkL6B_961oZ0+aSpA@mail.gmail.com>
	<20170629211641.5aeb3af7@gandalf.local.home>
	<20170629212750.5c3542ee@gandalf.local.home>
	<CA+55aFzzCPMUDt72hckauYu+fj=Q2MWjx+XiR06KpMLAr1EBAA@mail.gmail.com>
	<20170629221245.489760b1@gandalf.local.home>
	<CA+55aFxFLvX62SyOC9qyVwEQXH8J224Fe03tvy624AUx0U2fRQ@mail.gmail.com>
	<20170630025852.xjoif3aai6rny5a2@ast-mbp>
	<20170629230251.02f380cb@gandalf.local.home>
	<6AE378F0-42F7-45DE-9F3C-050A5019A1E8@fb.com>
	<20170630142956.7e0cb2d6@gandalf.local.home>
	<20170630143030.305b68a0@gandalf.local.home>
	<658A3F80-5E48-4EC4-A591-E3783AD3DADC@fb.com>
Message-ID: <20170706151008.24addd2b@gandalf.local.home>

On Fri, 30 Jun 2017 18:37:59 +0000
Josef Bacik <jbacik at fb.com> wrote:

> [ I forgot to add Tom to the Cc list. Sending again. ]
> 
> On Fri, 30 Jun 2017 14:29:56 -0400
> Steven Rostedt <rostedt at goodmis.org> wrote:
> 
> > On Fri, 30 Jun 2017 18:24:12 +0000
> > Josef Bacik <jbacik at fb.com> wrote:
> >   
> > > Yup I?ll start bugging people to submit talk proposals, starting with you!  I?ll put up my proposal in the next day or two, I think Brendan has something he?s going to talk about.  Thanks,    
> > 
> > I shouldn't have used the term "talk", as it really is all about
> > discussions. In fact, if you need more than one slide, you have too
> > many.
> > 
> > That said, I could probably come up with a few things, starting with
> > this trace event issue. But it will be pointless if Peter Zijlstra and
> > Mathieu are not there.
> > 
> > But having ideas about dynamic fields in tracepoints is always
> > interesting. Not to mention talking about Tom Zanussi's latest
> > histogram work. It may be pretty much completed, but I would like to
> > discuss where we go from there.
> > 
> > One last thing. I don't want to have too many responsibilities, as I'm
> > on the LPC program committee and I need to make sure I have time to
> > fulfill any action items I'm responsible for during the conference.
> >   
> 
> Yeah plumbers is a weird venue for tracing, I always hope that we are
> going to have people like Brendan or other sysadmin-y people show up
> and say ?this is what sucks about tracing, please fix it?, and then
> we can go fix it.  It doesn?t really seem to happen that way tho, and
> for things like tracing ABI there just aren?t the right people in the
> room to have that kind of discussion.  My proposal was just going to
> be a laundry list of things that would make my life easier, but it
> doesn?t really warrant a full micro-conference to listen to me bitch
> for an hour.  If it turns out nobody else has much to talk about then
> we can just declare tracing is feature complete and we can talk about
> something else ;).  Thanks,
> 

At this rate, I'm guessing that Tracing is not going to be on the
Plumbers' agenda.

-- Steve


From laurent.pinchart at ideasonboard.com  Thu Jul  6 21:19:46 2017
From: laurent.pinchart at ideasonboard.com (Laurent Pinchart)
Date: Fri, 07 Jul 2017 00:19:46 +0300
Subject: [Ksummit-discuss] [TECH TOPIC] is Kconfig a bit hard sometimes?
In-Reply-To: <20170706144028.46a2mt2mdzpt6ip7@mwanda>
References: <20170627135839.GB1886@jagdpanzerIV.localdomain>
	<20170706144028.46a2mt2mdzpt6ip7@mwanda>
Message-ID: <1601331.LiGNiPeBdk@avalon>

On Thursday 06 Jul 2017 17:40:29 Dan Carpenter wrote:
> People have mentioned "make oldconfig" but I've never had a lot of luck
> with that.  It always just prints "* Restart config..." and deletes my
> config.

I like oldconfig as it makes it easy to find about new options when upgrading 
the kernel. However, there's one thing that bothers me. When jumping by more 
than one kernel version, the number of options can be quite high, in which 
case I sometimes make mistakes answering questions. I'd love it if Kconfig 
allowed me to go back and correct mistakes, instead of having to note the 
option down and modify it manually afterwards.

> Also I hate menus.  It's such a pain if you want to enable a feature and
> you have to do a dungeon crawl through our menu system to try find it.
> 
> I wrote a script a couple years ago to create kernel configs.  I do a
> make defconfig, then I take a distro config and I do:
> 
>     for i in $(grep =m old_config) ; do
> 	./scripts/kconfig/kconfig set $i
>     done
> 
> This prints a lot of errors and the code is only half implemented but
> it's honestly the easiest way for me to get a bootable kernel these
> days.  If someone wanted to the could add a "./scripts/kconfig/kconfig
> file <name>" command that would read a line at a time and call
> `./scripts/kconfig/kconfig set $line` over and over.

-- 
Regards,

Laurent Pinchart


From daniel.vetter at ffwll.ch  Thu Jul  6 21:28:28 2017
From: daniel.vetter at ffwll.ch (Daniel Vetter)
Date: Thu, 6 Jul 2017 23:28:28 +0200
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
 regression tracking
In-Reply-To: <20170706145346.6w2uzcf7xacbr3or@thunk.org>
References: <20170705084528.67499f8c@gandalf.local.home>
	<4080ecc7-1aa8-2940-f230-1b79d656cdb4@redhat.com>
	<20170705092757.63dc2328@gandalf.local.home>
	<20170705140607.GA30187@kroah.com>
	<a462fb3b-a6d4-e969-b301-b404981de224@roeck-us.net>
	<20170705112707.54d7f345@gandalf.local.home>
	<c782a15a-4e73-7373-ca66-5b55e9406059@roeck-us.net>
	<20170705130200.7c653f61@gandalf.local.home>
	<20170706092836.ifcnc2qqwufndhdl@sirena.org.uk>
	<CAKMK7uFH+Kz8Mdph=J_FCZ4LC3tzoOmwNJPpSO+snTz6p0Xz+w@mail.gmail.com>
	<20170706145346.6w2uzcf7xacbr3or@thunk.org>
Message-ID: <CAKMK7uGQz8soDJ-+eLnDArx382jiMAWeFwrJK_LxJkWYJ6DKmQ@mail.gmail.com>

On Thu, Jul 6, 2017 at 4:53 PM, Theodore Ts'o <tytso at mit.edu> wrote:
> On Thu, Jul 06, 2017 at 11:41:39AM +0200, Daniel Vetter wrote:
>> +1. That pretty much means large-scale CI. The i915 test suite has
>> suffered quite a bit over the past years because the CI infrastructure
>> didn't keep up. Result is that running full CI kills pretty much every
>> platform there is eventually, and it's really hard to get back to a
>> state where the testsuite can be used to catch regressions again.
>
> I assume the i915 test suite requires real hardware and can't be run
> on VM's; is that correct?

Yes, that's another problem. If all bigger teams/subsystems would do
what we'd do, but extended to all mailing lists, your patch series
would get replies from a few hundred CI farms. Not sure that would
scale ... And there's no way ever that one single entity will have
hardware for everything.

And if you only CI the mailing list for your own subsystem then every
time a merge window happens your CI will be out of service (4.13 seems
extremely bad, atm nothing survives an extended run on linux-next for
us).
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

From shuahkh at osg.samsung.com  Thu Jul  6 22:24:01 2017
From: shuahkh at osg.samsung.com (Shuah Khan)
Date: Thu, 6 Jul 2017 16:24:01 -0600
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
 regression tracking
In-Reply-To: <20170705153259.GA7265@kroah.com>
References: <576cea07-770a-4864-c3f5-0832ff211e94@leemhuis.info>
	<20170703123025.7479702e@gandalf.local.home>
	<ad94dc65-cc9c-f4f1-27c1-5a48603c7f59@leemhuis.info>
	<20170705084528.67499f8c@gandalf.local.home>
	<4080ecc7-1aa8-2940-f230-1b79d656cdb4@redhat.com>
	<20170705092757.63dc2328@gandalf.local.home>
	<20170705140607.GA30187@kroah.com>
	<a462fb3b-a6d4-e969-b301-b404981de224@roeck-us.net>
	<20170705153259.GA7265@kroah.com>
Message-ID: <a2fada39-76d4-e136-f2db-d8306d929902@osg.samsung.com>

On 07/05/2017 09:32 AM, Greg KH wrote:
> On Wed, Jul 05, 2017 at 08:16:33AM -0700, Guenter Roeck wrote:
>> If we start shaming people for not providing unit tests, all we'll accomplish is
>> that people will stop providing bug fixes.
> 
> Yes, this is the key!
> 
> Steven, just look at everything marked with a "Fixes:" or "stable@" tag
> from 4.12-rc1..4.12 and try to determine how you would write a test for
> the majority of them.
> 
> Yes, for some subsystems this can work (look at xfstests as one great
> example for filesystems, same for the i915 tests), but for the majority
> of the kernel, at this point in time, it doesn't make sense.
> 
> So take Carlos's advice, start small, do it for your subsystem if you
> don't touch hardware (easy peasy, right?), and let's see how it goes,
> and see if we have the infrastructure to do it even today.  Right now,
> kselftests is finally getting a unified output format, which is great,
> it shows that people are starting to use and rely on it.  What else will
> we need to make this more widely used, we don't know yet...
> 

Over the past couple of years, kselftests have seen improvements to run
on ARM in kernel ci rings. TAP13 will definitely make it easier to find
run to run differences. There is the effort to use ksefltests to test
stable releases (4.4 LTS for example), which will help make the tests
fail/skip gracefully when a feature isn't enabled/supported.

The work so far is two fold:

- enable them to run in test rings.
- making them easy to use

As per test development, we are constantly adding tests and I see new tests
getting added for sub-systems that aren't hardware dependent. You will see
lots of activity in mm, timers, seccomp, net, sys-calls to name a few.

I am going to be looking for TAP13 format compliance for new tests starting
4.13.

I am not sure how popular they are among developers and sub-system maintainers
though. Maybe this is one area we can try to improve usage.

thanks,
-- Shuah


From rostedt at goodmis.org  Thu Jul  6 22:32:49 2017
From: rostedt at goodmis.org (Steven Rostedt)
Date: Thu, 6 Jul 2017 18:32:49 -0400
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
 regression tracking
In-Reply-To: <a2fada39-76d4-e136-f2db-d8306d929902@osg.samsung.com>
References: <576cea07-770a-4864-c3f5-0832ff211e94@leemhuis.info>
	<20170703123025.7479702e@gandalf.local.home>
	<ad94dc65-cc9c-f4f1-27c1-5a48603c7f59@leemhuis.info>
	<20170705084528.67499f8c@gandalf.local.home>
	<4080ecc7-1aa8-2940-f230-1b79d656cdb4@redhat.com>
	<20170705092757.63dc2328@gandalf.local.home>
	<20170705140607.GA30187@kroah.com>
	<a462fb3b-a6d4-e969-b301-b404981de224@roeck-us.net>
	<20170705153259.GA7265@kroah.com>
	<a2fada39-76d4-e136-f2db-d8306d929902@osg.samsung.com>
Message-ID: <20170706183249.60b2aef9@gandalf.local.home>

On Thu, 6 Jul 2017 16:24:01 -0600
Shuah Khan <shuahkh at osg.samsung.com> wrote:


> Over the past couple of years, kselftests have seen improvements to run
> on ARM in kernel ci rings. TAP13 will definitely make it easier to find
> run to run differences. There is the effort to use ksefltests to test
> stable releases (4.4 LTS for example), which will help make the tests
> fail/skip gracefully when a feature isn't enabled/supported.
> 
> The work so far is two fold:
> 
> - enable them to run in test rings.
> - making them easy to use
> 
> As per test development, we are constantly adding tests and I see new tests
> getting added for sub-systems that aren't hardware dependent. You will see
> lots of activity in mm, timers, seccomp, net, sys-calls to name a few.
> 
> I am going to be looking for TAP13 format compliance for new tests starting
> 4.13.
> 
> I am not sure how popular they are among developers and sub-system maintainers
> though. Maybe this is one area we can try to improve usage.

Maybe this should be included in the MAINTAINERS SUMMIT as well. To
consolidate the format of all the kselftests and have something that
everyone (or most) developers agree on.

-- Steve


From shuahkh at osg.samsung.com  Thu Jul  6 22:40:45 2017
From: shuahkh at osg.samsung.com (Shuah Khan)
Date: Thu, 6 Jul 2017 16:40:45 -0600
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
 regression tracking
In-Reply-To: <20170706183249.60b2aef9@gandalf.local.home>
References: <576cea07-770a-4864-c3f5-0832ff211e94@leemhuis.info>
	<20170703123025.7479702e@gandalf.local.home>
	<ad94dc65-cc9c-f4f1-27c1-5a48603c7f59@leemhuis.info>
	<20170705084528.67499f8c@gandalf.local.home>
	<4080ecc7-1aa8-2940-f230-1b79d656cdb4@redhat.com>
	<20170705092757.63dc2328@gandalf.local.home>
	<20170705140607.GA30187@kroah.com>
	<a462fb3b-a6d4-e969-b301-b404981de224@roeck-us.net>
	<20170705153259.GA7265@kroah.com>
	<a2fada39-76d4-e136-f2db-d8306d929902@osg.samsung.com>
	<20170706183249.60b2aef9@gandalf.local.home>
Message-ID: <803733a4-491b-3303-5e22-a057d4eadd3d@osg.samsung.com>

On 07/06/2017 04:32 PM, Steven Rostedt wrote:
> On Thu, 6 Jul 2017 16:24:01 -0600
> Shuah Khan <shuahkh at osg.samsung.com> wrote:
> 
> 
>> Over the past couple of years, kselftests have seen improvements to run
>> on ARM in kernel ci rings. TAP13 will definitely make it easier to find
>> run to run differences. There is the effort to use ksefltests to test
>> stable releases (4.4 LTS for example), which will help make the tests
>> fail/skip gracefully when a feature isn't enabled/supported.
>>
>> The work so far is two fold:
>>
>> - enable them to run in test rings.
>> - making them easy to use
>>
>> As per test development, we are constantly adding tests and I see new tests
>> getting added for sub-systems that aren't hardware dependent. You will see
>> lots of activity in mm, timers, seccomp, net, sys-calls to name a few.
>>
>> I am going to be looking for TAP13 format compliance for new tests starting
>> 4.13.
>>
>> I am not sure how popular they are among developers and sub-system maintainers
>> though. Maybe this is one area we can try to improve usage.

As a clarification, what I meant by "how popular they are among developers and
sub-system maintainers" is that how often developers and sub-system maintainers
run kselftests and are there any obstacles for running them.

It would be good to get feedback on usage by us as in developers.

> 
> Maybe this should be included in the MAINTAINERS SUMMIT as well. To
> consolidate the format of all the kselftests and have something that
> everyone (or most) developers agree on.


thanks,
-- Shuah

From fengguang.wu at intel.com  Fri Jul  7 03:33:02 2017
From: fengguang.wu at intel.com (Fengguang Wu)
Date: Fri, 7 Jul 2017 11:33:02 +0800
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
 regression tracking
In-Reply-To: <20170705112707.54d7f345@gandalf.local.home>
References: <576cea07-770a-4864-c3f5-0832ff211e94@leemhuis.info>
	<20170703123025.7479702e@gandalf.local.home>
	<ad94dc65-cc9c-f4f1-27c1-5a48603c7f59@leemhuis.info>
	<20170705084528.67499f8c@gandalf.local.home>
	<4080ecc7-1aa8-2940-f230-1b79d656cdb4@redhat.com>
	<20170705092757.63dc2328@gandalf.local.home>
	<20170705140607.GA30187@kroah.com>
	<a462fb3b-a6d4-e969-b301-b404981de224@roeck-us.net>
	<20170705112707.54d7f345@gandalf.local.home>
Message-ID: <20170707033302.rgpq5knzx3qvvr2p@wfg-t540p.sh.intel.com>

On Wed, Jul 05, 2017 at 11:27:07AM -0400, Steven Rostedt wrote:
[snip]
>I need to be clearer on this. What I meant was, if there's a bug
>where someone has a test that easily reproduces the bug, then if
>there's not a test added to selftests for said bug, then we should
>shame those into doing so.

Besides shaming, there's one more option -- acknowledgement.

When it's a test case or test tool that discovered the bug, we could
acknowledge it by adding one line in the bug fixing patch. The exact
forms can be discussed, but here are some examples to show the basic
idea:

Tool: lockdep
Tool: ktest
Tool: smatch
Tool: trinity
Tool: syzkaller
Tool: xfstests/tests/ext4/025
Tool: scripts/coccinelle/locks/call_kern.cocci
Tool: tools/testing/selftests/bpf/test_align.c

Reports from test infrastructures like 0day could go further to help
acknowledge the tool author or maintainer by showing such lines in its
bug report email:

You may consider adding these lines in the bug fixing patch:

-----------------------[ cut here ]----------------------------------
Fixes: XXXXXXXXXX ("title of the buggy commit")
Tool: tools/testing/selftests/bpf/test_align.c <davem at davemloft.net>
Reported-by: 0day test robot <xiaolong.ye at intel.com>
-----------------------[ cut here ]----------------------------------

Regards,
Fengguang

From frowand.list at gmail.com  Fri Jul  7 04:52:25 2017
From: frowand.list at gmail.com (Frank Rowand)
Date: Thu, 6 Jul 2017 21:52:25 -0700
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
 regression tracking
In-Reply-To: <20170707033302.rgpq5knzx3qvvr2p@wfg-t540p.sh.intel.com>
References: <576cea07-770a-4864-c3f5-0832ff211e94@leemhuis.info>
	<20170703123025.7479702e@gandalf.local.home>
	<ad94dc65-cc9c-f4f1-27c1-5a48603c7f59@leemhuis.info>
	<20170705084528.67499f8c@gandalf.local.home>
	<4080ecc7-1aa8-2940-f230-1b79d656cdb4@redhat.com>
	<20170705092757.63dc2328@gandalf.local.home>
	<20170705140607.GA30187@kroah.com>
	<a462fb3b-a6d4-e969-b301-b404981de224@roeck-us.net>
	<20170705112707.54d7f345@gandalf.local.home>
	<20170707033302.rgpq5knzx3qvvr2p@wfg-t540p.sh.intel.com>
Message-ID: <595F1389.60308@gmail.com>

On 07/06/17 20:33, Fengguang Wu wrote:
> On Wed, Jul 05, 2017 at 11:27:07AM -0400, Steven Rostedt wrote:
> [snip]
>> I need to be clearer on this. What I meant was, if there's a bug
>> where someone has a test that easily reproduces the bug, then if
>> there's not a test added to selftests for said bug, then we should
>> shame those into doing so.
> 
> Besides shaming, there's one more option -- acknowledgement.
> 
> When it's a test case or test tool that discovered the bug, we could
> acknowledge it by adding one line in the bug fixing patch. The exact
> forms can be discussed, but here are some examples to show the basic
> idea:
> 
> Tool: lockdep
> Tool: ktest
> Tool: smatch
> Tool: trinity
> Tool: syzkaller
> Tool: xfstests/tests/ext4/025
> Tool: scripts/coccinelle/locks/call_kern.cocci
> Tool: tools/testing/selftests/bpf/test_align.c
> 
> Reports from test infrastructures like 0day could go further to help
> acknowledge the tool author or maintainer by showing such lines in its
> bug report email:
> 
> You may consider adding these lines in the bug fixing patch:
> 
> -----------------------[ cut here ]----------------------------------
> Fixes: XXXXXXXXXX ("title of the buggy commit")
> Tool: tools/testing/selftests/bpf/test_align.c <davem at davemloft.net>
> Reported-by: 0day test robot <xiaolong.ye at intel.com>
> -----------------------[ cut here ]----------------------------------
> 
> Regards,
> Fengguang
> _______________________________________________
> Ksummit-discuss mailing list
> Ksummit-discuss at lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/ksummit-discuss
> 

That is a great idea!  If a tool is shown to be catching a large
number of bugs then I am more likely to add it to my test process.

-Frank

From avagin at gmail.com  Fri Jul  7 06:15:20 2017
From: avagin at gmail.com (Andrei Vagin)
Date: Thu, 6 Jul 2017 23:15:20 -0700
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
 regression tracking
In-Reply-To: <576cea07-770a-4864-c3f5-0832ff211e94@leemhuis.info>
References: <576cea07-770a-4864-c3f5-0832ff211e94@leemhuis.info>
Message-ID: <20170707061519.GA25786@gmail.com>

Here I want to share our experience of testing linux-next and other
trees. In CRIU we have a lot of tests for all sort of user-visible
primitives. Our goal is to catch changes which breaks CRIU before they
will be pushed to the Linus tree.

https://criu.org/linux-next

We run our test suite once a day for linux-next and a dozen of other
trees. About a year ago we used DO to get a virtual machine to run
tests, but now we use travis-ci.

Here is an example of a daily report:
https://travis-ci.org/avagin/criu/builds/250632728

What are benefits of this approach?

* It is free.
* Everyone can run these tests for any kernel and he/she doesn't
  need to spend hours to understand how to do that.
* You don't need to have a hardware to run tests
* You can do this periodically or for each patch or patchset

For example, If we want to run CRIU tests for a kernel,
we need to apply this patch to it:
https://github.com/avagin/linux/commit/2f34796b04cead83fa85cf92cf694ac4369ca970

and push its code to github, then travis-ci will run test for this
kernel:

https://travis-ci.org/avagin/linux/builds/250895561

Here is a detailed article which describes how we start a new kernel in
travis-ci:
https://avagin.github.io/travis-kexec-criu

The main idea, what I want to say, is that developers will use tests,
only if they will be able to execute them with minimal forces. In ideal
case, someone else has to run tests for them.

In CRIU, we run our tests for each patchset and a patchset can be
accepted only if it passed all test:

https://patchwork.criu.org/project/criu/series/?ordering=-last_updated

I know that the first problem is to write tests, but the next step is to
setup CI to run these tests for all changes and I think we can start
thinking about this problem too.

On Sun, Jul 02, 2017 at 07:51:43PM +0200, Thorsten Leemhuis wrote:
> Hi! Sorry, I know I'm late -- real life (travel, day job, ...) kept me
> away from spending time on Linux kernel regression work :-/
> 
> Maybe I'm taking it a bit to far for the new kid in town, but I think I
> want to propose two sessions. One for the maintainer summit, that deals
> with a the most critical issues relevant to regression tracking. And one
> technical session to deal with all the other stuff. Obviously we can
> move below mentioned topics from one to the other or talk about them at
> both if we want.
> 
> = [MAINTAINERS SUMMIT] Improve regression tracking =
> 
>  * Follow up from last year: What to do about bugzilla.kernel.org?
> Reporters still get stranded there.
>  * How to get subsystems maintainer involved more in regression tracking
> to better make sure that reported regressions are tracked and not
> forgotten accidentally.
>  * Frustrations with regression tracking aka. how to establish
> regression tracking properly to make sure it will never go away again.
> 
> = [TECH TOPIC] Improve the kernels quality by getting more people
> involved in regression testing and reporting =
> 
>  * A short report from the outcome of the maintainer summit discussion;
> also pick up and topics here that where not properly discussed on the
> maintainer summit or were postponed to this session.
>  * How to get distros more involved in regression tracking; especially
> those that have a technical aware user base or normally ship up2date
> kernel images (and thus have an greater interest in avoiding
> regressions). I'm mainly thinking about Arch Linux, Debian, Fedora, and
> openSUSE Tumbleweed here; having Ubuntu in the boat would be good, too!
> (might be wise to talk about this on the maintainers summit as well, if
> the right people are there)
>  * How to make it more easy to (ideally automatically!) track the
> current status and the progress of each regression? Are there any tools
> that could make regression tracking easier for all of us while not
> introducing much overhead for maintainers?
> 
> = Details =
> 
> Below you'll find few more words about some points mentioned above;
> there are a few other topics as well we could discuss if we want. But
> first, a few general words on regression tracking from my point of view:
> 
>  * There are a lot of areas in regression tracking where things are far
> from good (read: in a bad state). That makes it easy to discuss current
> problems and their solutions for hours -- and at the same time forget
> that discussing itself doesn't get us much forward (the old bugzilla
> issue mentioned in this mail is a good example). We thus IMHO should
> focus on the most important issues and lay the groundwork to establish
> regression tracking properly again, then we move on to solve things that
> are harder to solve.
> 
>  * Regression tracking currently is quite boring and exhausting (read:
> high burn-out risk), as it involves quite a lot of manual work finding
> regressions and keeping track of their progress (and at the end of the
> day it does not feel like you achieved much). Some of that work can not
> be automated. But quite a bit can and that would help a great deal to
> establish regression tracking properly (currently I'm the only one doing
> it and some development cycles I simply don't find spare time for it).
> 
>    I currently don't see any existing solutions that fit well with our
> mail focused workflow and at the same time do not introduce much
> overhead for subsystem maintainers (which I assume is what everyone
> wants, as I fear solutions with much overhead won't fly at all). Ideas
> how to solve this tricky problem area are highly welcomed. It's
> something that can be discussed when the aforementioned points
> "establish regression tracking properly" and "make it more easy to
> manually or automatically track the current status of a regression" come up.
> 
> == What to do about bugzilla.kernel.org =
> 
> Discussed last year already; see https://lwn.net/Articles/705245/ for
> details. Situation didn't change much since then: the bugzilla instance
> was updated, but people still get stranded there as most subsystems
> ignore it. That afaics frustrates people and makes them stop testing or
> reporting bugs.
> 
> Discuss how to improve things. [my2cent] Maybe a short term solution
> like this could work: Serve a static page on bugzilla.kernel.org that
> tells people where regressions/bugs for certain subsystems can be
> reported, as it most of the time is some mailing list anyway. Such a
> page could get compiled from MAINTAINERS (there is the "B:" field now
> that points to bugzilla; if its not there point to a mailing lists; also
> explain get_maintainers.pl).
> 
>   Leave our bugzilla reachable via bugzilla.kernel.org/frontpage (or
> something like that) for those few subsystems that use it; that's afaics
> ACPI and PM (including Cpufreq, Cpuidle, Hibernation, Suspend, ...) and
> maybe PCI (not sure) -- or should we tell them to move to
> bugzilla.freedesktop.org (or somewhere else) to get rid of our bugzilla
> in the long etrm and make Konstantins life easier? Anyway: Make sure
> bugs for other subsystems can't get filed in bugzilla.kernel.org anymore
> to make sure they get lost there. [/my2cent]
> 
> == How to get subsystems maintainer more involved in regression tracking
> to [?] ==
> 
> One reasons why I put this up is: It would help me a lot if people let
> regressions at leemhuis.info (side note: might be wise to make a
> mailing-list that replaces this address) get told about regressions --
> simply CCing it on reports or answers to regressions reports is enough;
> forwarding/bouncing mails there (even without additional text) is fine,
> too.
> 
> The other reason I included it: This came up in last years discussion on
> this list and it seemed some people thought we can get the subsystems
> maintainers more involved; so I thought it might be wise to discuss it.
> Might also be a good idea to discuss here how to get distro kernel
> maintainer more involved if enough are around.
> 
> == How to establish regression tracking properly [?] ==
> 
> This is a pretty vague topic on purpose. People seem to agree that
> regression tracking is important, but for years nobody did it (it
> stopped a little while after Rafael had to move on) and the little bit
> that I can do in my rare spare time won't help much (and I have no idea
> how long I can continue to find time for it).
> 
> == Make it easier to track the progress of regression ==
> 
> One of the main reasons that makes regression tracking hard currently:
> getting aware or regressions and tracking their progress is a lot of
> manual work. I plan one step that hopefully makes the job a little
> easier and at the same time might allow some automation in the long
> term: ask people to include a certain keyword in their regressions
> reports. Maybe something like "Linux-Regression" that doesn't get too
> much false positives when searching for it on lists and via Google
> (suggestions for a better tag welcome).
> 
> In addition, I plan to hand out some form of ID for each regressions I
> track and ask people to include it -- especially when they post patches
> that fix said regression or move the discussion to a new place (like
> "Corrects: Linux-Regression-d2afd"; again: suggestions welcome! Maybe I
> should just use a URL where people find details?).
> 
> That way I can notice more easy when a fix for a regression hits
> linux-next or master; I also get aware if a discussion moves from
> bugzilla to LKML or from one thread to another (fingers crossed).
> Obviously it depends on cooperation of those involved.
> 
> If this works out we could write a script or something that watches
> mailing lists, bug trackers and git trees for the tag in question. That
> script could file a database and automatically do some of the tracking job.
> 
> == get distros more involved ==
> 
> I assume at least Ben (Debian), Laura (Fedora), and Takashi (openSUSE)
> are around, so it might be a good idea to sit together and talk
> regression tracking in general and how we could get the distros kernel
> maintainers more involved. Even better would be to sit down before to
> maybe come up with some ideas/plans we could talk during this session.
> 
> One topic could be: How to make it easier for users of popular distros
> to get involved in testing. The "Kernel of the day" (KOTD) from
> SUSE/openSUSE was mentioned recently on this list already, but I got the
> impression that the existence of this repo is not well known; guess it's
> the same for my own Kernel Vanilla Repositories for Fedora (those
> contain packages with a quite recent mainline version; see
> https://fedoraproject.org/wiki/Kernel_Vanilla_Repositories ) or the fact
> that Fedora rawhide ships a recent mainline snapshot all the time. But
> should distros also offer Linux-next somewhere? Or anything else? And
> should the distros send experienced users upstream when they found a
> regression? Or will subsystem maintainers send those users away because
> they assume those kernels are not vanilla?
> 
> 
> == Topics or vague ideas I left out on purpose ==
> 
> Here is a list of other things we could talk about, but I think better
> left for a later time:
> 
>  * Kerneloops (http://oops.kernel.org/): It was discussed last year on
> this list. I have no idea what the current status is. Is someone
> watching & analysing it? And poking the right people when needed? (I
> doubt it)
> 
>  * Regression tracking for stable kernels (many bugs only get noticed
> once a new mainline version got released; at that time it might still be
> easy to revert a certain patch in mainline and stable)
> 
>  * statistics: I didn't spend time to create statistics, like Rafael did
> in the past. They'd be nice to have, but for now I think my time is
> better spend elsewhere.
> 
>  * work towards growing the number of tester by making it easier for
> them (better documentation, easier configuration, bisection scripts, ...)
> 
>  * maybe document a few some procedures for those that are not regular
> kernel developers (like the "When users report bugs on the Fedora
> tracker that look like actual upstream bugs, what's the best way to have
> those reported?" thing that Laura mentioned earlier this month in the
> mail "Bug reporting feedback loop"
> 
>  * provide better services than only a plain text list of regression on
> a mailing list?
> 
>  * better documentation? for example explain the difference between bugs
> and regressions somewhere to make people understand why their bugs might
> get ignored, but as the same time know that we handle regressions more
> seriously.
> 
>  * Should the regression tracker nag subsystem maintainers (and
> reporters) more often if they are inactive? How do people for example
> feel about (Semi-)Automatic nagging mails for regressions where there is
> no progress?
> 
>  * Is the data and the format of the current reports show useful at all?
> If not: How to improve it?
> 
>  * regression tracking is a fair amount of work, and it's frustrating,
> and people burn out. How to avoid that? Can we maybe get regression
> tracking on solid ground by somehow building a healthy community around
> it (containing kernel developers, Distro maintainers and people that are
> willing to help in their spare time) that work on regressions
> testing/tracking and other QA stuff?
> 
>  * how to make the Linux kernel development so good that the mainstream
> distros stop their kernel forks and do what they do with Firefox: Ship
> the latest stable version (users get a new version with new features
> every few weeks) or a longterm branch (makes a big version jump about
> once a year; see Firefox ESR).
> 
> Ugh, pretty long mail. Sorry about that. Maybe I shouldn't have looked
> so closely into LWN.net articles about regression tracking and older
> discussions about it.
> 
> Ciao, Thorsten
> _______________________________________________
> Ksummit-discuss mailing list
> Ksummit-discuss at lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/ksummit-discuss

From dan.carpenter at oracle.com  Fri Jul  7 09:02:06 2017
From: dan.carpenter at oracle.com (Dan Carpenter)
Date: Fri, 7 Jul 2017 12:02:06 +0300
Subject: [Ksummit-discuss] [PATCH 2/2] kconfig: new command line kernel
 configuration tool
In-Reply-To: <CAJKOXPfE+ffRWq_1JB_1LSpg4gDne5pWNN2fKd-3t1hwX7amFA@mail.gmail.com>
References: <20170706144028.46a2mt2mdzpt6ip7@mwanda>
	<20170706144208.6hlgxwo37gntk6qm@mwanda>
	<CAJKOXPfE+ffRWq_1JB_1LSpg4gDne5pWNN2fKd-3t1hwX7amFA@mail.gmail.com>
Message-ID: <20170707090206.uiry6j7yizpl7yw4@mwanda>

On Fri, Jul 07, 2017 at 07:55:27AM +0200, Krzysztof Kozlowski wrote:
> On Thu, Jul 6, 2017 at 4:42 PM, Dan Carpenter <dan.carpenter at oracle.com> wrote:
> > This tool barely works, it's just a rough draft.
> >
> > Sometimes I want to search for a config so I have to load menuconfig,
> > then search for the config entry, then exit.  With this script I
> > simply run:
> >
> >     ./scripts/kconfig/kconfig search COMEDI
> >
> > Quite often I find myself trying to enable a feature by doing this:
> >
> >     echo CONFIG_FEATURE=y >> .config
> >
> > But when I try to boot the new kernel, I find that the feature isn't
> > there because the kernel runs `make oldconfig` and I didn't have all
> > the depends selected so it silently removed it.  With this feature
> > what you can do is:
> >
> >     ./scripts/kconfig/kconfig set FEATURE=y
> 
> Sounds useful. I need to enable few options from scripts and if
> dependencies change they could be silently skipped.
> 
> Probably it would be nice to print what was effectively enabled to get
> your feature in. However why not extending existing scripts/config? It
> already has the feature for setting kconfig options (without looking
> at dependencies - so like >> of yours).
> 

I didn't know about scripts/config when I wrote it.  scripts/config is
essentially a UI around "echo CONFIG_FOO=m >> .config".  It's totally
useless.

regards,
dan carpenter


From broonie at kernel.org  Fri Jul  7 10:03:40 2017
From: broonie at kernel.org (Mark Brown)
Date: Fri, 7 Jul 2017 11:03:40 +0100
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
 regression tracking
In-Reply-To: <1499352485.2765.14.camel@HansenPartnership.com>
References: <20170705084528.67499f8c@gandalf.local.home>
	<4080ecc7-1aa8-2940-f230-1b79d656cdb4@redhat.com>
	<20170705092757.63dc2328@gandalf.local.home>
	<20170705140607.GA30187@kroah.com>
	<a462fb3b-a6d4-e969-b301-b404981de224@roeck-us.net>
	<20170705112707.54d7f345@gandalf.local.home>
	<c782a15a-4e73-7373-ca66-5b55e9406059@roeck-us.net>
	<20170705130200.7c653f61@gandalf.local.home>
	<20170706092836.ifcnc2qqwufndhdl@sirena.org.uk>
	<1499352485.2765.14.camel@HansenPartnership.com>
Message-ID: <20170707100340.kgks5aykbnwtc6om@sirena.org.uk>

On Thu, Jul 06, 2017 at 07:48:05AM -0700, James Bottomley wrote:
> On Thu, 2017-07-06 at 10:28 +0100, Mark Brown wrote:

> > I think before anything like that is viable we need to show a
> > concerted and visible interest in actually running the tests we
> > already have and paying attention to the results - if people can see
> > that they're just checking a checkbox that will often result in low
> > quality tests which can do more harm than good.

> it depends what you mean by "we". ?I used to run a battery of tests
> over every SCSI commit. ?It was time consuming and slowed down the

We as a community, I think something viable needs to be central services
like kernelci that's automated and allows multiple people to be involved
with the analysis.  Hand running tests at scale just doesn't.

> The corollary I take away from this is that the less intrusive the test
> infrastructure is (at least to my process) the happier I am. ?The 0day
> quantum leap for me was going from testing my tree and telling me of
> problems after I've added the patch to testing patches posted to the
> mailing list, which tells me of problems *before* the commit gets added
> to the tree.

I think we'd get a long way just by looking at what's ending up in -next
- it's not as good as detecting things before they go in but it's
workable if people keep on top of things.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: not available
URL: <http://lists.linuxfoundation.org/pipermail/ksummit-discuss/attachments/20170707/55deaa10/attachment.sig>

From dan.carpenter at oracle.com  Fri Jul  7 11:36:51 2017
From: dan.carpenter at oracle.com (Dan Carpenter)
Date: Fri, 7 Jul 2017 14:36:51 +0300
Subject: [Ksummit-discuss] [TECH TOPIC] is Kconfig a bit hard sometimes?
In-Reply-To: <CA+55aFyPQeMYUafR32parh=jGU0O8prn5vhOhhW+WBU-LbR4HQ@mail.gmail.com>
References: <20170627135839.GB1886@jagdpanzerIV.localdomain>
	<20170706144028.46a2mt2mdzpt6ip7@mwanda>
	<CA+55aFyPQeMYUafR32parh=jGU0O8prn5vhOhhW+WBU-LbR4HQ@mail.gmail.com>
Message-ID: <20170707113650.ee6oys5u4vq5hgdi@mwanda>

On Thu, Jul 06, 2017 at 09:41:36AM -0700, Linus Torvalds wrote:
> On Thu, Jul 6, 2017 at 7:40 AM, Dan Carpenter <dan.carpenter at oracle.com> wrote:
> > People have mentioned "make oldconfig" but I've never had a lot of luck
> > with that.  It always just prints "* Restart config..." and deletes my
> > config.
> 
> Really?
> 

Argh.  You're right.  I'm an idiot.  It's actually working fine, but it
asked so many questions I thought it was broken.

regards,
dan carpenter

From krzk at kernel.org  Fri Jul  7 05:55:27 2017
From: krzk at kernel.org (Krzysztof Kozlowski)
Date: Fri, 7 Jul 2017 07:55:27 +0200
Subject: [Ksummit-discuss] [PATCH 2/2] kconfig: new command line kernel
 configuration tool
In-Reply-To: <20170706144208.6hlgxwo37gntk6qm@mwanda>
References: <20170706144028.46a2mt2mdzpt6ip7@mwanda>
	<20170706144208.6hlgxwo37gntk6qm@mwanda>
Message-ID: <CAJKOXPfE+ffRWq_1JB_1LSpg4gDne5pWNN2fKd-3t1hwX7amFA@mail.gmail.com>

On Thu, Jul 6, 2017 at 4:42 PM, Dan Carpenter <dan.carpenter at oracle.com> wrote:
> This tool barely works, it's just a rough draft.
>
> Sometimes I want to search for a config so I have to load menuconfig,
> then search for the config entry, then exit.  With this script I
> simply run:
>
>     ./scripts/kconfig/kconfig search COMEDI
>
> Quite often I find myself trying to enable a feature by doing this:
>
>     echo CONFIG_FEATURE=y >> .config
>
> But when I try to boot the new kernel, I find that the feature isn't
> there because the kernel runs `make oldconfig` and I didn't have all
> the depends selected so it silently removed it.  With this feature
> what you can do is:
>
>     ./scripts/kconfig/kconfig set FEATURE=y

Sounds useful. I need to enable few options from scripts and if
dependencies change they could be silently skipped.

Probably it would be nice to print what was effectively enabled to get
your feature in. However why not extending existing scripts/config? It
already has the feature for setting kconfig options (without looking
at dependencies - so like >> of yours).

Best regards,
Krzysztof

>
> It helps you enable the dependencies or it at least prints an error
> if it can't enable the feature.
>
> But this code isn't all implemented.  1) It doesn't calculate the
> dependencies well.  See expr_parse() for more details.  2)  It
> doesn't work well for things like:
>
>         ./scripts/kconfig/kconfig set BT_INTEL=m
>
> because those aren't visible, they can only be using depend
> statements.  Or say you try to set FEATURE=m when something else
> depends on it be set =y then the error message is wrong.  The
> other problem is that I don't know how to print the help text.
> Again, this is just a rough draft.
>
> Signed-off-by: Dan Carpenter <dan.carpenter at oracle.com>
> ---
>  scripts/kconfig/Makefile |   6 +-
>  scripts/kconfig/kconfig  |  33 +++++
>  scripts/kconfig/lconf.c  | 332 +++++++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 370 insertions(+), 1 deletion(-)
>  create mode 100755 scripts/kconfig/kconfig
>  create mode 100644 scripts/kconfig/lconf.c
>

From linus.walleij at linaro.org  Sun Jul  9 03:56:47 2017
From: linus.walleij at linaro.org (Linus Walleij)
Date: Sun, 9 Jul 2017 05:56:47 +0200
Subject: [Ksummit-discuss] [PATCH 2/2] kconfig: new command line kernel
 configuration tool
In-Reply-To: <20170707090206.uiry6j7yizpl7yw4@mwanda>
References: <20170706144028.46a2mt2mdzpt6ip7@mwanda>
	<20170706144208.6hlgxwo37gntk6qm@mwanda>
	<CAJKOXPfE+ffRWq_1JB_1LSpg4gDne5pWNN2fKd-3t1hwX7amFA@mail.gmail.com>
	<20170707090206.uiry6j7yizpl7yw4@mwanda>
Message-ID: <CACRpkdZB9g3rCgRDnZFDT2g9qQTtUZhKG2Pxp8qKrGP3u1p5+Q@mail.gmail.com>

On Fri, Jul 7, 2017 at 11:02 AM, Dan Carpenter <dan.carpenter at oracle.com> wrote:

> Krzysztof Kozlowski wrote:
>> However why not extending existing scripts/config? It
>> already has the feature for setting kconfig options (without looking
>> at dependencies - so like >> of yours).
>
> I didn't know about scripts/config when I wrote it.  scripts/config is
> essentially a UI around "echo CONFIG_FOO=m >> .config".  It's totally
> useless.

Maybe useless for you but i use it every day in my work. To compile a kernel
for a purpose I have a custom makefile.mak in my top kernel dir
that calls scripts/config to set stuff up on-the-fly with multiple rules like
this:

config-base: FORCE
        @mkdir -p $(build_dir)
        @cp $(rootfs) $(build_dir)/$(rootfsbase)
        $(MAKE) $(make_options) u8500_defconfig

config-initramfs: have-rootfs config-base
        # Configure in the initramfs
        $(CURDIR)/scripts/config --file $(config_file) \
        --enable BLK_DEV_INITRD \
        --set-str INITRAMFS_SOURCE $(rootfsbase) \
        --enable RD_GZIP \
        --enable INITRAMFS_COMPRESSION_GZIP

(....)

config: config-common config-distro config-initramfs
        $(CURDIR)/scripts/config --file $(config_file) \
        --enable USE_OF \
        --enable ARM_APPENDED_DTB \
        --enable ARM_ATAG_DTB_COMPAT \
        --enable PROC_DEVICETREE
        yes "" | make $(make_options) oldconfig

For the full Makefile see:

https://dflund.se/~triad/krad/makefiles/ux500.mak

There are several of these, like some that create a minimal
i586 system with busybox on an initramfs:
https://dflund.se/~triad/krad/makefiles/i586.mak

I don't know if I am stupid in using this rather than config
fragments, but it works for me.

That said, what you have brewing looks better :)

Yours,
Linus Walleij

From geert at linux-m68k.org  Sun Jul  9 08:31:59 2017
From: geert at linux-m68k.org (Geert Uytterhoeven)
Date: Sun, 9 Jul 2017 10:31:59 +0200
Subject: [Ksummit-discuss] [PATCH 2/2] kconfig: new command line kernel
 configuration tool
In-Reply-To: <CACRpkdZB9g3rCgRDnZFDT2g9qQTtUZhKG2Pxp8qKrGP3u1p5+Q@mail.gmail.com>
References: <20170706144028.46a2mt2mdzpt6ip7@mwanda>
	<20170706144208.6hlgxwo37gntk6qm@mwanda>
	<CAJKOXPfE+ffRWq_1JB_1LSpg4gDne5pWNN2fKd-3t1hwX7amFA@mail.gmail.com>
	<20170707090206.uiry6j7yizpl7yw4@mwanda>
	<CACRpkdZB9g3rCgRDnZFDT2g9qQTtUZhKG2Pxp8qKrGP3u1p5+Q@mail.gmail.com>
Message-ID: <CAMuHMdUBt4g=VHNBeN9OCq76hWMtR=_Giu55MkeqQZT9+TsX4w@mail.gmail.com>

On Sun, Jul 9, 2017 at 5:56 AM, Linus Walleij <linus.walleij at linaro.org> wrote:
>> Krzysztof Kozlowski wrote:
>>> However why not extending existing scripts/config? It
>>> already has the feature for setting kconfig options (without looking
>>> at dependencies - so like >> of yours).
>>
>> I didn't know about scripts/config when I wrote it.  scripts/config is
>> essentially a UI around "echo CONFIG_FOO=m >> .config".  It's totally
>> useless.
>
> Maybe useless for you but i use it every day in my work. To compile a kernel

I assume the script has it uses.
But to scratch Dan's itch (and mine, for generating .config from DTS), which
is the non-trivial case, it may not work.
So I'll definitely give Dan's script a try, thanks!

>         yes "" | make $(make_options) oldconfig

That will become an infinite loop if "y" is not a valid answer for the newly
introduced option (e.g. if it needs a number)?

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert at linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

From linux at leemhuis.info  Sun Jul  9 13:46:50 2017
From: linux at leemhuis.info (Thorsten Leemhuis)
Date: Sun, 9 Jul 2017 15:46:50 +0200
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
 regression tracking
In-Reply-To: <20170705103335.0cbd9984@gandalf.local.home>
References: <576cea07-770a-4864-c3f5-0832ff211e94@leemhuis.info>
	<20170703123025.7479702e@gandalf.local.home>
	<ad94dc65-cc9c-f4f1-27c1-5a48603c7f59@leemhuis.info>
	<20170705084528.67499f8c@gandalf.local.home>
	<4080ecc7-1aa8-2940-f230-1b79d656cdb4@redhat.com>
	<20170705092757.63dc2328@gandalf.local.home>
	<20170705140607.GA30187@kroah.com>
	<20170705103335.0cbd9984@gandalf.local.home>
Message-ID: <8e49d1f3-2216-ca77-ac06-d62c08c18aea@leemhuis.info>

On 05.07.2017 16:33, Steven Rostedt wrote:
> On Wed, 5 Jul 2017 16:06:07 +0200
> Greg KH <greg at kroah.com> wrote:
> [...]
>> I don't mean to poo-poo the idea, but please realize that around 75% of
>> the kernel is hardware/arch support, so that means that 75% of the
>> changes/fixes deal with hardware things (yes, change is in direct
>> correlation to size of the codebase in the tree, strange but true).
> I would say that if it's for a specific hardware, then it's really up
> to the maintainer if there should be a test or not. As a lot of these
> is just to deal with some quirk or non standard that the hardware does.
> But are these regressions, or just some feature that's been broken on
> that hardware since its conception?
> 
> That is, Thorsten this is more for you, how much real regressions are in
> hardware? [...]
>From this and other mails in this thread I got the impression some more
data would be helpful -- for example a few percentage numbers on
 * how many of the regressions are in hardware-specific/driver code
 * how many regressions suddenly pop up due to a unrelated (and maybe
even correct) change
 * for how many regressions does it make sense to write a selftest to
catch similar issues beforehand in the future.

I'll try to gather some of those numbers when doing regression tracking
for 4.13 (sorry again that I had to skip 4.12), so be prepare yourself
for a mail when you include a "Fixes:" tag in a commit ;-) Then there is
some data to talk about on the summit or continue the discussion on this
mailing list or LKML.

BTW, Steven, you in this thread wrote "discuss if we want to consolidate
the format of all the kselftests and have something that everyone (or
most) developers agree on". I put that in my notes and try to make sure
we do not forget about this. Or is this something you'll drive forward
yourself?

Ciao, Thorsten

P.S.: Sorry, I'm a bit late with my reply here. My real job (which is
not really about kernel work) and some other things required my
attention in the past few days...

From rdunlap at infradead.org  Sun Jul  9 17:03:03 2017
From: rdunlap at infradead.org (Randy Dunlap)
Date: Sun, 9 Jul 2017 10:03:03 -0700
Subject: [Ksummit-discuss] [PATCH 2/2] kconfig: new command line kernel
 configuration tool
In-Reply-To: <CAMuHMdUBt4g=VHNBeN9OCq76hWMtR=_Giu55MkeqQZT9+TsX4w@mail.gmail.com>
References: <20170706144028.46a2mt2mdzpt6ip7@mwanda>
	<20170706144208.6hlgxwo37gntk6qm@mwanda>
	<CAJKOXPfE+ffRWq_1JB_1LSpg4gDne5pWNN2fKd-3t1hwX7amFA@mail.gmail.com>
	<20170707090206.uiry6j7yizpl7yw4@mwanda>
	<CACRpkdZB9g3rCgRDnZFDT2g9qQTtUZhKG2Pxp8qKrGP3u1p5+Q@mail.gmail.com>
	<CAMuHMdUBt4g=VHNBeN9OCq76hWMtR=_Giu55MkeqQZT9+TsX4w@mail.gmail.com>
Message-ID: <404833db-da51-e348-060e-c3b4f6a27e0d@infradead.org>

On 07/09/2017 01:31 AM, Geert Uytterhoeven wrote:
> On Sun, Jul 9, 2017 at 5:56 AM, Linus Walleij <linus.walleij at linaro.org> wrote:
>>> Krzysztof Kozlowski wrote:
>>>> However why not extending existing scripts/config? It
>>>> already has the feature for setting kconfig options (without looking
>>>> at dependencies - so like >> of yours).
>>>
>>> I didn't know about scripts/config when I wrote it.  scripts/config is
>>> essentially a UI around "echo CONFIG_FOO=m >> .config".  It's totally
>>> useless.
>>
>> Maybe useless for you but i use it every day in my work. To compile a kernel
> 
> I assume the script has it uses.
> But to scratch Dan's itch (and mine, for generating .config from DTS), which
> is the non-trivial case, it may not work.
> So I'll definitely give Dan's script a try, thanks!
> 
>>         yes "" | make $(make_options) oldconfig
> 
> That will become an infinite loop if "y" is not a valid answer for the newly
> introduced option (e.g. if it needs a number)?

yes ""
just answers with a null string, not 'y'.


-- 
~Randy

From frowand.list at gmail.com  Sun Jul  9 17:32:22 2017
From: frowand.list at gmail.com (Frank Rowand)
Date: Sun, 9 Jul 2017 10:32:22 -0700
Subject: [Ksummit-discuss] [PATCH 2/2] kconfig: new command line kernel
 configuration tool
In-Reply-To: <CAMuHMdUBt4g=VHNBeN9OCq76hWMtR=_Giu55MkeqQZT9+TsX4w@mail.gmail.com>
References: <20170706144028.46a2mt2mdzpt6ip7@mwanda>
	<20170706144208.6hlgxwo37gntk6qm@mwanda>
	<CAJKOXPfE+ffRWq_1JB_1LSpg4gDne5pWNN2fKd-3t1hwX7amFA@mail.gmail.com>
	<20170707090206.uiry6j7yizpl7yw4@mwanda>
	<CACRpkdZB9g3rCgRDnZFDT2g9qQTtUZhKG2Pxp8qKrGP3u1p5+Q@mail.gmail.com>
	<CAMuHMdUBt4g=VHNBeN9OCq76hWMtR=_Giu55MkeqQZT9+TsX4w@mail.gmail.com>
Message-ID: <596268A6.3080007@gmail.com>

On 07/09/17 01:31, Geert Uytterhoeven wrote:
> On Sun, Jul 9, 2017 at 5:56 AM, Linus Walleij <linus.walleij at linaro.org> wrote:
>>> Krzysztof Kozlowski wrote:
>>>> However why not extending existing scripts/config? It
>>>> already has the feature for setting kconfig options (without looking
>>>> at dependencies - so like >> of yours).
>>>
>>> I didn't know about scripts/config when I wrote it.  scripts/config is
>>> essentially a UI around "echo CONFIG_FOO=m >> .config".  It's totally
>>> useless.
>>
>> Maybe useless for you but i use it every day in my work. To compile a kernel
> 
> I assume the script has it uses.
> But to scratch Dan's itch (and mine, for generating .config from DTS), which
> is the non-trivial case, it may not work.

Hi Geert,

An aid, though not a full solution, is scripts/dtc/dt_to_config.  Though
if I remember correctly, you are already familiar with that.  For anyone
who wants more information on the complexities of using dt_to_config, how
to use it, and why it has difficulties providing a precise configuration
automatically, see: http://elinux.org/Device_Tree_presentations_papers_articles#linux_kernel_configuration
slides 33 - 80.

-Frank


> So I'll definitely give Dan's script a try, thanks!
> 
>>         yes "" | make $(make_options) oldconfig
> 
> That will become an infinite loop if "y" is not a valid answer for the newly
> introduced option (e.g. if it needs a number)?
> 
> Gr{oetje,eeting}s,
> 
>                         Geert
> 
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert at linux-m68k.org
> 
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like that.
>                                 -- Linus Torvalds
> _______________________________________________
> Ksummit-discuss mailing list
> Ksummit-discuss at lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/ksummit-discuss
> 


From geert at linux-m68k.org  Sun Jul  9 19:43:20 2017
From: geert at linux-m68k.org (Geert Uytterhoeven)
Date: Sun, 9 Jul 2017 21:43:20 +0200
Subject: [Ksummit-discuss] [PATCH 2/2] kconfig: new command line kernel
 configuration tool
In-Reply-To: <404833db-da51-e348-060e-c3b4f6a27e0d@infradead.org>
References: <20170706144028.46a2mt2mdzpt6ip7@mwanda>
	<20170706144208.6hlgxwo37gntk6qm@mwanda>
	<CAJKOXPfE+ffRWq_1JB_1LSpg4gDne5pWNN2fKd-3t1hwX7amFA@mail.gmail.com>
	<20170707090206.uiry6j7yizpl7yw4@mwanda>
	<CACRpkdZB9g3rCgRDnZFDT2g9qQTtUZhKG2Pxp8qKrGP3u1p5+Q@mail.gmail.com>
	<CAMuHMdUBt4g=VHNBeN9OCq76hWMtR=_Giu55MkeqQZT9+TsX4w@mail.gmail.com>
	<404833db-da51-e348-060e-c3b4f6a27e0d@infradead.org>
Message-ID: <CAMuHMdUe1oRr0SNGz=0Gq4A2k=BO+PA4Lr810YsPELPpqrEAbg@mail.gmail.com>

On Sun, Jul 9, 2017 at 7:03 PM, Randy Dunlap <rdunlap at infradead.org> wrote:
> On 07/09/2017 01:31 AM, Geert Uytterhoeven wrote:
>> On Sun, Jul 9, 2017 at 5:56 AM, Linus Walleij <linus.walleij at linaro.org> wrote:
>>>> Krzysztof Kozlowski wrote:
>>>>> However why not extending existing scripts/config? It
>>>>> already has the feature for setting kconfig options (without looking
>>>>> at dependencies - so like >> of yours).
>>>>
>>>> I didn't know about scripts/config when I wrote it.  scripts/config is
>>>> essentially a UI around "echo CONFIG_FOO=m >> .config".  It's totally
>>>> useless.
>>>
>>> Maybe useless for you but i use it every day in my work. To compile a kernel
>>
>> I assume the script has it uses.
>> But to scratch Dan's itch (and mine, for generating .config from DTS), which
>> is the non-trivial case, it may not work.
>> So I'll definitely give Dan's script a try, thanks!
>>
>>>         yes "" | make $(make_options) oldconfig
>>
>> That will become an infinite loop if "y" is not a valid answer for the newly
>> introduced option (e.g. if it needs a number)?
>
> yes ""
> just answers with a null string, not 'y'.

Oops, that's correct. /me on a lazy Sunday afternoon...
So the difficult part is "yes y" or "yes n".

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert at linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

From geert at linux-m68k.org  Mon Jul 10 09:44:22 2017
From: geert at linux-m68k.org (Geert Uytterhoeven)
Date: Mon, 10 Jul 2017 11:44:22 +0200
Subject: [Ksummit-discuss] [PATCH 2/2] kconfig: new command line kernel
 configuration tool
In-Reply-To: <20170706144208.6hlgxwo37gntk6qm@mwanda>
References: <20170706144028.46a2mt2mdzpt6ip7@mwanda>
	<20170706144208.6hlgxwo37gntk6qm@mwanda>
Message-ID: <CAMuHMdW4FFJkhA7oj0uA1an_WNt1wS-tM1KvnjYWwdfjRLjQ7A@mail.gmail.com>

Hi Dan,

On Thu, Jul 6, 2017 at 4:42 PM, Dan Carpenter <dan.carpenter at oracle.com> wrote:
> This tool barely works, it's just a rough draft.
>
> Sometimes I want to search for a config so I have to load menuconfig,
> then search for the config entry, then exit.  With this script I
> simply run:
>
>     ./scripts/kconfig/kconfig search COMEDI
>
> Quite often I find myself trying to enable a feature by doing this:
>
>     echo CONFIG_FEATURE=y >> .config
>
> But when I try to boot the new kernel, I find that the feature isn't
> there because the kernel runs `make oldconfig` and I didn't have all
> the depends selected so it silently removed it.  With this feature
> what you can do is:
>
>     ./scripts/kconfig/kconfig set FEATURE=y
>
> It helps you enable the dependencies or it at least prints an error
> if it can't enable the feature.
>
> But this code isn't all implemented.  1) It doesn't calculate the
> dependencies well.  See expr_parse() for more details.  2)  It
> doesn't work well for things like:
>
>         ./scripts/kconfig/kconfig set BT_INTEL=m
>
> because those aren't visible, they can only be using depend
> statements.  Or say you try to set FEATURE=m when something else
> depends on it be set =y then the error message is wrong.  The
> other problem is that I don't know how to print the help text.
> Again, this is just a rough draft.
>
> Signed-off-by: Dan Carpenter <dan.carpenter at oracle.com>

Thanks! With the small fixes below, it worked fine for all cases I tried it
with.

> --- /dev/null
> +++ b/scripts/kconfig/lconf.c
> @@ -0,0 +1,332 @@
> +/*
> + * Copyright (C) 2015 Oracle
> + * Released under the terms of the GNU GPL v2.0.
> + *
> + */
> +#define _GNU_SOURCE

    scripts/kconfig/lconf.c:6:0: warning: "_GNU_SOURCE" redefined
     #define _GNU_SOURCE
     ^
    <command-line>:0:0: note: this is the location of the previous definition

You can do:

    #ifndef _GNU_SOURCE
    #define _GNU_SOURCE
    #endif

like scripts/kconfig/nconf.c does.


> +static int conf_sym(struct symbol *sym)
> +{

> +               if (sym_set_tristate_value(sym, newval)) {
> +                       /* FIXME: if I don't write it doesn't save */
> +                       conf_write(NULL, 1);

    scripts/kconfig/lconf.c: In function ?conf_sym?:
    scripts/kconfig/lconf.c:159:4: error: too many arguments to
function ?conf_write?
        conf_write(NULL, 1);
        ^
    In file included from scripts/kconfig/lkc.h:24:0,
                     from scripts/kconfig/lconf.c:10:
    scripts/kconfig/lkc_proto.h:8:5: note: declared here

It seems it never took 2 parameters in upstream?
Dropping the "1" works.


> +static void kconfig_set(void)
> +{

> +       res = conf_write(NULL, 1);

Likewise


For search, it doesn't work with the CONFIG_ prefix:

$ path-to-source-tree/scripts/kconfig/kconfig search CONFIG_IPMMU_VMSA
  GEN     ./Makefile
No matches found.
$ path-to-source-tree/scripts/kconfig/kconfig search IPMMU_VMSA
  GEN     ./Makefile
Symbol: IPMMU_VMSA [=n]
Type  : boolean
Prompt: Renesas VMSA-compatible IPMMU
  Location:
    -> Device Drivers
      -> IOMMU Hardware Support (IOMMU_SUPPORT [=n])
  Defined at drivers/iommu/Kconfig:275
  Depends on: IOMMU_SUPPORT [=n] && (ARM [=y] || IOMMU_DMA [=n]) &&
(ARCH_RENESAS [=y] || COMPILE_TEST [=n])
  Selects: IOMMU_API [=n] && IOMMU_IO_PGTABLE_LPAE [=n] &&
ARM_DMA_USE_IOMMU [=n]


For set, it works with or without the CONFIG_ prefix:

$ path-to-source-tree/scripts/kconfig/kconfig set CONFIG_IPMMU_VMSA=y
  GEN     ./Makefile

IPMMU_VMSA: has missing dependencies
IOMMU_SUPPORT [=n] && (ARM [=y] || IOMMU_DMA [=n]) && (ARCH_RENESAS
[=y] || COMPILE_TEST [=n])

IOMMU_SUPPORT:  IOMMU Hardware Support [N/y] y
y
#
# configuration written to .config
#
HELP.  Lot of unimplemented code.  1
HELP.  Lot of unimplemented code.  1
#
# configuration written to .config
#
set: IPMMU_VMSA=y
$ diff .config{.orig,}
--- .config.orig        2017-07-10 11:34:13.181395059 +0200
+++ .config     2017-07-10 11:34:23.297370970 +0200
@@ -4,6 +4,9 @@
 #
 CONFIG_ARM=y
 CONFIG_ARM_HAS_SG_CHAIN=y
+CONFIG_NEED_SG_DMA_LENGTH=y
+CONFIG_ARM_DMA_USE_IOMMU=y
+CONFIG_ARM_DMA_IOMMU_ALIGNMENT=8
 CONFIG_MIGHT_HAVE_PCI=y
 CONFIG_SYS_SUPPORTS_APM_EMULATION=y
 CONFIG_HAVE_PROC_CPU=y
@@ -3452,6 +3455,7 @@ CONFIG_SYNC_FILE=y
 # CONFIG_SW_SYNC is not set
 # CONFIG_AUXDISPLAY is not set
 # CONFIG_UIO is not set
+# CONFIG_VFIO is not set
 # CONFIG_VIRT_DRIVERS is not set

 #
@@ -3634,7 +3638,19 @@ CONFIG_RENESAS_OSTM=y
 CONFIG_SH_TIMER_TMU=y
 CONFIG_EM_TIMER_STI=y
 # CONFIG_MAILBOX is not set
-# CONFIG_IOMMU_SUPPORT is not set
+CONFIG_IOMMU_API=y
+CONFIG_IOMMU_SUPPORT=y
+
+#
+# Generic IOMMU Pagetable Support
+#
+CONFIG_IOMMU_IO_PGTABLE=y
+CONFIG_IOMMU_IO_PGTABLE_LPAE=y
+# CONFIG_IOMMU_IO_PGTABLE_LPAE_SELFTEST is not set
+# CONFIG_IOMMU_IO_PGTABLE_ARMV7S is not set
+CONFIG_OF_IOMMU=y
+CONFIG_IPMMU_VMSA=y
+# CONFIG_ARM_SMMU is not set

 #
 # Remoteproc drivers

Nice!

BTW, forgetting the =y causes a crash:

$ path-to-source-tree/scripts/kconfig/kconfig set IPMMU_VMSA
  GEN     ./Makefile
path-to-source-tree/scripts/kconfig/Makefile:37: recipe for target
'lconfig' failed
make[4]: *** [lconfig] Segmentation fault
path-to-source-tree/Makefile:548: recipe for target 'lconfig' failed
make[3]: *** [lconfig] Error 2
Makefile:152: recipe for target 'sub-make' failed
make[2]: *** [sub-make] Error 2
Makefile:24: recipe for target '__sub-make' failed
make[1]: *** [__sub-make] Error 2
GNUmakefile:10: recipe for target 'all' failed
make: *** [all] Error 2

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert at linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

From dan.carpenter at oracle.com  Mon Jul 10 11:15:56 2017
From: dan.carpenter at oracle.com (Dan Carpenter)
Date: Mon, 10 Jul 2017 14:15:56 +0300
Subject: [Ksummit-discuss] [PATCH 2/2] kconfig: new command line kernel
 configuration tool
In-Reply-To: <CAMuHMdW4FFJkhA7oj0uA1an_WNt1wS-tM1KvnjYWwdfjRLjQ7A@mail.gmail.com>
References: <20170706144028.46a2mt2mdzpt6ip7@mwanda>
	<20170706144208.6hlgxwo37gntk6qm@mwanda>
	<CAMuHMdW4FFJkhA7oj0uA1an_WNt1wS-tM1KvnjYWwdfjRLjQ7A@mail.gmail.com>
Message-ID: <20170710111555.b66w4vuc6irur5n4@mwanda>

On Mon, Jul 10, 2017 at 11:44:22AM +0200, Geert Uytterhoeven wrote:
> > --- /dev/null
> > +++ b/scripts/kconfig/lconf.c
> > @@ -0,0 +1,332 @@
> > +/*
> > + * Copyright (C) 2015 Oracle
> > + * Released under the terms of the GNU GPL v2.0.
> > + *
> > + */
> > +#define _GNU_SOURCE
> 
>     scripts/kconfig/lconf.c:6:0: warning: "_GNU_SOURCE" redefined
>      #define _GNU_SOURCE
>      ^
>     <command-line>:0:0: note: this is the location of the previous definition
> 
> You can do:
> 
>     #ifndef _GNU_SOURCE
>     #define _GNU_SOURCE
>     #endif
> 
> like scripts/kconfig/nconf.c does.

Will do.

> 
> 
> > +static int conf_sym(struct symbol *sym)
> > +{
> 
> > +               if (sym_set_tristate_value(sym, newval)) {
> > +                       /* FIXME: if I don't write it doesn't save */
> > +                       conf_write(NULL, 1);
> 
>     scripts/kconfig/lconf.c: In function ?conf_sym?:
>     scripts/kconfig/lconf.c:159:4: error: too many arguments to
> function ?conf_write?
>         conf_write(NULL, 1);


I added that in [PATCH 1/2], otherwise there is a lot of unwanted
output.

> Likewise
> 
> 
> For search, it doesn't work with the CONFIG_ prefix:


Will fix.

> 
> Nice!
> 
> BTW, forgetting the =y causes a crash:
> 

Oops.  Sorry.  Will fix.

regards,
dan carpenter


From tony.luck at intel.com  Mon Jul 10 17:15:33 2017
From: tony.luck at intel.com (Luck, Tony)
Date: Mon, 10 Jul 2017 17:15:33 +0000
Subject: [Ksummit-discuss] [TECH TOPIC] is Kconfig a bit hard sometimes?
In-Reply-To: <20170707113650.ee6oys5u4vq5hgdi@mwanda>
References: <20170627135839.GB1886@jagdpanzerIV.localdomain>
	<20170706144028.46a2mt2mdzpt6ip7@mwanda>
	<CA+55aFyPQeMYUafR32parh=jGU0O8prn5vhOhhW+WBU-LbR4HQ@mail.gmail.com>
	<20170707113650.ee6oys5u4vq5hgdi@mwanda>
Message-ID: <3908561D78D1C84285E8C5FCA982C28F613009A7@ORSMSX114.amr.corp.intel.com>

> Argh.  You're right.  I'm an idiot.  It's actually working fine, but it
> asked so many questions I thought it was broken.

I run:

$ yes "" | make oldconfig

to just pick the default answer for all the questions.  It works
almost all of the time.  Only recent break was when using
a RHEL config as the start point some change in the MPT2SAS
and MPT3SAS bits left me with a kernel with no driver to get to
my root file system.

-Tony

From alexandre.belloni at free-electrons.com  Mon Jul 10 17:33:35 2017
From: alexandre.belloni at free-electrons.com (Alexandre Belloni)
Date: Mon, 10 Jul 2017 19:33:35 +0200
Subject: [Ksummit-discuss] [TECH TOPIC] is Kconfig a bit hard sometimes?
In-Reply-To: <3908561D78D1C84285E8C5FCA982C28F613009A7@ORSMSX114.amr.corp.intel.com>
References: <20170627135839.GB1886@jagdpanzerIV.localdomain>
	<20170706144028.46a2mt2mdzpt6ip7@mwanda>
	<CA+55aFyPQeMYUafR32parh=jGU0O8prn5vhOhhW+WBU-LbR4HQ@mail.gmail.com>
	<20170707113650.ee6oys5u4vq5hgdi@mwanda>
	<3908561D78D1C84285E8C5FCA982C28F613009A7@ORSMSX114.amr.corp.intel.com>
Message-ID: <20170710173335.4ksnso6dzaekoxz4@piout.net>

On 10/07/2017 at 17:15:33 +0000, Luck, Tony wrote:
> > Argh.  You're right.  I'm an idiot.  It's actually working fine, but it
> > asked so many questions I thought it was broken.
> 
> I run:
> 
> $ yes "" | make oldconfig
> 

I know the yes trick works for kernels older than 3.7 but maybe people
should start using make olddefconfig ;)


-- 
Alexandre Belloni, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

From torvalds at linux-foundation.org  Mon Jul 10 18:28:58 2017
From: torvalds at linux-foundation.org (Linus Torvalds)
Date: Mon, 10 Jul 2017 11:28:58 -0700
Subject: [Ksummit-discuss] [TECH TOPIC] is Kconfig a bit hard sometimes?
In-Reply-To: <20170710173335.4ksnso6dzaekoxz4@piout.net>
References: <20170627135839.GB1886@jagdpanzerIV.localdomain>
	<20170706144028.46a2mt2mdzpt6ip7@mwanda>
	<CA+55aFyPQeMYUafR32parh=jGU0O8prn5vhOhhW+WBU-LbR4HQ@mail.gmail.com>
	<20170707113650.ee6oys5u4vq5hgdi@mwanda>
	<3908561D78D1C84285E8C5FCA982C28F613009A7@ORSMSX114.amr.corp.intel.com>
	<20170710173335.4ksnso6dzaekoxz4@piout.net>
Message-ID: <CA+55aFzmjA1gkaVwr5J7UoW2Ovgsqr3aJJC_18Wv59QWp94+wA@mail.gmail.com>

On Mon, Jul 10, 2017 at 10:33 AM, Alexandre Belloni
<alexandre.belloni at free-electrons.com> wrote:
>
> I know the yes trick works for kernels older than 3.7 but maybe people
> should start using make olddefconfig ;)

Honestly, I wish more people just ran "oldconfig" and then started
complaining about people adding insane Kconfig options.

I seem to be the only one ever pushing back against some of the people
out there that add Kconfig options that really make zero sense (or
that add the oddest drivers or features with a crazy "default this
thing to on").

                       Linus

From rdunlap at infradead.org  Mon Jul 10 19:44:55 2017
From: rdunlap at infradead.org (Randy Dunlap)
Date: Mon, 10 Jul 2017 12:44:55 -0700
Subject: [Ksummit-discuss] [TECH TOPIC] is Kconfig a bit hard sometimes?
In-Reply-To: <CA+55aFzmjA1gkaVwr5J7UoW2Ovgsqr3aJJC_18Wv59QWp94+wA@mail.gmail.com>
References: <20170627135839.GB1886@jagdpanzerIV.localdomain>
	<20170706144028.46a2mt2mdzpt6ip7@mwanda>
	<CA+55aFyPQeMYUafR32parh=jGU0O8prn5vhOhhW+WBU-LbR4HQ@mail.gmail.com>
	<20170707113650.ee6oys5u4vq5hgdi@mwanda>
	<3908561D78D1C84285E8C5FCA982C28F613009A7@ORSMSX114.amr.corp.intel.com>
	<20170710173335.4ksnso6dzaekoxz4@piout.net>
	<CA+55aFzmjA1gkaVwr5J7UoW2Ovgsqr3aJJC_18Wv59QWp94+wA@mail.gmail.com>
Message-ID: <e210384a-4873-d439-7031-43abfc74d134@infradead.org>

On 07/10/2017 11:28 AM, Linus Torvalds wrote:
> On Mon, Jul 10, 2017 at 10:33 AM, Alexandre Belloni
> <alexandre.belloni at free-electrons.com> wrote:
>>
>> I know the yes trick works for kernels older than 3.7 but maybe people
>> should start using make olddefconfig ;)
> 
> Honestly, I wish more people just ran "oldconfig" and then started
> complaining about people adding insane Kconfig options.
> 
> I seem to be the only one ever pushing back against some of the people
> out there that add Kconfig options that really make zero sense (or
> that add the oddest drivers or features with a crazy "default this
> thing to on").

I could -- and I have a few times. But usually it needs a $maintainer
to make others listen.  I'm just a lurker.


-- 
~Randy

From vrothberg at suse.com  Tue Jul 11 06:21:32 2017
From: vrothberg at suse.com (Valentin Rothberg)
Date: Tue, 11 Jul 2017 08:21:32 +0200
Subject: [Ksummit-discuss] [TECH TOPIC] is Kconfig a bit hard sometimes?
In-Reply-To: <CA+55aFzmjA1gkaVwr5J7UoW2Ovgsqr3aJJC_18Wv59QWp94+wA@mail.gmail.com>
References: <20170627135839.GB1886@jagdpanzerIV.localdomain>
	<20170706144028.46a2mt2mdzpt6ip7@mwanda>
	<CA+55aFyPQeMYUafR32parh=jGU0O8prn5vhOhhW+WBU-LbR4HQ@mail.gmail.com>
	<20170707113650.ee6oys5u4vq5hgdi@mwanda>
	<3908561D78D1C84285E8C5FCA982C28F613009A7@ORSMSX114.amr.corp.intel.com>
	<20170710173335.4ksnso6dzaekoxz4@piout.net>
	<CA+55aFzmjA1gkaVwr5J7UoW2Ovgsqr3aJJC_18Wv59QWp94+wA@mail.gmail.com>
Message-ID: <20170711062132.GA13470@nebuchadnezzar.suse.de>

On Jul 10 '17 11:28, Linus Torvalds wrote:
> On Mon, Jul 10, 2017 at 10:33 AM, Alexandre Belloni
> <alexandre.belloni at free-electrons.com> wrote:
> >
> > I know the yes trick works for kernels older than 3.7 but maybe people
> > should start using make olddefconfig ;)
> 
> Honestly, I wish more people just ran "oldconfig" and then started
> complaining about people adding insane Kconfig options.

If you want, we could add a "--diff-options" to checkkconfigsymbols.py.
It runs reasonably fast and would also report options outside the
current architecture.

Kind regards,
 Valentin

From dhowells at redhat.com  Wed Jul 12 12:43:30 2017
From: dhowells at redhat.com (David Howells)
Date: Wed, 12 Jul 2017 13:43:30 +0100
Subject: [Ksummit-discuss] [TECH TOPIC] Getting better/supplementary error
	info back to userspace
Message-ID: <10144.1499863410@warthog.procyon.org.uk>

Whilst undertaking a foray into container space and, related to that, looking
at overhauling the mounting API, it occurred to me that I could make use of
the mount context (now fs_context) that I was creating to allow the filesystem
driver to pass supplementary error information back to the userspace program
that was driving it in the form of textual messages:

	int fd = fsopen("ext4");
	write(fd, "d /dev/sda2");
	write(fd, "o user_xattr");
	if (fsmount(fd, "/mnt") == -1) {
		/* Something went wrong, read back any error info */
		size = read(fd, buffer, sizeof(buffer));
		/* Now print the supplementary error message */
		fprintf(stderr, "%*.*s\n", size, size, buffer);
	}

This would be particularly useful in the case of mounting a filesystem where
so many things can go wrong that a small number is insufficient to represent
them all.  Yes, you have the dmesg log, but that's not necessarily available
to you and is potentially intermixed with other things.  Further, it's more
user-friendly if the mount command or your GUI gives you the errors directly.

However, it occurred to me that this feature might be useful in other cases,
not just mounting, and there are cases where it's not easy or not possible to
get the message back to userspace because there's no user-accessible context
(eg. automounting), or because the context is buried several levels down the
stack (eg. NFS mount doing a pathwalk).

In which case, would it make sense to attach such a facility to the
task_struct instead?  I implemented a test of this using prctl, but a new
syscall might be a better idea, at least for reading.

 (*) int old_setting = prctl(PR_ERRMSG_ENABLE, int setting);

     Enable (setting == 1) or disable (setting == 0) the facility.
     Disabling the facility clears the error buffer.

 (*) int size = prctl(PR_ERRMSG_READ, char *buffer, int buf_size);

     Read back a message and discard it.  


Anyway, some questions:

 (1) Is this something that would be of interest on a more global scale?

     Or should I just stick with stashing it in the fs_context structure and
     find someway to route around the pathwalk in nfs mount?

     Or is this totally a bad idea and only dmesg should ever be used?

If it is of interest globally:

 (2) How big should I make each task's message buffer?  My current
     implementation allows each task to hold a single message if enabled.

 (3) Should I allow warnings in addition to errors?

 (4) Should I allow wait() and co. to try and retrieve errors from zombies?

 (5) Should execve() disable the facility?

 (6) Could all the messages be static (not kmalloc'd) and cleared/redacted by
     rmmod?  This would potentially prevent the use of formatted messages.

David

From acme at kernel.org  Wed Jul 12 14:33:21 2017
From: acme at kernel.org (Arnaldo Carvalho de Melo)
Date: Wed, 12 Jul 2017 11:33:21 -0300
Subject: [Ksummit-discuss] [TECH TOPIC] Getting better/supplementary
 error info back to userspace
In-Reply-To: <10144.1499863410@warthog.procyon.org.uk>
References: <10144.1499863410@warthog.procyon.org.uk>
Message-ID: <20170712143321.GL27350@kernel.org>

Em Wed, Jul 12, 2017 at 01:43:30PM +0100, David Howells escreveu:
> Whilst undertaking a foray into container space and, related to that, looking
> at overhauling the mounting API, it occurred to me that I could make use of
> the mount context (now fs_context) that I was creating to allow the filesystem
> driver to pass supplementary error information back to the userspace program
> that was driving it in the form of textual messages:
> 
> 	int fd = fsopen("ext4");
> 	write(fd, "d /dev/sda2");
> 	write(fd, "o user_xattr");
> 	if (fsmount(fd, "/mnt") == -1) {
> 		/* Something went wrong, read back any error info */
> 		size = read(fd, buffer, sizeof(buffer));
> 		/* Now print the supplementary error message */
> 		fprintf(stderr, "%*.*s\n", size, size, buffer);
> 	}
> 
> This would be particularly useful in the case of mounting a filesystem where
> so many things can go wrong that a small number is insufficient to represent
> them all.  Yes, you have the dmesg log, but that's not necessarily available
> to you and is potentially intermixed with other things.  Further, it's more
> user-friendly if the mount command or your GUI gives you the errors directly.
> 
> However, it occurred to me that this feature might be useful in other cases,
> not just mounting, and there are cases where it's not easy or not possible to
> get the message back to userspace because there's no user-accessible context
> (eg. automounting), or because the context is buried several levels down the
> stack (eg. NFS mount doing a pathwalk).
> 
> In which case, would it make sense to attach such a facility to the
> task_struct instead?  I implemented a test of this using prctl, but a new
> syscall might be a better idea, at least for reading.
> 
>  (*) int old_setting = prctl(PR_ERRMSG_ENABLE, int setting);
> 
>      Enable (setting == 1) or disable (setting == 0) the facility.
>      Disabling the facility clears the error buffer.
> 
>  (*) int size = prctl(PR_ERRMSG_READ, char *buffer, int buf_size);
> 
>      Read back a message and discard it.  

There were discussions about this in the not so distant past, perf being
one of the areas where something like this would help a lot, lemme dig
it, yeah, there is even a short LWN article describing it and with links
to the lkml posts:

  https://lwn.net/Articles/657341/

Involces prctl as yours, etc, etc.

What we do now in tools/perf/ with what we do have now is to have
strerrno like messages for each class and method (well, we have for some
of them), like:

  int perf_evsel__open_strerror(struct perf_evsel *evsel,
                                struct target *target,
                                int err, char *msg, size_t size);

where we have a switch to see, from syscall errno return and intended
target (CPU, system wide, a specific thread, cgroups, etc), who is
asking this (user, root, etc) and lots of other tunables, how to best
translate this to the user, formatting it in a string allows us to show
it in whatever GUI is in use.

- Arnaldo
 
> 
> Anyway, some questions:
> 
>  (1) Is this something that would be of interest on a more global scale?
> 
>      Or should I just stick with stashing it in the fs_context structure and
>      find someway to route around the pathwalk in nfs mount?
> 
>      Or is this totally a bad idea and only dmesg should ever be used?
> 
> If it is of interest globally:
> 
>  (2) How big should I make each task's message buffer?  My current
>      implementation allows each task to hold a single message if enabled.
> 
>  (3) Should I allow warnings in addition to errors?
> 
>  (4) Should I allow wait() and co. to try and retrieve errors from zombies?
> 
>  (5) Should execve() disable the facility?
> 
>  (6) Could all the messages be static (not kmalloc'd) and cleared/redacted by
>      rmmod?  This would potentially prevent the use of formatted messages.
> 
> David
> _______________________________________________
> Ksummit-discuss mailing list
> Ksummit-discuss at lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/ksummit-discuss

From acme at kernel.org  Wed Jul 12 14:44:28 2017
From: acme at kernel.org (Arnaldo Carvalho de Melo)
Date: Wed, 12 Jul 2017 11:44:28 -0300
Subject: [Ksummit-discuss] [TECH TOPIC] Getting better/supplementary
 error info back to userspace
In-Reply-To: <20170712143321.GL27350@kernel.org>
References: <10144.1499863410@warthog.procyon.org.uk>
	<20170712143321.GL27350@kernel.org>
Message-ID: <20170712144428.GM27350@kernel.org>

Em Wed, Jul 12, 2017 at 11:33:21AM -0300, Arnaldo Carvalho de Melo escreveu:
> What we do now in tools/perf/ with what we do have now is to have
> strerrno like messages for each class and method (well, we have for some
> of them), like:
> 
>   int perf_evsel__open_strerror(struct perf_evsel *evsel,
>                                 struct target *target,
>                                 int err, char *msg, size_t size);
> 
> where we have a switch to see, from syscall errno return and intended
> target (CPU, system wide, a specific thread, cgroups, etc), who is
> asking this (user, root, etc) and lots of other tunables, how to best
> translate this to the user, formatting it in a string allows us to show
> it in whatever GUI is in use.

To get this clearer in terms of actual usage, here is a (simplified) snippet
for 'perf top':

  try_again:

  	if (perf_evsel__open(event, cpus, threads) < 0) {
		if (perf_evsel__fallback(event, errno, msg, sizeof(msg))) {
			if (verbose > 0)
				ui__warning("%s\n", msg);
				goto try_again;
                        }

		perf_evsel__open_strerror(event, target, errno, msg, sizeof(msg));
		ui__error("%s\n", msg);
		goto out_err;
	}

- Arnaldo

From dhowells at redhat.com  Wed Jul 12 14:57:56 2017
From: dhowells at redhat.com (David Howells)
Date: Wed, 12 Jul 2017 15:57:56 +0100
Subject: [Ksummit-discuss] [TECH TOPIC] Getting better/supplementary
	error info back to userspace
In-Reply-To: <10144.1499863410@warthog.procyon.org.uk>
References: <10144.1499863410@warthog.procyon.org.uk>
Message-ID: <12463.1499871476@warthog.procyon.org.uk>

David Howells <dhowells at redhat.com> wrote:

> In which case, would it make sense to attach such a facility to the
> task_struct instead?  I implemented a test of this using prctl, but a new
> syscall might be a better idea, at least for reading.
> 
>  (*) int old_setting = prctl(PR_ERRMSG_ENABLE, int setting);
> 
>      Enable (setting == 1) or disable (setting == 0) the facility.
>      Disabling the facility clears the error buffer.
> 
>  (*) int size = prctl(PR_ERRMSG_READ, char *buffer, int buf_size);
> 
>      Read back a message and discard it.  

I forgot to add that I've kept the in-kernel interface I have for this very
simple for the moment:

	void errorf(const char *fmt, ...);
	int invalf(const char *fmt, ...);

where these functions take printf-style arguments and where invalf() is the
same as errorf(), but returns -EINVAL for convenience.  To take an example
from NFS:

-	if (auth_info->flavor_len + 1 >= max_flavor_len) {
-		dfprintk(MOUNT, "NFS: too many sec= flavors\n");
-		return -EINVAL;
-	}
+	if (auth_info->flavor_len + 1 >= max_flavor_len)
+		return invalf("NFS: too many sec= flavors");

David

From stephen at networkplumber.org  Wed Jul 12 15:21:39 2017
From: stephen at networkplumber.org (Stephen Hemminger)
Date: Wed, 12 Jul 2017 08:21:39 -0700
Subject: [Ksummit-discuss] [TECH TOPIC] Getting better/supplementary
 error info back to userspace
In-Reply-To: <12463.1499871476@warthog.procyon.org.uk>
References: <10144.1499863410@warthog.procyon.org.uk>
	<12463.1499871476@warthog.procyon.org.uk>
Message-ID: <20170712082139.17cfd33a@xeon-e3>

On Wed, 12 Jul 2017 15:57:56 +0100
David Howells <dhowells at redhat.com> wrote:

> David Howells <dhowells at redhat.com> wrote:
> 
> > In which case, would it make sense to attach such a facility to the
> > task_struct instead?  I implemented a test of this using prctl, but a new
> > syscall might be a better idea, at least for reading.
> > 
> >  (*) int old_setting = prctl(PR_ERRMSG_ENABLE, int setting);
> > 
> >      Enable (setting == 1) or disable (setting == 0) the facility.
> >      Disabling the facility clears the error buffer.
> > 
> >  (*) int size = prctl(PR_ERRMSG_READ, char *buffer, int buf_size);
> > 
> >      Read back a message and discard it.    
> 
> I forgot to add that I've kept the in-kernel interface I have for this very
> simple for the moment:
> 
> 	void errorf(const char *fmt, ...);
> 	int invalf(const char *fmt, ...);
> 
> where these functions take printf-style arguments and where invalf() is the
> same as errorf(), but returns -EINVAL for convenience.  To take an example
> from NFS:
> 
> -	if (auth_info->flavor_len + 1 >= max_flavor_len) {
> -		dfprintk(MOUNT, "NFS: too many sec= flavors\n");
> -		return -EINVAL;
> -	}
> +	if (auth_info->flavor_len + 1 >= max_flavor_len)
> +		return invalf("NFS: too many sec= flavors");

Netlink has recently got extended error reporting, still not used widely
and library support is lacking in most places.

From torvalds at linux-foundation.org  Wed Jul 12 16:19:55 2017
From: torvalds at linux-foundation.org (Linus Torvalds)
Date: Wed, 12 Jul 2017 09:19:55 -0700
Subject: [Ksummit-discuss] [TECH TOPIC] Getting better/supplementary
 error info back to userspace
In-Reply-To: <20170712082139.17cfd33a@xeon-e3>
References: <10144.1499863410@warthog.procyon.org.uk>
	<12463.1499871476@warthog.procyon.org.uk>
	<20170712082139.17cfd33a@xeon-e3>
Message-ID: <CA+55aFySG7NAvsphb76J-M2YuM8_4wQ8Cvufu24Gb=EhpaoKTg@mail.gmail.com>

On Wed, Jul 12, 2017 at 8:21 AM, Stephen Hemminger
<stephen at networkplumber.org> wrote:
>
> Netlink has recently got extended error reporting, still not used widely
> and library support is lacking in most places.

Yeah, and that "not widely supported and library support is lacking"
is always going to be an issue with anything like that.

Along with internationalization, which is a whole nasty set of issues
in itself with error messages.

It's not going to happen, in other words. The problems are basically
insurmountable, and the thing it fixes will always be some special
case that doesn't much matter.

Every time it comes up it is because some developer found one case
that they were hunting down and it annoyed them, and the developer
went "if only it had included more information and it would have been
obvious".

But every time it comes up people ignore this basic issue:

     [torvalds at i7 linux]$ git grep -e '-E[A-Z]\{4\}' | wc -l
     182523


Give it up. It's really is a horrible idea for so many reasons.

                     Linus

From stephen at networkplumber.org  Wed Jul 12 16:35:07 2017
From: stephen at networkplumber.org (Stephen Hemminger)
Date: Wed, 12 Jul 2017 09:35:07 -0700
Subject: [Ksummit-discuss] [TECH TOPIC] Getting better/supplementary
 error info back to userspace
In-Reply-To: <CA+55aFySG7NAvsphb76J-M2YuM8_4wQ8Cvufu24Gb=EhpaoKTg@mail.gmail.com>
References: <10144.1499863410@warthog.procyon.org.uk>
	<12463.1499871476@warthog.procyon.org.uk>
	<20170712082139.17cfd33a@xeon-e3>
	<CA+55aFySG7NAvsphb76J-M2YuM8_4wQ8Cvufu24Gb=EhpaoKTg@mail.gmail.com>
Message-ID: <20170712093507.4482f3fc@xeon-e3>

On Wed, 12 Jul 2017 09:19:55 -0700
Linus Torvalds <torvalds at linux-foundation.org> wrote:

> On Wed, Jul 12, 2017 at 8:21 AM, Stephen Hemminger
> <stephen at networkplumber.org> wrote:
> >
> > Netlink has recently got extended error reporting, still not used widely
> > and library support is lacking in most places.  
> 
> Yeah, and that "not widely supported and library support is lacking"
> is always going to be an issue with anything like that.
> 
> Along with internationalization, which is a whole nasty set of issues
> in itself with error messages.
> 
> It's not going to happen, in other words. The problems are basically
> insurmountable, and the thing it fixes will always be some special
> case that doesn't much matter.
> 
> Every time it comes up it is because some developer found one case
> that they were hunting down and it annoyed them, and the developer
> went "if only it had included more information and it would have been
> obvious".
> 
> But every time it comes up people ignore this basic issue:
> 
>      [torvalds at i7 linux]$ git grep -e '-E[A-Z]\{4\}' | wc -l
>      182523
> 
> 
> Give it up. It's really is a horrible idea for so many reasons.

For netlink, it isn't so bad. 80% of the usage is in iproute2 and
therefore getting tool support for the usual cases isn't too hard.

I fear kernel developers think at too low a level. They think if glibc
and/or 1st level command can handle an extension, their work is done.
But in the modern world, there are many scripts and layers above that.
For the networking case, the worst case examples are things where configuration
is done in stuff like some layer on top of Openstack, in python code
which is scripting ip commands, which is talking to the kernel. Good luck
on trying to get any meaningful error handling out of that dog pile.

From leon at kernel.org  Fri Jul 14 04:04:47 2017
From: leon at kernel.org (Leon Romanovsky)
Date: Fri, 14 Jul 2017 07:04:47 +0300
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] Developing across
 multiple areas of the kernel
In-Reply-To: <20170630062717.534b06e9@canb.auug.org.au>
References: <CAGXu5j+bpi6-krTYwN_BdhFnHZRYpQwhtc9Z-kcRerm+t-Xyfw@mail.gmail.com>
	<1498754169.2834.61.camel@HansenPartnership.com>
	<CAGXu5jJ011-7JH7w4n0No=gon-jp4VTeN=0JPWaQsGh5zz79eA@mail.gmail.com>
	<1498758126.2834.70.camel@HansenPartnership.com>
	<CAGXu5jLL8_h3AYrpCm4LKV8-rTPf793gAJckS0rQag6iGqk9xw@mail.gmail.com>
	<20170629182044.GP21846@wotan.suse.de>
	<CAGXu5jL21gKAO1DxNHH-4x6H9jw1t84Z=Z8iCSDRmgcxh4TFrQ@mail.gmail.com>
	<20170630062717.534b06e9@canb.auug.org.au>
Message-ID: <20170714040447.GT1528@mtr-leonro.local>

On Fri, Jun 30, 2017 at 06:27:17AM +1000, Stephen Rothwell wrote:
> Hi Kees,
>
> On Thu, 29 Jun 2017 13:16:40 -0700 Kees Cook <keescook at chromium.org> wrote:
> >
> > [1] If the solution for this is to merge other -next trees into mine,
> > I guess I can do that, though it can be very messy if any of them are
> > forced to make their commits unstable. It also creates headaches,
> > AIUI, for sfr if my tree suddenly gains a bunch of other trees so it's
> > not clear where something came from.
>
> I don't have a problem with trees in linux-next sharing *commits* - I
> have problems when they share *patches* that are different commits
> (that affect files that get changed in other commits).

Do we have any sane way to overcome this limitation?

I tried to add my tree [1] to participate in linux-next. My tree
includes my submission queue and important patches posted to the mailing list
to the RDMA subsystem.

The absence of ability to add parallel tree with same commits doesn't allow us
effectively test the RDMA patches.

The reasons to it are combination of mostly two factors: my tree is not
official one [2] (all patches in my tree are not officially final) and very
sporadic update very close and/or during merge window [3].

In this cycle, we missed merge window [4] because lack of ready for pull tree [5].

Thanks

[1] https://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma.git/
[2] https://git.kernel.org/cgit/linux/kernel/git/dledford/rdma.git/
[3] http://marc.info/?l=linux-next&m=149999488214297&w=2
[4] http://marc.info/?l=linux-rdma&m=149980130008834&w=2
[5] http://marc.info/?l=linux-rdma&m=149987945120683&w=2
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://lists.linuxfoundation.org/pipermail/ksummit-discuss/attachments/20170714/fa23dcd3/attachment.sig>

From greg at kroah.com  Fri Jul 14 09:54:09 2017
From: greg at kroah.com (Greg KH)
Date: Fri, 14 Jul 2017 11:54:09 +0200
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] Developing across
 multiple areas of the kernel
In-Reply-To: <20170714040447.GT1528@mtr-leonro.local>
References: <CAGXu5j+bpi6-krTYwN_BdhFnHZRYpQwhtc9Z-kcRerm+t-Xyfw@mail.gmail.com>
	<1498754169.2834.61.camel@HansenPartnership.com>
	<CAGXu5jJ011-7JH7w4n0No=gon-jp4VTeN=0JPWaQsGh5zz79eA@mail.gmail.com>
	<1498758126.2834.70.camel@HansenPartnership.com>
	<CAGXu5jLL8_h3AYrpCm4LKV8-rTPf793gAJckS0rQag6iGqk9xw@mail.gmail.com>
	<20170629182044.GP21846@wotan.suse.de>
	<CAGXu5jL21gKAO1DxNHH-4x6H9jw1t84Z=Z8iCSDRmgcxh4TFrQ@mail.gmail.com>
	<20170630062717.534b06e9@canb.auug.org.au>
	<20170714040447.GT1528@mtr-leonro.local>
Message-ID: <20170714095409.GF2269@kroah.com>

On Fri, Jul 14, 2017 at 07:04:47AM +0300, Leon Romanovsky wrote:
> On Fri, Jun 30, 2017 at 06:27:17AM +1000, Stephen Rothwell wrote:
> > Hi Kees,
> >
> > On Thu, 29 Jun 2017 13:16:40 -0700 Kees Cook <keescook at chromium.org> wrote:
> > >
> > > [1] If the solution for this is to merge other -next trees into mine,
> > > I guess I can do that, though it can be very messy if any of them are
> > > forced to make their commits unstable. It also creates headaches,
> > > AIUI, for sfr if my tree suddenly gains a bunch of other trees so it's
> > > not clear where something came from.
> >
> > I don't have a problem with trees in linux-next sharing *commits* - I
> > have problems when they share *patches* that are different commits
> > (that affect files that get changed in other commits).
> 
> Do we have any sane way to overcome this limitation?
> 
> I tried to add my tree [1] to participate in linux-next. My tree
> includes my submission queue and important patches posted to the mailing list
> to the RDMA subsystem.
> 
> The absence of ability to add parallel tree with same commits doesn't allow us
> effectively test the RDMA patches.

Why do you need "parallel" trees in linux-next?  What is that going to
help with?

> The reasons to it are combination of mostly two factors: my tree is not
> official one [2] (all patches in my tree are not officially final) and very
> sporadic update very close and/or during merge window [3].

If it's not "official", why should it be in linux-next?

thanks,

greg k-h

From leon at kernel.org  Fri Jul 14 10:29:20 2017
From: leon at kernel.org (Leon Romanovsky)
Date: Fri, 14 Jul 2017 13:29:20 +0300
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] Developing across
 multiple areas of the kernel
In-Reply-To: <20170714095409.GF2269@kroah.com>
References: <CAGXu5j+bpi6-krTYwN_BdhFnHZRYpQwhtc9Z-kcRerm+t-Xyfw@mail.gmail.com>
	<1498754169.2834.61.camel@HansenPartnership.com>
	<CAGXu5jJ011-7JH7w4n0No=gon-jp4VTeN=0JPWaQsGh5zz79eA@mail.gmail.com>
	<1498758126.2834.70.camel@HansenPartnership.com>
	<CAGXu5jLL8_h3AYrpCm4LKV8-rTPf793gAJckS0rQag6iGqk9xw@mail.gmail.com>
	<20170629182044.GP21846@wotan.suse.de>
	<CAGXu5jL21gKAO1DxNHH-4x6H9jw1t84Z=Z8iCSDRmgcxh4TFrQ@mail.gmail.com>
	<20170630062717.534b06e9@canb.auug.org.au>
	<20170714040447.GT1528@mtr-leonro.local>
	<20170714095409.GF2269@kroah.com>
Message-ID: <20170714102920.GY1528@mtr-leonro.local>

On Fri, Jul 14, 2017 at 11:54:09AM +0200, Greg KH wrote:
> On Fri, Jul 14, 2017 at 07:04:47AM +0300, Leon Romanovsky wrote:
> > On Fri, Jun 30, 2017 at 06:27:17AM +1000, Stephen Rothwell wrote:
> > > Hi Kees,
> > >
> > > On Thu, 29 Jun 2017 13:16:40 -0700 Kees Cook <keescook at chromium.org> wrote:
> > > >
> > > > [1] If the solution for this is to merge other -next trees into mine,
> > > > I guess I can do that, though it can be very messy if any of them are
> > > > forced to make their commits unstable. It also creates headaches,
> > > > AIUI, for sfr if my tree suddenly gains a bunch of other trees so it's
> > > > not clear where something came from.
> > >
> > > I don't have a problem with trees in linux-next sharing *commits* - I
> > > have problems when they share *patches* that are different commits
> > > (that affect files that get changed in other commits).
> >
> > Do we have any sane way to overcome this limitation?
> >
> > I tried to add my tree [1] to participate in linux-next. My tree
> > includes my submission queue and important patches posted to the mailing list
> > to the RDMA subsystem.
> >
> > The absence of ability to add parallel tree with same commits doesn't allow us
> > effectively test the RDMA patches.
>
> Why do you need "parallel" trees in linux-next?  What is that going to
> help with?

We are developing against two subsystems at the same time (netdev vs. RDMA) and need
to ensure that combination of them is working. Currently me (RDMA) and Saeed (netdev)
are merging out trees by ourselves [1] and instructs our verification (end-to-end and
QA) to run from that tree.

It means that we are missing a lot of stuff related to PCI, nvme, vitalization and storage
where our technology is used.

The difference in maintainers style between netdev and RDMA causes to have long queue
(100+) of patches posted to the ML [2], which are not cross-checked in various CIs.

And the situation is worse when someone posts patches which has potential to break
other vendors.

Ability to have "parallel" trees will allow us to run our (other vendors expressed the same
desire) verification on top of linux-next with all goodies of automatic regression systems
which we have as a hardware vendor.

So I would like to have "parallel" tree where I can put all my RDMA patches + important patches
from other parties and run from linux-next.

>
> > The reasons to it are combination of mostly two factors: my tree is not
> > official one [2] (all patches in my tree are not officially final) and very
> > sporadic update very close and/or during merge window [3].
>
> If it's not "official", why should it be in linux-next?

Because, official updates occur mostly twice in the cycle on -rc3 (for fixes)
and before merge window, while it is too late for us because we are preparing
our submission queues for next cycle (Linus's requirement for Mellanox's
submissions) and verification is busy with that.

Thanks

[1] https://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git/ branches:queue-next and queue-rc
[2] https://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma.git/log/?h=testing/queue-next
>
> thanks,
>
> greg k-h
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://lists.linuxfoundation.org/pipermail/ksummit-discuss/attachments/20170714/c5e1b99b/attachment.sig>

From andrew at lunn.ch  Fri Jul 14 14:10:57 2017
From: andrew at lunn.ch (Andrew Lunn)
Date: Fri, 14 Jul 2017 16:10:57 +0200
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] Developing across
 multiple areas of the kernel
In-Reply-To: <20170714102920.GY1528@mtr-leonro.local>
References: <1498754169.2834.61.camel@HansenPartnership.com>
	<CAGXu5jJ011-7JH7w4n0No=gon-jp4VTeN=0JPWaQsGh5zz79eA@mail.gmail.com>
	<1498758126.2834.70.camel@HansenPartnership.com>
	<CAGXu5jLL8_h3AYrpCm4LKV8-rTPf793gAJckS0rQag6iGqk9xw@mail.gmail.com>
	<20170629182044.GP21846@wotan.suse.de>
	<CAGXu5jL21gKAO1DxNHH-4x6H9jw1t84Z=Z8iCSDRmgcxh4TFrQ@mail.gmail.com>
	<20170630062717.534b06e9@canb.auug.org.au>
	<20170714040447.GT1528@mtr-leonro.local>
	<20170714095409.GF2269@kroah.com>
	<20170714102920.GY1528@mtr-leonro.local>
Message-ID: <20170714141057.GC21743@lunn.ch>

> The difference in maintainers style between netdev and RDMA causes to have long queue
> (100+) of patches posted to the ML [2], which are not cross-checked in various CIs.

It is possible to get 0-day to run against any arbitrary git tree, if
you ask nicely. If same is true for the kernel-ci project. So if you
are willing to do the merge work, you can get it tested.

    Andrew

From broonie at kernel.org  Fri Jul 14 15:05:50 2017
From: broonie at kernel.org (Mark Brown)
Date: Fri, 14 Jul 2017 16:05:50 +0100
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] Developing across
 multiple areas of the kernel
In-Reply-To: <20170714141057.GC21743@lunn.ch>
References: <CAGXu5jJ011-7JH7w4n0No=gon-jp4VTeN=0JPWaQsGh5zz79eA@mail.gmail.com>
	<1498758126.2834.70.camel@HansenPartnership.com>
	<CAGXu5jLL8_h3AYrpCm4LKV8-rTPf793gAJckS0rQag6iGqk9xw@mail.gmail.com>
	<20170629182044.GP21846@wotan.suse.de>
	<CAGXu5jL21gKAO1DxNHH-4x6H9jw1t84Z=Z8iCSDRmgcxh4TFrQ@mail.gmail.com>
	<20170630062717.534b06e9@canb.auug.org.au>
	<20170714040447.GT1528@mtr-leonro.local>
	<20170714095409.GF2269@kroah.com>
	<20170714102920.GY1528@mtr-leonro.local>
	<20170714141057.GC21743@lunn.ch>
Message-ID: <20170714150550.ubtkwmd3wcx554m6@sirena.org.uk>

On Fri, Jul 14, 2017 at 04:10:57PM +0200, Andrew Lunn wrote:
> > The difference in maintainers style between netdev and RDMA causes to have long queue
> > (100+) of patches posted to the ML [2], which are not cross-checked in various CIs.

> It is possible to get 0-day to run against any arbitrary git tree, if
> you ask nicely. If same is true for the kernel-ci project. So if you
> are willing to do the merge work, you can get it tested.

Trees can be added to kernelci, yes.  Another approach would be to work
out a workflow with the upstreams that makes this better, if they'd take
pull requests for example.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: not available
URL: <http://lists.linuxfoundation.org/pipermail/ksummit-discuss/attachments/20170714/0e5560de/attachment.sig>

From leon at kernel.org  Fri Jul 14 15:35:44 2017
From: leon at kernel.org (Leon Romanovsky)
Date: Fri, 14 Jul 2017 18:35:44 +0300
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] Developing across
 multiple areas of the kernel
In-Reply-To: <20170714141057.GC21743@lunn.ch>
References: <CAGXu5jJ011-7JH7w4n0No=gon-jp4VTeN=0JPWaQsGh5zz79eA@mail.gmail.com>
	<1498758126.2834.70.camel@HansenPartnership.com>
	<CAGXu5jLL8_h3AYrpCm4LKV8-rTPf793gAJckS0rQag6iGqk9xw@mail.gmail.com>
	<20170629182044.GP21846@wotan.suse.de>
	<CAGXu5jL21gKAO1DxNHH-4x6H9jw1t84Z=Z8iCSDRmgcxh4TFrQ@mail.gmail.com>
	<20170630062717.534b06e9@canb.auug.org.au>
	<20170714040447.GT1528@mtr-leonro.local>
	<20170714095409.GF2269@kroah.com>
	<20170714102920.GY1528@mtr-leonro.local>
	<20170714141057.GC21743@lunn.ch>
Message-ID: <20170714153544.GE1528@mtr-leonro.local>

On Fri, Jul 14, 2017 at 04:10:57PM +0200, Andrew Lunn wrote:
> > The difference in maintainers style between netdev and RDMA causes to have long queue
> > (100+) of patches posted to the ML [2], which are not cross-checked in various CIs.
>
> It is possible to get 0-day to run against any arbitrary git tree, if
> you ask nicely. If same is true for the kernel-ci project. So if you
> are willing to do the merge work, you can get it tested.

0-day is checking my tree, so it is not the problem.

I don't see how kernel-ci can help me, because RDMA requires special
hardware to run it and it usually requires more than two endpoints (servers)
connected together.

My problem is related to changes in other trees for example netdev, which
can break RDMA functionality.

Technology wise, there are:
1. RoCE - RDMA over Converged Ethernet - netdev is below RDMA
2. IPoIB - IP over Infiniband - netdev is above RDMA
3. HFI-VNIC - Ethernet over OmniPath - netdev is above RDMA
4. iWARP - RDMA over IP networks
e.t.c.

>
>     Andrew
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://lists.linuxfoundation.org/pipermail/ksummit-discuss/attachments/20170714/3492c8fb/attachment-0001.sig>

From James.Bottomley at HansenPartnership.com  Fri Jul 14 15:43:58 2017
From: James.Bottomley at HansenPartnership.com (James Bottomley)
Date: Fri, 14 Jul 2017 08:43:58 -0700
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] Developing across
 multiple areas of the kernel
In-Reply-To: <20170714153544.GE1528@mtr-leonro.local>
References: <CAGXu5jJ011-7JH7w4n0No=gon-jp4VTeN=0JPWaQsGh5zz79eA@mail.gmail.com>
	<1498758126.2834.70.camel@HansenPartnership.com>
	<CAGXu5jLL8_h3AYrpCm4LKV8-rTPf793gAJckS0rQag6iGqk9xw@mail.gmail.com>
	<20170629182044.GP21846@wotan.suse.de>
	<CAGXu5jL21gKAO1DxNHH-4x6H9jw1t84Z=Z8iCSDRmgcxh4TFrQ@mail.gmail.com>
	<20170630062717.534b06e9@canb.auug.org.au>
	<20170714040447.GT1528@mtr-leonro.local>
	<20170714095409.GF2269@kroah.com>
	<20170714102920.GY1528@mtr-leonro.local>
	<20170714141057.GC21743@lunn.ch>
	<20170714153544.GE1528@mtr-leonro.local>
Message-ID: <1500047038.2853.16.camel@HansenPartnership.com>

On Fri, 2017-07-14 at 18:35 +0300, Leon Romanovsky wrote:
> On Fri, Jul 14, 2017 at 04:10:57PM +0200, Andrew Lunn wrote:
> > 
> > > 
> > > The difference in maintainers style between netdev and RDMA
> > > causes to have long queue
> > > (100+) of patches posted to the ML [2], which are not cross-
> > > checked in various CIs.
> > 
> > It is possible to get 0-day to run against any arbitrary git tree,
> > if you ask nicely. If same is true for the kernel-ci project. So if
> > you are willing to do the merge work, you can get it tested.
> 
> 0-day is checking my tree, so it is not the problem.
> 
> I don't see how kernel-ci can help me, because RDMA requires special
> hardware to run it and it usually requires more than two endpoints
> (servers) connected together.
> 
> My problem is related to changes in other trees for example netdev,
> which can break RDMA functionality.
> 
> Technology wise, there are:
> 1. RoCE - RDMA over Converged Ethernet - netdev is below RDMA
> 2. IPoIB - IP over Infiniband - netdev is above RDMA
> 3. HFI-VNIC - Ethernet over OmniPath - netdev is above RDMA
> 4. iWARP - RDMA over IP networks
> e.t.c.

So I think your goal is to get your tree and the one above you (Doug's
tree) into linux-next without causing a mismerge nightmare?

I still didn't get why you can't change workflow to share commits? If
you can do that, linux-next can be based on both your tree and the one
above it. You can do this either by you sending pull requests or by you
basing on the upstream tree and rebasing when the patches are accepted
(rebase is very good at recognizing and discarding the same patch with
a different commit id).

James
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part
URL: <http://lists.linuxfoundation.org/pipermail/ksummit-discuss/attachments/20170714/64178e92/attachment.sig>

From leon at kernel.org  Fri Jul 14 15:51:42 2017
From: leon at kernel.org (Leon Romanovsky)
Date: Fri, 14 Jul 2017 18:51:42 +0300
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] Developing across
 multiple areas of the kernel
In-Reply-To: <20170714150550.ubtkwmd3wcx554m6@sirena.org.uk>
References: <1498758126.2834.70.camel@HansenPartnership.com>
	<CAGXu5jLL8_h3AYrpCm4LKV8-rTPf793gAJckS0rQag6iGqk9xw@mail.gmail.com>
	<20170629182044.GP21846@wotan.suse.de>
	<CAGXu5jL21gKAO1DxNHH-4x6H9jw1t84Z=Z8iCSDRmgcxh4TFrQ@mail.gmail.com>
	<20170630062717.534b06e9@canb.auug.org.au>
	<20170714040447.GT1528@mtr-leonro.local>
	<20170714095409.GF2269@kroah.com>
	<20170714102920.GY1528@mtr-leonro.local>
	<20170714141057.GC21743@lunn.ch>
	<20170714150550.ubtkwmd3wcx554m6@sirena.org.uk>
Message-ID: <20170714155142.GF1528@mtr-leonro.local>

On Fri, Jul 14, 2017 at 04:05:50PM +0100, Mark Brown wrote:
> On Fri, Jul 14, 2017 at 04:10:57PM +0200, Andrew Lunn wrote:
> > > The difference in maintainers style between netdev and RDMA causes to have long queue
> > > (100+) of patches posted to the ML [2], which are not cross-checked in various CIs.
>
> > It is possible to get 0-day to run against any arbitrary git tree, if
> > you ask nicely. If same is true for the kernel-ci project. So if you
> > are willing to do the merge work, you can get it tested.
>
> Trees can be added to kernelci, yes.  Another approach would be to work
> out a workflow with the upstreams that makes this better, if they'd take
> pull requests for example.

Isn't the goal of this topic in maintainers summit? Improving workflows :)

So, my way to overcome my issues was to add "parallel" tree and to stop
crying about "long queues".

Thanks
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://lists.linuxfoundation.org/pipermail/ksummit-discuss/attachments/20170714/3ce476f5/attachment.sig>

From leon at kernel.org  Fri Jul 14 16:08:41 2017
From: leon at kernel.org (Leon Romanovsky)
Date: Fri, 14 Jul 2017 19:08:41 +0300
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] Developing across
 multiple areas of the kernel
In-Reply-To: <1500047038.2853.16.camel@HansenPartnership.com>
References: <CAGXu5jLL8_h3AYrpCm4LKV8-rTPf793gAJckS0rQag6iGqk9xw@mail.gmail.com>
	<20170629182044.GP21846@wotan.suse.de>
	<CAGXu5jL21gKAO1DxNHH-4x6H9jw1t84Z=Z8iCSDRmgcxh4TFrQ@mail.gmail.com>
	<20170630062717.534b06e9@canb.auug.org.au>
	<20170714040447.GT1528@mtr-leonro.local>
	<20170714095409.GF2269@kroah.com>
	<20170714102920.GY1528@mtr-leonro.local>
	<20170714141057.GC21743@lunn.ch>
	<20170714153544.GE1528@mtr-leonro.local>
	<1500047038.2853.16.camel@HansenPartnership.com>
Message-ID: <20170714160841.GG1528@mtr-leonro.local>

On Fri, Jul 14, 2017 at 08:43:58AM -0700, James Bottomley wrote:
> On Fri, 2017-07-14 at 18:35 +0300, Leon Romanovsky wrote:
> > On Fri, Jul 14, 2017 at 04:10:57PM +0200, Andrew Lunn wrote:
> > >
> > > >
> > > > The difference in maintainers style between netdev and RDMA
> > > > causes to have long queue
> > > > (100+) of patches posted to the ML [2], which are not cross-
> > > > checked in various CIs.
> > >
> > > It is possible to get 0-day to run against any arbitrary git tree,
> > > if you ask nicely. If same is true for the kernel-ci project. So if
> > > you are willing to do the merge work, you can get it tested.
> >
> > 0-day is checking my tree, so it is not the problem.
> >
> > I don't see how kernel-ci can help me, because RDMA requires special
> > hardware to run it and it usually requires more than two endpoints
> > (servers) connected together.
> >
> > My problem is related to changes in other trees for example netdev,
> > which can break RDMA functionality.
> >
> > Technology wise, there are:
> > 1. RoCE - RDMA over Converged Ethernet - netdev is below RDMA
> > 2. IPoIB - IP over Infiniband - netdev is above RDMA
> > 3. HFI-VNIC - Ethernet over OmniPath - netdev is above RDMA
> > 4. iWARP - RDMA over IP networks
> > e.t.c.
>
> So I think your goal is to get your tree and the one above you (Doug's
> tree) into linux-next without causing a mismerge nightmare?

Yeah, exactly, I acknowledge Doug's work and just want to be sure that
all other tress are not breaking our technology and want to see it as soon as
possible.

In regards, of my submissions, I'm pretty confident with it. The patches are
backed by verification teams and don't got public without approval.

>
> I still didn't get why you can't change workflow to share commits? If
> you can do that, linux-next can be based on both your tree and the one
> above it. You can do this either by you sending pull requests or by you
> basing on the upstream tree and rebasing when the patches are accepted
> (rebase is very good at recognizing and discarding the same patch with
> a different commit id).

1. I would like to send pull requests, but It doesn't depend on me to honor
or not pull request.
2. In my early days, I tried to base on upstream and rebase, but it caused
to emails from Stephen [2], maybe I need to try it again.

[1] http://www.mail-archive.com/linux-kernel at vger.kernel.org/msg1302627.html

>
> James


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://lists.linuxfoundation.org/pipermail/ksummit-discuss/attachments/20170714/31b07516/attachment.sig>

From andrew at lunn.ch  Fri Jul 14 16:18:24 2017
From: andrew at lunn.ch (Andrew Lunn)
Date: Fri, 14 Jul 2017 18:18:24 +0200
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] Developing across
 multiple areas of the kernel
In-Reply-To: <20170714153544.GE1528@mtr-leonro.local>
References: <1498758126.2834.70.camel@HansenPartnership.com>
	<CAGXu5jLL8_h3AYrpCm4LKV8-rTPf793gAJckS0rQag6iGqk9xw@mail.gmail.com>
	<20170629182044.GP21846@wotan.suse.de>
	<CAGXu5jL21gKAO1DxNHH-4x6H9jw1t84Z=Z8iCSDRmgcxh4TFrQ@mail.gmail.com>
	<20170630062717.534b06e9@canb.auug.org.au>
	<20170714040447.GT1528@mtr-leonro.local>
	<20170714095409.GF2269@kroah.com>
	<20170714102920.GY1528@mtr-leonro.local>
	<20170714141057.GC21743@lunn.ch>
	<20170714153544.GE1528@mtr-leonro.local>
Message-ID: <20170714161824.GJ21743@lunn.ch>

On Fri, Jul 14, 2017 at 06:35:44PM +0300, Leon Romanovsky wrote:
> On Fri, Jul 14, 2017 at 04:10:57PM +0200, Andrew Lunn wrote:
> > > The difference in maintainers style between netdev and RDMA causes to have long queue
> > > (100+) of patches posted to the ML [2], which are not cross-checked in various CIs.
> >
> > It is possible to get 0-day to run against any arbitrary git tree, if
> > you ask nicely. If same is true for the kernel-ci project. So if you
> > are willing to do the merge work, you can get it tested.
> 
> 0-day is checking my tree, so it is not the problem.
> 
> I don't see how kernel-ci can help me, because RDMA requires special
> hardware to run it and it usually requires more than two endpoints (servers)
> connected together.

kernel-ci are happy to receive hardware. I've sent them boards in the
past which have been added to their test farm. Kernel-ci is mostly
about boot testing, but they do do some tests post boot. So if you can
supply tests as well, they may run them for you.

> My problem is related to changes in other trees for example netdev, which
> can break RDMA functionality.
> 
> Technology wise, there are:
> 1. RoCE - RDMA over Converged Ethernet - netdev is below RDMA
> 2. IPoIB - IP over Infiniband - netdev is above RDMA
> 3. HFI-VNIC - Ethernet over OmniPath - netdev is above RDMA
> 4. iWARP - RDMA over IP networks

How much of this do you already have automated test for?  You can also
setup your own test farm, using the kernels kernel-ci builds.

      Andrew

From broonie at sirena.org.uk  Fri Jul 14 16:20:25 2017
From: broonie at sirena.org.uk (Mark Brown)
Date: Fri, 14 Jul 2017 17:20:25 +0100
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] Developing across
 multiple areas of the kernel
In-Reply-To: <20170714155142.GF1528@mtr-leonro.local>
References: <CAGXu5jLL8_h3AYrpCm4LKV8-rTPf793gAJckS0rQag6iGqk9xw@mail.gmail.com>
	<20170629182044.GP21846@wotan.suse.de>
	<CAGXu5jL21gKAO1DxNHH-4x6H9jw1t84Z=Z8iCSDRmgcxh4TFrQ@mail.gmail.com>
	<20170630062717.534b06e9@canb.auug.org.au>
	<20170714040447.GT1528@mtr-leonro.local>
	<20170714095409.GF2269@kroah.com>
	<20170714102920.GY1528@mtr-leonro.local>
	<20170714141057.GC21743@lunn.ch>
	<20170714150550.ubtkwmd3wcx554m6@sirena.org.uk>
	<20170714155142.GF1528@mtr-leonro.local>
Message-ID: <20170714162025.liz3hmpedz2rfquq@sirena.org.uk>

On Fri, Jul 14, 2017 at 06:51:42PM +0300, Leon Romanovsky wrote:
> On Fri, Jul 14, 2017 at 04:05:50PM +0100, Mark Brown wrote:

> > Trees can be added to kernelci, yes.  Another approach would be to work
> > out a workflow with the upstreams that makes this better, if they'd take
> > pull requests for example.

> Isn't the goal of this topic in maintainers summit? Improving workflows :)

It's also the goal here!

> So, my way to overcome my issues was to add "parallel" tree and to stop
> crying about "long queues".

So I guess what everyone is suggesting here is changing this from being
a parallel tree to a tree that's part of the normal workflow for this
code.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: not available
URL: <http://lists.linuxfoundation.org/pipermail/ksummit-discuss/attachments/20170714/e998e755/attachment.sig>

From Bart.VanAssche at wdc.com  Fri Jul 14 16:28:04 2017
From: Bart.VanAssche at wdc.com (Bart Van Assche)
Date: Fri, 14 Jul 2017 16:28:04 +0000
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] Developing across
 multiple areas of the kernel
In-Reply-To: <20170714161824.GJ21743@lunn.ch>
References: <1498758126.2834.70.camel@HansenPartnership.com>
	<CAGXu5jLL8_h3AYrpCm4LKV8-rTPf793gAJckS0rQag6iGqk9xw@mail.gmail.com>
	<20170629182044.GP21846@wotan.suse.de>
	<CAGXu5jL21gKAO1DxNHH-4x6H9jw1t84Z=Z8iCSDRmgcxh4TFrQ@mail.gmail.com>
	<20170630062717.534b06e9@canb.auug.org.au>
	<20170714040447.GT1528@mtr-leonro.local>
	<20170714095409.GF2269@kroah.com>
	<20170714102920.GY1528@mtr-leonro.local>
	<20170714141057.GC21743@lunn.ch>
	<20170714153544.GE1528@mtr-leonro.local>
	<20170714161824.GJ21743@lunn.ch>
Message-ID: <1500049683.2662.6.camel@wdc.com>

On Fri, 2017-07-14 at 18:18 +0200, Andrew Lunn wrote:
> On Fri, Jul 14, 2017 at 06:35:44PM +0300, Leon Romanovsky wrote:
> > On Fri, Jul 14, 2017 at 04:10:57PM +0200, Andrew Lunn wrote:
> > > > The difference in maintainers style between netdev and RDMA causes to have long queue
> > > > (100+) of patches posted to the ML [2], which are not cross-checked in various CIs.
> > > 
> > > It is possible to get 0-day to run against any arbitrary git tree, if
> > > you ask nicely. If same is true for the kernel-ci project. So if you
> > > are willing to do the merge work, you can get it tested.
> > 
> > 0-day is checking my tree, so it is not the problem.
> > 
> > I don't see how kernel-ci can help me, because RDMA requires special
> > hardware to run it and it usually requires more than two endpoints (servers)
> > connected together.
> 
> kernel-ci are happy to receive hardware. I've sent them boards in the
> past which have been added to their test farm. Kernel-ci is mostly
> about boot testing, but they do do some tests post boot. So if you can
> supply tests as well, they may run them for you.
> 
> > My problem is related to changes in other trees for example netdev, which
> > can break RDMA functionality.
> > 
> > Technology wise, there are:
> > 1. RoCE - RDMA over Converged Ethernet - netdev is below RDMA
> > 2. IPoIB - IP over Infiniband - netdev is above RDMA
> > 3. HFI-VNIC - Ethernet over OmniPath - netdev is above RDMA
> > 4. iWARP - RDMA over IP networks
> 
> How much of this do you already have automated test for?  You can also
> setup your own test farm, using the kernels kernel-ci builds.

Hello Andrew,

The srp-test software is fully automated. It requires IB hardware today but does
not require a second server because it uses IB loopback. As soon as I have the time
I will add RoCE support to the upstream SRP initiator and target drivers such that
these tests can be run on top of Ethernet hardware. Please let me know if you would
like to start using this software and if you need help. See also
https://github.com/bvanassche/srp-test.

Bart.

From sergey.senozhatsky.work at gmail.com  Wed Jul 19 06:24:01 2017
From: sergey.senozhatsky.work at gmail.com (Sergey Senozhatsky)
Date: Wed, 19 Jul 2017 15:24:01 +0900
Subject: [Ksummit-discuss] [TECH TOPIC] printk redesign
In-Reply-To: <20170619052146.GA2889@jagdpanzerIV.localdomain>
References: <20170619052146.GA2889@jagdpanzerIV.localdomain>
Message-ID: <20170719062401.GA12064@jagdpanzerIV.localdomain>

On (06/19/17 14:21), Sergey Senozhatsky wrote:
> Hello,
> 
> 	I, Petr Mladek and Steven Rostedt would like to propose a printk
> tech topic (as suggested by Steven). We are currently exploring the idea
> of complete redesign and rework of printk and it would be extremely helpful
> to hear from the community. printk serves different purposes, and some of
> requirements of printk tend to contradict each other; printk is monolithic
> and quite heavy, no wonder, it causes problems sometimes.

I made a trivial printk TODO list. The list is incomplete and mostly
was created for personal use: thus it's probably a bit hard to read,
but at the same time it contains some quotes/opinions/ideas copy-pastes
and web-links. May be can be of some use. This also looks like our
possible (some approximation) agenda [if the topic will be accepted].

	-ss

From sergey.senozhatsky.work at gmail.com  Wed Jul 19 06:25:38 2017
From: sergey.senozhatsky.work at gmail.com (Sergey Senozhatsky)
Date: Wed, 19 Jul 2017 15:25:38 +0900
Subject: [Ksummit-discuss] [TECH TOPIC] printk redesign
In-Reply-To: <20170719062401.GA12064@jagdpanzerIV.localdomain>
References: <20170619052146.GA2889@jagdpanzerIV.localdomain>
	<20170719062401.GA12064@jagdpanzerIV.localdomain>
Message-ID: <20170719062538.GB12064@jagdpanzerIV.localdomain>

On (07/19/17 15:24), Sergey Senozhatsky wrote:
> On (06/19/17 14:21), Sergey Senozhatsky wrote:
> > Hello,
> > 
> > 	I, Petr Mladek and Steven Rostedt would like to propose a printk
> > tech topic (as suggested by Steven). We are currently exploring the idea
> > of complete redesign and rework of printk and it would be extremely helpful
> > to hear from the community. printk serves different purposes, and some of
> > requirements of printk tend to contradict each other; printk is monolithic
> > and quite heavy, no wonder, it causes problems sometimes.
> 
> I made a trivial printk TODO list. The list is incomplete and mostly
> was created for personal use: thus it's probably a bit hard to read,
> but at the same time it contains some quotes/opinions/ideas copy-pastes
> and web-links. May be can be of some use. This also looks like our
> possible (some approximation) agenda [if the topic will be accepted].
> 

d'oh...  the link...

https://github.com/sergey-senozhatsky/printk-todo

	-ss

From daniel.vetter at ffwll.ch  Wed Jul 19 07:26:23 2017
From: daniel.vetter at ffwll.ch (Daniel Vetter)
Date: Wed, 19 Jul 2017 09:26:23 +0200
Subject: [Ksummit-discuss] [TECH TOPIC] printk redesign
In-Reply-To: <20170719062538.GB12064@jagdpanzerIV.localdomain>
References: <20170619052146.GA2889@jagdpanzerIV.localdomain>
	<20170719062401.GA12064@jagdpanzerIV.localdomain>
	<20170719062538.GB12064@jagdpanzerIV.localdomain>
Message-ID: <CAKMK7uGWZHAWF_kGDvsWghE-1Fwo1PDFaNiP6nnhz3M2g17x8Q@mail.gmail.com>

On Wed, Jul 19, 2017 at 8:25 AM, Sergey Senozhatsky
<sergey.senozhatsky.work at gmail.com> wrote:
> On (07/19/17 15:24), Sergey Senozhatsky wrote:
>> On (06/19/17 14:21), Sergey Senozhatsky wrote:
>> > Hello,
>> >
>> >     I, Petr Mladek and Steven Rostedt would like to propose a printk
>> > tech topic (as suggested by Steven). We are currently exploring the idea
>> > of complete redesign and rework of printk and it would be extremely helpful
>> > to hear from the community. printk serves different purposes, and some of
>> > requirements of printk tend to contradict each other; printk is monolithic
>> > and quite heavy, no wonder, it causes problems sometimes.
>>
>> I made a trivial printk TODO list. The list is incomplete and mostly
>> was created for personal use: thus it's probably a bit hard to read,
>> but at the same time it contains some quotes/opinions/ideas copy-pastes
>> and web-links. May be can be of some use. This also looks like our
>> possible (some approximation) agenda [if the topic will be accepted].
>>
>
> d'oh...  the link...
>
> https://github.com/sergey-senozhatsky/printk-todo

lgtm, two quick notes:
- my mail with the fbdev discussion seems to be in the wrong chapter.
Move it from "console_sem" to "fbdev, tty, drm, etc .."?

- feature request for per-console output: Per-console flag to always
use a kthread/offloading, even when oops/panic is happening. kms
definitely wants that. Please note that in that section. I can help
with implementing, once we get there.

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

From dwmw2 at infradead.org  Wed Jul 19 07:35:20 2017
From: dwmw2 at infradead.org (David Woodhouse)
Date: Wed, 19 Jul 2017 09:35:20 +0200
Subject: [Ksummit-discuss] [TECH TOPIC] printk redesign
In-Reply-To: <20170620172738.zh4maxtfmlwhyrnt@sirena.org.uk>
References: <20170619052146.GA2889@jagdpanzerIV.localdomain>
	<ef18231f-c69b-5d88-0410-485cfcf4143b@suse.com>
	<20170619103912.2edbf88a@gandalf.local.home>
	<20170619152055.GM3786@lunn.ch>
	<01a7d603-c0a2-7aae-8c8d-587063da5e61@suse.com>
	<20170619162317.4nxx6jsvuzvdtasz@sirena.org.uk>
	<20170620155825.GC409@tigerII.localdomain>
	<3908561D78D1C84285E8C5FCA982C28F612DAC67@ORSMSX114.amr.corp.intel.com>
	<20170620171134.GA444@tigerII.localdomain>
	<20170620172738.zh4maxtfmlwhyrnt@sirena.org.uk>
Message-ID: <1500449720.19151.7.camel@infradead.org>

On Tue, 2017-06-20 at 18:27 +0100, Mark Brown wrote:
> On Wed, Jun 21, 2017 at 02:11:34AM +0900, Sergey Senozhatsky wrote:
> 
> > 
> > another thing that I found useful is a CPU number of the processor
> > that stored a particular line to the logbuf.
>
> At some point we start reinventing ftrace...??there's issues with
> joining the two up but there should at least be lessons we can learn.
> 

The other way of looking at this is "why are you abusing printk for
stuff that should have been done via ftrace or other means instead".

I confess I haven't got my curmudgeonly brain out of that mode at all,
ever since realising that printk had been made asynchronous and
unreliable (how long ago was that?) and that you could no longer see
the dying gasp of a crashing kernel on its serial console.

Rather than morphing printk into something more capable of bulk
transport, I'd rather see it go back to its roots of
debugging/diagnostics.

The original complaint of "all this printk output makes things too
slow" was better addressed by printing less or at lower severity (or
adjusting the console loglevel), IMO.

As things stand, the requirements for the various printk (ab)use cases
seem to be contradictory ? if we're going to have a redesign then I
think it would be good to take a holistic view and decide what it's
actually *supposed* to be used for. And, perhaps more to the point,
what it isn't supposed to be used for.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 4938 bytes
Desc: not available
URL: <http://lists.linuxfoundation.org/pipermail/ksummit-discuss/attachments/20170719/fdfbb667/attachment.bin>

From dwmw2 at infradead.org  Wed Jul 19 07:59:31 2017
From: dwmw2 at infradead.org (David Woodhouse)
Date: Wed, 19 Jul 2017 09:59:31 +0200
Subject: [Ksummit-discuss] [TECH TOPIC] printk redesign
In-Reply-To: <alpine.LSU.2.20.1706261040560.30709@cbobk.fhfr.pm>
References: <ef18231f-c69b-5d88-0410-485cfcf4143b@suse.com>
	<20170619103912.2edbf88a@gandalf.local.home>
	<20170619152055.GM3786@lunn.ch>
	<20170619122651.57ba27c4@gandalf.local.home>
	<20170624081411.58b4fb6a@vento.lan> <20170624140659.GM4875@lunn.ch>
	<20170624184216.2ffd4a96@gandalf.local.home>
	<20170624232140.GA27473@lunn.ch>
	<CA+55aFzQyfVnj2mhsZGsOVP7FEc4g7vguGCtkyGy1cMONPeitw@mail.gmail.com>
	<20170624234805.GT10672@ZenIV.linux.org.uk>
	<20170625012913.GC27473@lunn.ch>
	<CA+55aFzisqGwwG7nNabu_X7=CR45HBrrV4XhgSEw9RkED-A9yg@mail.gmail.com>
	<alpine.LSU.2.20.1706261040560.30709@cbobk.fhfr.pm>
Message-ID: <1500451171.19151.13.camel@infradead.org>

On Mon, 2017-06-26 at 10:46 +0200, Jiri Kosina wrote:
> On Sat, 24 Jun 2017, Linus Torvalds wrote:
> 
> > 
> > > 
> > > It is how the embedded world operates, RS232, or now more often, RS232?
> > > with a built in USB-RS232 converter, so you use USB on the host.
> > I'm not saying that serial lines shouldn't be an option.
> > 
> > But for a *large* user base, they simply aren't.
> > 
> > On regular PC's, it's often not an option any more. Even in the data
> > center, it's often not an option any more.
> I don't really agree here. Yes, the mid-to-hig-end servers don't probably?
> contain the actual UART chip any more, but the vast majority of those have?
> somehting that's emulated in firmware, and actually do have a serial?
> console line connector (not the 9-pin one, but rather RJ-45 with either?
> Cisco or Yost pinout), which is then connected into serial-over-TCP?
> concentrator box, exposing the serial console over telnet (or some?
> proprietary client application). This is seen in DCs quite frequently.
> 
> Even machines that have very good IPMI support still ship with this.

Yeah, we definitely still have a "serial console" in the data centre,
even if it's not actually RS232 any more. Or indeed "serial".

You want to catch those failures where even kdump doesn't manage to
give you a viable report of the original crash? You'd better be
watching...

Even on regular PCs we have the USB debug ports which can serve the
same purpose.

But still, we're talking about printk being used for its original
$DEITY-intended purpose for debugging and diagnostic data. Not for the
random "hey, here's a channel I can abuse to send data up to userspace"
stuff. I heartily agree with Steven when he says that "printk is used
too freely".
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 4938 bytes
Desc: not available
URL: <http://lists.linuxfoundation.org/pipermail/ksummit-discuss/attachments/20170719/ba524e58/attachment-0001.bin>

From rostedt at goodmis.org  Wed Jul 19 13:02:39 2017
From: rostedt at goodmis.org (Steven Rostedt)
Date: Wed, 19 Jul 2017 09:02:39 -0400
Subject: [Ksummit-discuss] [TECH TOPIC] Getting better/supplementary
 error info back to userspace
In-Reply-To: <CA+55aFySG7NAvsphb76J-M2YuM8_4wQ8Cvufu24Gb=EhpaoKTg@mail.gmail.com>
References: <10144.1499863410@warthog.procyon.org.uk>
	<12463.1499871476@warthog.procyon.org.uk>
	<20170712082139.17cfd33a@xeon-e3>
	<CA+55aFySG7NAvsphb76J-M2YuM8_4wQ8Cvufu24Gb=EhpaoKTg@mail.gmail.com>
Message-ID: <20170719090239.39f031c5@gandalf.local.home>

On Wed, 12 Jul 2017 09:19:55 -0700
Linus Torvalds <torvalds at linux-foundation.org> wrote:

> On Wed, Jul 12, 2017 at 8:21 AM, Stephen Hemminger
> <stephen at networkplumber.org> wrote:
> >
> > Netlink has recently got extended error reporting, still not used widely
> > and library support is lacking in most places.  
> 
> Yeah, and that "not widely supported and library support is lacking"
> is always going to be an issue with anything like that.
> 
> Along with internationalization, which is a whole nasty set of issues
> in itself with error messages.
> 
> It's not going to happen, in other words. The problems are basically
> insurmountable, and the thing it fixes will always be some special
> case that doesn't much matter.
> 
> Every time it comes up it is because some developer found one case
> that they were hunting down and it annoyed them, and the developer
> went "if only it had included more information and it would have been
> obvious".
> 
> But every time it comes up people ignore this basic issue:
> 
>      [torvalds at i7 linux]$ git grep -e '-E[A-Z]\{4\}' | wc -l
>      182523
> 

Note a lot of those -E* are not going to user space. Some are in
comments, and some are used internally. I use them to pass back
information to other kernel only routines, as some errors are more
critical than others.

> 
> Give it up. It's really is a horrible idea for so many reasons.
> 

One reason that this has never taken off is that there is no good
infrastructure in doing it. I wouldn't tell people to give it up, but
I don't see a one size fits all. In tracing, we have ways to pass
detailed errors back to user space. But that's probably one of the
easier cases as we have defined methods to do so.

A more generic approach would require a lot more planning, and making
it simple to use both in user space and in the kernel. If it is too
complex in either place, it will be ignored.

-- Steve

From sergey.senozhatsky.work at gmail.com  Thu Jul 20 05:19:08 2017
From: sergey.senozhatsky.work at gmail.com (Sergey Senozhatsky)
Date: Thu, 20 Jul 2017 14:19:08 +0900
Subject: [Ksummit-discuss] [TECH TOPIC] printk redesign
In-Reply-To: <CAKMK7uGWZHAWF_kGDvsWghE-1Fwo1PDFaNiP6nnhz3M2g17x8Q@mail.gmail.com>
References: <20170619052146.GA2889@jagdpanzerIV.localdomain>
	<20170719062401.GA12064@jagdpanzerIV.localdomain>
	<20170719062538.GB12064@jagdpanzerIV.localdomain>
	<CAKMK7uGWZHAWF_kGDvsWghE-1Fwo1PDFaNiP6nnhz3M2g17x8Q@mail.gmail.com>
Message-ID: <20170720051908.GB7483@jagdpanzerIV.localdomain>

On (07/19/17 09:26), Daniel Vetter wrote:
[..]
> > d'oh...  the link...
> >
> > https://github.com/sergey-senozhatsky/printk-todo
> 
> lgtm, two quick notes:
> - my mail with the fbdev discussion seems to be in the wrong chapter.
> Move it from "console_sem" to "fbdev, tty, drm, etc .."?

thanks for taking a look!

and sorry for not being very responsive these weeks, still struggling
to recover from my sickness.

the list is incomplete and very spontaneous, I'll improve it.

> - feature request for per-console output: Per-console flag to always
> use a kthread/offloading, even when oops/panic is happening. kms
> definitely wants that. Please note that in that section. I can help
> with implementing, once we get there.

thanks. will add.

> Per-console flag to always use a kthread/offloading, even when oops/panic
> is happening. kms definitely wants that.
>

hmm... kthread offloading during panic() is really risky. nothing
guarantees that we will be able to call into the scheduler and wake
up that console printing-kthread, or that we will be able to schedule
at all. we may be in panic() from NMI handler, with the rest of CPUs
stopped. it's quite a risky thing to do. that's why we disable printk
offloading when in panic() - we don't want to make the things any
worse.

before doing this I think I want to make call_console_drivers() to be
more reliable. right now we pick the first unseen messages from the
logbuf and iterate over registered consoles calling ->write() on every
driver from the console drivers list. if one of consoles is misbehaving,
then the entire console output mechanism stops: we don't print anything
on other consoles until current con->write() returns. so probably I want
to make it more independent.

	-ss

From sergey.senozhatsky.work at gmail.com  Thu Jul 20 07:53:47 2017
From: sergey.senozhatsky.work at gmail.com (Sergey Senozhatsky)
Date: Thu, 20 Jul 2017 16:53:47 +0900
Subject: [Ksummit-discuss] [TECH TOPIC] printk redesign
In-Reply-To: <1500449720.19151.7.camel@infradead.org>
References: <ef18231f-c69b-5d88-0410-485cfcf4143b@suse.com>
	<20170619103912.2edbf88a@gandalf.local.home>
	<20170619152055.GM3786@lunn.ch>
	<01a7d603-c0a2-7aae-8c8d-587063da5e61@suse.com>
	<20170619162317.4nxx6jsvuzvdtasz@sirena.org.uk>
	<20170620155825.GC409@tigerII.localdomain>
	<3908561D78D1C84285E8C5FCA982C28F612DAC67@ORSMSX114.amr.corp.intel.com>
	<20170620171134.GA444@tigerII.localdomain>
	<20170620172738.zh4maxtfmlwhyrnt@sirena.org.uk>
	<1500449720.19151.7.camel@infradead.org>
Message-ID: <20170720075347.GA356@jagdpanzerIV.localdomain>

Hello,

On (07/19/17 09:35), David Woodhouse wrote:
[..]
> > At some point we start reinventing ftrace...??there's issues with
> > joining the two up but there should at least be lessons we can learn.
> > 
> 
> The other way of looking at this is "why are you abusing printk for
> stuff that should have been done via ftrace or other means instead".
>
> I confess I haven't got my curmudgeonly brain out of that mode at all,
> ever since realising that printk had been made asynchronous and
> unreliable (how long ago was that?) and that you could no longer see
> the dying gasp of a crashing kernel on its serial console.
> 
> Rather than morphing printk into something more capable of bulk
> transport, I'd rather see it go back to its roots of
> debugging/diagnostics.
> 
> The original complaint of "all this printk output makes things too
> slow" was better addressed by printing less or at lower severity (or
> adjusting the console loglevel), IMO.
> 
> As things stand, the requirements for the various printk (ab)use cases
> seem to be contradictory ? if we're going to have a redesign then I
> think it would be good to take a holistic view and decide what it's
> actually *supposed* to be used for. And, perhaps more to the point,
> what it isn't supposed to be used for.

just some thoughts,

at glance printk has 3 major issues

- it has to do offloading, no doubt.

- printk() can deadlock, easily. (that's the whole reason there is
  printk_deferred())

- printk from NMI is not completely reliable. this area has been
  improved recently; but there are still cases when we can lose
  NMI-printk messages


... but there are more problems. and those issues are not completely
printk fault.

what I mean (and I'm not criticizing anyone),

so we can split printk: defer printing of debug messages and have
direct printing of important messages. and that's where the redesign
hits the first obstacle: direct printing is unreliable. when we do
call_console_drivers() we pass control to the outside world, and we
never know where we will end up at. consoles can invoke timekeeping,
networking, MM, and so on. so I think printk redesign better start
from this part - make call to console drivers more reliable.
if possible.


what I'm talking about, by just one example:

bug report
https://marc.info/?l=dri-devel&m=149938825811219

root cause
https://marc.info/?l=linux-mm&m=149939515214223&w=2


so printk live-locked, and there was no way to see any kernel logs
until Tetsuo sysrq-c'ed the system. and the root cause was all those
complex and difficult dependencies between completely different
subsystems that printk depend on and that, in turn, depend on printk.


> hm, this allocation, per se, looks ok to me. can't really blame it.
> what you had is a combination of factors
>
>        CPU0                    CPU1                            CPU2
>                                                                console_callback()
>                                                                 console_lock()
>                                                                 ^^^^^^^^^^^^^
>        vprintk_emit()          mutex_lock(&par->bo_mutex)
>                                 kzalloc(GFP_KERNEL)
>         console_trylock()        kmem_cache_alloc()              mutex_lock(&par->bo_mutex)
>         ^^^^^^^^^^^^^^^^          io_schedule_timeout


there are more examples.

more closer to the point,
to the best of my knowledge, we don't have that much problems with the
printk logbuf now. we made some progress there over the last year. yes,
NMI printk is not completely awesome.
where we do have problems, I think:

a) we probably need to make more progress towards "and now we print it to the console"
b) print out offloading
c) printk deadlock and the need of printk_deferred()


and it's not always crazy printk abuse to justify the existence of
printk offloading.

example: https://marc.info/?l=linux-mm&m=149977866327662

> you will find that calling cond_resched() (from console_unlock() from printk())
> can cause a delay of nearly one minute, and it can cause a delay of nearly 5 minutes
> to complete one out_of_memory() call. 

example: https://marc.info/?l=linux-kernel&m=149509270422321


printk, to me, is a debugging/diagnostics tool. and we can't fully rely on
it, even we do reasonable things, like OOM print out. moreover, I think, to
some extent, due to printk imperfections, the more debugging options we
enable (CONFIG_DEBUG_PREEMPT, CONFIG_DEBUG_SPINLOCK, etc.) the less stable
the kernel, potentially, gets. because those options use printk() to report
the problems. so might_sleep() or spin_dump() called from "a wrong place"
can eventually deadlock printk() and the system.

example: https://marc.info/?l=linux-kernel&m=149007148320611


well, just my thoughts.

	-ss

From dhowells at redhat.com  Fri Jul 21 13:41:39 2017
From: dhowells at redhat.com (David Howells)
Date: Fri, 21 Jul 2017 14:41:39 +0100
Subject: [Ksummit-discuss] [TECH TOPIC] Getting better/supplementary
	error info back to userspace
In-Reply-To: <CA+55aFySG7NAvsphb76J-M2YuM8_4wQ8Cvufu24Gb=EhpaoKTg@mail.gmail.com>
References: <CA+55aFySG7NAvsphb76J-M2YuM8_4wQ8Cvufu24Gb=EhpaoKTg@mail.gmail.com>
	<10144.1499863410@warthog.procyon.org.uk>
	<12463.1499871476@warthog.procyon.org.uk>
	<20170712082139.17cfd33a@xeon-e3>
Message-ID: <7884.1500644499@warthog.procyon.org.uk>

Linus Torvalds <torvalds at linux-foundation.org> wrote:

> But every time it comes up people ignore this basic issue:
> 
>      [torvalds at i7 linux]$ git grep -e '-E[A-Z]\{4\}' | wc -l
>      182523
> 
> 
> Give it up. It's really is a horrible idea for so many reasons.

Are you okay with me making it possible to retrieve mount errors, warnings and
informational messages through fd-arbitrated-mount I'm working on?  For
example (and skipping some of the parameters for brevity):

	int fs_fd;

	static inline void e(int x)
	{
		char buf[1024];
		int i;
		if (x == -1)
			fprintf(stderr, "Mount error: %m\n");
		/* Read back any messages */
		while (i = read(fs_fd, buf), i != -1) {
			buf[i] = 0;
			fprintf(stderr, "%s\n", buf);
		}
		if (x == -1)
			exit(1);
	}

	fs_fd = fsopen("ext4");
	e(write(fs_fd, "d /dev/sda3"));
	e(write(fs_fd, "o user_xattr"));
	e(write(fs_fd, "o acl"));
	e(write(fs_fd, "o data=ordered"));
	e(write(fs_fd, "x create"));
	e(fsmount(fs_fd, AT_FDCWD, "/mnt", MS_NODEV));
	close(fs_fd);

David

From mathieu.desnoyers at efficios.com  Fri Jul 21 21:45:57 2017
From: mathieu.desnoyers at efficios.com (Mathieu Desnoyers)
Date: Fri, 21 Jul 2017 21:45:57 +0000 (UTC)
Subject: [Ksummit-discuss] [TECH TOPIC] Pulling away from the tracing
 ABI quicksands
In-Reply-To: <20170706151008.24addd2b@gandalf.local.home>
References: <20170629195537.534445e7@gandalf.local.home>
	<20170630025852.xjoif3aai6rny5a2@ast-mbp>
	<20170629230251.02f380cb@gandalf.local.home>
	<6AE378F0-42F7-45DE-9F3C-050A5019A1E8@fb.com>
	<20170630142956.7e0cb2d6@gandalf.local.home>
	<20170630143030.305b68a0@gandalf.local.home>
	<658A3F80-5E48-4EC4-A591-E3783AD3DADC@fb.com>
	<20170706151008.24addd2b@gandalf.local.home>
Message-ID: <1188050494.22035.1500673557010.JavaMail.zimbra@efficios.com>


----- On Jul 6, 2017, at 3:10 PM, rostedt rostedt at goodmis.org wrote:

> On Fri, 30 Jun 2017 18:37:59 +0000
> Josef Bacik <jbacik at fb.com> wrote:
> 
>> [ I forgot to add Tom to the Cc list. Sending again. ]
>> 
>> On Fri, 30 Jun 2017 14:29:56 -0400
>> Steven Rostedt <rostedt at goodmis.org> wrote:
>> 
>> > On Fri, 30 Jun 2017 18:24:12 +0000
>> > Josef Bacik <jbacik at fb.com> wrote:
>> >   
>> > > Yup I?ll start bugging people to submit talk proposals, starting with you!  I?ll
>> > > put up my proposal in the next day or two, I think Brendan has something he?s
>> > > going to talk about.  Thanks,
>> > 
>> > I shouldn't have used the term "talk", as it really is all about
>> > discussions. In fact, if you need more than one slide, you have too
>> > many.
>> > 
>> > That said, I could probably come up with a few things, starting with
>> > this trace event issue. But it will be pointless if Peter Zijlstra and
>> > Mathieu are not there.
>> > 
>> > But having ideas about dynamic fields in tracepoints is always
>> > interesting. Not to mention talking about Tom Zanussi's latest
>> > histogram work. It may be pretty much completed, but I would like to
>> > discuss where we go from there.
>> > 
>> > One last thing. I don't want to have too many responsibilities, as I'm
>> > on the LPC program committee and I need to make sure I have time to
>> > fulfill any action items I'm responsible for during the conference.
>> >   
>> 
>> Yeah plumbers is a weird venue for tracing, I always hope that we are
>> going to have people like Brendan or other sysadmin-y people show up
>> and say ?this is what sucks about tracing, please fix it?, and then
>> we can go fix it.  It doesn?t really seem to happen that way tho, and
>> for things like tracing ABI there just aren?t the right people in the
>> room to have that kind of discussion.  My proposal was just going to
>> be a laundry list of things that would make my life easier, but it
>> doesn?t really warrant a full micro-conference to listen to me bitch
>> for an hour.  If it turns out nobody else has much to talk about then
>> we can just declare tracing is feature complete and we can talk about
>> something else ;).  Thanks,
>> 
> 
> At this rate, I'm guessing that Tracing is not going to be on the
> Plumbers' agenda.

Since the Kernel Summit and Plumbers do not seem like a good fit to have
discussions involving both tracing end users and developers, we have
adapted the Tracing Summit schedule to have half day of the usual
presentations, and half day dedicated to such discussions. Steven has
volunteered to run the discussion part.

The Tracing Summit will take place on October 27th in Prague, on the
Friday right after Kernel Summit.

So if you have tracing topics that you would like to discuss at this
event, the CFP/CFD and all the information are available here:

http://tracingsummit.org/wiki/TracingSummit2017

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

From James.Bottomley at HansenPartnership.com  Fri Jul 21 23:15:14 2017
From: James.Bottomley at HansenPartnership.com (James Bottomley)
Date: Fri, 21 Jul 2017 16:15:14 -0700
Subject: [Ksummit-discuss] [TECH TOPIC] Pulling away from the tracing
 ABI quicksands
In-Reply-To: <1188050494.22035.1500673557010.JavaMail.zimbra@efficios.com>
References: <20170629195537.534445e7@gandalf.local.home>
	<20170630025852.xjoif3aai6rny5a2@ast-mbp>
	<20170629230251.02f380cb@gandalf.local.home>
	<6AE378F0-42F7-45DE-9F3C-050A5019A1E8@fb.com>
	<20170630142956.7e0cb2d6@gandalf.local.home>
	<20170630143030.305b68a0@gandalf.local.home>
	<658A3F80-5E48-4EC4-A591-E3783AD3DADC@fb.com>
	<20170706151008.24addd2b@gandalf.local.home>
	<1188050494.22035.1500673557010.JavaMail.zimbra@efficios.com>
Message-ID: <1500678914.2900.77.camel@HansenPartnership.com>

On Fri, 2017-07-21 at 21:45 +0000, Mathieu Desnoyers wrote:
> 
> ----- On Jul 6, 2017, at 3:10 PM, rostedt rostedt at goodmis.org wrote:
> 
> > 
> > On Fri, 30 Jun 2017 18:37:59 +0000
> > Josef Bacik <jbacik at fb.com> wrote:
> > 
> > > 
> > > [ I forgot to add Tom to the Cc list. Sending again. ]
> > > 
> > > On Fri, 30 Jun 2017 14:29:56 -0400
> > > Steven Rostedt <rostedt at goodmis.org> wrote:
> > > 
> > > > 
> > > > On Fri, 30 Jun 2017 18:24:12 +0000
> > > > Josef Bacik <jbacik at fb.com> wrote:
> > > > ??
> > > > > 
> > > > > Yup I?ll start bugging people to submit talk proposals,
> > > > > starting with you!??I?ll put up my proposal in the next day
> > > > > or two, I think Brendan has something he?s going to talk
> > > > > about.??Thanks,
> > > > 
> > > > I shouldn't have used the term "talk", as it really is all
> > > > about discussions. In fact, if you need more than one slide,
> > > > you have too many.
> > > > 
> > > > That said, I could probably come up with a few things, starting
> > > > with this trace event issue. But it will be pointless if Peter
> > > > Zijlstra and Mathieu are not there.
> > > > 
> > > > But having ideas about dynamic fields in tracepoints is always
> > > > interesting. Not to mention talking about Tom Zanussi's latest
> > > > histogram work. It may be pretty much completed, but I would
> > > > like to discuss where we go from there.
> > > > 
> > > > One last thing. I don't want to have too many responsibilities,
> > > > as I'm on the LPC program committee and I need to make sure I
> > > > have time to fulfill any action items I'm responsible for
> > > > during the conference.
> > > > ??
> > > 
> > > Yeah plumbers is a weird venue for tracing, I always hope that we
> > > are going to have people like Brendan or other sysadmin-y people
> > > show up and say ?this is what sucks about tracing, please fix
> > > it?, and then we can go fix it.??It doesn?t really seem to happen
> > > that way tho, and for things like tracing ABI there just aren?t
> > > the right people in the room to have that kind of discussion.??My
> > > proposal was just going to be a laundry list of things that would
> > > make my life easier, but it doesn?t really warrant a full micro-
> > > conference to listen to me bitch for an hour.??If it turns out
> > > nobody else has much to talk about then we can just declare
> > > tracing is feature complete and we can talk about something else
> > > ;).??Thanks,
> > > 
> > 
> > At this rate, I'm guessing that Tracing is not going to be on the
> > Plumbers' agenda.
> 
> Since the Kernel Summit and Plumbers do not seem like a good fit to
> have discussions involving both tracing end users and developers

First the disclaimer: being on the Plumbers Programme Committee, I'm
biased. ?However, I have to say that the design of Plumbers is to bring
together everyone interested in the plumbing of Linux. ?That means end
users as well, so it's not correct to say it's not a good fit.

It also looks like there's been some renewed interest in having a
Tracing MC at Plumbers, so my best guess now is that it will happen.
?That's not to say the two events can't easily co-exist: being on
different continents means better opportunities for attendees with
international travel restrictions.

James


From rostedt at goodmis.org  Sun Jul 23 21:25:14 2017
From: rostedt at goodmis.org (Steven Rostedt)
Date: Sun, 23 Jul 2017 17:25:14 -0400
Subject: [Ksummit-discuss] [TECH TOPIC] Pulling away from the tracing
 ABI quicksands
In-Reply-To: <40F38E70-C173-463F-99AF-099927AC63E4@fb.com>
References: <20170629195537.534445e7@gandalf.local.home>
	<20170630025852.xjoif3aai6rny5a2@ast-mbp>
	<20170629230251.02f380cb@gandalf.local.home>
	<6AE378F0-42F7-45DE-9F3C-050A5019A1E8@fb.com>
	<20170630142956.7e0cb2d6@gandalf.local.home>
	<20170630143030.305b68a0@gandalf.local.home>
	<658A3F80-5E48-4EC4-A591-E3783AD3DADC@fb.com>
	<20170706151008.24addd2b@gandalf.local.home>
	<1188050494.22035.1500673557010.JavaMail.zimbra@efficios.com>
	<37EF4BA7-FC04-4580-8AD8-28E4C384DA88@goodmis.org>
	<40F38E70-C173-463F-99AF-099927AC63E4@fb.com>
Message-ID: <20170723172514.564ed7c5@gandalf.local.home>

On Sun, 23 Jul 2017 16:24:09 +0000
Josef Bacik <jbacik at fb.com> wrote:

> Do we want to talk about ABI at the micro conference?  Facebook uses
> tracing everywhere in production so I can talk about it from both a
> user and maintainer standpoint.  Thanks,

Yes, please add that to the wiki.

-- Steve


From rostedt at goodmis.org  Sat Jul 22 02:18:25 2017
From: rostedt at goodmis.org (Steven Rostedt)
Date: Fri, 21 Jul 2017 22:18:25 -0400
Subject: [Ksummit-discuss] [TECH TOPIC] Pulling away from the tracing
	ABI quicksands
In-Reply-To: <1188050494.22035.1500673557010.JavaMail.zimbra@efficios.com>
References: <20170629195537.534445e7@gandalf.local.home>
	<20170630025852.xjoif3aai6rny5a2@ast-mbp>
	<20170629230251.02f380cb@gandalf.local.home>
	<6AE378F0-42F7-45DE-9F3C-050A5019A1E8@fb.com>
	<20170630142956.7e0cb2d6@gandalf.local.home>
	<20170630143030.305b68a0@gandalf.local.home>
	<658A3F80-5E48-4EC4-A591-E3783AD3DADC@fb.com>
	<20170706151008.24addd2b@gandalf.local.home>
	<1188050494.22035.1500673557010.JavaMail.zimbra@efficios.com>
Message-ID: <37EF4BA7-FC04-4580-8AD8-28E4C384DA88@goodmis.org>

Actually, Brendan Gregg got enough proposals together and there will be a tracing MC at Plumbers this year.

-- Steve


On July 21, 2017 5:45:57 PM EDT, Mathieu Desnoyers <mathieu.desnoyers at efficios.com> wrote:
>
>
>----- On Jul 6, 2017, at 3:10 PM, rostedt rostedt at goodmis.org wrote:
>
>> On Fri, 30 Jun 2017 18:37:59 +0000
>> Josef Bacik <jbacik at fb.com> wrote:
>> 
>>> [ I forgot to add Tom to the Cc list. Sending again. ]
>>> 
>>> On Fri, 30 Jun 2017 14:29:56 -0400
>>> Steven Rostedt <rostedt at goodmis.org> wrote:
>>> 
>>> > On Fri, 30 Jun 2017 18:24:12 +0000
>>> > Josef Bacik <jbacik at fb.com> wrote:
>>> >   
>>> > > Yup I?ll start bugging people to submit talk proposals, starting
>with you!  I?ll
>>> > > put up my proposal in the next day or two, I think Brendan has
>something he?s
>>> > > going to talk about.  Thanks,
>>> > 
>>> > I shouldn't have used the term "talk", as it really is all about
>>> > discussions. In fact, if you need more than one slide, you have
>too
>>> > many.
>>> > 
>>> > That said, I could probably come up with a few things, starting
>with
>>> > this trace event issue. But it will be pointless if Peter Zijlstra
>and
>>> > Mathieu are not there.
>>> > 
>>> > But having ideas about dynamic fields in tracepoints is always
>>> > interesting. Not to mention talking about Tom Zanussi's latest
>>> > histogram work. It may be pretty much completed, but I would like
>to
>>> > discuss where we go from there.
>>> > 
>>> > One last thing. I don't want to have too many responsibilities, as
>I'm
>>> > on the LPC program committee and I need to make sure I have time
>to
>>> > fulfill any action items I'm responsible for during the
>conference.
>>> >   
>>> 
>>> Yeah plumbers is a weird venue for tracing, I always hope that we
>are
>>> going to have people like Brendan or other sysadmin-y people show up
>>> and say ?this is what sucks about tracing, please fix it?, and then
>>> we can go fix it.  It doesn?t really seem to happen that way tho,
>and
>>> for things like tracing ABI there just aren?t the right people in
>the
>>> room to have that kind of discussion.  My proposal was just going to
>>> be a laundry list of things that would make my life easier, but it
>>> doesn?t really warrant a full micro-conference to listen to me bitch
>>> for an hour.  If it turns out nobody else has much to talk about
>then
>>> we can just declare tracing is feature complete and we can talk
>about
>>> something else ;).  Thanks,
>>> 
>> 
>> At this rate, I'm guessing that Tracing is not going to be on the
>> Plumbers' agenda.
>
>Since the Kernel Summit and Plumbers do not seem like a good fit to
>have
>discussions involving both tracing end users and developers, we have
>adapted the Tracing Summit schedule to have half day of the usual
>presentations, and half day dedicated to such discussions. Steven has
>volunteered to run the discussion part.
>
>The Tracing Summit will take place on October 27th in Prague, on the
>Friday right after Kernel Summit.
>
>So if you have tracing topics that you would like to discuss at this
>event, the CFP/CFD and all the information are available here:
>
>http://tracingsummit.org/wiki/TracingSummit2017
>
>Thanks,
>
>Mathieu

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

From jbacik at fb.com  Sun Jul 23 16:24:09 2017
From: jbacik at fb.com (Josef Bacik)
Date: Sun, 23 Jul 2017 16:24:09 +0000
Subject: [Ksummit-discuss] [TECH TOPIC] Pulling away from the tracing
 ABI quicksands
In-Reply-To: <37EF4BA7-FC04-4580-8AD8-28E4C384DA88@goodmis.org>
References: <20170629195537.534445e7@gandalf.local.home>
	<20170630025852.xjoif3aai6rny5a2@ast-mbp>
	<20170629230251.02f380cb@gandalf.local.home>
	<6AE378F0-42F7-45DE-9F3C-050A5019A1E8@fb.com>
	<20170630142956.7e0cb2d6@gandalf.local.home>
	<20170630143030.305b68a0@gandalf.local.home>
	<658A3F80-5E48-4EC4-A591-E3783AD3DADC@fb.com>
	<20170706151008.24addd2b@gandalf.local.home>
	<1188050494.22035.1500673557010.JavaMail.zimbra@efficios.com>,
	<37EF4BA7-FC04-4580-8AD8-28E4C384DA88@goodmis.org>
Message-ID: <40F38E70-C173-463F-99AF-099927AC63E4@fb.com>

Do we want to talk about ABI at the micro conference?  Facebook uses tracing everywhere in production so I can talk about it from both a user and maintainer standpoint.  Thanks,

Josef

Sent from my iPhone

> On Jul 23, 2017, at 11:49 AM, Steven Rostedt <rostedt at goodmis.org> wrote:
> 
> Actually, Brendan Gregg got enough proposals together and there will be a tracing MC at Plumbers this year.
> 
> -- Steve
> 
> 
>> On July 21, 2017 5:45:57 PM EDT, Mathieu Desnoyers <mathieu.desnoyers at efficios.com> wrote:
>> 
>> 
>> ----- On Jul 6, 2017, at 3:10 PM, rostedt rostedt at goodmis.org wrote:
>> 
>>> On Fri, 30 Jun 2017 18:37:59 +0000
>>> Josef Bacik <jbacik at fb.com> wrote:
>>> 
>>>> [ I forgot to add Tom to the Cc list. Sending again. ]
>>>> 
>>>> On Fri, 30 Jun 2017 14:29:56 -0400
>>>> Steven Rostedt <rostedt at goodmis.org> wrote:
>>>> 
>>>>> On Fri, 30 Jun 2017 18:24:12 +0000
>>>>> Josef Bacik <jbacik at fb.com> wrote:
>>>>> 
>>>>>> Yup I?ll start bugging people to submit talk proposals, starting
>> with you!  I?ll
>>>>>> put up my proposal in the next day or two, I think Brendan has
>> something he?s
>>>>>> going to talk about.  Thanks,
>>>>> 
>>>>> I shouldn't have used the term "talk", as it really is all about
>>>>> discussions. In fact, if you need more than one slide, you have
>> too
>>>>> many.
>>>>> 
>>>>> That said, I could probably come up with a few things, starting
>> with
>>>>> this trace event issue. But it will be pointless if Peter Zijlstra
>> and
>>>>> Mathieu are not there.
>>>>> 
>>>>> But having ideas about dynamic fields in tracepoints is always
>>>>> interesting. Not to mention talking about Tom Zanussi's latest
>>>>> histogram work. It may be pretty much completed, but I would like
>> to
>>>>> discuss where we go from there.
>>>>> 
>>>>> One last thing. I don't want to have too many responsibilities, as
>> I'm
>>>>> on the LPC program committee and I need to make sure I have time
>> to
>>>>> fulfill any action items I'm responsible for during the
>> conference.
>>>>> 
>>>> 
>>>> Yeah plumbers is a weird venue for tracing, I always hope that we
>> are
>>>> going to have people like Brendan or other sysadmin-y people show up
>>>> and say ?this is what sucks about tracing, please fix it?, and then
>>>> we can go fix it.  It doesn?t really seem to happen that way tho,
>> and
>>>> for things like tracing ABI there just aren?t the right people in
>> the
>>>> room to have that kind of discussion.  My proposal was just going to
>>>> be a laundry list of things that would make my life easier, but it
>>>> doesn?t really warrant a full micro-conference to listen to me bitch
>>>> for an hour.  If it turns out nobody else has much to talk about
>> then
>>>> we can just declare tracing is feature complete and we can talk
>> about
>>>> something else ;).  Thanks,
>>>> 
>>> 
>>> At this rate, I'm guessing that Tracing is not going to be on the
>>> Plumbers' agenda.
>> 
>> Since the Kernel Summit and Plumbers do not seem like a good fit to
>> have
>> discussions involving both tracing end users and developers, we have
>> adapted the Tracing Summit schedule to have half day of the usual
>> presentations, and half day dedicated to such discussions. Steven has
>> volunteered to run the discussion part.
>> 
>> The Tracing Summit will take place on October 27th in Prague, on the
>> Friday right after Kernel Summit.
>> 
>> So if you have tracing topics that you would like to discuss at this
>> event, the CFP/CFD and all the information are available here:
>> 
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__tracingsummit.org_wiki_TracingSummit2017&d=DwIFaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=sDzg6MvHymKOUgI8SFIm4Q&m=i0WUwxGtMNqWf0sUfcJGu8mjy4EmALhzGj4FSSAj_10&s=uSqVNCjkvfgDy8m4bV0fRhLxXZzd2b5MOnzIs5uAugM&e= 
>> 
>> Thanks,
>> 
>> Mathieu
> 
> -- 
> Sent from my Android device with K-9 Mail. Please excuse my brevity.

From miklos at szeredi.hu  Mon Jul 24 07:55:19 2017
From: miklos at szeredi.hu (Miklos Szeredi)
Date: Mon, 24 Jul 2017 09:55:19 +0200
Subject: [Ksummit-discuss] [TECH TOPIC] Getting better/supplementary
 error info back to userspace
In-Reply-To: <20170719090239.39f031c5@gandalf.local.home>
References: <10144.1499863410@warthog.procyon.org.uk>
	<12463.1499871476@warthog.procyon.org.uk>
	<20170712082139.17cfd33a@xeon-e3>
	<CA+55aFySG7NAvsphb76J-M2YuM8_4wQ8Cvufu24Gb=EhpaoKTg@mail.gmail.com>
	<20170719090239.39f031c5@gandalf.local.home>
Message-ID: <CAJfpegsK_SvLOg9rTMSwVSZpDG4HFkPrkfaPtGrniLhtr1vUhw@mail.gmail.com>

On Wed, Jul 19, 2017 at 3:02 PM, Steven Rostedt <rostedt at goodmis.org> wrote:
> On Wed, 12 Jul 2017 09:19:55 -0700
> Linus Torvalds <torvalds at linux-foundation.org> wrote:
>
>> On Wed, Jul 12, 2017 at 8:21 AM, Stephen Hemminger
>> <stephen at networkplumber.org> wrote:
>> >
>> > Netlink has recently got extended error reporting, still not used widely
>> > and library support is lacking in most places.
>>
>> Yeah, and that "not widely supported and library support is lacking"
>> is always going to be an issue with anything like that.
>>
>> Along with internationalization, which is a whole nasty set of issues
>> in itself with error messages.
>>
>> It's not going to happen, in other words. The problems are basically
>> insurmountable, and the thing it fixes will always be some special
>> case that doesn't much matter.
>>
>> Every time it comes up it is because some developer found one case
>> that they were hunting down and it annoyed them, and the developer
>> went "if only it had included more information and it would have been
>> obvious".
>>
>> But every time it comes up people ignore this basic issue:
>>
>>      [torvalds at i7 linux]$ git grep -e '-E[A-Z]\{4\}' | wc -l
>>      182523
>>
>
> Note a lot of those -E* are not going to user space. Some are in
> comments, and some are used internally. I use them to pass back
> information to other kernel only routines, as some errors are more
> critical than others.

a) it wouldn't have to be for every error

b) kernel prints detailed error in dmesg anyway, why not allow that
info to be bound to the syscall that triggered the error?

c) internationalization can be solved at the level where it matters
(NOT in the kernel)

My suggestion was to keep the kernel interface really simple, e.g.:

   return detailed_error(-EINVAL, "failure to do foo because of bar");

What are the insurmountable issues you are talking about?

Thanks,
Miklos

From dhowells at redhat.com  Mon Jul 24 08:25:17 2017
From: dhowells at redhat.com (David Howells)
Date: Mon, 24 Jul 2017 09:25:17 +0100
Subject: [Ksummit-discuss] [TECH TOPIC] Getting better/supplementary
	error info back to userspace
In-Reply-To: <CAJfpegsK_SvLOg9rTMSwVSZpDG4HFkPrkfaPtGrniLhtr1vUhw@mail.gmail.com>
References: <CAJfpegsK_SvLOg9rTMSwVSZpDG4HFkPrkfaPtGrniLhtr1vUhw@mail.gmail.com>
	<10144.1499863410@warthog.procyon.org.uk>
	<12463.1499871476@warthog.procyon.org.uk>
	<20170712082139.17cfd33a@xeon-e3>
	<CA+55aFySG7NAvsphb76J-M2YuM8_4wQ8Cvufu24Gb=EhpaoKTg@mail.gmail.com>
	<20170719090239.39f031c5@gandalf.local.home>
Message-ID: <25485.1500884717@warthog.procyon.org.uk>

Miklos Szeredi <miklos at szeredi.hu> wrote:

> My suggestion was to keep the kernel interface really simple, e.g.:
> 
>    return detailed_error(-EINVAL, "failure to do foo because of bar");

That's what I was thinking of, though I'd prefix the string with a source tag,
such as "nfs", "vfs" or "dvb-core".

David

From mathieu.desnoyers at efficios.com  Thu Jul 27 14:35:45 2017
From: mathieu.desnoyers at efficios.com (Mathieu Desnoyers)
Date: Thu, 27 Jul 2017 14:35:45 +0000 (UTC)
Subject: [Ksummit-discuss] [TECH TOPIC] Pulling away from the tracing
 ABI quicksands
In-Reply-To: <20170629232016.4cde203e@gandalf.local.home>
References: <20170629195537.534445e7@gandalf.local.home>
	<20170629212750.5c3542ee@gandalf.local.home>
	<CA+55aFzzCPMUDt72hckauYu+fj=Q2MWjx+XiR06KpMLAr1EBAA@mail.gmail.com>
	<20170629221245.489760b1@gandalf.local.home>
	<CA+55aFxFLvX62SyOC9qyVwEQXH8J224Fe03tvy624AUx0U2fRQ@mail.gmail.com>
	<20170630025852.xjoif3aai6rny5a2@ast-mbp>
	<20170629230251.02f380cb@gandalf.local.home>
	<20170629232016.4cde203e@gandalf.local.home>
Message-ID: <292206664.28374.1501166145927.JavaMail.zimbra@efficios.com>

----- On Jun 29, 2017, at 11:20 PM, rostedt rostedt at goodmis.org wrote:

> On Thu, 29 Jun 2017 23:02:51 -0400
> Steven Rostedt <rostedt at goodmis.org> wrote:
> 
>> On Thu, 29 Jun 2017 19:58:54 -0700
>> Alexei Starovoitov <alexei.starovoitov at gmail.com> wrote:
>> 
>> 
>> > Also I'm not planning to fly to Prague just for tracing discussion.
>> > There is netdev2.2 right after in Seoul.
>> > And tracing microconf at plumbers in September which is imo better
>> > suited to discuss tracing related topics.
>> 
>> Which reminds me. The LPC Tracing Microconf WIKI has been stale, and not
>> moving at all. If it is to be accepted, it needs some talk proposals,
>> and fast!
> 
> Also note, Mathieu has stated he wont be attending Plumbers, and I'm
> not sure Peter will be either as he has smaller things to attend to.

I take it back. Work permit delays postpone my conflicting house
renovation work, so I will likely be able to make it to LPC finally. :)

Thanks,

Mathieu


-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

From rostedt at goodmis.org  Thu Jul 27 15:57:20 2017
From: rostedt at goodmis.org (Steven Rostedt)
Date: Thu, 27 Jul 2017 11:57:20 -0400
Subject: [Ksummit-discuss] [TECH TOPIC] Pulling away from the tracing
 ABI quicksands
In-Reply-To: <292206664.28374.1501166145927.JavaMail.zimbra@efficios.com>
References: <20170629195537.534445e7@gandalf.local.home>
	<20170629212750.5c3542ee@gandalf.local.home>
	<CA+55aFzzCPMUDt72hckauYu+fj=Q2MWjx+XiR06KpMLAr1EBAA@mail.gmail.com>
	<20170629221245.489760b1@gandalf.local.home>
	<CA+55aFxFLvX62SyOC9qyVwEQXH8J224Fe03tvy624AUx0U2fRQ@mail.gmail.com>
	<20170630025852.xjoif3aai6rny5a2@ast-mbp>
	<20170629230251.02f380cb@gandalf.local.home>
	<20170629232016.4cde203e@gandalf.local.home>
	<292206664.28374.1501166145927.JavaMail.zimbra@efficios.com>
Message-ID: <20170727115720.521f06aa@vmware.local.home>

On Thu, 27 Jul 2017 14:35:45 +0000 (UTC)
Mathieu Desnoyers <mathieu.desnoyers at efficios.com> wrote:


> > Also note, Mathieu has stated he wont be attending Plumbers, and I'm
> > not sure Peter will be either as he has smaller things to attend to.  
> 
> I take it back. Work permit delays postpone my conflicting house
> renovation work, so I will likely be able to make it to LPC finally. :)

Great! I see you updated the Wiki that you are attending as well.

 http://wiki.linuxplumbersconf.org/2017:tracing

Thanks, looking forward in seeing you there.

-- Steve


From ebiederm at xmission.com  Mon Jul 31 16:54:45 2017
From: ebiederm at xmission.com (Eric W. Biederman)
Date: Mon, 31 Jul 2017 11:54:45 -0500
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
	regression tracking
In-Reply-To: <20170705130200.7c653f61@gandalf.local.home> (Steven Rostedt's
	message of "Wed, 5 Jul 2017 13:02:00 -0400")
References: <576cea07-770a-4864-c3f5-0832ff211e94@leemhuis.info>
	<20170703123025.7479702e@gandalf.local.home>
	<ad94dc65-cc9c-f4f1-27c1-5a48603c7f59@leemhuis.info>
	<20170705084528.67499f8c@gandalf.local.home>
	<4080ecc7-1aa8-2940-f230-1b79d656cdb4@redhat.com>
	<20170705092757.63dc2328@gandalf.local.home>
	<20170705140607.GA30187@kroah.com>
	<a462fb3b-a6d4-e969-b301-b404981de224@roeck-us.net>
	<20170705112707.54d7f345@gandalf.local.home>
	<c782a15a-4e73-7373-ca66-5b55e9406059@roeck-us.net>
	<20170705130200.7c653f61@gandalf.local.home>
Message-ID: <87zibkzgve.fsf@xmission.com>

Steven Rostedt <rostedt at goodmis.org> writes:

> On Wed, 5 Jul 2017 09:48:31 -0700
> Guenter Roeck <linux at roeck-us.net> wrote:
>
>> On 07/05/2017 08:27 AM, Steven Rostedt wrote:
>> > On Wed, 5 Jul 2017 08:16:33 -0700
>> > Guenter Roeck <linux at roeck-us.net> wrote:  
>> [ ... ]
>> >>
>> >> If we start shaming people for not providing unit tests, all we'll accomplish is
>> >> that people will stop providing bug fixes.  
>> > 
>> > I need to be clearer on this. What I meant was, if there's a bug
>> > where someone has a test that easily reproduces the bug, then if
>> > there's not a test added to selftests for said bug, then we should
>> > shame those into doing so.
>> >   
>> 
>> I don't think that public shaming of kernel developers is going to work
>> any better than public shaming of children or teenagers.
>> 
>> Maybe a friendlier approach would be more useful ?
>
> I'm a friendly shamer ;-)
>
>> 
>> If a test to reproduce a problem exists, it might be more beneficial to suggest
>> to the patch submitter that it would be great if that test would be submitted
>> as unit test instead of shaming that person for not doing so. Acknowledging and
>> praising kselftest submissions might help more than shaming for non-submissions.
>> 
>> > A bug that is found by inspection or hard to reproduce test cases are
>> > not applicable, as they don't have tests that can show a regression.
>> >   
>> 
>> My concern would be that once the shaming starts, it won't stop.
>
> I think this is a communication issue. My word for "shaming" was to
> call out a developer for not submitting a test. It wasn't about making
> fun of them, or anything like that. I was only making a point
> about how to teach people that they need to be more aware of the
> testing infrastructure. Not about actually demeaning people.
>
> Lets take a hypothetical sample. Say someone posted a bug report with
> an associated reproducer for it. The developer then runs the reproducer
> sees the bug, makes a fix and sends it to Linus and stable. Now the
> developer forgets this and continues on their merry way. Along comes
> someone like myself and sees a reproducing test case for a bug, but
> sees no test added to kselftests. I would send an email along the lines
> of "Hi, I noticed that there was a reproducer for this bug you fixed.
> How come there was no test added to the kselftests to make sure it
> doesn't appear again?" There, I "shamed" them ;-)

I just want to point out that kselftests are hard to build and run.

As I was looking at another issue I found a bug in one of the tests.  It
had defined a constant wrong.  I have a patch.  It took me a week of
poking at the kselftest code and trying one thing or another (between
working on other things) before I could figure out which combination of
things would let the test build and run.

Until kselftests get easier to run I don't think they are something we
want to push to hard.

Eric

From rostedt at goodmis.org  Mon Jul 31 20:11:23 2017
From: rostedt at goodmis.org (Steven Rostedt)
Date: Mon, 31 Jul 2017 16:11:23 -0400
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
 regression tracking
In-Reply-To: <87zibkzgve.fsf@xmission.com>
References: <576cea07-770a-4864-c3f5-0832ff211e94@leemhuis.info>
	<20170703123025.7479702e@gandalf.local.home>
	<ad94dc65-cc9c-f4f1-27c1-5a48603c7f59@leemhuis.info>
	<20170705084528.67499f8c@gandalf.local.home>
	<4080ecc7-1aa8-2940-f230-1b79d656cdb4@redhat.com>
	<20170705092757.63dc2328@gandalf.local.home>
	<20170705140607.GA30187@kroah.com>
	<a462fb3b-a6d4-e969-b301-b404981de224@roeck-us.net>
	<20170705112707.54d7f345@gandalf.local.home>
	<c782a15a-4e73-7373-ca66-5b55e9406059@roeck-us.net>
	<20170705130200.7c653f61@gandalf.local.home>
	<87zibkzgve.fsf@xmission.com>
Message-ID: <20170731161123.4d1e80ac@gandalf.local.home>

On Mon, 31 Jul 2017 11:54:45 -0500
ebiederm at xmission.com (Eric W. Biederman) wrote:

> Until kselftests get easier to run I don't think they are something we
> want to push to hard.

Then perhaps we should push making them easier to run.

-- Steve

From ebiederm at xmission.com  Mon Jul 31 20:12:46 2017
From: ebiederm at xmission.com (Eric W. Biederman)
Date: Mon, 31 Jul 2017 15:12:46 -0500
Subject: [Ksummit-discuss] [MAINTAINERS SUMMIT] & [TECH TOPIC] Improve
	regression tracking
In-Reply-To: <20170731161123.4d1e80ac@gandalf.local.home> (Steven Rostedt's
	message of "Mon, 31 Jul 2017 16:11:23 -0400")
References: <576cea07-770a-4864-c3f5-0832ff211e94@leemhuis.info>
	<20170703123025.7479702e@gandalf.local.home>
	<ad94dc65-cc9c-f4f1-27c1-5a48603c7f59@leemhuis.info>
	<20170705084528.67499f8c@gandalf.local.home>
	<4080ecc7-1aa8-2940-f230-1b79d656cdb4@redhat.com>
	<20170705092757.63dc2328@gandalf.local.home>
	<20170705140607.GA30187@kroah.com>
	<a462fb3b-a6d4-e969-b301-b404981de224@roeck-us.net>
	<20170705112707.54d7f345@gandalf.local.home>
	<c782a15a-4e73-7373-ca66-5b55e9406059@roeck-us.net>
	<20170705130200.7c653f61@gandalf.local.home>
	<87zibkzgve.fsf@xmission.com>
	<20170731161123.4d1e80ac@gandalf.local.home>
Message-ID: <87o9s0z7pd.fsf@xmission.com>

Steven Rostedt <rostedt at goodmis.org> writes:

> On Mon, 31 Jul 2017 11:54:45 -0500
> ebiederm at xmission.com (Eric W. Biederman) wrote:
>
>> Until kselftests get easier to run I don't think they are something we
>> want to push to hard.
>
> Then perhaps we should push making them easier to run.

Please.

Eric