[Fuego] LPC Increase Test Coverage in a Linux-based OS

Bird, Timothy Tim.Bird at am.sony.com
Tue Nov 8 00:26:56 UTC 2016


Victor,

Thanks for raising this topic.  I think it's an important one.  I have some comments below, inline.

> -----Original Message-----
> From: Victor Rodriguez on Saturday, November 05, 2016 10:15 AM
>
> This week I presented a case of study for the problem of lack of test
> log output standardization in the majority of packages that are used
> to build the current Linux distributions. This was presented as a BOF
> ( https://www.linuxplumbersconf.org/2016/ocw/proposals/3555)  during
> the Linux Plumbers Conference.
> 
> it was a productive  discussion that let us share the problem that we
> have in the current projects that we use every day to build a
> distribution ( either in embedded as in a cloud base distribution).
> The open source projects don't follow a standard output log format to
> print the passing and failing tests that they run during packaging
> time ( "make test" or "make check" )
> 
> The Clear Linux project is using a simple Perl script that helps them
> to count the number of passing and failing tests (which should be
> trivial if could have a single standard output among all the projects,
> but we don’t):
> 
> https://github.com/clearlinux/autospec/blob/master/autospec/count.pl
> 
> # perl count.pl <build.log>

A few remarks about this.  This will be something of a stream of ideas, not
very well organized.  I'd like to prevent requiring too many different
language skills in Fuego.  In order to write a test for Fuego, we already require
knowledge of shell script, python (for the benchmark parsers) and json formats
(for the test specs and plans).  I'd be hesitant to adopt something in perl, but maybe
there's a way to leverage the expertise embedded in your script.

I'm not that fond of the idea of integrating all the parsers into a single program.
I think it's conceptually simpler to have a parser per log file format.  However,
I haven't looked in detail at your parser, so I can't really comment on it's
complexity.  I note that 0day has a parser per test (but I haven't checked to
see if they re-use common parsers between tests.)  Possibly some combination
of code-driven and data-driven parsers is best, but I don't have the experience
you guys do with your parser.

If I understood your presentation, you are currently parsing
logs for thousands of packages. I thought you said that about half of the
20,000 packages in a distro have unit tests, and I thought you said that
your parser was covering about half of those (so, about 5000 packages currently).
And this is with 26 log formats parsed so far.

I'm guessing that packages have a "long tail" of formats, with them getting
weirder and weirder the farther out on the tail of formats you get.

Please correct my numbers if I'm mistaken.

> Examples of real packages build logs:
> 
> https://kojipkgs.fedoraproject.org//packages/gcc/6.2.1/2.fc25/data/logs/x8
> 6_64/build.log
> https://kojipkgs.fedoraproject.org//packages/acl/2.2.52/11.fc24/data/logs/x
> 86_64/build.log
> 
> So far that simple (and not well engineered) parser has found 26
> “standard” outputs ( and counting ) . 

This is actually remarkable, as Fuego is only handing the formats for the
standalone tests we ship with Fuego.  As I stated in the BOF, we have two 
mechanisms, one for functional tests that uses shell, grep and diff, and
one for benchmark tests that uses a very small python program that uses
regexes.   So, currently we only have 50 tests covered, but many of these
parsers use very simple one-line grep regexes.

Neither of these Fuego log results parser methods supports tracking individual
subtest results.

> The script has the fail that it
> does not recognize the name of the tests in order to detect
> regressions. Maybe one test was passing in the previous release and in
> the new one is failing, and then the number of failing tests remains
> the same.

This is a concern with the Fuego log parsing as well.

I would like to modify Fuego's parser to not just parse out counts, but to
also convert the results to something where individual sub-tests can be
tracked over time.  Daniel Sangorrin's recent work converting the output
of LTP into excel format might be one way to do this (although I'm not
that comfortable with using a proprietary format - I would prefer CSV
or json, but I think Daniel is going for ease of use first.)

I need to do some more research, but I'm hoping that there are Jenkins
plugins (maybe xUnit) that will provide tools to automatically handle 
visualization of test and sub-test results over time.  If so, I might
try converting the Fuego parsers to product that format.

> To be honest, before presenting at LPC I was very confident that this
> script ( or another version of it , much smarter ) could be beginning
> of the solution to the problem we have. However, during the discussion
> at LPC I understand that this might be a huge effort (not sure if
> bigger) in order to solve the nightmare we already have.

So far, I think you're solving a bit different problem than Fuego is, and in one sense are
much farther along than Fuego.  I'm hoping we can learn from your
experience with this.

I do think we share the goal of producing a standard, or at least a recommendation,
for a common test log output format.  This would help the industry going forward.
Even if individual tests don't produce the standard format, it will help 3rd parties
write parsers that conform the test output to the format, as well as encourage the
development of tools that utilize the format for visualization or regression checking.

Do you feel confident enough to propose a format?  I don't at the moment.
I'd like to survey the industry for 1) existing formats produced by tests (which you have good experience
with, which is already maybe capture well by your perl script), and 2) existing tools
that use common formats as input (e.g. the Jenkins xunit plugin).  From this I'd like
to develop some ideas about the fields that are most commonly used, and a good language to
express those fields. My preference would be JSON - I'm something of an XML naysayer, but
I could be talked into YAML.  Under no circumstances do I want to invent a new language for
this.
 
> Tim Bird participates at the BOF and recommends me to send a mail to
> the Fuego project team in order to look for more inputs and ideas bout
> this topic.
> 
> I really believe in the importance of attack this problem before we
> have a bigger problem
> 
> All feedback is more than welcome

Here is how I propose moving forward on this.  I'd like to get a group together to study this
issue.  I wrote down a list of people at LPC who seem to be working on test issues.  I'd like to
do the following:
 1) perform a survey of the areas I mentioned above
 2) write up a draft spec
 3) send it around for comments (to what individual and lists? is an open issue)
 4) discuss it at a future face-to-face meeting (probably at ELC or maybe next year's plumbers)
 5) publish it as a standard endorsed by the Linux Foundation

Let me know what you think, and if you'd like to be involved.

Thanks and regards,
 -- Tim



More information about the Fuego mailing list