[Fuego] LPC Increase Test Coverage in a Linux-based OS
Daniel Sangorrin
daniel.sangorrin at toshiba.co.jp
Thu Nov 10 04:07:10 UTC 2016
Hi all,
> -----Original Message-----
> From: fuego-bounces at lists.linuxfoundation.org [mailto:fuego-bounces at lists.linuxfoundation.org] On Behalf Of Bird, Timothy
> Sent: Wednesday, November 09, 2016 9:21 AM
> To: Guillermo Adrian Ponce Castañeda
> Cc: fuego at lists.linuxfoundation.org
> Subject: Re: [Fuego] LPC Increase Test Coverage in a Linux-based OS
>
...
> I'll go first - Fuego is currently just using the standard Jenkins "weather" report
> and 'list of recent overall pass/failure' for each test. So we don't have anything
> visualizing the results of sub-tests, or even displaying the counts for each test run, at the moment.
> Daniel Sangorrin has just recently proposed a facility to put LTP results into spreadsheet format,
> to allow visualizing test results over time via spreadsheet tools. I'd like to add better
> sub-test visualization in the future, but that's lower on our priority list at the moment.
Actually, the spreadsheet format I'm using is basically CSV + some colors to easily distinguish
failed from pass tests. It can be opened with libreoffice or exported to CSV format.
I can add CSV output to my script (which can also be opened with libreoffice).
Best regards
Daniel
> Also in the future, we'd like to do test results aggregation, to allow for data mining
> of results from tests on different hardware platforms and embedded distributions.
> This will require that the parsed log output be machine-readable, and consistent.
> -- Tim
>
> > On Mon, Nov 7, 2016 at 6:26 PM, Bird, Timothy <Tim.Bird at am.sony.com
> > <mailto:Tim.Bird at am.sony.com> > wrote:
> >
> >
> > Victor,
> >
> > Thanks for raising this topic. I think it's an important one. I have
> > some comments below, inline.
> >
> > > -----Original Message-----
> > > From: Victor Rodriguez on Saturday, November 05, 2016 10:15 AM
> > >
> > > This week I presented a case of study for the problem of lack of
> > test
> > > log output standardization in the majority of packages that are used
> > > to build the current Linux distributions. This was presented as a BOF
> > > ( https://www.linuxplumbersconf.org/2016/ocw/proposals/3555
> > <https://www.linuxplumbersconf.org/2016/ocw/proposals/3555> ) during
> > > the Linux Plumbers Conference.
> > >
> > > it was a productive discussion that let us share the problem that
> > we
> > > have in the current projects that we use every day to build a
> > > distribution ( either in embedded as in a cloud base distribution).
> > > The open source projects don't follow a standard output log format
> > to
> > > print the passing and failing tests that they run during packaging
> > > time ( "make test" or "make check" )
> > >
> > > The Clear Linux project is using a simple Perl script that helps them
> > > to count the number of passing and failing tests (which should be
> > > trivial if could have a single standard output among all the projects,
> > > but we don’t):
> > >
> > >
> > https://github.com/clearlinux/autospec/blob/master/autospec/count.pl
> > <https://github.com/clearlinux/autospec/blob/master/autospec/count.pl>
> > >
> > > # perl count.pl <http://count.pl> <build.log>
> >
> > A few remarks about this. This will be something of a stream of
> > ideas, not
> > very well organized. I'd like to prevent requiring too many different
> > language skills in Fuego. In order to write a test for Fuego, we
> > already require
> > knowledge of shell script, python (for the benchmark parsers) and
> > json formats
> > (for the test specs and plans). I'd be hesitant to adopt something in
> > perl, but maybe
> > there's a way to leverage the expertise embedded in your script.
> >
> > I'm not that fond of the idea of integrating all the parsers into a single
> > program.
> > I think it's conceptually simpler to have a parser per log file format.
> > However,
> > I haven't looked in detail at your parser, so I can't really comment on
> > it's
> > complexity. I note that 0day has a parser per test (but I haven't
> > checked to
> > see if they re-use common parsers between tests.) Possibly some
> > combination
> > of code-driven and data-driven parsers is best, but I don't have the
> > experience
> > you guys do with your parser.
> >
> > If I understood your presentation, you are currently parsing
> > logs for thousands of packages. I thought you said that about half of
> > the
> > 20,000 packages in a distro have unit tests, and I thought you said
> > that
> > your parser was covering about half of those (so, about 5000
> > packages currently).
> > And this is with 26 log formats parsed so far.
> >
> > I'm guessing that packages have a "long tail" of formats, with them
> > getting
> > weirder and weirder the farther out on the tail of formats you get.
> >
> > Please correct my numbers if I'm mistaken.
> >
> > > Examples of real packages build logs:
> > >
> > >
> > https://kojipkgs.fedoraproject.org//packages/gcc/6.2.1/2.fc25/data/logs/x8
> > <https://kojipkgs.fedoraproject.org//packages/gcc/6.2.1/2.fc25/data/logs/x
> > 8>
> > > 6_64/build.log
> > >
> > https://kojipkgs.fedoraproject.org//packages/acl/2.2.52/11.fc24/data/logs/x
> > <https://kojipkgs.fedoraproject.org//packages/acl/2.2.52/11.fc24/data/logs/
> > x>
> > > 86_64/build.log
> > >
> > > So far that simple (and not well engineered) parser has found 26
> > > “standard” outputs ( and counting ) .
> >
> > This is actually remarkable, as Fuego is only handing the formats for
> > the
> > standalone tests we ship with Fuego. As I stated in the BOF, we have
> > two
> > mechanisms, one for functional tests that uses shell, grep and diff,
> > and
> > one for benchmark tests that uses a very small python program that
> > uses
> > regexes. So, currently we only have 50 tests covered, but many of
> > these
> > parsers use very simple one-line grep regexes.
> >
> > Neither of these Fuego log results parser methods supports tracking
> > individual
> > subtest results.
> >
> > > The script has the fail that it
> > > does not recognize the name of the tests in order to detect
> > > regressions. Maybe one test was passing in the previous release
> > and in
> > > the new one is failing, and then the number of failing tests remains
> > > the same.
> >
> > This is a concern with the Fuego log parsing as well.
> >
> > I would like to modify Fuego's parser to not just parse out counts, but
> > to
> > also convert the results to something where individual sub-tests can
> > be
> > tracked over time. Daniel Sangorrin's recent work converting the
> > output
> > of LTP into excel format might be one way to do this (although I'm
> > not
> > that comfortable with using a proprietary format - I would prefer CSV
> > or json, but I think Daniel is going for ease of use first.)
> >
> > I need to do some more research, but I'm hoping that there are
> > Jenkins
> > plugins (maybe xUnit) that will provide tools to automatically handle
> > visualization of test and sub-test results over time. If so, I might
> > try converting the Fuego parsers to product that format.
> >
> > > To be honest, before presenting at LPC I was very confident that
> > this
> > > script ( or another version of it , much smarter ) could be beginning
> > > of the solution to the problem we have. However, during the
> > discussion
> > > at LPC I understand that this might be a huge effort (not sure if
> > > bigger) in order to solve the nightmare we already have.
> >
> > So far, I think you're solving a bit different problem than Fuego is,
> > and in one sense are
> > much farther along than Fuego. I'm hoping we can learn from your
> > experience with this.
> >
> > I do think we share the goal of producing a standard, or at least a
> > recommendation,
> > for a common test log output format. This would help the industry
> > going forward.
> > Even if individual tests don't produce the standard format, it will help
> > 3rd parties
> > write parsers that conform the test output to the format, as well as
> > encourage the
> > development of tools that utilize the format for visualization or
> > regression checking.
> >
> > Do you feel confident enough to propose a format? I don't at the
> > moment.
> > I'd like to survey the industry for 1) existing formats produced by
> > tests (which you have good experience
> > with, which is already maybe capture well by your perl script), and 2)
> > existing tools
> > that use common formats as input (e.g. the Jenkins xunit plugin).
> > From this I'd like
> > to develop some ideas about the fields that are most commonly
> > used, and a good language to
> > express those fields. My preference would be JSON - I'm something
> > of an XML naysayer, but
> > I could be talked into YAML. Under no circumstances do I want to
> > invent a new language for
> > this.
> >
> > > Tim Bird participates at the BOF and recommends me to send a mail
> > to
> > > the Fuego project team in order to look for more inputs and ideas
> > bout
> > > this topic.
> > >
> > > I really believe in the importance of attack this problem before we
> > > have a bigger problem
> > >
> > > All feedback is more than welcome
> >
> > Here is how I propose moving forward on this. I'd like to get a group
> > together to study this
> > issue. I wrote down a list of people at LPC who seem to be working
> > on test issues. I'd like to
> > do the following:
> > 1) perform a survey of the areas I mentioned above
> > 2) write up a draft spec
> > 3) send it around for comments (to what individual and lists? is an
> > open issue)
> > 4) discuss it at a future face-to-face meeting (probably at ELC or
> > maybe next year's plumbers)
> > 5) publish it as a standard endorsed by the Linux Foundation
> >
> > Let me know what you think, and if you'd like to be involved.
> >
> > Thanks and regards,
> > -- Tim
> >
> >
> >
> >
> >
> >
> > --
> >
> > - Guillermo Ponce
> _______________________________________________
> Fuego mailing list
> Fuego at lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/fuego
More information about the Fuego
mailing list