[Fuego] LPC Increase Test Coverage in a Linux-based OS

Mon Nov 14 01:44:05 UTC 2016

> -----Original Message-----
> From: Victor Rodriguez [mailto:vm.rod25 at gmail.com]
> Sent: Thursday, November 10, 2016 10:30 PM
> To: Daniel Sangorrin
> Cc: fuego at lists.linuxfoundation.org; Guillermo Adrian Ponce Castañeda
> Subject: Re: [Fuego] LPC Increase Test Coverage in a Linux-based OS
> 
> On Wed, Nov 9, 2016 at 10:09 PM, Daniel Sangorrin
> <daniel.sangorrin at toshiba.co.jp> wrote:
> > Hi Victor,
> >
> >> -----Original Message-----
> >> From: fuego-bounces at lists.linuxfoundation.org [mailto:fuego-bounces at lists.linuxfoundation.org] On Behalf Of Victor Rodriguez
> >> Sent: Sunday, November 06, 2016 2:15 AM
> >> To: fuego at lists.linuxfoundation.org; Guillermo Adrian Ponce Castañeda
> >> Subject: [Fuego] LPC Increase Test Coverage in a Linux-based OS
> >>
> >> Hi Fuego team.
> >>
> >> This week I presented a case of study for the problem of lack of test
> >> log output standardization in the majority of packages that are used
> >> to build the current Linux distributions. This was presented as a BOF
> >> ( https://www.linuxplumbersconf.org/2016/ocw/proposals/3555)  during
> >> the Linux Plumbers Conference.
> >>
> >> it was a productive  discussion that let us share the problem that we
> >> have in the current projects that we use every day to build a
> >> distribution ( either in embedded as in a cloud base distribution).
> >> The open source projects don't follow a standard output log format to
> >> print the passing and failing tests that they run during packaging
> >> time ( "make test" or "make check" )
> >
> > Sorry I couldn't download your slides because of proxy issues but
> > I think you are talking about the tests that are inside packages (e.g. .deb .rpm files).
> > For example, autopkgtest for debian. Is that correct?
> >
> 
> Yes
> 
> > I'm not an expert about them, but I believe these tests can also be executed
> > decoupled  from the build process in a flexible way (e.g.: locally, on qemu,
> > remotely through ssh, or on an lxc/schroot environment for example).
> >
> 
> Yes , with a little of extra work in the tool path , for example some
> of the test point to the binary they build
> isntead of the one in /usr/bin for example. But yes, with a little of
> extra work all these test can be decoupled
> 
> > Being able to leverage all these tests in Fuego for testing package-based
> > embedded systems would be great.
> >
> 
> Yes !!!
> 
> > For non-package-based embedded systems, I think those tests [2]
> > could be ported and made cross-compilable. In particular, Yocto/OpenEmbedded's ptest
> > framework decouples the compiling phase from the testing phase and
> > produces "a consistent output format".
> >
> > [1] https://packages.debian.org/sid/autopkgtest
> > [2] https://wiki.yoctoproject.org/wiki/Ptest
> >
> 
> I knew I was not wrong when I mention about this Ptest during the conference
> 
> Let me take a look and see how they work

Ptest normally uses the test suite that comes with the original source code. For
example, for the openssh recipe it uses some of the tests inside openssh's "regress" folder.

However, there are many recipes without their corresponding ptest definitions. I'm not sure
if that is just because nobody added them yet, or because there was no test suite
in the original source code.

There is another thing called testimage [1] that seems closer to the build tests you
are talking about, but I have never used it. Might be worth asking about it.

[1] https://wiki.yoctoproject.org/wiki/Image_tests

> >> The Clear Linux project is using a simple Perl script that helps them
> >> to count the number of passing and failing tests (which should be
> >> trivial if could have a single standard output among all the projects,
> >> but we don’t):
> >
> > I think that counting is good but we also need to know specifically which test/subtest
> > in particular failed and what the error log was like.
> >
> 
> Great , how do you push this to jenkins ?

At the moment, I just put a link from the jenkins LTP webpage to the spreadsheet file.
When the LTP tests finish, you click on that link and get the updated spreadsheet.
In the future I want to split the parsing functionality from the spreadsheet (see below).

> What do you think about the TAP ?

I didn't know TAP before. From my understanding, the part of TAP that would be
useful for us is the "specification of the test output format" and the available
"consumers/parsers" (including one for jenkins [2] and a python library [3]).

Although subtests (grouping/test suites) seem not to be officially in the TAP 13
specification (according to [2]), the format looks quite flexible. 

1..2
ok 1
not ok 2 - br.eti.kinoshita.selenium.TestListVeterinarians#testGoogle
  ---
  extensions:
      Files:
          my_message.txt:
            File-Title: my_message.txt
            File-Description: Sample message
            File-Size: 31
            File-Name: message.txt
            File-Content: TuNvIGNvbnRhdmFtIGNvbSBtaW5oYSBhc3T6Y2lhIQ==
            File-Type: image/png
  ...

Comparing with the output of
ctest (test framework provided by cmake) I only miss two things:
  - The name of the test/subtest (TAP uses numbers, I don't like that)
  - Information about timing (ctest tells you how long it took for the test to finish)

Still, I think that could be worked out easily.

[2] https://wiki.jenkins-ci.org/display/JENKINS/TAP+Plugin
[3] https://pypi.python.org/pypi/tap.py

> If you could share a csv example, that will be great

I'm using a normal spreadsheet, not csv (although it's just a table like csv). The advantages over
csv are that you don't have to care about 'commas' in the error logs; that you can separate
test cases using sheets; and that you can apply colors for easier visualization when you have
lots of tests passing and only a few of them failing. You can see the example attached or 
on the slides at [4]. 

Another advantage is that you can later analyze it (e.g. calculate the 5 number summary, 
create some figures, etc..), write comments about why it's failing (e.g. only because that 
functionality is not present), and hand it as a report to your customer.
# This is an important point that some people miss. The test results are not just for the
# developers. They are used for compliance, certification, or customer deliverables as well.

However, I admit that my implementation is mixing a "parser" (extract information 
from the LTP logs) and a "visualizer" (the spreadsheet). Once we decide on the standard 
output format" we can just use it as a visualizer.

I think we should go for an architecture that looks a bit like the cloud log collectors 
fluentd [5] or logstash.

LTP log format   -----adapter---+
PTS log format   -----adapter---+---> TAP/Ctest format --> Visualizers (jenkins, spreadsheet, gnuplot..)
NNN log format   ---adapter---+

Actually, I think we could even use them by writing input adapter plugins and a TAP output plugin.
Probably there is some value in doing it like that, because there are many powerful tools around 
for visualization (kibana) and searching (elasticsearch). 

[4] http://elinux.org/images/7/77/Fuego-jamboree-oct-2016.pdf
[5] https://camo.githubusercontent.com/c4abfe337c0b54b36f81bce78481f8965acbc7a9/687474703a2f2f646f63732e666c75656e74642e6f72672f696d616765732f666c75656e74642d6172636869746563747572652e706e67

Cheers

--
IoT Technology center
Toshiba Corp. Industrial ICT solutions, 
Daniel SANGORRIN

> >> https://github.com/clearlinux/autospec/blob/master/autospec/count.pl
> >>
> >> # perl count.pl <build.log>
> >>
> >> Examples of real packages build logs:
> >>
> >> https://kojipkgs.fedoraproject.org//packages/gcc/6.2.1/2.fc25/data/logs/x86_64/build.log
> >> https://kojipkgs.fedoraproject.org//packages/acl/2.2.52/11.fc24/data/logs/x86_64/build.log
> >>
> >> So far that simple (and not well engineered) parser has found 26
> >> “standard” outputs ( and counting ) .  The script has the fail that it
> >> does not recognize the name of the tests in order to detect
> >> regressions. Maybe one test was passing in the previous release and in
> >> the new one is failing, and then the number of failing tests remains
> >> the same.
> >>
> >> To be honest, before presenting at LPC I was very confident that this
> >> script ( or another version of it , much smarter ) could be beginning
> >> of the solution to the problem we have. However, during the discussion
> >> at LPC I understand that this might be a huge effort (not sure if
> >> bigger) in order to solve the nightmare we already have.
> >>
> >> Tim Bird participates at the BOF and recommends me to send a mail to
> >> the Fuego project team in order to look for more inputs and ideas bout
> >> this topic.
> >>
> >> I really believe in the importance of attack this problem before we
> >> have a bigger problem
> >>
> >> All feedback is more than welcome
> >>
> >> Regards
> >>
> >> Victor Rodriguez
> >>
> >> [presentation slides] :
> >> https://drive.google.com/open?id=0B7iKrGdVkDhIcVpncUdGTGhEQTQ
> >> [BOF notes] : https://drive.google.com/open?id=1lOPXQcrhL4AoOBSDnwUlJAKIXsReU8OqP82usZn-DCo
> >> _______________________________________________
> >> Fuego mailing list
> >> Fuego at lists.linuxfoundation.org
> >> https://lists.linuxfoundation.org/mailman/listinfo/fuego
> >
> >
-------------- next part --------------
A non-text attachment was scrubbed...
Name: results-jamboree.xlsx
Type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
Size: 54705 bytes
Desc: not available
URL: <http://lists.linuxfoundation.org/pipermail/fuego/attachments/20161114/45b6a133/attachment-0001.xlsx>