<div dir="ltr"><div><div><div><div><div>Hello Tim and Victor,<br><br></div>I am a co-author of this code and I must confess that it was more or less my fault that it was made on Perl.<br><br></div>Regarding how many logs the program analyzes, I think it is nowhere near 5000, it is much less, but taking in count that some logs are similar I think it is possible that some logs that haven't been tested are going to work, but who knows :).<br><br></div>And about the output file, right now it delivers a comma separated list of numbers, without headers, this is because this code is part of a bigger tool, I think that code is not open source yet, but that doesn't matter I guess, the thing here is that I think the output could be changed into a json like you suggested and i can try to translate the code from Perl to Python, still not sure how long it's gonna take, but I can sure try.<br><br></div>Thanks.<br></div>- Guillermo Ponce<br></div><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Nov 7, 2016 at 6:26 PM, Bird, Timothy <span dir="ltr"><<a href="mailto:Tim.Bird@am.sony.com" target="_blank">Tim.Bird@am.sony.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Victor,<br>
<br>
Thanks for raising this topic. I think it's an important one. I have some comments below, inline.<br>
<br>
> -----Original Message-----<br>
> From: Victor Rodriguez on Saturday, November 05, 2016 10:15 AM<br>
><br>
> This week I presented a case of study for the problem of lack of test<br>
> log output standardization in the majority of packages that are used<br>
> to build the current Linux distributions. This was presented as a BOF<br>
> ( <a href="https://www.linuxplumbersconf.org/2016/ocw/proposals/3555" rel="noreferrer" target="_blank">https://www.linuxplumbersconf.<wbr>org/2016/ocw/proposals/3555</a>) during<br>
> the Linux Plumbers Conference.<br>
><br>
> it was a productive discussion that let us share the problem that we<br>
> have in the current projects that we use every day to build a<br>
> distribution ( either in embedded as in a cloud base distribution).<br>
> The open source projects don't follow a standard output log format to<br>
> print the passing and failing tests that they run during packaging<br>
> time ( "make test" or "make check" )<br>
><br>
> The Clear Linux project is using a simple Perl script that helps them<br>
> to count the number of passing and failing tests (which should be<br>
> trivial if could have a single standard output among all the projects,<br>
> but we don’t):<br>
><br>
> <a href="https://github.com/clearlinux/autospec/blob/master/autospec/count.pl" rel="noreferrer" target="_blank">https://github.com/clearlinux/<wbr>autospec/blob/master/autospec/<wbr>count.pl</a><br>
><br>
> # perl <a href="http://count.pl" rel="noreferrer" target="_blank">count.pl</a> <build.log><br>
<br>
A few remarks about this. This will be something of a stream of ideas, not<br>
very well organized. I'd like to prevent requiring too many different<br>
language skills in Fuego. In order to write a test for Fuego, we already require<br>
knowledge of shell script, python (for the benchmark parsers) and json formats<br>
(for the test specs and plans). I'd be hesitant to adopt something in perl, but maybe<br>
there's a way to leverage the expertise embedded in your script.<br>
<br>
I'm not that fond of the idea of integrating all the parsers into a single program.<br>
I think it's conceptually simpler to have a parser per log file format. However,<br>
I haven't looked in detail at your parser, so I can't really comment on it's<br>
complexity. I note that 0day has a parser per test (but I haven't checked to<br>
see if they re-use common parsers between tests.) Possibly some combination<br>
of code-driven and data-driven parsers is best, but I don't have the experience<br>
you guys do with your parser.<br>
<br>
If I understood your presentation, you are currently parsing<br>
logs for thousands of packages. I thought you said that about half of the<br>
20,000 packages in a distro have unit tests, and I thought you said that<br>
your parser was covering about half of those (so, about 5000 packages currently).<br>
And this is with 26 log formats parsed so far.<br>
<br>
I'm guessing that packages have a "long tail" of formats, with them getting<br>
weirder and weirder the farther out on the tail of formats you get.<br>
<br>
Please correct my numbers if I'm mistaken.<br>
<br>
> Examples of real packages build logs:<br>
><br>
> <a href="https://kojipkgs.fedoraproject.org//packages/gcc/6.2.1/2.fc25/data/logs/x8" rel="noreferrer" target="_blank">https://kojipkgs.<wbr>fedoraproject.org//packages/<wbr>gcc/6.2.1/2.fc25/data/logs/x8</a><br>
> 6_64/build.log<br>
> <a href="https://kojipkgs.fedoraproject.org//packages/acl/2.2.52/11.fc24/data/logs/x" rel="noreferrer" target="_blank">https://kojipkgs.<wbr>fedoraproject.org//packages/<wbr>acl/2.2.52/11.fc24/data/logs/x</a><br>
> 86_64/build.log<br>
><br>
> So far that simple (and not well engineered) parser has found 26<br>
> “standard” outputs ( and counting ) .<br>
<br>
This is actually remarkable, as Fuego is only handing the formats for the<br>
standalone tests we ship with Fuego. As I stated in the BOF, we have two<br>
mechanisms, one for functional tests that uses shell, grep and diff, and<br>
one for benchmark tests that uses a very small python program that uses<br>
regexes. So, currently we only have 50 tests covered, but many of these<br>
parsers use very simple one-line grep regexes.<br>
<br>
Neither of these Fuego log results parser methods supports tracking individual<br>
subtest results.<br>
<br>
> The script has the fail that it<br>
> does not recognize the name of the tests in order to detect<br>
> regressions. Maybe one test was passing in the previous release and in<br>
> the new one is failing, and then the number of failing tests remains<br>
> the same.<br>
<br>
This is a concern with the Fuego log parsing as well.<br>
<br>
I would like to modify Fuego's parser to not just parse out counts, but to<br>
also convert the results to something where individual sub-tests can be<br>
tracked over time. Daniel Sangorrin's recent work converting the output<br>
of LTP into excel format might be one way to do this (although I'm not<br>
that comfortable with using a proprietary format - I would prefer CSV<br>
or json, but I think Daniel is going for ease of use first.)<br>
<br>
I need to do some more research, but I'm hoping that there are Jenkins<br>
plugins (maybe xUnit) that will provide tools to automatically handle<br>
visualization of test and sub-test results over time. If so, I might<br>
try converting the Fuego parsers to product that format.<br>
<br>
> To be honest, before presenting at LPC I was very confident that this<br>
> script ( or another version of it , much smarter ) could be beginning<br>
> of the solution to the problem we have. However, during the discussion<br>
> at LPC I understand that this might be a huge effort (not sure if<br>
> bigger) in order to solve the nightmare we already have.<br>
<br>
So far, I think you're solving a bit different problem than Fuego is, and in one sense are<br>
much farther along than Fuego. I'm hoping we can learn from your<br>
experience with this.<br>
<br>
I do think we share the goal of producing a standard, or at least a recommendation,<br>
for a common test log output format. This would help the industry going forward.<br>
Even if individual tests don't produce the standard format, it will help 3rd parties<br>
write parsers that conform the test output to the format, as well as encourage the<br>
development of tools that utilize the format for visualization or regression checking.<br>
<br>
Do you feel confident enough to propose a format? I don't at the moment.<br>
I'd like to survey the industry for 1) existing formats produced by tests (which you have good experience<br>
with, which is already maybe capture well by your perl script), and 2) existing tools<br>
that use common formats as input (e.g. the Jenkins xunit plugin). From this I'd like<br>
to develop some ideas about the fields that are most commonly used, and a good language to<br>
express those fields. My preference would be JSON - I'm something of an XML naysayer, but<br>
I could be talked into YAML. Under no circumstances do I want to invent a new language for<br>
this.<br>
<br>
> Tim Bird participates at the BOF and recommends me to send a mail to<br>
> the Fuego project team in order to look for more inputs and ideas bout<br>
> this topic.<br>
><br>
> I really believe in the importance of attack this problem before we<br>
> have a bigger problem<br>
><br>
> All feedback is more than welcome<br>
<br>
Here is how I propose moving forward on this. I'd like to get a group together to study this<br>
issue. I wrote down a list of people at LPC who seem to be working on test issues. I'd like to<br>
do the following:<br>
1) perform a survey of the areas I mentioned above<br>
2) write up a draft spec<br>
3) send it around for comments (to what individual and lists? is an open issue)<br>
4) discuss it at a future face-to-face meeting (probably at ELC or maybe next year's plumbers)<br>
5) publish it as a standard endorsed by the Linux Foundation<br>
<br>
Let me know what you think, and if you'd like to be involved.<br>
<br>
Thanks and regards,<br>
-- Tim<br>
<br>
</blockquote></div><br><br clear="all"><br>-- <br><div class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr">- Guillermo Ponce</div></div>
</div>