<div dir="ltr"><div><div><div><div><div><div><div>Hi Tim and Victor,<br><br></div>Ok, if we would like to talk about the tools that are used to visualize the results, I will try to describe the way it works without incurring in revealing proprietary information, since that code is not open source yet.<br><br></div>There is an initial script that gets the active packages names for the linux distro and gets the build logs for each one.<br></div>That script calls <a href="http://count.pl">count.pl</a> script and attaches the package name to the results of the <a href="http://count.pl">count.pl</a> script. That string will result like '<package>,100,80,20,0,0', if I remind correctly, and each package output will be appended to big csv file with headers.<br></div>After we have that csv file we pass it to another script that will create some graphs.<br><br></div>So basically it is all CSV and home made tools to analyze them, I think it can be automated, but I guess Victor can give us more details on the current process on that matter if any.<br><br></div>Thanks and Regards.<br></div>- Guillermo Ponce<br><div><div><div><div><div><div><div><br></div></div></div></div></div></div></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Nov 8, 2016 at 6:21 PM, Bird, Timothy <span dir="ltr"><<a href="mailto:Tim.Bird@am.sony.com" target="_blank">Tim.Bird@am.sony.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class=""><br>
<br>
> -----Original Message-----<br>
> From: Guillermo Adrian Ponce Castañeda on Tuesday, November 08, 2016 11:38 AM<br>
><br>
> I am a co-author of this code and I must confess that it was more or less my<br>
> fault that it was made on Perl.<br>
<br>
</span>No blame intended. :-)<br>
<span class=""><br>
><br>
> Regarding how many logs the program analyzes, I think it is nowhere near<br>
> 5000, it is much less, but taking in count that some logs are similar I think it is<br>
> possible that some logs that haven't been tested are going to work, but who<br>
> knows :).<br>
><br>
><br>
> And about the output file, right now it delivers a comma separated list of<br>
> numbers, without headers, this is because this code is part of a bigger tool, I<br>
> think that code is not open source yet, but that doesn't matter I guess, the<br>
> thing here is that I think the output could be changed into a json like you<br>
> suggested and i can try to translate the code from Perl to Python, still not<br>
> sure how long it's gonna take, but I can sure try.<br>
<br>
</span>Well, don't do any re-writing just yet. I think we need to consider the<br>
output format some more, and decide whether it makes sense to have a<br>
single vs. multiple parsers first.<br>
<br>
An important issue here is scalability of the project, and making it easy<br>
to allow (and incentivize) other developers to create and maintain<br>
parsers for the log files. Or, to help encourage people to use a common<br>
format either initially, or by conversion from their current log format.<br>
The only way to scale this is by having 3rd parties adopt the format, and<br>
be willing to maintain compatibility with it over time.<br>
<br>
I think it's important to consider what will motivate people to adopt a common<br>
log format. They either need to 1) write a parser for their current format, or<br>
2) identify an existing parser which is close enough and modify it to<br>
support their format, or 3) convert their test output directly to the desired<br>
format. This will be some amount of work whichever route people take.<br>
<br>
I think what will be of value is having tools that read and process the format,<br>
and provide utility to those who use the format for output. So I want to do a bit<br>
of a survey on what tools (visualizers, aggregators, automated processors,<br>
notifiers, etc.) might be useful to different developer groups, and make sure<br>
the format is something that can be used by existing tools or by envisioned<br>
future tools, that would be valuable to community members.<br>
<br>
In more high-level terms, we should trying to create a double-sided network effect,<br>
where use (output) of the format drives tools creation, and tools usage<br>
of the format (input) drives format popularity.<br>
<br>
Can you describe a bit more what tools, if any, you use to view the results,<br>
or any other processing systems that the results are used with? If you are reviewing<br>
results manually, are there steps you are doing now by hand that you'd like to<br>
do automatically in the future, that a common format would help you with?<br>
<br>
I'll go first - Fuego is currently just using the standard Jenkins "weather" report<br>
and 'list of recent overall pass/failure' for each test. So we don't have anything<br>
visualizing the results of sub-tests, or even displaying the counts for each test run, at the moment.<br>
Daniel Sangorrin has just recently proposed a facility to put LTP results into spreadsheet format,<br>
to allow visualizing test results over time via spreadsheet tools. I'd like to add better<br>
sub-test visualization in the future, but that's lower on our priority list at the moment.<br>
<br>
Also in the future, we'd like to do test results aggregation, to allow for data mining<br>
of results from tests on different hardware platforms and embedded distributions.<br>
This will require that the parsed log output be machine-readable, and consistent.<br>
-- Tim<br>
<span class=""><br>
> On Mon, Nov 7, 2016 at 6:26 PM, Bird, Timothy <<a href="mailto:Tim.Bird@am.sony.com">Tim.Bird@am.sony.com</a><br>
</span><span class="">> <mailto:<a href="mailto:Tim.Bird@am.sony.com">Tim.Bird@am.sony.com</a>> > wrote:<br>
><br>
><br>
> Victor,<br>
><br>
> Thanks for raising this topic. I think it's an important one. I have<br>
> some comments below, inline.<br>
><br>
> > -----Original Message-----<br>
> > From: Victor Rodriguez on Saturday, November 05, 2016 10:15 AM<br>
> ><br>
> > This week I presented a case of study for the problem of lack of<br>
> test<br>
> > log output standardization in the majority of packages that are used<br>
> > to build the current Linux distributions. This was presented as a BOF<br>
> > ( <a href="https://www.linuxplumbersconf.org/2016/ocw/proposals/3555" rel="noreferrer" target="_blank">https://www.linuxplumbersconf.<wbr>org/2016/ocw/proposals/3555</a><br>
</span>> <<a href="https://www.linuxplumbersconf.org/2016/ocw/proposals/3555" rel="noreferrer" target="_blank">https://www.<wbr>linuxplumbersconf.org/2016/<wbr>ocw/proposals/3555</a>> ) during<br>
<span class="">> > the Linux Plumbers Conference.<br>
> ><br>
> > it was a productive discussion that let us share the problem that<br>
> we<br>
> > have in the current projects that we use every day to build a<br>
> > distribution ( either in embedded as in a cloud base distribution).<br>
> > The open source projects don't follow a standard output log format<br>
> to<br>
> > print the passing and failing tests that they run during packaging<br>
> > time ( "make test" or "make check" )<br>
> ><br>
> > The Clear Linux project is using a simple Perl script that helps them<br>
> > to count the number of passing and failing tests (which should be<br>
> > trivial if could have a single standard output among all the projects,<br>
> > but we don’t):<br>
> ><br>
> ><br>
> <a href="https://github.com/clearlinux/autospec/blob/master/autospec/count.pl" rel="noreferrer" target="_blank">https://github.com/clearlinux/<wbr>autospec/blob/master/autospec/<wbr>count.pl</a><br>
> <<a href="https://github.com/clearlinux/autospec/blob/master/autospec/count.pl" rel="noreferrer" target="_blank">https://github.com/<wbr>clearlinux/autospec/blob/<wbr>master/autospec/count.pl</a>><br>
> ><br>
</span>> > # perl <a href="http://count.pl" rel="noreferrer" target="_blank">count.pl</a> <<a href="http://count.pl" rel="noreferrer" target="_blank">http://count.pl</a>> <build.log><br>
<div><div class="h5">><br>
> A few remarks about this. This will be something of a stream of<br>
> ideas, not<br>
> very well organized. I'd like to prevent requiring too many different<br>
> language skills in Fuego. In order to write a test for Fuego, we<br>
> already require<br>
> knowledge of shell script, python (for the benchmark parsers) and<br>
> json formats<br>
> (for the test specs and plans). I'd be hesitant to adopt something in<br>
> perl, but maybe<br>
> there's a way to leverage the expertise embedded in your script.<br>
><br>
> I'm not that fond of the idea of integrating all the parsers into a single<br>
> program.<br>
> I think it's conceptually simpler to have a parser per log file format.<br>
> However,<br>
> I haven't looked in detail at your parser, so I can't really comment on<br>
> it's<br>
> complexity. I note that 0day has a parser per test (but I haven't<br>
> checked to<br>
> see if they re-use common parsers between tests.) Possibly some<br>
> combination<br>
> of code-driven and data-driven parsers is best, but I don't have the<br>
> experience<br>
> you guys do with your parser.<br>
><br>
> If I understood your presentation, you are currently parsing<br>
> logs for thousands of packages. I thought you said that about half of<br>
> the<br>
> 20,000 packages in a distro have unit tests, and I thought you said<br>
> that<br>
> your parser was covering about half of those (so, about 5000<br>
> packages currently).<br>
> And this is with 26 log formats parsed so far.<br>
><br>
> I'm guessing that packages have a "long tail" of formats, with them<br>
> getting<br>
> weirder and weirder the farther out on the tail of formats you get.<br>
><br>
> Please correct my numbers if I'm mistaken.<br>
><br>
> > Examples of real packages build logs:<br>
> ><br>
> ><br>
> <a href="https://kojipkgs.fedoraproject.org//packages/gcc/6.2.1/2.fc25/data/logs/x8" rel="noreferrer" target="_blank">https://kojipkgs.<wbr>fedoraproject.org//packages/<wbr>gcc/6.2.1/2.fc25/data/logs/x8</a><br>
</div></div>> <<a href="https://kojipkgs.fedoraproject.org//packages/gcc/6.2.1/2.fc25/data/logs/x" rel="noreferrer" target="_blank">https://kojipkgs.<wbr>fedoraproject.org//packages/<wbr>gcc/6.2.1/2.fc25/data/logs/x</a><br>
> 8><br>
> > 6_64/build.log<br>
> ><br>
> <a href="https://kojipkgs.fedoraproject.org//packages/acl/2.2.52/11.fc24/data/logs/x" rel="noreferrer" target="_blank">https://kojipkgs.<wbr>fedoraproject.org//packages/<wbr>acl/2.2.52/11.fc24/data/logs/x</a><br>
> <<a href="https://kojipkgs.fedoraproject.org//packages/acl/2.2.52/11.fc24/data/logs/" rel="noreferrer" target="_blank">https://kojipkgs.<wbr>fedoraproject.org//packages/<wbr>acl/2.2.52/11.fc24/data/logs/</a><br>
> x><br>
<div class="HOEnZb"><div class="h5">> > 86_64/build.log<br>
> ><br>
> > So far that simple (and not well engineered) parser has found 26<br>
> > “standard” outputs ( and counting ) .<br>
><br>
> This is actually remarkable, as Fuego is only handing the formats for<br>
> the<br>
> standalone tests we ship with Fuego. As I stated in the BOF, we have<br>
> two<br>
> mechanisms, one for functional tests that uses shell, grep and diff,<br>
> and<br>
> one for benchmark tests that uses a very small python program that<br>
> uses<br>
> regexes. So, currently we only have 50 tests covered, but many of<br>
> these<br>
> parsers use very simple one-line grep regexes.<br>
><br>
> Neither of these Fuego log results parser methods supports tracking<br>
> individual<br>
> subtest results.<br>
><br>
> > The script has the fail that it<br>
> > does not recognize the name of the tests in order to detect<br>
> > regressions. Maybe one test was passing in the previous release<br>
> and in<br>
> > the new one is failing, and then the number of failing tests remains<br>
> > the same.<br>
><br>
> This is a concern with the Fuego log parsing as well.<br>
><br>
> I would like to modify Fuego's parser to not just parse out counts, but<br>
> to<br>
> also convert the results to something where individual sub-tests can<br>
> be<br>
> tracked over time. Daniel Sangorrin's recent work converting the<br>
> output<br>
> of LTP into excel format might be one way to do this (although I'm<br>
> not<br>
> that comfortable with using a proprietary format - I would prefer CSV<br>
> or json, but I think Daniel is going for ease of use first.)<br>
><br>
> I need to do some more research, but I'm hoping that there are<br>
> Jenkins<br>
> plugins (maybe xUnit) that will provide tools to automatically handle<br>
> visualization of test and sub-test results over time. If so, I might<br>
> try converting the Fuego parsers to product that format.<br>
><br>
> > To be honest, before presenting at LPC I was very confident that<br>
> this<br>
> > script ( or another version of it , much smarter ) could be beginning<br>
> > of the solution to the problem we have. However, during the<br>
> discussion<br>
> > at LPC I understand that this might be a huge effort (not sure if<br>
> > bigger) in order to solve the nightmare we already have.<br>
><br>
> So far, I think you're solving a bit different problem than Fuego is,<br>
> and in one sense are<br>
> much farther along than Fuego. I'm hoping we can learn from your<br>
> experience with this.<br>
><br>
> I do think we share the goal of producing a standard, or at least a<br>
> recommendation,<br>
> for a common test log output format. This would help the industry<br>
> going forward.<br>
> Even if individual tests don't produce the standard format, it will help<br>
> 3rd parties<br>
> write parsers that conform the test output to the format, as well as<br>
> encourage the<br>
> development of tools that utilize the format for visualization or<br>
> regression checking.<br>
><br>
> Do you feel confident enough to propose a format? I don't at the<br>
> moment.<br>
> I'd like to survey the industry for 1) existing formats produced by<br>
> tests (which you have good experience<br>
> with, which is already maybe capture well by your perl script), and 2)<br>
> existing tools<br>
> that use common formats as input (e.g. the Jenkins xunit plugin).<br>
> From this I'd like<br>
> to develop some ideas about the fields that are most commonly<br>
> used, and a good language to<br>
> express those fields. My preference would be JSON - I'm something<br>
> of an XML naysayer, but<br>
> I could be talked into YAML. Under no circumstances do I want to<br>
> invent a new language for<br>
> this.<br>
><br>
> > Tim Bird participates at the BOF and recommends me to send a mail<br>
> to<br>
> > the Fuego project team in order to look for more inputs and ideas<br>
> bout<br>
> > this topic.<br>
> ><br>
> > I really believe in the importance of attack this problem before we<br>
> > have a bigger problem<br>
> ><br>
> > All feedback is more than welcome<br>
><br>
> Here is how I propose moving forward on this. I'd like to get a group<br>
> together to study this<br>
> issue. I wrote down a list of people at LPC who seem to be working<br>
> on test issues. I'd like to<br>
> do the following:<br>
> 1) perform a survey of the areas I mentioned above<br>
> 2) write up a draft spec<br>
> 3) send it around for comments (to what individual and lists? is an<br>
> open issue)<br>
> 4) discuss it at a future face-to-face meeting (probably at ELC or<br>
> maybe next year's plumbers)<br>
> 5) publish it as a standard endorsed by the Linux Foundation<br>
><br>
> Let me know what you think, and if you'd like to be involved.<br>
><br>
> Thanks and regards,<br>
> -- Tim<br>
><br>
><br>
><br>
><br>
><br>
><br>
> --<br>
><br>
> - Guillermo Ponce<br>
</div></div></blockquote></div><br><br clear="all"><br>-- <br><div class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr">- Guillermo Ponce</div></div>
</div>