<div dir="ltr"><div><div><div><div><div><div><div>Hi Tim and Victor,<br><br></div>Ok, if we would like to talk about the tools that are used to visualize the results, I will try to describe the way it works without incurring in revealing proprietary information, since that code is not open source yet.<br><br></div>There is an initial script that gets the active packages names for the linux distro and gets the build logs for each one.<br></div>That script calls <a href="http://count.pl">count.pl</a> script and attaches the package name to the results of the <a href="http://count.pl">count.pl</a> script. That string will result like &#39;&lt;package&gt;,100,80,20,0,0&#39;, if I remind correctly, and each package output will be appended to big csv file with headers.<br></div>After we have that csv file we pass it to another script that will create some graphs.<br><br></div>So basically it is all CSV and home made tools to analyze them, I think it can be automated, but I guess Victor can give us more details on the current process on that matter if any.<br><br></div>Thanks and Regards.<br></div>- Guillermo Ponce<br><div><div><div><div><div><div><div><br></div></div></div></div></div></div></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Nov 8, 2016 at 6:21 PM, Bird, Timothy <span dir="ltr">&lt;<a href="mailto:Tim.Bird@am.sony.com" target="_blank">Tim.Bird@am.sony.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class=""><br>

<br>

&gt; -----Original Message-----<br>

&gt; From: Guillermo Adrian Ponce Castañeda on Tuesday, November 08, 2016 11:38 AM<br>

&gt;<br>

&gt; I am a co-author of this code and I must confess that it was more or less my<br>

&gt; fault that it was made on Perl.<br>

<br>

</span>No blame intended. :-)<br>

<span class=""><br>

&gt;<br>

&gt; Regarding how many logs the program analyzes, I think it is nowhere near<br>

&gt; 5000, it is much less, but taking in count that some logs are similar I think it is<br>

&gt; possible that some logs that haven&#39;t been tested are going to work, but who<br>

&gt; knows :).<br>

&gt;<br>

&gt;<br>

&gt; And about the output file, right now it delivers a comma separated list of<br>

&gt; numbers, without headers, this is because this code is part of a  bigger tool, I<br>

&gt; think that code is not open source yet, but that doesn&#39;t matter I guess, the<br>

&gt; thing here is that I think the output could be changed into a json like you<br>

&gt; suggested and i can try to translate the code from Perl to Python, still not<br>

&gt; sure how long it&#39;s gonna take, but I can sure try.<br>

<br>

</span>Well, don&#39;t do any re-writing just yet.  I think we need to consider the<br>

output format some more, and decide whether it makes sense to have a<br>

single vs. multiple parsers first.<br>

<br>

An important issue here is scalability of the project, and making it easy<br>

to allow (and incentivize) other developers to create and maintain<br>

parsers for the log files.  Or, to help encourage people to use a common<br>

format either initially, or by conversion from their current log format.<br>

The only way to scale this is by having 3rd parties adopt the format, and<br>

be willing to maintain compatibility with it over time.<br>

<br>

I think it&#39;s important to consider what will motivate people to adopt a common<br>

log format.  They either need to 1) write a parser for their current format, or<br>

2) identify an existing parser which is close enough and modify it to<br>

support their format, or 3) convert their test output directly to the desired<br>

format.  This will be some amount of work whichever route people take.<br>

<br>

I think what will be of value is having tools that read and process the format,<br>

and provide utility to those who use the format for output.  So I want to do a bit<br>

of a survey on what tools (visualizers, aggregators, automated processors,<br>

notifiers, etc.) might be useful to different developer groups, and make sure<br>

the format is something that can be used by existing tools or by envisioned<br>

future tools, that would be valuable to community members.<br>

<br>

In more high-level terms, we should trying to create a double-sided network effect,<br>

where use  (output) of the format drives tools creation, and tools usage<br>

of the format (input) drives format popularity.<br>

<br>

Can you describe a bit more what tools, if any, you use to view the results,<br>

or any other processing systems that the results are used with?  If you are reviewing<br>

results manually, are there steps you are doing now by hand that you&#39;d like to<br>

do automatically in the future, that a common format would help you with?<br>

<br>

I&#39;ll go first - Fuego is currently just using the standard Jenkins &quot;weather&quot; report<br>

and &#39;list of recent overall pass/failure&#39; for each test. So we don&#39;t have anything<br>

visualizing the results of sub-tests, or even displaying the counts for each test run, at the moment.<br>

Daniel Sangorrin has just recently proposed a facility to put LTP results into spreadsheet format,<br>

to allow visualizing test results over time via spreadsheet tools.  I&#39;d like to add better<br>

sub-test visualization in the future, but that&#39;s lower on our priority list at the moment.<br>

<br>

Also in the future, we&#39;d like to do test results aggregation, to allow for data mining<br>

of results from tests on different hardware platforms and embedded distributions.<br>

This will require that the parsed log output be machine-readable, and consistent.<br>

 -- Tim<br>

<span class=""><br>

&gt; On Mon, Nov 7, 2016 at 6:26 PM, Bird, Timothy &lt;<a href="mailto:Tim.Bird@am.sony.com">Tim.Bird@am.sony.com</a><br>

</span><span class="">&gt; &lt;mailto:<a href="mailto:Tim.Bird@am.sony.com">Tim.Bird@am.sony.com</a>&gt; &gt; wrote:<br>

&gt;<br>

&gt;<br>

&gt;       Victor,<br>

&gt;<br>

&gt;       Thanks for raising this topic.  I think it&#39;s an important one.  I have<br>

&gt; some comments below, inline.<br>

&gt;<br>

&gt;       &gt; -----Original Message-----<br>

&gt;       &gt; From: Victor Rodriguez on Saturday, November 05, 2016 10:15 AM<br>

&gt;       &gt;<br>

&gt;       &gt; This week I presented a case of study for the problem of lack of<br>

&gt; test<br>

&gt;       &gt; log output standardization in the majority of packages that are used<br>

&gt;       &gt; to build the current Linux distributions. This was presented as a BOF<br>

&gt;       &gt; ( <a href="https://www.linuxplumbersconf.org/2016/ocw/proposals/3555" rel="noreferrer" target="_blank">https://www.linuxplumbersconf.<wbr>org/2016/ocw/proposals/3555</a><br>

</span>&gt; &lt;<a href="https://www.linuxplumbersconf.org/2016/ocw/proposals/3555" rel="noreferrer" target="_blank">https://www.<wbr>linuxplumbersconf.org/2016/<wbr>ocw/proposals/3555</a>&gt; )  during<br>

<span class="">&gt;       &gt; the Linux Plumbers Conference.<br>

&gt;       &gt;<br>

&gt;       &gt; it was a productive  discussion that let us share the problem that<br>

&gt; we<br>

&gt;       &gt; have in the current projects that we use every day to build a<br>

&gt;       &gt; distribution ( either in embedded as in a cloud base distribution).<br>

&gt;       &gt; The open source projects don&#39;t follow a standard output log format<br>

&gt; to<br>

&gt;       &gt; print the passing and failing tests that they run during packaging<br>

&gt;       &gt; time ( &quot;make test&quot; or &quot;make check&quot; )<br>

&gt;       &gt;<br>

&gt;       &gt; The Clear Linux project is using a simple Perl script that helps them<br>

&gt;       &gt; to count the number of passing and failing tests (which should be<br>

&gt;       &gt; trivial if could have a single standard output among all the projects,<br>

&gt;       &gt; but we don’t):<br>

&gt;       &gt;<br>

&gt;       &gt;<br>

&gt; <a href="https://github.com/clearlinux/autospec/blob/master/autospec/count.pl" rel="noreferrer" target="_blank">https://github.com/clearlinux/<wbr>autospec/blob/master/autospec/<wbr>count.pl</a><br>

&gt; &lt;<a href="https://github.com/clearlinux/autospec/blob/master/autospec/count.pl" rel="noreferrer" target="_blank">https://github.com/<wbr>clearlinux/autospec/blob/<wbr>master/autospec/count.pl</a>&gt;<br>

&gt;       &gt;<br>

</span>&gt;       &gt; # perl <a href="http://count.pl" rel="noreferrer" target="_blank">count.pl</a> &lt;<a href="http://count.pl" rel="noreferrer" target="_blank">http://count.pl</a>&gt;  &lt;build.log&gt;<br>

<div><div class="h5">&gt;<br>

&gt;       A few remarks about this.  This will be something of a stream of<br>

&gt; ideas, not<br>

&gt;       very well organized.  I&#39;d like to prevent requiring too many different<br>

&gt;       language skills in Fuego.  In order to write a test for Fuego, we<br>

&gt; already require<br>

&gt;       knowledge of shell script, python (for the benchmark parsers) and<br>

&gt; json formats<br>

&gt;       (for the test specs and plans).  I&#39;d be hesitant to adopt something in<br>

&gt; perl, but maybe<br>

&gt;       there&#39;s a way to leverage the expertise embedded in your script.<br>

&gt;<br>

&gt;       I&#39;m not that fond of the idea of integrating all the parsers into a single<br>

&gt; program.<br>

&gt;       I think it&#39;s conceptually simpler to have a parser per log file format.<br>

&gt; However,<br>

&gt;       I haven&#39;t looked in detail at your parser, so I can&#39;t really comment on<br>

&gt; it&#39;s<br>

&gt;       complexity.  I note that 0day has a parser per test (but I haven&#39;t<br>

&gt; checked to<br>

&gt;       see if they re-use common parsers between tests.)  Possibly some<br>

&gt; combination<br>

&gt;       of code-driven and data-driven parsers is best, but I don&#39;t have the<br>

&gt; experience<br>

&gt;       you guys do with your parser.<br>

&gt;<br>

&gt;       If I understood your presentation, you are currently parsing<br>

&gt;       logs for thousands of packages. I thought you said that about half of<br>

&gt; the<br>

&gt;       20,000 packages in a distro have unit tests, and I thought you said<br>

&gt; that<br>

&gt;       your parser was covering about half of those (so, about 5000<br>

&gt; packages currently).<br>

&gt;       And this is with 26 log formats parsed so far.<br>

&gt;<br>

&gt;       I&#39;m guessing that packages have a &quot;long tail&quot; of formats, with them<br>

&gt; getting<br>

&gt;       weirder and weirder the farther out on the tail of formats you get.<br>

&gt;<br>

&gt;       Please correct my numbers if I&#39;m mistaken.<br>

&gt;<br>

&gt;       &gt; Examples of real packages build logs:<br>

&gt;       &gt;<br>

&gt;       &gt;<br>

&gt; <a href="https://kojipkgs.fedoraproject.org//packages/gcc/6.2.1/2.fc25/data/logs/x8" rel="noreferrer" target="_blank">https://kojipkgs.<wbr>fedoraproject.org//packages/<wbr>gcc/6.2.1/2.fc25/data/logs/x8</a><br>

</div></div>&gt; &lt;<a href="https://kojipkgs.fedoraproject.org//packages/gcc/6.2.1/2.fc25/data/logs/x" rel="noreferrer" target="_blank">https://kojipkgs.<wbr>fedoraproject.org//packages/<wbr>gcc/6.2.1/2.fc25/data/logs/x</a><br>

&gt; 8&gt;<br>

&gt;       &gt; 6_64/build.log<br>

&gt;       &gt;<br>

&gt; <a href="https://kojipkgs.fedoraproject.org//packages/acl/2.2.52/11.fc24/data/logs/x" rel="noreferrer" target="_blank">https://kojipkgs.<wbr>fedoraproject.org//packages/<wbr>acl/2.2.52/11.fc24/data/logs/x</a><br>

&gt; &lt;<a href="https://kojipkgs.fedoraproject.org//packages/acl/2.2.52/11.fc24/data/logs/" rel="noreferrer" target="_blank">https://kojipkgs.<wbr>fedoraproject.org//packages/<wbr>acl/2.2.52/11.fc24/data/logs/</a><br>

&gt; x&gt;<br>

<div class="HOEnZb"><div class="h5">&gt;       &gt; 86_64/build.log<br>

&gt;       &gt;<br>

&gt;       &gt; So far that simple (and not well engineered) parser has found 26<br>

&gt;       &gt; “standard” outputs ( and counting ) .<br>

&gt;<br>

&gt;       This is actually remarkable, as Fuego is only handing the formats for<br>

&gt; the<br>

&gt;       standalone tests we ship with Fuego.  As I stated in the BOF, we have<br>

&gt; two<br>

&gt;       mechanisms, one for functional tests that uses shell, grep and diff,<br>

&gt; and<br>

&gt;       one for benchmark tests that uses a very small python program that<br>

&gt; uses<br>

&gt;       regexes.   So, currently we only have 50 tests covered, but many of<br>

&gt; these<br>

&gt;       parsers use very simple one-line grep regexes.<br>

&gt;<br>

&gt;       Neither of these Fuego log results parser methods supports tracking<br>

&gt; individual<br>

&gt;       subtest results.<br>

&gt;<br>

&gt;       &gt; The script has the fail that it<br>

&gt;       &gt; does not recognize the name of the tests in order to detect<br>

&gt;       &gt; regressions. Maybe one test was passing in the previous release<br>

&gt; and in<br>

&gt;       &gt; the new one is failing, and then the number of failing tests remains<br>

&gt;       &gt; the same.<br>

&gt;<br>

&gt;       This is a concern with the Fuego log parsing as well.<br>

&gt;<br>

&gt;       I would like to modify Fuego&#39;s parser to not just parse out counts, but<br>

&gt; to<br>

&gt;       also convert the results to something where individual sub-tests can<br>

&gt; be<br>

&gt;       tracked over time.  Daniel Sangorrin&#39;s recent work converting the<br>

&gt; output<br>

&gt;       of LTP into excel format might be one way to do this (although I&#39;m<br>

&gt; not<br>

&gt;       that comfortable with using a proprietary format - I would prefer CSV<br>

&gt;       or json, but I think Daniel is going for ease of use first.)<br>

&gt;<br>

&gt;       I need to do some more research, but I&#39;m hoping that there are<br>

&gt; Jenkins<br>

&gt;       plugins (maybe xUnit) that will provide tools to automatically handle<br>

&gt;       visualization of test and sub-test results over time.  If so, I might<br>

&gt;       try converting the Fuego parsers to product that format.<br>

&gt;<br>

&gt;       &gt; To be honest, before presenting at LPC I was very confident that<br>

&gt; this<br>

&gt;       &gt; script ( or another version of it , much smarter ) could be beginning<br>

&gt;       &gt; of the solution to the problem we have. However, during the<br>

&gt; discussion<br>

&gt;       &gt; at LPC I understand that this might be a huge effort (not sure if<br>

&gt;       &gt; bigger) in order to solve the nightmare we already have.<br>

&gt;<br>

&gt;       So far, I think you&#39;re solving a bit different problem than Fuego is,<br>

&gt; and in one sense are<br>

&gt;       much farther along than Fuego.  I&#39;m hoping we can learn from your<br>

&gt;       experience with this.<br>

&gt;<br>

&gt;       I do think we share the goal of producing a standard, or at least a<br>

&gt; recommendation,<br>

&gt;       for a common test log output format.  This would help the industry<br>

&gt; going forward.<br>

&gt;       Even if individual tests don&#39;t produce the standard format, it will help<br>

&gt; 3rd parties<br>

&gt;       write parsers that conform the test output to the format, as well as<br>

&gt; encourage the<br>

&gt;       development of tools that utilize the format for visualization or<br>

&gt; regression checking.<br>

&gt;<br>

&gt;       Do you feel confident enough to propose a format?  I don&#39;t at the<br>

&gt; moment.<br>

&gt;       I&#39;d like to survey the industry for 1) existing formats produced by<br>

&gt; tests (which you have good experience<br>

&gt;       with, which is already maybe capture well by your perl script), and 2)<br>

&gt; existing tools<br>

&gt;       that use common formats as input (e.g. the Jenkins xunit plugin).<br>

&gt; From this I&#39;d like<br>

&gt;       to develop some ideas about the fields that are most commonly<br>

&gt; used, and a good language to<br>

&gt;       express those fields. My preference would be JSON - I&#39;m something<br>

&gt; of an XML naysayer, but<br>

&gt;       I could be talked into YAML.  Under no circumstances do I want to<br>

&gt; invent a new language for<br>

&gt;       this.<br>

&gt;<br>

&gt;       &gt; Tim Bird participates at the BOF and recommends me to send a mail<br>

&gt; to<br>

&gt;       &gt; the Fuego project team in order to look for more inputs and ideas<br>

&gt; bout<br>

&gt;       &gt; this topic.<br>

&gt;       &gt;<br>

&gt;       &gt; I really believe in the importance of attack this problem before we<br>

&gt;       &gt; have a bigger problem<br>

&gt;       &gt;<br>

&gt;       &gt; All feedback is more than welcome<br>

&gt;<br>

&gt;       Here is how I propose moving forward on this.  I&#39;d like to get a group<br>

&gt; together to study this<br>

&gt;       issue.  I wrote down a list of people at LPC who seem to be working<br>

&gt; on test issues.  I&#39;d like to<br>

&gt;       do the following:<br>

&gt;        1) perform a survey of the areas I mentioned above<br>

&gt;        2) write up a draft spec<br>

&gt;        3) send it around for comments (to what individual and lists? is an<br>

&gt; open issue)<br>

&gt;        4) discuss it at a future face-to-face meeting (probably at ELC or<br>

&gt; maybe next year&#39;s plumbers)<br>

&gt;        5) publish it as a standard endorsed by the Linux Foundation<br>

&gt;<br>

&gt;       Let me know what you think, and if you&#39;d like to be involved.<br>

&gt;<br>

&gt;       Thanks and regards,<br>

&gt;        -- Tim<br>

&gt;<br>

&gt;<br>

&gt;<br>

&gt;<br>

&gt;<br>

&gt;<br>

&gt; --<br>

&gt;<br>

&gt; - Guillermo Ponce<br>

</div></div></blockquote></div><br><br clear="all"><br>-- <br><div class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr">- Guillermo Ponce</div></div>

</div>