[Fuego] AGL-JTA XML output investigation

Tue Nov 29 08:31:21 UTC 2016

> -----Original Message-----
> From: Bird, Timothy [mailto:Tim.Bird at am.sony.com]
> Sent: Tuesday, November 29, 2016 3:28 PM
> To: Daniel Sangorrin; fuego at lists.linuxfoundation.org
> Cc: dmitry.cherkasov at cogentembedded.com
> Subject: RE: AGL-JTA XML output investigation
> 
> 
> 
> > -----Original Message-----
> > From: Daniel on Thursday, November 24, 2016 4:30 PM
> ...
> > After installing AGL-JTA, I have been investigating how the XML output
> > actually works
> > and its relation with the old parser and flot plotter plugin. I will try to illustrate
> > the steps
> > using dbench as an example.
> >
> > It all starts when Jenkins launches dbench.sh which in turn sources
> > benchmark.sh.
> > At some point, once the test execution on the target ends, benchmark.sh
> > calls
> > "bench_processing" which does the following:
> >     - Outputs a "RESULT ANALYSIS" message
> >     - Fetches the test execution log:
> >         $ cat /home/jenkins/logs/Benchmark.dbench/testlogs/<board.job>.log
> >             ...
> >             Throughput 138.451 MB/sec 2 procs
> >     - It calls dbench's parser.py which uses a regex to match the lines in
> >       the log that contain the benchmark results; and then calls into the
> >       common parser code (engine/scripts/parser/common.py) which appends
> >       the result to a plot.data file; generates a plot.png file; and decides
> >       if the test passes based on a thresold comparison (using reference.log).
> >         -> /home/jenkins/logs/Benchmark.dbench/plot.data
> >         -> /home/jenkins/tests/common/Benchmark.dbench/reference.log
> >         -> /home/jenkins/logs/Benchmark.dbench/plot.png
> >     - It then calls engine/scripts/parser/dataload.py which transforms the
> >       information from plot.data into two json files:
> >         ->
> > /home/jenkins/logs/Benchmark.dbench/Benchmark.dbench.Throughput.jso
> > n
> >         -> /home/jenkins/logs/Benchmark.dbench/Benchmark.dbench.info.json
> >       [Note] This json files are supposed to be used by the flot-plotter plugin
> >              but afaik that plugin is not available in AGL-JTA.
> 
> Yes.  When the AGL folks changed their version of Jenkins, this is one
> of the things that broke.  The flot plugin, as far as I can tell, is not very
> complicated, basically consisting of just glue code to wedge it into
> the Jenkins.  The real work is all in the flot plotting library itself.
> I made a change to the flot plugin javascript for my latest refactoring
> of tests.info, but I didn't actually spin up a full Jenkins plugin development
> environment - which required Maven, of all things.  Instead I just
> faked it by modifying the javascript and building the 'war' file by hand.

This is one of the reasons I'm trying to upgrade Jenkins with as few plugins
as possible.

> >       [Note] If you want the output format to be in json, as opposed to XML,
> >              this could be a good time to do that.
> 
> I very much prefer json to XML.  In my experience (which is admittedly limited)
> with XML, I find that very few people actually do validation on the schema.
> So while it's a neat feature in principle, I haven't seen it used to its potential.
> I think JSON is more human-readable. Maybe that just because I'm a Python
> junkie, and it reads to me just like python data structures.  Also, I think
> it's simpler.

Sure. 
Another +1 for choosing JSON is that the kernel-ci API uses the following JSON 
schema: https://api.kernelci.org/schema.html

> > After that, the "POST BUILD TASK" checks the syslogs and cleans directories
> > on the target as necessary.
> >
> > So far this is very similar to Fuego. The real difference comes with a
> > new "Post-build Actions" called "Execute set of scripts" that has been added,
> > and contains the following calls to python code:
> >     RET=0
> >     python
> > $JTA_TESTS_PATH/$TEST_CATEGORY/$JOB_NAME/create_xml_dbench.py
> > || RET=$?
> >     python
> > $JTA_ENGINE_PATH/scripts/detailed_results/get_detailed_results.py
> >
> > $JTA_LOGS_PATH/$JOB_NAME/test_result.${BUILD_ID}.${BUILD_NUMBER}
> > .html
> >
> > $JENKINS_HOME/jobs/$JOB_NAME/builds/$BUILD_NUMBER/test_result.x
> > ml
> >     exit $RET
> >
> > Each test in AGL-JTA has its own create_xml_<test>.py (btw duplicating lots
> > of code).
> > This script uses the information stored at "build.xml" (start time, endtime,
> > result..),
> > "config_default" (avg,min,max values), and the _complete_ execution log;
> > and
> > produces a "test_result.xml" file as a result.
> >     - jta/job_conf/common/Benchmark.dbench/builds/5/build.xml
> >     - jta/engine/tests/common/Benchmark.dbench/config_default
> >     - jta/jobs/Benchmark.dbench/builds/5/log
> >       [IMPORTANT] it parses the messages printed out by parser.py so there is
> > a hidden
> >       dependency between this and the old parser.
> >     - jta/job_conf/common/Benchmark.dbench/builds/5/test_result.xml
> >
> > This is a test_result.xml example (failure):
> >     <?xml version="1.0" encoding="utf-8"?>
> >     <report>
> >         <name>Benchmark.dbench</name>
> >         <starttime>2016-11-24 02:15:52</starttime>
> >         <endtime>2016-11-24 02:18:22</endtime>
> >         <result>FAILURE</result>
> >         <items>
> >         </items>
> >     </report>
> >
> > And this is another one (success):
> >     <?xml version="1.0" encoding="utf-8"?>
> >     <report>
> >         <name>Benchmark.dbench</name>
> >         <starttime>2016-11-24 03:00:23</starttime>
> >         <endtime>2016-11-24 03:00:51</endtime>
> >         <result>SUCCESS</result>
> >         <items>
> >             <item>
> >                 <name>Throughput</name>
> >                 <average>99.48</average>
> >                 <unit>MB/s</unit>
> >                 <criterion>0.00 ~ 100.00</criterion>
> >                 <output>138.451</output>
> >                 <rate>1.39</rate>
> >                 <result>PASS</result>
> >             </item>
> >         </items>
> >         <test_dir>/tests/jta.Benchmark.dbench</test_dir>
> >         <command_line>./dbench -t 10 -D /a/jta.Benchmark.dbench -c
> > /a/jta.Benchmark.dbench/client.txt 2</command_line>
> >     </report>
> >
> > The script "get_detailed_results.py" takes the "test_result.xml" and a
> > template html ("detailed_results_tpl.html") as inputs, and renders the
> > results in html format.
> 
> Well, that's the kind of thing you're supposed to be able to do easily with XML.
> Can you tell if they are using XML features to do this, or just reading
> the XML in Python and spitting out the values found through some python
> templating mechanism?

They are using a template mechanism (jinja2: http://jinja.pocoo.org/docs/dev/).

> > In the process it also summarizes the pass/fail results.
> >     - engine/scripts/detailed_results/get_detailed_results.py
> >     - jta/engine/scripts/detailed_results/detailed_results_tpl.html
> >     - /userdata/logs/Benchmark.dbench/test_result.5.5.html
> >
> > Finally, the resulting html, as well as the plot.png file, are linked
> > on the "Set build description" section using these lines:
> >     Dbench benchmark<p><b><a
> > href="/userContent/jta.logs/Benchmark.dbench/plot.png">
> >     Graph</a></b> <b><a
> > href="/userContent/jta.logs/${JOB_NAME}/test_result.${BUILD_ID}.
> >     ${BUILD_NUMBER}.html">Test Result</a></b></p>
> >
> > Other test_result.xml examples:
> >     <?xml version="1.0" encoding="utf-8"?>
> >     <report>
> >         <name>Benchmark.Dhrystone</name>
> >         <starttime>2016-11-24 03:05:30</starttime>
> >         <endtime>2016-11-24 03:05:45</endtime>
> >         <result>SUCCESS</result>
> >         <items>
> >             <item>
> >                 <name>Dhrystone</name>
> >                 <average>5555555.50</average>
> >                 <unit>Dhrystones/s</unit>
> >                 <criterion>0.00 ~ 100.00</criterion>
> >                 <output>909090.9</output>
> >                 <rate>0.16</rate>
> >                 <result>PASS</result>
> >             </item>
> >         </items>
> >         <test_dir>/tests/jta.Benchmark.Dhrystone</test_dir>
> >         <command_line>./dhrystone 10000000</command_line>
> >     </report>
> >
> >     <?xml version="1.0" encoding="utf-8"?>
> >     <report>
> >         <name>Functional.LTP.Syscalls</name>
> >         <starttime>2016-11-24 04:17:40</starttime>
> >         <endtime>2016-11-24 04:51:36</endtime>
> >         <result>FAILURE</result>
> >         <items>
> >             <item>
> >                 <name>abort01</name>
> >                 <result>PASS</result>
> >             </item>
> >             <item>
> >                 <name>accept01</name>
> >                 <result>PASS</result>
> >             </item>
> >             ...OMITTED...
> >             <item>
> >                 <name>futex_wait_bitset01</name>
> >                 <result>PASS</result>
> >             </item>
> >             <item>
> >                 <name>futex_wait_bitset02</name>
> >                 <result>PASS</result>
> >             </item>
> >         </items>
> >         <test_dir>/tmp/jta.LTP/target_bin</test_dir>
> >         <command_line>./runltp -f syscalls -g
> > /tmp/jta.LTP/syscalls.html</command_line>
> >     </report>
> >
> > I hope that was clear and easy to understand. If you have any questions let
> > me know.
> 
> This is very helpful to see what they're doing.  It's useful to see what types of output they
> consider important for their reports.
> 
> I think we could probably find a way to use what they're doing in Fuego.
> I'd really like to add enough features for them for them to be willing to unfork
> their system from ours.  If that involves moving Fuego their direction, that's
> fine with me.

Totally agree.

Regards,
Daniel