[Fuego] [PATCH] dbench: adjust test materials for dbench version

Mon Apr 9 17:25:17 UTC 2018

> -----Original Message-----
> From: Daniel Sangorrin 
> Hi Tim,
> 
> > -----Original Message-----
> > From: Tim.Bird at sony.com [mailto:Tim.Bird at sony.com]
> > Sent: Thursday, April 5, 2018 7:43 AM
> > To: daniel.sangorrin at toshiba.co.jp; fuego at lists.linuxfoundation.org
> > Subject: [PATCH] dbench: adjust test materials for dbench version
> >
> > Daniel,
> >
> > I had a host of issues with the dbench upgrade.  I used the patch below
> > to adjust some of the test materials for the different dbench tests.
> > Let me know if you have feedback on this.
> 
> Thanks for the review and the fixes as well!
> 
> > In hindsight, maybe splitting the tests wasn't such a great idea.
> > It may have been better to write a parser that could recognize
> > and handle both versions (3 and 4) of the test output format.
> > Although it appears that most systems will be using dbench
> > version 4.0 or above, it's not impossible that Benchmark.dbench4
> > could discover dbench version 3.x on a system, and try to use
> > it, which would result in errors.
> 
> Yes, that's true.
> Another option would be to check the version and ask the user to
> run Benchmark.dbench3 instead of Benchmark.dbench4. Which one
> do you prefer?
I prefer checking the version and telling the user to run Benchmark.dbench3.

If we get serious about running tests that already exist on the target board,
then it could come in handy to have an example of version checking code,
so this would be nice to have.

> 
> > Also, I find that using the test name to prefix the test variable
> > (from the spec file), is problematic in this case.  To rename a test
> > requires changing all the variable names.  Maybe we should use
> > a generic prefix, like TVAR_ or something.
> 
> Good idea. The variables would be sorter as well.

It's too big a change for the 1.3 release, but I'll make a note and
we can consider it for 1.4.

> 
> > Also, I wasn't sure whether I should change the testcase name
> > (in the parser and resulting json file) to match the test name (dbench4).
> > I ended up doing this for dbench3, but left dbench4 alone, as I expect
> > it will become our defacto 'dbench' test in the future.
> 
> mm I am worrying a new version will come up and we will have to change
> many things.
For now, I think we should just keep an eye on it.  Tests don't change
very often.  We'll see how often this situation comes up and determine
how we ought to deal with test version changes in general.  So far, I believe
this is the first parser that has broken for us in Fuego, on a test version change.

> 
> >
> > And I found that the name in the spec file is only used for generating
> > the test variable names.  I tried changing it in dbench4, but then changed
> > my mind, and left it alone.
> >
> > Something is way different between versions 3 and 4 in reporting
> throughput
> > on my systems.  I'm getting pretty significant differences in results on most
> > of my boards (the only one with 4.x results consistent with 3.x is on my
> Renesas
> > arm64 board).
> >
> > Here is a table of results:
> > test       spec      board     tguid                result
> > ------------------------------------------------------------
> > dbench3    default   bbb        <<< wouldn't compile >>>
> > dbench3    default   bbb       dbench3.Throughput   11.5173
> > dbench4    default   bbb       dbench.Throughput    0.0282559
> > dbench4    roota     bbb       dbench.Throughput    0.116864
> >
> > dbench     testdir   docker    dbench.Throughput    1741.26
> > dbench     testdir   docker    dbench.Throughput    1831.78
> > dbench3    default   docker    dbench3.Throughput   1169.77
> > dbench4    default   docker    dbench.Throughput    10.394
> > dbench4    roota     docker    dbench.Throughput    10.959
> >
> > dbench     default   min1      dbench.Throughput    254.03
> > dbench3    default   min1      dbench3.Throughput   253.83
> > dbench3    default   min1      dbench3.Throughput   251.732
> > dbench4    default   min1      dbench.Throughput    32.1764
> > dbench4    default   min1      dbench.Throughput    33.0374
> > dbench4    roota     min1      dbench.Throughput    34.0026
> > dbench4    roota     min1      dbench.Throughput    21.3904
> >
> > dbench     default   ren1      dbench.Throughput    8.41553
> > dbench     default   ren1      dbench.Throughput    8.62088
> > dbench3    default   ren1      dbench3.Throughput   8.43887
> > dbench3    default   ren1      dbench3.Throughput   8.45205
> > dbench4    default   ren1      dbench.Throughput    7.53374
> > dbench4    roota     ren1      dbench.Throughput    8.03552
> > dbench4    roota     ren1      dbench.Throughput    7.72424
> >
> > dbench     default   rpi3-1    dbench.Throughput    136.099
> > dbench     default   rpi3-1    dbench.Throughput    132.869
> > dbench3    default   rpi3-1    dbench3.Throughput   159.953
> > dbench4    default   rpi3-1    dbench.Throughput    39.9088
> > dbench4    roota     rpi3-1    dbench.Throughput    50.6633
> >
> > Note that the 'roota' spec tries to mimic the settings used by
> > the old Benchmark.dbench test.
> >
> > I'm not sure which version of the test to believe for my
> > throughput.  But something is different enough to skew the
> > results by a large amount - in the case of docker by 100x.
> >
> > Have you seen differences in the reported throughput?
> 
> Yes, they are completely different in my boards too. I am not sure why,
> probably the dbench developers know better.

Do you plan to ask them, or should I?

I'd like to know which is the "real" number.

Thanks,
 -- Tim