[Fuego] UOF - more thoughts after the call

Daniel Sangorrin daniel.sangorrin at toshiba.co.jp
Thu Jul 20 00:56:23 UTC 2017


> -----Original Message-----
> From: Bird, Timothy [mailto:Tim.Bird at sony.com]
> Sent: Wednesday, July 19, 2017 1:02 AM
> To: fuego at lists.linuxfoundation.org
> Cc: Daniel Sangorrin
> Subject: UOF - more thoughts after the call
> 
> Daniel,
> 
> Thanks for sending the information about the UOF work so far for the AGL-CIAT/Fuego
> conference call.

BTW: today I managed to get Functional tests working (except LTP).

> I wanted to clarify a few thoughts I had in the call, and continue some discussion points on
> the mailing list.
> 
> Right now we have:
>  reference.json -
>    - this describes the test measurements (name, units, and pass criteria for each measure)
>    - it can also have group pass criteria (fail_count, etc.) at  different levels (test_suite, test_set, test_case?)
>  run.json  - the results from a single run
>   -  this includes test data (time of test, device under test, kernel, host, etc.)
>   - it also has the results of the test
>      - including individual sub-test results
>  results.json - an aggregate of results from multiple runs
>   - currently has test data, as well as results for multiple runs (organized by measure)
> 
> I think what's throwing me off about reference.json is that it has some invariant data:
> (name and units) mixed in with data that may need to be customized  (pass criteria).
> Pass criteria consists of an operation and a value for each measure (with '=' and 'PASS')
> as an inferred pass criteria for individual functional test measures.
> 
> A tester may want to alter the pass criteria, for the following types of scenarios:
>  * some tests should be ignored on a  64-bit system, but used on a 32-bit system
>  * some tests results may differ based on whether some uses a Debian or a Yocto-based
>  distribution on their board
>  * benchmark thresholds may need to change based on the location of the file system
>  (rotating media vs. flash-based)
>  * benchmark thresholds may need to change based on the board (high-end board vs. low-end board)
> 
> It seems redundant to have the units repeated in lots of different files, when they are
> invariant for the test.
> 
> For functional tests, we never had pass-criteria explicitly listed on a per-measure basis.
> All we had were: 1) pass/fail counts, and 2) p/n log comparisons.  Our new system will
> actually be much more explicit, and flexible.  So I very much like the direction we
> are heading.
> 
> A few questions came up in the call:
>  - the format for alternate pass-criteria?
>    - you suggested json diffs, and I suggested an alternate full json file

Full json is ok.

By the way, do we want to be able to set the criteria/specs through a parameterized build? (e.g. using ftc)
[Note] I think it is useful, but I'm worrying about adding lots of "useful" things that will make
Fuego more complicated and unstable for a while. Probably we should focus on a rock-solid stable
Fuego instead of adding too many fancy features for now.

>  - where to store the alternate pass-criteria?
>    - they definitely need to be recorded for a run, so either the diff or the json file
>    should be copied to the log directory, or the criteria should be put into the run.json file.

We can just save the criteria.json file. The run.json doesn't need it.

>    - they also need to be stored per host, for application with each test
>       - they should be outside the test directory - their lifecycle is not conducive to being
>       in the fuego-core repository
>   - I think we should put them in /fuego-rw/test-conf/<test-name>/<criteria-name>.[json|jdiff]
>    where criteria-name is something like: beaglebone-debian
>
> I have other data I'd like to associate with each measure in the test framework (keyed
> off the test guid).  This includes things like test descriptions, and information about results
> interpretation.  This data should be invariant to a run (similar to the units) (I guess in a way
> units is one element of interpreting the results data).  So, the extra data that I've had in
> the back of my mind seems like the units would be better co-located with this extra
> data, than with the pass-criteria.
> 
> A made-up example of this extra data would be:
> 'Cyclictest.Thread0.Max': 'description': 'This represents the maximum latency observed
> for an interrupt during the test.  If hard real-time is required, the threshold for this
> should be set to 50% less than the longest acceptable latency required by your real-time
> application.'
> 
> Arguably, someone my want to make their own
> custom notes about a test (their own interpretation), and share those with the world.
> 
> For example, this tidbit of knowledge would be good to share:
> 'LTP.syscall.creat05': 'note': 'creat05 takes a very long time (40 minutes) in docker because
> it creates and removes 1 million files, which takes a long time on the docker stacked filesystem'.
> Or someone may want to write:
> 'LTP.syscall.kill10'': 'note': 'kill10 is hanging on various machines, for reasons I haven't been
> able to determine yet.'
> 
> We can't create an economy of sharing pass-criteria, descriptions, interpretation and notes
> until we formalize how they are formatted and shared.  We don't need to formalize
> this other material this release, but it may influence how we want to organize the material
> we *are* working on in this release - namely the schemas for the run, results, reference
> (and pass-criteria - in there somewhere).

Agree. We need a schema for the criteria.json files. But it should be just a subschema
of fuego-schema.json since I already put the criteria schema in it.
 
> What do you think of /fuego-rw/test-conf ?  I'm open to other names for this. I think
> it does need to be in the rw directory, because users (testers) are expected to customize it.

Why not in /fuego-ro?
Testers can customize it from the host, and Jenkins cannot overwrite/delete them by chance.

> Anything that needs to be written to from inside the docker container needs to
> be in the rw directory.

But I was assuming that the user-provided data would be just read not written by
jenkins. Could you elaborate a bit on that?

> I can imagine creating new pass-criteria with something like this:
> (in the future, not for this release)
> 
>  * ftc set-thresholds --tguid bonnie.*.Read --run beagle-bone.run.5  --sigma 5% -o my-fs-criteria
> This would take the values for bonnie measurements ending in 'Read', from the data in run 5 for
> the beaglebone board, adjust them by 5 % (either up or down depending on the operation), and
> store them as new pass-criteria in a file called 'my-fs-criteria'.

This looks a lot like the parameterized builds for modifying the parameters in a spec. 
But I was thinking about something like ftc run --thresholds ... --criteria .... Then, the "my-fs-criteria" file 
would be stored directly in the log directory for that run.

>  * ftc set-ignore --tguid LTP.syscall.kill10 --description 'I don't care about this right now because, blah, blah' \
> -o my-ltp-ignore-list
> 
> Then using them something like this:
>  * ftc run-test -b beagle-bone -t Benchmark.bonnie --pass-criteria my-fs-criteria
>  * ftc run-test -b beagle-bone -t Functional.LTP --pass-criteria my-ltp-ignore-list
> 
> Or, in a spec:
>  {
>     "specs": {
>            "default": {
>                    "VAR1": "-f foo"
>            },
>           "alternate": {
>                    "VAR1":"-f bar"
>                    "pass-criteria": "my-criteria"
>           }
> }
>    * ftc run-test -b beagle-bone -t Functional.some_test --spec "alternate"
> 
> I have more thoughts on other topics, that I'll put in other e-mails.  Otherwise the discussion
> threading gets too hard to follow.

Thanks,
Daniel




More information about the Fuego mailing list