[Fuego] work-in-progress on UOF stuff
Tim.Bird at sony.com
Tue Aug 1 00:45:05 UTC 2017
OK - I've done a bit of work on the UOF stuff. I tried not to wreck stuff too badly,
but there were a few things I wanted to support. I've checked stuff into my
'next' branch, but I think most of it is not actually working at the moment.
Here's the major things I did:
- removed nbench-bytes (a duplicate of nbench_bytes)
- added supporting code for Benchmark.fs_mark (this test was just a stub before)
- added engine/scripts/jdiff command (as a helper tool for developers to examine json files)
- changed 'threshold' to 'value' in the criteria.json files (and common.py)
- added support for backwards compatibility with old tests. This involved:
- adding a conversion routine, that takes a reference.log and builds a criteria data structure from it
- changing the create_default_ref routine (this is not quite done yet), for when a reference.json file
is not found.
- renaming common.py:parse to common.py:parse_log, and implemented a parse() that takes
the same parameters as the previous parse() function
- renaming common.py:process_data to common.py:process, and implemented a process_data that
takes the same parameters as the previous process_data function.
This means that an old (fuego-1.1) test, that has a parser.py that calls parse() and process_data() and
a reference.log file, will run with the system, and the system will produce a correctly-formated run.json
file (with results data embedded).
New tests (for fuego 1.2 and above) should have
a parser.py that calls parse_log() and process() instead, and provide a criteria.json file and reference.json
Khiem Nguyen has over 6000 tests, and I think we should support those so he can migrate
to Fuego 1.2 with little disruption. I'm sure we've broken something that he relies on, but I'd
like to do our best to minimize the obstacles to Khiem upgrading.
I got things mostly working, and then ran into trouble figuring out how to get the generic parser.py
to fit into this scheme, so I can support (legacy) functional tests as well. I got tripped up by the fact
that generic_parser.py doesn't pass a measure, but a test_case id. The routine common.py:add_result()
distinguishes these different cases, but I got tangled up trying to figure out different use cases in
the code, and had to take a step back to evaluate our tguid system.
I put together a list of all of the tguids in the system (except for LTP, which has it's own issues).
I documented them at: http://bird.org/fuego-1.2/Fuego_tguid_list
For backwards compatibility, I had to figure out how the simple names in the reference.log
should be mapped to fully-qualified names. I believe your create_default_ref() function
has some bugs. At least, when I tried to use it with existing functional tests, I got some
python exceptions. I wrote my own replacement for it, but pretty much gummed things
up because I didn't fully understand the mapping from tguid to structured elements.
I may be trying change too many things at once. I had hoped that when the dust settled,
I would be able to support new Functional tests that provide their own parser.py and
can then record the results for individual sub-tests. This will allow us to use the criteria.json
file to have more flexibility in managing results analysis.
The more I think about it, the more I am dissatisfied with the current approach of
putting pass/fail counts in the spec file (the old JTA way of handling partial success)
This is extremely crude, and doesn't allow for specifying which specific tests should
be ignored. So I'm anxious to support the more flexible pass criteria, and have that
available to Functional tests as well.
In any event, that's a status report. I think we need to have some hard rules
on how tguids map onto the structures in all the different json files. Specifically,
I want to document how shortened names map to (<test_name>,<test_set>,<test_case>,<measure>).
It's started putting some rules on the tguid page I referenced above. Let me know what
I hope to have the work on backward compatibility for old benchmark and functional tests
done by tomorrow. But I've got to sign off for tonight.
Let me know your feedback on the above.
More information about the Fuego