[Fuego] Some stray thoughts on the parser generalization

Thu Jun 28 08:08:59 UTC 2018

I've been thinking about the parser generalization that was discussed at the
Fuego Jamboree.  I have a few ideas to toss out, in no particular order:

1) there's some boilerplate code that every parser.py has at the beginning,
that IMHO it would be good to try to eliminate.
The lines with sys.path.insert... and
import common as plib
would be nice to eliminate.

Rather than running the parser.py as a standalone program, why don't we structure
it as a plugin module instead?

Currently we invoke the parser with:
run_python $PYTHON_ARGS $FUEGO_CORE/engine/tests/${TESTDIR}/parser.py
(with a whole lot of Fuego-specific environment variables).

I think it would be good to refactor this, so that the fuego core (functions.sh)
calls a single program, indicating the test, the log file, and a parser name.
Many fuego parsers could be combined by declaring a single regex pattern
in the fuego_test.sh (similar to what is done with log_compare).

So, something like the following instead:
run_python $PYTHON_ARGS --log=$FUEGO_RW/logs/.../testlog --test=$TESTDIR --parser=TAP13
or
run_python $PYTHON)_ARGS --log=$FUEGO_RW/logs/.../testlog --test=$TESTDIR --parser=2part-regex --parser-arg="regex_string= ^TEST-(\d+) (.*)$"

2) I'm starting to come to the conclusion that the testcase name needs to
be very free-form.  That is, it should be allowed to have spaces and punctuation.
Many tests use a description of the test as the only unique identifier for
the test.  That is, they don't use numbered testcases.  I strongly prefer moving
away from numbers as testcase names, as a number provides very little
human-usable information about the testcase.

I think the run.json can handle arbitrary strings for testcase names, but
I fear that a lot of our parser and ftc code can not.

3) I'm also starting to think that the structured data is a pain to manage,
and it might be better to do most of the work in a flat format.  The  charting
code uses a mixture of both structured (nested objects) and flat testcase
names, and I think there's a lot of duplicate code lying around that handles
the conversion back and forth, that could probably be coalesced into a 
single set of library routines.

That's it for now.  I'm just dumping my brain - not requesting anyone to
work on anything.

Feedback is welcome.

Regards,
 -- Tim