[Fuego] testplans and job inter-dependency

Wed Nov 14 03:18:18 UTC 2018

> -----Original Message-----
> From: Tim.Bird at sony.com <Tim.Bird at sony.com>
[...]
> > Having said that, it wouldn't be as powerful as what you describe below.
> My way is powerful, but it does have tradeoffs.

One of them is that you are not using jobs, and therefore our current visualization tools based on Jenkins will not work.

> > > Also, this approach of using Jenkins is declarative in nature, which means
> > > there it's difficult to do the dependencies conditionally, or to do complex
> > > sequences where more than one job is launched (maybe in parallel)
> > > when another job completes.
> > >
> > > I haven't looked in detail at Jenkins pipelines, but my limited understanding
> > > is that they essentially allow you to specify the sequence of operations (ie
> > tests to
> > > run)
> > > procedurally in a high-level language of your choice.
> >
> > The procedural pipelines that you mention are probably the ones called
> > "scripted style" which can be
> > written in groovy (maybe in other languages too, i don't know).
> > https://medium.com/@Lenkovits/jenkins-pipelines-and-their-dirty-secrets-
> > 1-9e535cd603f4
> >
> > I think that Jenkins has now declarative pipelines where you can also do
> > conditional execution (when, post, etc..):
> > https://jenkins.io/blog/2018/04/09/whats-in-declarative/
> > https://jenkins.io/doc/pipeline/tour/running-multiple-steps/#finishing-up
> 
> Ugh.  My head is spinning with all their declarative syntax.  I think these
> types of things are easy to read but nearly impossible to write without
> constantly referring to a manual.  And sometimes the manual is very
> hard to find or non-existent.  Well, maybe the same would be true
> of a scripted, or procedural style. Either one requires knowledge that
> is very specific to the test framework.
> 
> In terms of automated operations on the testplans themselves (aside from
> running them), declarative style has the benefit that it's regular enough
> for certain operations like automatic scanning.  This can be useful
> for things like customizing the testplans.
> 
> I stumbled across this today:
> https://git.linaro.org/qa/test-definitions.git/tree/plans/test-plan-overlay-example
> .yaml
> 
> I have always thought the best way to customize a testplan is just to copy it and
> edit the parts you need to change.  But apparently LAVA allows you to
> separate your modifications into a separate file.  I guess which approach would be
> better
> would depend on how often the 'base' testplans were modified.  I don't
> think they are modified that often, so I'm not sure it's worth the sacrifice
> in writability and expressiveness to go with declarative style.
> 
> Note that LAVA's declarative style is much easier to read than Jenkins', IMHO.
> (Maybe because they have less features.)
> See https://git.linaro.org/qa/test-definitions.git/tree/plans/qcomlt/smoke.yaml
> for a relatively simple example.

LAVA uses an elegant yaml abstraction.
However, I think Fuego wants to focus on usability. Even Fuego beginners should be able to easily modify an existing testplan without having to read through a lot of documentation.

> > Unfortunately, we can't rely on them if we want to stay independent of
> > Jenkins in the future.
> 
> Indeed, my preference would be to do something outside of Jenkins.
> 
> >
> > > I think I would prefer to see testplans in Fuego converted to this type of
> > > system.
> > >
> > > What if we composed test sequences using a Fuego test, with arbitrary
> > > code in the test_run() function of fuegotest.sh
> > >
> > > Here's an example:
> > > make a directory:
> > > fuego-core/engine/tests/Testplan.smoketest
> > > with file: fuego_test.sh
> > > that contained:
> > > function test_run {
> > >    ftc run-test -b ${NODE_NAME} -s default -t Benchmark.Dhrystone || true
> > >    ftc run-test -b ${NODE_NAME} -s default -t Benchmark.dbench4 || true
> > >    ftc run-test -b ${NODE_NAME} -s default -t Benchmark.hackbench || true
> > >    ...
> > >    ftc run-test --timeout 15m -b ${NODE_NAME} -s default -t
> > Benchmark.OpenSSL
> > > || true
> > >    ...
> > >    ftc run-test -b ${NODE_NAME} -s default -t Functional.hello_world ||
> > true
> > > }
> > >
> > > Then we modify ftc and the core scripts to support Testplan as a legal kind
> > > of test (along with "Benchmark" and "Functional") - which is just for
> > composing
> > > other tests.
> > >
> > > I have been thinking of adding an "Action" kind of test, that would be used
> > for other
> > > "lab" operations like deployment of the software under test, or setup of
> > lab
> > > equipment
> > > testing, or generating reports, or generating the binary cache of test
> > program
> > > packages.
> > >
> > > There are issues with using "ftc build-job" instead of "ftc run-test", since I
> > don't think
> > > Jenkins will executed nested tests (due to Jenkins starting only 1 test per
> > executor,
> > > and Fuego defining only one executor per node (or board).
> > >
> > > This strategy allows us to run plans like we run tests, with something like:
> > > ftc run-test -b minnowboard -t Testplan.smoketest
> > >
> > > (Actually, calling them "Testplan" is not really required, but I don't want to
> > call
> > > them either "Functional" or "Benchmark", and I might use the "Testplan"
> > prefix
> > > to avoid invoking other test phases.)
> > >
> > > Note that since the test invocations are procedural, you can do them
> > conditionally
> > > on the outcome of a previous test, or start them in parallel, or decide
> > whether
> > > to stop when a particular one fails.
> >
> > It sounds like a really powerful and flexible method. I like it because it is
> > easier to understand than a declarative abstraction layer.
> > Should we still keep the current testplan files as a way to add jobs to jenkins
> > in a declarative style?
> Yes.  I try not to get rid of anything that people might be using.
> We might deprecate it, but I don't see a need to remove it in the short term.
> 
> > By the way, are we going to keep relying on Jenkins for board access
> > serialization? I thought you were going to create a board/resource
> > reservation system.
> Well, I have a way to reserve a board and release a reservation.
> 
> See 'ftc reserve-resource' and 'ftc release-resource'

Cool. Shall we use that on the testplans? (e.g.: when executing 2 testplans on the same board at the same time, each testplan must first reserve the board)

> 
> A bit more is needed before it's fully functional for board serialization.
> I need to add functionality to wait for a board to be released (essentially
> a wait queue for a board).  We're creeping up on a "test scheduler"
> feature, but I definitely don't want it to get more complicated before
> the 1.4 release.  I don't want to introduce full board serialization
> at the ftc layer before 1.4, because I suspect
> there will be issues with not releasing the lock on test aborts initiated
> by Jenkins.  Basically I want to introduce it when I have time to do some
> testing.

OK, then I guess the testplans functionality should also come after 1.4.

> 
> >
> > > > I put a mention on the commit logs but I wanted to discuss a couple
> > > > of questions:
> > > > - currently in my patches, the next job is triggered no matter
> > > >   if the current job fails or succeeds. That behavior can be
> > > >   easily changed. I want to know if you prefer to have a
> > > >   new interface to switch the behavior, or just hardwire it to
> > > >   a predefined one (e.g. "always continue" or "stop running jobs
> > > >   if one fails").
> > > I would want the behavior to match exactly what we have now,
> > > for the first iteration of the feature.
> > >
> > > >   I can see cases where you want to stop (eg: if the kernel
> > > >   build fails, then the next tests are worthless). On the other
> > > >   hand, i can also see cases where you want to go on (e.g.
> > > >   when the jobs in the testplan do not really depend on each
> > > >   other).
> > > >   If you like the "behavior switch" interface better, then
> > > >   I could add a parameter (parallel=True?) to each test in a testplan
> > > >   to define that behavior and have a default behavior when
> > > >   omitted. Any preferences for the default behavior?
> > > > - the second question is related to the interface used when
> > > >   adding jobs without a testplan. The current interface
> > > >   (--trigger myjob) only allows you to define one job to
> > > >   trigger, but i can add more in the future. The question
> > > >   is: if i do --trigger job1,job2,job3, should all those
> > > >   jobs be triggered in order (one after the other) or not.
> > > >   If we want a "behavior switch" here, the interface could
> > > >   become a bit convoluted. For example:
> > > >    --trigger job1->job2->job3,job4,job5->job6
> > > >   This could be used to define jobs that can run in parallel,
> > > >   and jobs that must run in a specific order.
> > > >
> > > > Your feedback is very welcomed.
> > >
> > > Let me know what you think of my counter-idea to convert testplans
> > > into procedurally defined lists of tests to execute, inside a test structure
> > > like the other tests (away from the declarative syntax we have now).
> >
> > The idea is very powerful, I really like it.
> > My only questions are:
> > - Should we completely remove the testplan files? or keep them around as
> > an easy way of adding a set of jobs to Jenkins.
> Keep them.
> 
> > - If we do keep them, how about adding the testplan's name to the job name
> > to avoid collisions between jobs from different testplans?
> I'd rather not.
> I don't want to put off this feature (serializing the testplan jobs using a different
> mechanism than we've got today).  But I'm not sure how to proceed with the least
> impact.  Your changes were pretty small, since they leveraged a Jenkins feature.
> To build up an alternative in Fuego core will take more time and thought.
> 
> Alternative ideas?  Either fuego core or Jenkins has to launch subsequent jobs.
> If it's Jenkins, then we'll (obviously) have to use a Jenkins feature. If it's fuego core,
> we'll likely have to implement some new code, while trying to leverage as much
> existing functionality as we can.

The thing is that we still do not have a better visualization tool than Jenkins. For that reason, using my approach has the benefit of reusing our current visualization infrastructure.
#When I have time i will try to port Squad support back into Fuego to solve that.

> On a related topic, I want to add a feature to track the batch-id for a run.  That is,
> when tests are executed as part of the same batch, I want them to have all to have
> the same batch-id in their run.json file.  This is so that we can query the results and
> generate reports with this batch-id.  With the "test plan is a fuego test" design
> I think this is easy:
> function test_run {
>     export FUEGO_BATCH_ID=$(get_next_batch_id)
>     ...
> }
> and a couple of changes in ftc to save this in the run.json file.

I will explore the posibility of exporting FUEGO_BATCH_ID between the trigger job and the triggered job.
If that was possible, we could solve the "collisions" problem something like this:

if $FUEGO_BATCH_ID == "testplan_lts"
   start the next job in the testplan lts
if ...

Regards,
Daniel