[lsb-discuss] ISP RAS framework issues/feedback/questions

Mon Jul 31 22:17:41 PDT 2006

Hi,
	Thanks for the detailed response. I think things are getting
clearer now. I think you have answered some of the basic questions we
were asking till now. Here I will attempt to put together some of our
observations and comments from what we have understood so far (instead
of following up on your responses; so nothing inline in this email).
Please pardon me if I am asking a question that is already answered in
one of your document as there was a lot to go thr' in such a short
timeframe.

1. Are pre/post blocks used for validating the parameters for function
under test or the test case itself? Many times a function under test can
only be called after creating a required scenario for it. If the pre
block is for validating parameters for test case (not the function under
test), what exactly is its use? Shouldn't this really be done on the
agent side to validate parameters for function under test? In general,
the usage of pre-post block was not clear from the docs... Instead of
taking a simple example like abs, it may be useful to illustrate with
slightly complex set of parameters.

2. I would like to look at this framework from two points of views:
	a. Porting existing tests (e.g. ones taken from upstream and not
necessarily the ones we have in LSB): In this case, my general
observation is that the framework will force a lot of re-write. If we
try to take tests from upstream, then this framework will require a lot
of changes to the tests. e.g. for libxml2 tests found in LSB CVS, it
took us about 1 week to "port" it to TET-Lite. What is the estimate for
converting it to RAS framework? 

	b. Developing new tests: For LSB-Core like libraries where many
of the interfaces are primitives with very little complexity in
parameters, this framework seems to be a very good choice. As soon as
the parameters get complex (pointers, structures and combinations of
those), I think the test developer will spend a lot of time marshaling
and unmarshaling the parameters while sending test from server to agent.
Now, I am making this comment based on the assumption that the agent
side test case will always represent one particular "assert" and tests
require complex data as input (or globals) which I think is the case for
many of the higher level libraries. There are many examples in GTK tests
like the ones in GtkMenu.inp file. 
	If this marshalling and unmarshalling is really as I understand
it will add a lot of overhead for libraries similar to the ones in
current desktop test suite. 

3. Does the existing RAS tests cover all 7 LSB supported architectures?
Any limitations for supporting all these archs... I don't see any, but
thought I should ask it anyway...

That's all for now,

Thanks,

-Rajesh

>-----Original Message-----
>From: Denis Markovtsev [mailto:markovtsev at linuxtesting.org]
>Sent: Friday, July 28, 2006 8:36 AM
>To: Banginwar, Rajesh; lsb-discuss at lists.freestandards.org
>Subject: RE: [lsb-discuss] ISP RAS framework issues/feedback/questions
>
>Hi,
>
>Thanks for very concrete questions. It really helps us to focus our
>documentation on the most important questions. We put information about
>pre/post/coverage and protocol semantics with examples into a single
>document which we hope could get the whole picture clear.
>
>We used some pictures and syntax text highlighting and finally decided
to
>make it a .pdf  to help reading it. The document itself (Comments.pdf)
>finally become some sort of solid narration which could help
understanding
>so I'll refer specific sections of it in the embedded answers below.
>
>We published all training materials here
>http://www.unitesk.com/download/YET7W6NYFGXE4RW5HAJB7YAD9C/. It
contains
>lecture plans, .ppt presentations and tutorial. Tutorial is a set of
>examples we use in our training to help learn SeC step by step. Each
>example
>contains task description, hints and correct answer. So upon completion
of
>all 36 tasks in addition to lectures leading by an experienced trainer
we
>get a good trained specialist in CTesK/UniTESK and markup.
>
>Regards,
>Denis Markovtsev
>linuxtesting.org
>e-mail: markovtsev at linuxtesting.org
>phone: +7(495)912-07-54 ext 4435
>
>PS: I don't know if mailing list allows doing any attachments. The
>Comments.pdf is also included in the package with other training
materials.
>
>>Banginwar, Rajesh wrote:
>
>>Hi,
>>	Thanks for the detailed response. Some questions/comments inline
>>below. Again I am combining the response from myself and Brian just
>>to avoid forking this thread...
>>
>>General comment: Waiting till Autumn for some of the documents is
>>going to affect any good review. For us to provide good feedback on
>>the framework and hence make educated decision on final RAS proposal,
>>certain documentation is essential. Please see below for specifics. I
>>understand that not everything can be documented right away, but
>>let's try to complete some parts that are essential for any review.
>>Again, Ian tells us that the final proposal needs to be finished
>>much before the LinuxWorld and hence waiting till then may not be a
>>good/viable option.
>
>>Thank again,
>
>>-Rajesh
>
>
>>-----Original Message-----
>>From: Denis Markovtsev [mailto:markovtsev at linuxtesting.org]
>>Hi,
>
>
>>>CTesK UserGuide
>>>www.unitesk.com/download/papers/ctesk/CTesK2.1UserGuide.eng.pdf
>>>
>>>CTesK Whitepaper
>>>http://www.unitesk.com/download/papers/ctesk/ctesk_wp.pdf
>>>
>>>Also a good "Getting Started" example is present on how to interpret
>>>one of OLVER tests:
>>>http://linuxtesting.org/downloads/getting-started-math-integer.pdf
>
>>Suggestion for that doc- while it identifies pre/post/coverage
>>blocks, it really doesn't clarify what those blocks actually do,
>>more of a "if you look in figure a, you'll see an example of a pre
>>block" without clarifying what that block actually does; expanding
>>on that would definitely be useful for people trying to jump in and
>>get a high level view of how a functional specification is actually
>>represented in code. And this is essential for any review before the
>>final proposal can be discussed.
>
>Please see the chapter "What pre/post/coverage means and how it
translates
>to C" in the "Comments.pdf" doc attached. We took an example of the
"abs"
>function and tried to show different aspects of its behavior.
>
>----------------------------------------------------------------------
>
>>* What is inviolable in your standard specification call; all
>>examples use 'CallContext context' (exact var name), related to the
>>pre/post/coverage blocks via macro expansion?  It's not documented
>>what must not be changed, that said the sharp edges need to be
>>labeled.
>>* new objects/typedefs in use need to be documented
>
>>>2. We plan to release the documentation for the common part of the
>>>OLVER test suite this autumn. This documentation will contain:
>>>   a) Description of the CallContext and other objects widely used
>>>      throughout all OLVER tests.
>>>   b) Description of our typedef schema.
>>>   c) Description of the communication protocol used between chost
>>>      and CTARGET machines.
>
>>I think some of these needs to be documented for effective review.
>>E.g.
>>I would really like to see some details around CallContext and the
>>communication protocol.
>
>Please see the chapter "What CallContext means" in the "Comments.pdf"
doc
>attached.
>
>----------------------------------------------------------------------
>
>>I would like to see good documentation for "how to develop tests"
>>and not just how to run tests.
>
>>>3. General information on how to write tests can be obtained from
>>>CTesK UserGuide
>>>(www.unitesk.com/download/papers/ctesk/CTesK2.1UserGuide.eng.pdf).
>>> Of course OLVER adds new rules to the pure CTesK process of writing
>>>tests.
>>>We plan to release a separate document with a detailed step-by-step
>>>description on how to write tests for OLVER this autumn.
>
>>Without reviewing this part (instructions to write new tests), it
>>would be extremely difficult or time consuming to do a good review.
>So at
>least some level of documentation will be very useful.
>
>>>At the moment the quickest possible way to understand testing in the
>>>OLVER framework is a 3-4 days training course, which we could
>>>provide at any convenient time at any suitable site. We provide
>>>training courses for the UniTesK testing technology on regular
>>>basis and our experience shows that 4 days is enough to start
>>>developing tests with the model-based approach. Materials for
>>>remote education also can be included in our plans if it is
>>>demanded. Your team is welcome.
>
>>Does this material exist now? Can you please forward it to us?
>
>I attached all the materials I was able to collect (some people have
their
>vacations right now but I tried to do my best to collect as much as
>possible). This is mostly .ppt of the lectures which are part of our
>training course.
>
>>>a) Please note, that all calls to functions under test are placed
>>>inside the agent which has no additional types/structures defined
>>>(i.e. there is no naming conflict with IntT, String, etc - all that
>>>stuff is used in "OLVER test suite" part.
>
>>>b) Agent is really lightweight. The only "external" thing it
>>>depends on is TCP/IP socket. The rest of its dependency is actual
>>>library which is being tested. It is possible to test
>>>embedded/restricted distributions having only limited subset of
>>>functionality available.
>
>>Bit curious how exactly the agent is invoked remotely (say literally
>>a separate machine)- could you elaborate?
>
>Please see the chapter "Details on Agent protocol" in the
"Comments.pdf"
>doc
>attached.
>
>>>c) One test suite may communicate with a number of Agents at the
>>>same time. Each agent is an active Agent thread. "CallContext"  is
>>>a structure representing active process+thread to which this call
>>>is directed. Having multiple processes/threads is very useful for
>>>testing various things as "pthread_cancel", blocking operations as
>>>sem_wait() and so on.
>
>>>d) Agent itself does not introduce any specific signal handlers -
>>>there is no need in it, so it is safe to work with signal-related
>>>functions.
>
>>>e) Test suite could be executed on a machine different from the
>>>tested one. Even architecture could be different. We introduce IntT
>>>to represent "int" type of the CTARGET machine which could be
>>>different in some sense from the "int" on the CHOST. The naming
>>>convention is straight forward - eacht type/structure defined in
>>>LSB has its reflection on CHOST with a simple naming convention.
>>>All generic types are defined and commented in one place,
>>>see: olver_0.5/src/model/data/embed_model.seh
>
>>Specifically what I was wondering about in that case was complex
>>types; struct definitions from elsewhere that use int (for example),
>>and how olver would handle that.  Sounds of it, it seems like the
>>actual testing code is essentially split out and cross compiled (if
>>needed), which would resolve the issue.  Assuming that's roughly
>>akin to what occurs? If so... Non-issue then.
>
>Ok.
>
>----------------------------------------------------------------------
>
>>* test execution speed is fairly slow, with a large amount of time
>>idling- reason?
>
>>>7. Yes, it is one of side effects of the distributed model (OLVER
>>>Test Suite - Agent) and extensive tracing. We are going to provide
>>>configuration parameters to regulate depth of testing and level of
>>>tracing. If it is critical we'll implement alternative way for
>>>local communication between the test system and the test agent.
>
>>Could you clarify on this a bit more?  The delays being seen via the
>>run_tests script are between each individual scenario, specifically
>>the forced sleeps after spawning/backgrounding of each scenarios
>>automated_test; the sleep there looks like it's intending on
>>avoiding a race (?), although doesn't seem to be the main slow down-
>>locally at least, I've seen 10-20s delays waitpid'ing with nothing
>>actually occuring.
>
>You are right, we had to put a "sleep" into run_tests.sh. I tried to
>mention
>it in "Comments.pdf" in example scenario for strtok_r_spec that we have
>deterministic execution model so we don't deal with racing conditions.
That
>specific delay is technical: we had some annoying bug that with "cannot
>bind
>socket" problem just before previous OLVER release. "Sleep" was an
easiest
>workaround we was able to find at that moment. Coming release does not
have
>such problems.
>
>----------------------------------------------------------------------
>
>>* schema/dtd for the .utt (xml) generated files?  Web pages for
>>viewing results is fine for build bots, but for local
>>certification, ability to check an error code is > bit more useful
>>(same for submitting test results for bugs).
>
>>>8. There are two descriptions of the trace format. Formal - BNF
>>>notation, and informal - natural language. Today informal
>>>description exists only in Russian. We are going to translate it as
>>>a part of our documentation activities and make available for
>>>download this autumn. We also have a Perl script which converts our
>>>.utt trace to a TET report format, which could be used for
>>>certification purposes.
>>>Does it answer you question? What did you mean by ability to check
>>>an error code?
>
>>Actual exit code from the test 'runner', which in this case seems to
>>be a custom shell script specifying all tests to run, and triggering
>>the actual report generation.  When I say 'check an error code', I
>>mean some easy method to know if there were any failures- this would
>>likely include being able to filter out TODO_REQ warnings since some
>>assertions are borderline impossible to actually trigger/test
>>(ENOMEM from open for example).
>
>>Reason this is useful is that while the generated reports are useful,
>>first step people are going to be after is seeing simply whether or
>>not there were failures; while the test cases *are* mainly for
>>certification, should also be looking to push the tests to the actual
>>distros themselves for buildbot testing (if the tests exceed what
>>upstream provides from a coverage/depth standpoint, getting upstream
>>to actuall fold the tests in is preferable).
>
>>Realize the automated "did it pass or not" angle for buildbots is a
>>bit outside of just the intention of certification testing, but it is
>>definitely useful for chaining together executing multiple separate
>>sets of tests (see 3.1 desktop tests for an example- actual tests per
>>component are totally separate, combined at a high level).
>
>Thank you for such a great explanation. Such use cases are really
useful.
>We
>did some corrections which are to be included into the coming release:
>1. Each separate scenario and whole run_tests.sh will return definite
>status
>code.
>2. Test execution creates summary log in short form, like:
>Build passed
>Run passed
>Total : 105
>Passed : 82
>Failed : 23
>3. The reporting tools will use "tjreport" to provide tet-like form of
a
>report:
>/olver/ERRORS/ERRORS/tests/ERRORS 1 PASS
>/olver/ERRORS/ERRORS/tests/ERRORS 2 FAIL
>
>
>Test was run: 20060411 time1
>Test Suite Version: unset
>Test Suite Architecture: unset
>Total Tests Passed: 0
>Total Tests Failed (including waived): 2
>Total Tests Failed (excluding waived): 2
>
>Test Result Breakdown:
>PASS: 1
>WARNING: 0
>NOTIMP: 0
>UNAPPROVE: 0
>UNSUPPORTED: 0
>TEST_ERROR: 0
>FIP: 0
>NOTINUSE: 0
>UNRESOLVED: 0
>UNINITIATED: 0
>UNTESTED: 0
>FAIL: 1
>UNKNOWN: 0
>UNREPORTED: 0
>
>----------------------------------------------------------------------
>
>>* What is the jdk used for (building, looks of it), and will the
>>java components work with gcj/kaffe?
>
>>>9. All our Java tools are written in pure Java. So theoretically
>>>they shall work on any JVM. In practice they do not work with the
>>>current version of gij virtual machine because of several bugs in
>>>it.
>
>>Severity of bugs?
>
>>What I'm specifically wondering about is if it's just shallow bugs in
>>their implementation, or deep bugs that will take a good chunk of
>>time to sort out.  With the 1.5 license changes, distribution *is*
>>easier (namely no annoying fetch restrictions), the remaining
>>restrictions may still conflict with the distros intent of being
>>'pure foss'- in this case, specifically thinking of debian who do
>>aim for lsb compliance, and while I'm not up to par with their view
>>of the 1.5 license revisions, I'd expect they will still be only
>>willing to provide gcj/kaffe in main repos (someone who follows
>>debian closely, kindly comment on this also).
>
>We see problems running under gij and it is hard to provide the precise
>list
>of problems that we have with it. SeC and trace tools do use some
apache
>libs which don't work properly with gij.
>In the other hand kaffe is acceptable. Our tools work properly with
kaffe.
>We hope kaffe support alone is anough because many other libs have
issues
>with gij. What do you think?
>
>
>>>We're not going to push our technology instead of any current
>>>developments. Our goal is to complement existing test suites in
>>>places where we see weaknesses and we have strengths which provide
>>>visible benefits.
>
>>The intention behind this review is to find out whether this
>>framework is scalable from the point of handling different kinds of
>>tests. So if going forward we use this framework for one class of
>>interfaces and some other framework for other type, we are not
>>achieving the goal. We should at least understand where this
>>framework will fall short so that the decision will be based on data.
>
>I'm sorry I was not clear in my answer. Please let me rephrase. The
>framework itself is able to handle all the kind of tests. LSB-Core has
>whole
>range of various interfaces which will completely be covered by CTesK
that
>is demonstrated in OLVER. I was trying to tell that the tests like
>"memParseTest" in XML does not take any benefits from pre/post
approach.
>This test is designed to read document using old API and write it back
and
>test that old document equal to new one. So the result depends on the
>comparison of two data files. We can do the scenario which does same
things
>using SeC but it is going to be very similar to what is done now in
>memParseTest  using agent, mediators and so on, but that gives no
>additional
>benifits (it only adds remote execution facilities, unified tracing and
>test
>quality measurement).
>
>>>One more thing that could be useful separately from model-based
>>>testing is a markup. We have invested much efforts into
>>>markup/requirements tracing facilities. The main intention is a
>>>visibility: easy to markup/easy to generate code/easy to see the
>>>results. This is not specific to a pre/post, model-driven testing.
>>>This technology and tools could be reused in existing tests to have
>>>requirements coverage tree, document coverage (for instance
>>>you can see the parts of text representing covered/uncovered
>>>assertions in different background colors so that it is easy to
>>>figure out things that something was uncovered by the tests) and so
>>>on. We believe that it is a good way to identify the parts of the
>>>standard which are really tested and covered. We have shown all
>>>that features to Ian, but not all of them are present in the
>>>current release (many are under active development). We can
>>>show and explain it to you f2f in LinuxWorld in San-Francisco
>>>(Aug-14 - Aug-17).
>
>>Related request/question, will the generated c be marked up via
>>comments indicating where each individual bit roughly came from?
>>Right now, the generated c is a bit opaque to dig through (although
>>that's partially do to the mass symbol definitions).
>
>Please see "Comments.pdf", section "Informal example".  It tries to
explain
>what means what is in the generated C file.