[Desktop_architects] [cairo] Automated testing of Cairo

Thu Aug 10 06:50:38 PDT 2006

On Thu, 10 Aug 2006 02:41:04 -0700, Bryce Harrington wrote:
>
> I'm happy to be able to announce that OSDL will be supporting Cairo by
> providing automated testing on various platforms.

Hey Bryce,

Thank you so much for doing this. This looks fantastic and should be
extremely valuable. I really appreciate it!

>  * Right now, once a day git snapshots are pulled and 'make check' run.
>    Currently, they're being run on three x86 systems:
>     - Gentoo P4 x86/32
>     - Redhat P4 x86/32
>     - Gentoo Xeon x86/64 (but in 32-bit mode)

Very nice. One thing that would help in the reports is to be able to
determine which machine is which. I see names like nfs08, nfs09, and
nfs11 but can't find any information about what the configuration of
any machine is. Perhaps if each machine name were a link to a page
describing its configuration?

>  * Cairo results can be found here:
>     http://crucible.osdl.org/runs/cairo_branches.html

This looks great. I really like the compact overview it provides. I do
have a few comments though:

1) For a git pull such as c3c7068 it would be nice if the link it
   provided was directly into cairo's gitweb. For example the link
   I would expect is to:

	http://gitweb.freedesktop.org/?p=cairo;a=commit;h=c3c706873ef6a0e1318b1d4b4d4b6841758ea18d

   Currently you're providing a link to a diff from the previous run
   (I think) which is also valuable since I don't think git provides
   an easy way to get at that information.

2) As a nit: The git pull is labelled 1.2.2 but it should be 1.2.3
   since that's what the version advertises itself as now. So maybe
   with this and the above comment what I would like to see is:

	1.2.3-c3c7068 (diff)

   where the "1.2.3-c3c7068" links into gitweb and the "diff" is a
   link to the diff you currently provide.

3) The results all say "OK" now regardless of what failures
   exist. We're going to need to make that say something very
   different than "OK" for failures if this is going to be useful. ;-)

4) The three different machines seem to be getting cairo compiled with
   different backends. The first two are testing image and ps while
   the third seems to also be compiling xlib, but not successfully
   testing it, (which likely means that the X libraries are available
   on that machine but that no X server is available).

   It would be nice to see all the machines testing as many of our
   "supported" backends as possible, (which would be "image, ps, pdf,
   svg, and xlib"). The win32 backend is also supported, but that
   would obviously require a separate machine for testing. I don't
   know if OSDL is interested in hosting such a machine, (but win32 is
   one backend that could benefit a lot from automated testing since
   most of the core cairo hackers don't have ready access to any win32
   systems).

   The pdf and svg backends should be getting compiled already, but
   they're likely not getting tested since poppler and librsvg are not
   available, (Behdad, care to fix the handling of CAIRO_CAN_TEST so
   that the backend still shows up in the tests but as UNTESTED rather
   than not appearing at all?). As for xlib, it should be quite
   reasonable to get a headless X server, (Xfake), running so that
   even xlib could be tested. I can help out with this some.

5) The current PS failures for nfs08 are showing a lot of false
   positives, (the diff images just show a single pixel differing in
   some trivial amount, for example). Interestingly enough, the same
   false positives are not appearing on nfs11. One difficulty here is
   that the test suite is extremely unforgiving, so it may require a
   precise version of ghostscript to reproduce the expected
   results. And we definitely haven't done a good job of documenting
   the precise version needed.

   Similar undocumented version dependencies also exist for poppler
   and librsvg, (though more recent versions are pretty much always
   better---the cairo test suite has exposed bugs in those libraries
   for which we've pushed for fixes upstream). With poppler and
   librsvg though, we're in a better situation than with ghostscript,
   since the final rasterization still goes through cairo. So it
   really is reasonable to expect the PDF and SVG backends to be able
   to pass the tests even with the current, strict, not-even-one-bit-
   can-differ behavior of our test suite. And if someone wanted to
   write a cairo-based backend for ghostscript we could be there too.

   The punchline there is that we should make the effort to get rid of
   all these false positives in the failure reports.

6) The only current failure I see that isn't an obvious false positive
   like those described above is the failure of
   ft-text-vertical-layout which can be seen here:

	http://crucible.osdl.org/runs/1466/test_output/cairo-test/nfs11/test/

   This is a failure that's hitting the image backend, which is the
   worst kind since there's no chance of a false positive being
   introduced by some external conversion tool like with PDF, PS, and
   SVG. Some false positives do hit the image backend because of a
   missing font, (Bitstream Vera is about the only thing we use I
   think), or perhaps a freetype-dependency. But usually a problem
   like that will manifest itself as failures in every text-using
   test. So I'm not sure what might be happening here. Again, it would
   help to know the configuration of this machine.

>  * More tests can (and should) be added.  Point me at what you'd like to
>    have run.

I think the easiest way to add more tests is to just keep adding them
to cairo/test. We try to do this for every bug report against cairo,
so the list of available tests should just keep growing.

>  * We also have an amd64 and an itanium2, and we can run the Xeon in
>    64-bit mode, if you wish to do 64-bit testing.

Yes, it would be very helpful to have these. The more variety in
platforms the better! There are a lot of cases where 64-bit-specific
bugs in cairo have gone latent, (the most recent being the
truetype-subsetting problem), so it would be very helpful to have
automated testing to help us catch these issues earlier.

>  * Developer login access to the SUTs is available on request.
>    I can provide full access + instruction for test developers.

Hook me up and I can start looking into the library version issues
discussed above.

>  * If you have hardware worth adding to the pool for testing Cairo
>    against, we can host it in our test environment here in Beaverton, OR
>    (near Carl and Keith).  We would just need someone identified as the
>    admin for it (esp. if it's a non-Linux box).

Particularly when we get performance tracking added to this setup (see
below) it would be helpful to have some ARM system(s) added to the
mix. Those would be Linux, so admin should be simple, though I could
obviously help.

>  * Crucible gives a lot of flexibility for changing Linux kernels, so if
>    there's any kernel-variation worth doing, we're well set up for
>    automating that.

I hope that for the most part cairo doesn't care about the kernel,
(well, for the xlib backend, things like DRM drivers could obviously
be relevant), but this is certainly good to know.

> With Carl's announcement yesterday that the Cairo team is turning
> attention to performance improvement work, this seems like a great time
> to jump in.  I've been doing a lot of NFS performance work, so am hoping
> some of our analysis tools (e.g. historical performance graphing) may be
> reusable for Cairo without too much trouble.

The idea of getting automated performance testing of cairo on a wide
variety of hardware---with historical tracking---is very compelling to
me. I was hoping someone would step up and offer this, but I wasn't
expecting to see it so soon. This will be great!

Since you've already got some existing tools for tracking, perhaps you
can give me some suggestions on what the report format should look
like. We're starting from scratch, (to some extent), so we can output
pretty much whatever would be most convenient.

>                                               Also, it sounds like Carl
> is working on performance tests.  Does anyone else have performance (or
> other) tests that could be used, that you'd be able to show me how to
> run?

As you might expect, my "work" consists in large part of collecting
good stuff that others have done before, (David Reveman, Billy Biggs,
and Vladimir Vukicevic have each written a cairo benchmarking suite at
some time in the past, and many others have written or suggested
specific performance tests. Many of these are scattered throughout the
cairo bugzilla and cairo mailing list. So I'm already in the process
of collecting these and will continue to do so.

I'm very much looking forward to the rest of this. And I'd like to
extend my appreciation to Bryce Harrington and OSDL for dedicating
resources to support cairo development this way. Thanks so much!

-Carl
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://lists.linux-foundation.org/pipermail/desktop_architects/attachments/20060810/44e81fe0/attachment-0001.pgp