[lsb-discuss] Ways for us to work more efficiently

Theodore Ts'o tytso at MIT.EDU
Thu Jan 24 12:24:27 PST 2008


Hi all,

Earlier, at the last LSB phone call, I mentioned that I had some ideas
for how we could work more efficiently, without needing as many
conference calls and face-to-face meetings.

The following is taken from:

http://www.linux-foundation.org/en/User:Tytso:How_to_work_more_efficiently

It's a description of some of the techniques which we used when I was
the technical architect for the LTC Real-Time development team, where we
took pre-alpha patches development patches (that were not mainline) and
made them into a something was supportable as a GA (albeit 1.0 :-)
product using IBM's standard Help Line processes, in only 9 months.
Yes, it means more formalism, and yes it means people will need to write
more.  But (a) as we add more people to the project, we need a bit more
formalism so that we can all work together efficiently, and (b) it will
allow us to be able to more accurately predict when we will be able to
make a release --- and more importantly know whether we are in danger of
slipping release deadlines unlesss we cut features, or add more people,
or both.

Obviously, we can't force volunteers on the project to follow these work
processes; but it is something which I plan to ask all LF employees and
contractors to start following, once I take in feedback and make any
necessary tweaks and adjustments to the proposal.  I would ask folks to
try it out, since I think it will allow us to work that much more
effectively and efficiently together.

I've deliberately put this into a wiki so that people who feel more
comfortable discussing this wiki-style can click on the "Discussion"
link at the bottom of the page and add comments.   Just sign them with
"~~~~", wiki style, as you might do on any "talk" page on Wikipedia.

Alternatively, of course, feel free to respond to this e-mail message.

Regards,

						- Ted


= Introduction =

In the past, the Linux Foundation and its predecessor organizations (the
FSG and the OSDL) have operated with relatively informal engineering and
project management processes.  Given the number of engineers, and the
size of the engineering organizations, and the number of people who
needed to work together on a single project, this as probably
appropriate.

However, as we start adding more people to the organization, it is going
to be more and more important that we know what each other is doing, and
who needs to do what next, and if someone needs help, how we can divert
or obtain resources in time so that we get a particular project or
sub-project back on track before it is deeply in the weeds.  Moreover, I
want to do this without increasing the number of coordinating conference
calls.  As a result, the following project management scheme I am
proposing is as lightweight as I can make it, and it is wiki-based.  It
is still relatively informal --- you'll notice no mention of things like
Gantt charts --- but it still should be what we need to achieve these
goals.

Finally, I want to make a statement about our overall objectives of
instituting this additional amount of formalism.  We need to build an
organization where we can make commitments and be confident that we can
keep those commitments that we have made.  Part of this, of course,
means that our projects have the right amount of resources (ultimately
my responsibility) and that the LF has enough funding so we can obtain
those resources (ultimately our Executive Director's job).  Together,
our job is to execute on the commitments which Jim has made on our
behalf to the Linux Foundation Board and ultimately to the Linux
Foundations' Sponsors.  The better we can do this, the easier Jim's job
will be to get the funding we need to execute on future commitments.

= Use of Bugzilla =

== Meaning of priority fields ==

There are a number of P1 bugs which have persisted over 2, 3, or more
LSB releases.  Furthermore, bugs such as
[http://bugs.linuxbase.org/show_bug.cgi?id=1310 BZ #1310: LSB 3.2
complete release] are Priority 2, which seems rather strange.  It
appears that the priority field is not used at all.  We should use the
Priority fields to indicate how important a particular bug is, and in
particular, P1 bugs should indicate bugs which are block-ship, and P4
and P5 bugs should be bugs which we are willing to defer to the next
release.  While using the bug dependency tree is very helpful,
unfortunately it appears that Bugzilla doesn't have very good reporting
mechanisms involving it, while filtering bug reports based on "Target
Release" and "Priority" Bugzilla *can* do very well, so we should take
advantage of BZ's abilities in these areas.

== Use of Target and Version fields ==

Bugzilla allows the introduction of a "target" or "target milestone"
field which appears to have been suppressed in the LSB bugzilla.  This
field specifies the target version when a bug is intended to be fixed.
The Version field is used in some bugzilla systems to indicate the
target milestone, although formally the definition of the Version field
is the version of the system where the reporter found the bug.  In the
LSB bugzilla, the Version field is used inconsistently; in some cases it
is updated to mean the version when the bug is intended to be fixed, but
in other cases it is left as the version where the reporter found the
problem.

In the LSB bugzilla, bug dependencies are used to track when a release
is ready to be released.  This can be more powerful than the Target
field, but it also makes it much harder to track changes via Bugzilla's
graphing and reporting tools.

== Use Bugzilla Graphing Tools ==

[[Image:Sample-bug-trend-chart.jpg|thumb|right|Sample bug trend chart]]

I would suggest using periodic charts (probably generated monthly) so we
can see how we are doing vis-a-vis open bug reports, and so we can get a
high-level view about whether or not we seem to be closing the necessary
bugs to make a release.

In the chart to the right, the green area indicates fixed bugs, while
the red area indicates bugs that are open.  It would be useful to look
at graphs both for all bugs and for bugs that considered relevant for
the current targeted release.

== More aggressive use of bugzilla log entries ==

One of the things which I have noticed on the LSB bug calls is that a
huge amount of time is spent recalling the status of the various LSB
bugs.  We could probably make the bug calls more efficient if people
were more aggressive about keeping the BZ bug accurate.  At a previous
project that I was on, there was a team rule that if anytime anything
interesting was discovered about the bug, it was *always* logged, and if
a particular bug was designated as an engineer's FOCUS bug, it would be
updated at least once a day, or more often as necessary.  This allowed
people to be able to track the progress of a bug without having to rely
on a conference call, which is not necessarily the most efficient way to
do things. (The other purpose of a concall is as a forcing function to
remind people to get things done; but the discpline of updating FOCUS
bugs every day, and assigning various P1 bugs each week to engineers as
their FOCUS bug seems to be even more efficient way of reminding people
about what needs to be worked on.)

= Status Reports =

The [[LF:Status Reports]] pages are not getting updated very frequently.
In order to change this, I propose that we make the following changes in
how we collect status.  Instead of using a separate status page for each
indidivdual, we create a status page for each department (i.e., one for
the Russian Academy of Sciences, one for the US staff, etc.).  On the
status page, each engineer will be given a specific section, which is
broken into three subsections:

* Current --- what they are currently working on (including links to
  specific bugzilla entries and project pages)
* Done --- what they have completed
* Queued --- work queued up for the next week or so

Ideally, each person should update their status page every day, but
updates every other day is acceptable.  The status page *must* be up to
date at the end of each week.

== Periodic status rollups ==

Every two weeks, everyone's "done" list will be rolled up into a group
status update which will be posted on the wiki and/or sent via e-mail to
a mailing list that LF senior management may subscribe to.  After the
rollup report has been completed, everyone's "done" list will be
truncated to empty so the size of the status page remains manageable.

== Should status pages be private? ==

One interesting question is whether the status reports should be
private.  They are currently located in the LF namespace, which means
they are not visible to anyone except for core members of the workgroup.
Is that really necessary?  There are complications in keeping the status
pages protected, especially as we expand the number of LF contractors
that we would have to give wiki 'sysop' privileges to (although we can
work around this by using another group to control access to the LF
namespace).  But that begs the question about whether or not the status
pages themselves should contain anything confidential.

Certainly if the status pages including working with an individual
distribution or ISV, such private consultations or requests for waivers
should not be public for the world to see.  However, such requests
should be tracked via a ticket system in any case, in which case the
status report could just say, "worked with an ISV/distribution", and
give a reference via a URL or ticket number to where all of the details
could be stored --- and the trouble ticket can be protected as
necessary.

= Project Pages =

In addition to status pages, we will also have a series of project pages.   

== General Philosophy ==

In order to work effectively, developers need to be able to focus on an
issue for an uninterrupted period of time, and for several days. Some
project management theories even suggest 30 days! Our work is too
dynamic for that kind of duration, but we can still benefit from the
concept.  To do so, we will size and prioritize what will be worked on a
weekly basis. Those tasks are to remain as immutable as possible for
duration of the following week and will be listed above "The Line" at
the project summary page.  Projects or bugs which are below the line
will be deferred for work in future weeks.

== Focus Bugs ==

Bug which are the focus for the current week will have "[FOCUS]" added
to the beginning of their description.  It will be the responsibility of
the Bug Wrangler to assign high priority bugs to engineers each week,
and as bugs get completed, to designate which bugs should next receive
[FOCUS] attention.

== Projects ==

For issues which are bigger and require more tracking, project pages will be created and linked to the top-level project summary page.

Each project page will generally have the following sections:

* Weekly Executive Summary (only updated for active projects "above the line")
* Vision and Milestones
** Vision -- Why is this project important to the Linux Foundation?
   What business objectives does it help achieve or is it in support of?
   For the LSB, if this is needed to allow some important ISV
   application to certify, mention it here.  What would be the
   consequences if this project is NOT done?
** Milestones -- a rough proposed schedule for this project.  When we
   anticipate that we will reach certain milestones.  Also, if the
   project has an absolutely-must-be-completed-by date, mention it here.

* Resources
** Hardware
** Contacts (list of people and their roles)
* Tasks
** Tasks can be in a number of different states: Active, Pending, and
   Completed.  We will have a separate subsection for tasks in each
   state.
* Comments / Freeform notes
* Archived Weekly Executive Summary

Each project will have an owner, who will be responsible for keeping the
project page up to date.  In most projects, the project owner will also
be the person doing all or most of the engineering work.  In larger
projects with 3 or more people, this will obviously not not be true, but
in those cases, it is highly likely that project will be broken up into
one or more subprojects.

== Project Summary Page ==

The project summary page will be the top-level link to all projects.  It
will be composed of several sections.

=== Active Projects ===

These are the projects which the engineering team is actively working on
this week.  The section will have a table containing the following
fields:

* Name of the project (and link to the project page)
* Project Owner
* Release Target / Outlook --- when the project is expected to be completed and/or a short 1-2 line summary of the project status

The first active project will typically be "FOCUS bugs", which will be a
link to a bugzilla search of all open bugs which have [FOCUS] in their
descriptions.

=== Waiting Projects ===

These projects are pending on some task or event that is not under the
LF's engineering team's direct control.  The outlook field should
contain a note of who/what the project is blocked on.

=== Non-Focus Projects ===

Non-Focus projects are projects which we do need to complete, but they
are not actively being working on by anyone on the engineering team at
the moment.  Keeping projects on the "Non-Focus" list can help keep the
list of work we have to do from being overwhelming, in addition to
focusing the attention on the team on the Focus projects so we can make
progress quickly and efficiently.

=== Archived Projects ===

These are projects which are likely obsoleted or postponed.  As such,
they should not be considered for Focus (i.e., Active) status.  Archived
Projects will be periodically garbage collected and moved off to a
separate page.

=== Completed Projects ===

This is where we archive projects which are completed.  The Completed
Projects list will probably be on a separate page.

= Need a getting started web page =

As we try to attract more people to the LSB, we need to have a single
"Getting Starting" page which can be used by a newbie to learn about
everything they need in order to fully participate in the LSB working
group.  The top-level LSB Wiki kinda serves that function, but it's
mixed in with a lot of other things.  Also, certain things are missing;
for example, one can find the list of bzr repositories at
http://bzr.linux-foundation.org/lsb/devel, but aside from the one-line
description, there's much documentation.  If someone wanted to download
the complete sources for distribution testing, how would they do that?
Our tests are are broken up into multiple repositories, from
dtk-manager, to tet-harness, to t2c-harness, to qmtest-harness, to
misc-test, and so on.  The same is true on the spec side as well.  We
need a description of how the repositories hang together, and if someone
wants to build the tests from scratch, to know how to do that easily,
and if they want to find a certain set of tests because they want to
include them in their upstream sources, how to do that too.

= Conference Calls = 

In general, conference calls should not be used for anything where
e-mail or wiki updates can be substituted.  For example,
status-reporting conference calls tend to be of use to mostly one
person, the one collecting status. The rest of the people on the call
tend to try and work on other things until they're called upon for
status. This approach is disruptive, and greatly reduces productivity
(not to mention morale).  The task management system outlined above
should hopefully eliminate the need for "status" calls, which are the
least effective type of conference call.

Before each conference call, we should have an Agenda established; each
agenda item should ideally have pointers any background information
pertinent to the agenda item and what the goal is of bringing up the
item on the conference call.  In general, problems should not be worked
on the conference call, unless it is expected that nearly everyone on
the call is needed to weigh in real-time, and that it can't be worked
more efficiently via e-mail and via irc.  (Hint: most of the time the
latter will be true.)  Also, for each conference call in addition to a
dial-in number, there should also be an associated IRC channel which
everyone should be on and for which the minutes taker can log the
minutes directly into the IRC channel.  (Hopefully we are keeping logs
of our IRC channels, and if not, we can hopefully get that started
soon.)  This will allow the attendees to see that the logs reflect what
they were trying to say, and in some cases, they can assist the minutes
taker by typing their comments into the IRC channel.  The IRC channel
can also provide a useful back channel for quick comments where you
don't necessarily want to interrupt the speaker.

= Continuously Buildable =

Concept that the LSB specification, build tools, etc., should be always
buildable, and ready for release.  This means daily builds and
regression test suites, so we can find discover problems earlier.
Michael Schultheiss' work is the framework of what we need, but we need
to deploy it on enough machines so we can be automatically running it
every day on all of our architectures and on as many distributions as
possible.  This is going to be a huge test matrix, and it is almost
certain that we do not have enough development machines to support this.
So one of the things we need to do is to figure out how much resources
this will need, so we can start trying to request the necessary hardware
and rack space so we can do this kind of exhaustive testing --- and not
just with the current enterprise versions, but also for the development
"community distribution" versions of the enterprise distro's (i.e.,
Fedora, Open SuSE, Debian unstable, etc.)



More information about the lsb-discuss mailing list