[Chaoss-members] [Oss-health-metrics] Growth Maturity and Decline Working Group Update

dmg dmg at uvic.ca
Thu Jun 14 22:52:27 UTC 2018


I think these metrics should be defined as time series, where the
period between the observations is a parameter.

This is NOT the same as "over a period of time". A time series imply a
list of values one for each specific moment of time.

On Thu, Jun 14, 2018 at 3:50 PM, Jesus M. Gonzalez-Barahona
<jgb at bitergia.com> wrote:
> On Thu, 2018-06-14 at 23:59 +0200, Jesus M. Gonzalez-Barahona wrote:
>> On Thu, 2018-06-14 at 13:52 -0700, dmg wrote:
>> > Sean Goggins <s at goggins.com> writes:
>> >
>> > > Hi All:
>> > >
>> > > During our Growth Maturity and Decline Metrics working group
>> > > today we discussed two specific metrics:
>> > >
>> >
>> > with all respect to those who are doing the work, I feel this
>> > method of defining metrics is flawed.
>> >
>> > Take for example Pullrequest 13:
>> >
>> > + [New Overall Contributors](activity-metrics/new-contributors.md)
>> > > What is the overall number of new contributors?
>> >
>> >  +[New Contributors of
>> >  Commits](activity-metrics/new-contributors-commits.md) | What is
>> >  the number of persons contributing with an accepted commit for
>> >  the first time?
>> >  +[New Contributors of Opened
>> >  Issues](activity-metrics/new-contributors-issues-opened.md) |
>> >  What is the number of persons opening an issue for the first
>> >  time?
>> >  +[New Contributors of Closed
>> >  Issues](activity-metrics/new-contributors-issues-closed.md) |
>> >  What is the number of persons closing an issue for the first
>> >  time?
>> >  +[New Contributors of Initiated Code
>> >  Reviews](activity-metrics/new-contributors-code-reviews-
>> > opened.md)
>> >  | What is the number of persons initiating a code review for the
>> >  first time?
>> >  +[New Contributors of Reviews for
>> >  Code](activity-metrics/new-contributors-code-reviews.md) | What
>> >  is the number of persons contributing with reviews of code for
>> >  the first time?
>> >  +[New Contributors of Posted
>> >  Messages](activity-metrics/new-contributors-posts.md) | What is
>> >  the number of persons posting messages in mailing lists for the
>> >  first time?
>> >
>> > Based on this definition, i assert that the number of new
>> > contributors to a project is equal to the number of contributors
>> > of that project. Anybody wants to prove me wrong?
>>
>> Daniel, have a look at the pr. The metric is defined for a period of
>> time. Or maybe I'm missing something?
>
> /me kicks /me pretty hard for being so dumb.
>
> You are completely right, Daniel, the pr does not mention in any place
> that this is for a period of time. I was confused with the pr on
> efficiency, which I was discussing in some detail during our meeting
> today.
>
> I'm so sorry for my confusion.
>
> Please see https://github.com/chaoss/wg-gmd/pull/12/files for how we
> are dealing with period in that other metric about efficiency.
>
> Yes, the detailed definition of the metric (to be written) should
> clearly state that it is defined over a certain period of time. If you
> feel that should be in the name of the metric, which is the only part
> which is written for now, we can discuss it. I see pros and cons to
> have very detailed names for the metrics.
>
> Again, sorry for the noise,
>
>         Jesus.
>
>>       Jesus.
>>
>> > What we need is to think more holistically and think more in term
>> > of what we are measuring.
>> >
>> > First, "a new contributors" metric is not a _new_ metric. It is a
>> > derived metric. Is a filtering of an activity metric that has been
>> > filtered to particular subset of individuals.
>> >
>> > We need to clearly define what we can measure and what we can
>> > derive from what we can measure.
>> >
>> > here is a proposal:
>> >
>> > perhaps we should first start with what we can measure. What are
>> > observable  entities? Then based on this entities define "lists"
>> > of activities.
>> > Each activity has many attributes: type, who is involved with it,
>> > when it was done, etc. An activity is polymorphic.
>> >
>> > Then we can define metrics in terms of filtering. For instance,
>> > "commits by first contributors" is the result of filtering
>> > activities of type commit such that we only capture the first
>> > commit from each person.
>> >
>> > Now, there is also the issue of 'work' vs 'power'. Work is
>> > absolute (think physics), while power is avg power over unit of
>> > time.
>> >
>> > The metric I defined above is absolute. If I want to compute its
>> > "time related" one I have to define a period, basically, the
>> > "average number of commits by first contributors" over "some unit
>> > of time".
>> > or I can define it more fine grained, as a time series, where I
>> > compute the average over a fix period. Then the result is a time
>> > series.
>> >
>> > for example: I can define the Time series of new contributors as:
>> >
>> > montly new contributors = TimeSeries( count(filter <keep only the
>> > first activity of each contributor> activities)) per month
>> >
>> > montly new commmitters = TimeSeries( count(filter <keep only the
>> > first activity of each contributor> filter <commits> activities))
>> > per month
>> >
>> >
>> > Efficiency in PR 12 is flawed to.
>> >
>> > Note that in this context, efficiency (as defined in the PR) is
>> > also an absolute metric:
>> >
>> >    Formula:** 'issues_closed / (issues_opened + issues_backlog)'
>> >
>> > but that is ok, because it can be converted into a time series.
>> >
>> > We can still define it in terms of a filtering of the activities:
>> >
>> > issue resolution efficiency = count(filter <type=issue and
>> > status=closed> activities)/ count(filter <type=issue and
>> > status=(not closed> activities)
>> >
>> > but this rate is only useful when it is converted into a time
>> > series. So with my made-up-notation:
>> >
>> > monthly issue resolution efficiency = TimeSeries(count(filter
>> > <type=issue and status=closed> activities)/ count(filter
>> > <type=issue and status=(not closed> activities)) per month
>> >
>> > I personally  don't like the name "efficiency". Its meaning is
>> > rate of output to input. This is not what this is measuring. A
>> > project that did not have any new issues
>> > and did not close an outstanding issue would have the same
>> > efficiency as in the previous period, but nothing has being done.
>> >
>> >
>> > --dmg
>> >
>> >
>> > > 1. New Contributors and
>> > > https://github.com/chaoss/wg-gmd/pull/13
>> > > <https://github.com/chaoss/wg-gmd/pull/13>
>> > > 2. Issue Resolution Efficiency
>> > > https://github.com/chaoss/wg-gmd/pull/12
>> > > <https://github.com/chaoss/wg-gmd/pull/12>
>> > >
>> > > These two metrics share the characteristic that their expression
>> > > is likely to be parameterized in different ways. You can follow
>> > > the examples and discussion on the associated pull requests,
>> > > noted above.
>> > >
>> > > We encourage participation from community managers during our
>> > > next call, at 11am CDT on June
>> > > 28th. https://unomaha.zoom.us/j/720431288
>> > > <https://unomaha.zoom.us/j/720431288>
>> > >
>> > > Whether or not you are able to make the next call, please review
>> > > and comment if you are interested on the two pull requests from
>> > > Jesus, noted above and here:
>> > >
>> > > https://github.com/chaoss/wg-gmd/pulls
>> > > <https://github.com/chaoss/wg-gmd/pulls>
>> > >
>> > > Thanks!
>> > >
>> > > Jesus & Sean _______________________________________________
>> > > Oss-health-metrics mailing list
>> > > Oss-health-metrics at lists.linuxfoundation.org
>> > > https://lists.linuxfoundation.org/mailman/listinfo/oss-health-met
>> > > ri
>> > > cs
>> >
>> >
>> > --
>> > Daniel M. German                  "Often a small and simple
>> > question can chisel away at the biggest problems"
>> >                                    Levitt and Dubner
>> > http://turingmachine.org/
>> > http://silvernegative.com/
>> > dmg (at) uvic (dot) ca
>> > replace (at) with @ and (dot) with .
>> > _______________________________________________
>> > Oss-health-metrics mailing list
>> > Oss-health-metrics at lists.linuxfoundation.org
>> > https://lists.linuxfoundation.org/mailman/listinfo/oss-health-metri
>> > cs
> --
> Bitergia: http://bitergia.com
> /me at Twitter: https://twitter.com/jgbarah
>
> _______________________________________________
> Chaoss-members mailing list
> Chaoss-members at lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/chaoss-members



-- 
--dmg

---
Daniel M. German
http://turingmachine.org


More information about the Chaoss-members mailing list