[Chaoss-software] [Meeting item] Collaboration of projects within the Software TC

Sean Goggins s at goggins.com
Thu Oct 19 15:24:58 UTC 2017


I have a few comments inline. 

> On Oct 19, 2017, at 2:21 AM, Daniel Izquierdo <dizquierdo at bitergia.com <mailto:dizquierdo at bitergia.com>> wrote:
> 
> Hi Jesús,
> 
> Thanks for this set of emails. Some comments in line.
> 
> On 18/10/17 00:49, Jesus M. Gonzalez-Barahona wrote:
>> As I commented in the thread proposing our kick-off meeting as the
>> CHAOSS TC, I'm going to start email threads with the proposed topics.
>> Let's see if this works. If it doesn't work, I'll announce a time slot
>> for a synchronous meeting next week.
>> 
>> This is the first item that I proposed to discuss:
>> 
>> * Item:
>> 
>> According to our charter [1], we should "produce integrated, open
>> source software for analyzing software development". So, we should
>> discuss how to start working in this direction.
>> 
>> [1] https://chaoss.community/about/governance/ <https://chaoss.community/about/governance/>
>> 
>> * Discussion:
>> 
>> We have now three projects in the CHAOSS Software TC: Prospector,
>> GrimoireLab, cregit. During the conversations that lead to the launch
>> of CHAOSS, we decided that, at least for a start, the idea was to have
>> GrimoireLab a the "glue" for all the projects, so that they would
>> interoperate, at least to some extent, via GrimoireLab.

I think this conversation took place in a subgroup that was not on the list. Its also possible i missed it, but I think it will be helpful for the onboarding of new contributors if we provide some kind of clear road map. 

>> 
>> In this regard, Prospector is already integrated, since it was ported
>> to use GrimoireLab/Perceval for data retrieval when it was updated to
>> newer versions of its dependencies.

OK, so from a deployment perspective, do we have two projects or one at this time? 

>> 
>> WRT cregit, I've talked to Daniel German about using a new Perceval
>> backend to extract the information it produces, and then showing it
>> GrimoireLab dashboards. In fact, I have a Perceval backend wrote that,
>> improved, could do the trick. But i need to find some time to update
>> and improve it.

It seems like working on Perceval is a priority then? Should we have different “contributing” document sections in the repository for different layers?  For example, if we agree that perceval is our “back end”, presumably, then, there are also possibly “Web service /REST API” contributions and “front end” contributions needed.  

Across all contributions I think we need to be clear about how specific “activity level metrics” and integrated views/calculations across activity level metrics reflect (or in some cases perhaps deviate) from the Metrics committee definitions. 

>> 
>> Then, I would like to find ways of including other projects, which
>> could cover areas not already covered. Since GrimoireLab produces
>> comprehensive databases with a lot of data from the original
>> repositories, this should be easy. Any idea in this respect is welcome.

The ghdata project is keenly interested in sharing data providers and ultimately contributing code directly to this stack. Since the project is intended as an exploration ground and not a durable product, code may from time to time migrate into this project if it shows utility.  Jesus and i have started working out how that might happen. 

> 
> I'd say that we should produce some kind of on boarding guidelines. This typically helps people to understand where to start from several points of view.
> 
> For instance,
> 
> * What do I need to do if I want to integrate a non-supported data source?

Write the code for perceval to support it? 

> 
>  + First, this developer needs to check if that data source is not currently supported

Presumably we could create a list and reference it in the ‘contributing” or “read me” files? 

> 
>  + Then, the developer should start in some place: a new Perceval backend? directly creating a new ElasticSearch index?
> 
>  + How should I define a new ElasticSearch index? are there guidelines? recommendations?

can we treat both like service providers wrapped in something we call “perceval”, which i acknowledge may not actually be what *is* perceval today? 

> 
>> 
>> There is also an specific case that maybe we could consider, which is
>> ghData [2]. Since it is being actively used by the Metrics TC, it would
>> be specially interesting to find ways of integrating it with
>> GrimoireLab. Sean and me talked briefly about this in LA, and maybe we
>> can try to follow the discussion.
>> 
>> [2] https://github.com/OSSHealth/ghdata <https://github.com/OSSHealth/ghdata>

See this project context document for ghdata … we’re trying to make it clear “what the project is”, as it emerged as part of sorting out how to work together to define metrics, and in the process of forming CHAOSS … The focus is on human centered design, and the work is we think highly transferrable into CHAOSS … https://github.com/OSSHealth/ghdata/blob/dev/ghdataContext.md <https://github.com/OSSHealth/ghdata/blob/dev/ghdataContext.md>

A number of links related to Grimorelab have also been shared previously, and are pasted here for convenience: 

https://grimoirelab.gitbooks.io/training/content/perceval/git.html <https://grimoirelab.gitbooks.io/training/content/perceval/git.html>

https://grimoirelab.github.io/ <https://grimoirelab.github.io/>

https://grimoirelab.gitbooks.io/training/content/cases-chaoss.html <https://grimoirelab.gitbooks.io/training/content/cases-chaoss.html>

https://grimoirelab.gitbooks.io/training/content/cases-chaoss/activity.html <https://grimoirelab.gitbooks.io/training/content/cases-chaoss/activity.html>


> 
> This would be a great example of how to integrate things and may help to start that on boarding guideline.
> 
> Regards,
> Daniel.
> 
>> 
>> As I understand it, currently ghData gets data from GHTorrent and
>> GitHub. Maybe one step to walk would be to explore to which extent we
>> could have a Perceval backend to query git, GitHub or other data
>> sources not currently supported. Or interfacing directly to the
>> GrimoireELK database. (for a brief explanation of the role of Perceval
>> and GrimoireELK in GrimoireLab, please have a look at [3] [4] [5]).
>> 
>> [3] https://grimoirelab.gitbooks.io/training/grimoirelab/intro.html <https://grimoirelab.gitbooks.io/training/grimoirelab/intro.html>
>> [4] https://grimoirelab.gitbooks.io/training/grimoirelab/intro/components.html <https://grimoirelab.gitbooks.io/training/grimoirelab/intro/components.html>
>> [5] https://grimoirelab.gitbooks.io/training/grimoirelab/intro/scenarios.html <https://grimoirelab.gitbooks.io/training/grimoirelab/intro/scenarios.html>
>> 
>> Any comments on any of this?
>> 
>> Saludos,
>> 
>> 	Jesus.
>> 
> 
> -- 
> Daniel Izquierdo Cortazar, PhD
> Chief Data Officer
> ---------
> "Software Analytics for your peace of mind"
> www.bitergia.com <http://www.bitergia.com/>
> @bitergia
> 
> _______________________________________________
> Chaoss-software mailing list
> Chaoss-software at lists.linuxfoundation.org <mailto:Chaoss-software at lists.linuxfoundation.org>
> https://lists.linuxfoundation.org/mailman/listinfo/chaoss-software <https://lists.linuxfoundation.org/mailman/listinfo/chaoss-software>
Sean P. Goggins
Associate Professor, Computer Science
Director, Data Science and Analytics Masters Program
University of Missouri
http://www.seangoggins.net <http://www.seangoggins.net/>

Computer Science:  http://engineering.missouri.edu/cs/  <http://engineering.missouri.edu/cs/>
Data Science & Analytics: http://dsa.missouri.edu <http://dsa.missouri.edu/> 
MU Informatics Institute http://muii.missouri.edu <http://muii.missouri.edu/> 
visit: http://www.sociotech.net <http://www.sociotech.net/>
visit: http://osshealth.io <http://osshealth.io/> (for ghdata OSS Metrics Software) [Sloan Foundation]
visit:  <http://chaoss.community/>http://c <http://c/>haoss.community (for open source health metrics) [Sloan Foundation]
visit: http://mhs.missouri.edu <http://mhs.missouri.edu/> (for mission hydro sci!) [i3 & IES]
visit: http://ocdx.io <http://ocdx.io/> (for the open collaboration data exchange!)   [National Science Foundation]
visit: http://sociallycompute.io <http://sociallycompute.io/> (for code like things and Group Informatics) [National Science Foundation]
 
"It may be that openness is a bad choice for communities, but it's a great choice for groups that want to span, not colonize. Span, not colonize. Include, not exclude. Learn from, not teach at."
-- Steve Sawyer with Tony Salvador 

"The most effective way to do it, is to do it."
-- Amelia Earhart

‌‌

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.linuxfoundation.org/mailman/private/chaoss-software/attachments/20171019/005a6dbe/attachment-0001.html>


More information about the Chaoss-software mailing list