[CHAOSS] Adding licenses analysis to GrimoireLab (Potential FOSSology integration)

Stefano Zacchiroli zack at upsilon.cc
Tue Sep 18 19:01:09 UTC 2018


On Tue, Sep 18, 2018 at 08:48:07PM +0200, Jesus M. Gonzalez-Barahona wrote:
> WRT the tool for analyzing licenses, I don't have a lot of experience
> with license / copyright analysis tools, but I think at least we could
> use FOSSology, ninka and/or scancode. The main requirement is that they
> either offer a Python3 API, or a relatively simple command interface
> (in that case, the language doesn't matter), in a way that Graal can
> pass them the contents of a file, and they return their analysis on it.
> Of course, the tool should also run well in Linux.

With that data model what will work best are indeed the file-scope
scanners, like ninka, scancode and nomossa from FOSSology.

On the other hand it would be interesting to be able to also run
FOSSology scans on the entire root directory of a Git repo at every
commit. What you gain there is that you exploit one thing FOSSology is
good at, i.e., propagating license information down directories where a
LICENSE file is found, even if there is license header in individual
files. (Granted, it's speculative, but all license detectors are, one
way or another :-)) Doing this is however more expensive that the above,
as you need to re-process all files at each commit.

Aside from the computation cost, is that data model / processing model
something that Graal (and more generally GrimoireLab for querying /
rendering purposes) support?

Anyway, it's a great idea :-)
-- 
Stefano Zacchiroli . zack at upsilon.cc . upsilon.cc/zack . . o . . . o . o
Computer Science Professor . CTO Software Heritage . . . . . o . . . o o
Former Debian Project Leader & OSI Board Director  . . . o o o . . . o .
« the first rule of tautology club is the first rule of tautology club »


More information about the CHAOSS mailing list