[Linux-kernel-mentees] checkpatch.pl investigation: NO_AUTHOR_SIGN_OFF issues

Dwaipayan Ray dwaipayanray1 at gmail.com
Fri Sep 25 07:29:31 UTC 2020


> Let us try to put some systematic structure into this handling of the
> NO_AUTHOR_SIGN_OFF.
>
> There are basically two aspects to consider:
>
> - which classes are potential mismatches can we potentially observe?
>
> - which level of severity do we assign to each class of mismatch?
>
> To the first point, you started with some initial classes above.
>
>   0) same name, same address (no mismatch)
>   1) same name, different address
>      subclasses: (how 'different' address?)
>      - email address differ by valid difference, e.g., extensions that
>        will go to the same inbox, e.g.,  the xyz+fds extension in mail
>        addresses. (that is your class 3, right?)
>      - two email addresses that are known to belong to the same individual
>        - known because of .mailmap
>        - known otherwise?
>   2) different name, same address
>      subclasses: (how 'different' name?)
>      - examples:
>        - Firstname, Lastname (but middle-name initials differ)
>        - special "regional" characters, like ü vs. ue, etc.
>        - firstname and lastname are reverted etc.
>   3) different name, different address (but still we believe it "matches")
>      combinations of subclasses of 1) and 2), which we want to consider.
>
> Then, to the second question:
>
> So, these different classes we can think of and "assign" different
> severities that checkpatch.pl can use.
>
> Checkpatch.pl already has four levels:
>
> three severity reporting levels: ERROR, WARN, CHECK, and
> the level _do not report_ (which is implemented by just ignoring some kind
> of difference).
>
> I think a lot of discussion will be around what severity to assign to
> which class above, and in some way, long-term maintainers have a larger
> say than we do here.
>
> So, let us first modify checkpatch.pl to identify the various classes
> above, maybe just starting very basic, with different name, same address
> and same name, different address and run it over the commit range and see
> which examples show up and how often.
>
> For now, we first just use checkpatch.pl to identify the different types
> and refine them into subclasses; then, we can discuss with severity should
> be assigned to which type of mismatch.
>
> The second question should not invalidate our data collection and
> identification of subclasses, though.
>
> Does that help? What do you think?
>
>
> Lukas

Yes, that should help I think. Currently checkpatch is very vague
about Author sign offs, so subclassing it will really be helpful.
But at the same point I don't think it should become
too complex either.

I have already prepared a patch for the basic two classes:
1) same name, different address
2) same address, different name

It would be great to have your feedback on this.

---
diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 9e65d21456f1..e23e753fcaf0 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -1210,14 +1210,22 @@ sub reformat_email {
  return format_email($email_name, $email_address);
 }

-sub same_email_addresses {
+sub same_mail_check {
  my ($email1, $email2) = @_;

  my ($email1_name, $name1_comment, $email1_address, $comment1) =
parse_email($email1);
  my ($email2_name, $name2_comment, $email2_address, $comment2) =
parse_email($email2);

- return $email1_name eq $email2_name &&
-        $email1_address eq $email2_address;
+ my $same_name = $email1_name eq $email2_name;
+ my $same_address= $email1_address eq $email2_address;
+
+ return ($same_name, $same_address);
+}
+
+sub same_email_addresses {
+ my ($same_name, $same_address) = same_mail_check(@_);
+
+ return $same_name && $same_address;
 }

 sub which {
@@ -2347,6 +2355,7 @@ sub process {
  my $signoff = 0;
  my $author = '';
  my $authorsignoff = 0;
+ my $authorsignerr = '';
  my $is_patch = 0;
  my $is_binding_patch = -1;
  my $in_header_lines = $file ? 0 : 1;
@@ -2677,6 +2686,13 @@ sub process {
  if ($author ne '') {
  if (same_email_addresses($1, $author)) {
  $authorsignoff = 1;
+ } else {
+ my ($same_name, $same_address) = same_mail_check($1, $author);
+ if($same_name) {
+ $authorsignerr = "ADDRESS_MISMATCH:${1}";
+ } elsif ($same_address) {
+ $authorsignerr = "NAME_MISMATCH:${1}";
+ }
  }
  }
  }
@@ -6891,8 +6907,16 @@ sub process {
  ERROR("MISSING_SIGN_OFF",
        "Missing Signed-off-by: line(s)\n");
  } elsif (!$authorsignoff) {
- WARN("NO_AUTHOR_SIGN_OFF",
-      "Missing Signed-off-by: line by nominal patch author '$author'\n");
+ if($authorsignerr =~ /^NAME_MISMATCH:(.*)/) {
+ WARN("NO_AUTHOR_SIGN_OFF",
+      "Author name mismatch: 'From: $author' != 'Signed-off-by: $1'\n");
+ } elsif ($authorsignerr =~ /^ADDRESS_MISMATCH:(.*)/) {
+ WARN("NO_AUTHOR_SIGN_OFF",
+      "Author email address mismatch: 'From: $author' !=
'Signed-off-by: $1'\n");
+ } else {
+ WARN("NO_AUTHOR_SIGN_OFF",
+ "Missing Signed-off-by: line by nominal patch author '$author'\n");
+ }
  }
  }
---

Thanks,
Dwaipayan.


More information about the Linux-kernel-mentees mailing list