[Linux-kernel-mentees] Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?

Coiby Xu coiby.xu at gmail.com
Wed Oct 14 11:34:05 UTC 2020


On Wed, Oct 14, 2020 at 12:24:20PM +0800, Coiby Xu wrote:
>On Tue, Oct 06, 2020 at 11:29:40AM +0200, Hans de Goede wrote:
>>
>>
>>On 10/6/20 11:28 AM, Hans de Goede wrote:
>>>Hi,
>>>
>>>On 10/6/20 10:55 AM, Hans de Goede wrote:
>>>>Hi,
>>>>
>>>>On 10/6/20 10:31 AM, Coiby Xu wrote:
>>>>>On Tue, Oct 06, 2020 at 08:28:40AM +0200, Hans de Goede wrote:
>>>>>>Hi,
>>>>>>
>>>>>>On 10/6/20 6:49 AM, Coiby Xu wrote:
>>>>>>>Hi Hans and Linus,
>>>>>>>
>>>>>>>I've found the direct evidence proving the GPIO interrupt controller is
>>>>>>>malfunctioning.
>>>>>>>
>>>>>>>I've found a way to let the GPIO chip trigger an interrupt by accident
>>>>>>>when playing with the GPIO sysfs interface,
>>>>>>>
>>>>>>> - export pin130 which is used by the touchad
>>>>>>> - set the direction to be "out"
>>>>>>> - `echo 0 > value` will trigger the GPIO controller's parent irq and
>>>>>>>   "echo 1 > value" will make it stop firing
>>>>>>>
>>>>>>>(I'm not sure if this is yet another bug of the GPIO chip. Anyway I can
>>>>>>>manually trigger an interrupt now.)
>>>>>>>
>>>>>>>I wrote a C program is to let GPIO controller quickly generate some
>>>>>>>interrupts then disable the firing of interrupts by toggling pin#130's
>>>>>>>value with an specified time interval, i.e., set the value to 0 first
>>>>>>>and then after some time, re-set the value to 1. There is no interrupt
>>>>>>>firing unless time internal > 120ms (~7Hz). This explains why we can
>>>>>>>only see 7 interrupts for the GPIO controller's parent irq.
>>>>>>
>>>>>>That is a great find, well done.
>>>>>>
>>>>>>>My hypothesis is the GPIO doesn't have proper power setting so it stays
>>>>>>>in an idle state or its clock frequency is too low by default thus not
>>>>>>>quick enough to read interrupt input. Then pinctrl-amd must miss some
>>>>>>>code to configure the chip and I need a hardware reference manual of this
>>>>>>>GPIO chip (HID: AMDI0030) or reverse-engineer the driver for Windows
>>>>>>>since I couldn't find a copy of reference manual online? What would you
>>>>>>>suggest?
>>>>>>
>>>>>>This sounds like it might have something to do with the glitch filter.
>>>>>>The code in pinctrl-amd.c to setup the trigger-type also configures
>>>>>>the glitch filter, you could try changing that code to disable the
>>>>>>glitch-filter. The defines for setting the glitch-filter bits to
>>>>>>disabled are already there.
>>>>>>
>>>>>
>>>>>Disabling the glitch filter works like a charm! Other enthusiastic
>>>>>Linux users who have been troubled by this issue for months would
>>>>>also feel great to know this small tweaking could bring their
>>>>>touchpad back to life:) Thank you!
>>>>
>>>>That is good to hear, I'm glad that we have finally found a solution.
>>>>
>>>>>$ git diff
>>>>>diff --git a/drivers/pinctrl/pinctrl-amd.c b/drivers/pinctrl/pinctrl-amd.c
>>>>>index 9a760f5cd7ed..e786d779d6c8 100644
>>>>>--- a/drivers/pinctrl/pinctrl-amd.c
>>>>>+++ b/drivers/pinctrl/pinctrl-amd.c
>>>>>@@ -463,7 +463,7 @@ static int amd_gpio_irq_set_type(struct irq_data *d, unsigned int type)
>>>>>                 pin_reg &= ~(ACTIVE_LEVEL_MASK << ACTIVE_LEVEL_OFF);
>>>>>                 pin_reg |= ACTIVE_LOW << ACTIVE_LEVEL_OFF;
>>>>>                 pin_reg &= ~(DB_CNTRl_MASK << DB_CNTRL_OFF);
>>>>>-               pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF;
>>>>>+               /** pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF; */
>>>>>                 irq_set_handler_locked(d, handle_level_irq);
>>>>>                 break;
>>>>>
>>>>>I will learn more about the glitch filter and the implementation of
>>>>>pinctrl and see if I can disable glitch filter only for this touchpad.
>>>>
>>>>The glitch filter likely also has settings for how long a glitch
>>>>lasts, which apparently goes all the way up to 120ms. If it would
>>>>only delay reporting by say 0.1ms and consider any pulse longer
>>>>then 0.1s not a glitch, then having it enabled would be fine.
>>>>
>>>>I don't think we want some sort of quirk here to only disable the
>>>>glitch filter for some touchpads. One approach might be to simply
>>>>disable it completely for level type irqs.
>>>>
>>>>What we really need here is some input from AMD engineers with how
>>>>this is all supposed to work.
>>>>
>>>>E.g. maybe the glitch-filter is setup by the BIOS and we should not
>>>>touch it all ?
>>>>
>>>>Or maybe instead of DB_TYPE_PRESERVE_HIGH_GLITCH low level interrupts
>>>>should use DB_TYPE_PRESERVE_LOW_GLITCH ?   Some docs for the hw
>>>>would really help here ...
>>>
>>>So I've been digging through the history of the pinctrl-amd.c driver
>>>and once upon a time it used to set a default debounce time of
>>>2.75 ms.
>>>
>>>See the patch generated by doing:
>>>
>>>git format-patch 8cf4345575a416e6856a6856ac6eaa31ad883126~..8cf4345575a416e6856a6856ac6eaa31ad883126
>>>
>>>In a linux kernel checkout.
>>>
>>>So it would be interesting to add a debugging printk to see
>>>what the value of pin_reg & DB_TMR_OUT_MASK is for the troublesome
>>>GPIO.
>>>
>>>I guess that it might be all 1s (0xfffffffff) or some such which
>>>might be a way to check that we should disable the glitch-filter
>>>for this pin?
>>
>>p.s.
>>
>>Or maybe we should simply stop touching all the glitch-filter
>>related bits, in the same way as that old commit has already
>>removed the code setting the timing of the filter ?
>>
>>At least is seems that forcing the filter to be on without
>>sanitizing the de-bounce time is not a good idea.
>>
>Today I find an inconsistency in drivers/pinctrl/pinctrl-amd.c
>so there must be a bug. As far as I can understand pinctrl-amd,
>"pin_reg & ~DB_CNTRl_MASK" is used to mask out the debouncing
>feature,
>
>static int amd_gpio_set_debounce(struct gpio_chip *gc, unsigned offset,
>		unsigned debounce)
>{
>    ...
>	if (debounce) {
>        ...
>		if (debounce < 61) {
>			pin_reg |= 1;
>			pin_reg &= ~BIT(DB_TMR_OUT_UNIT_OFF);
>			pin_reg &= ~BIT(DB_TMR_LARGE_OFF);
>		...
>		} else if (debounce < 1000000) {
>			time = debounce / 62500;
>			pin_reg |= time & DB_TMR_OUT_MASK;
>			pin_reg |= BIT(DB_TMR_OUT_UNIT_OFF);
>			pin_reg |= BIT(DB_TMR_LARGE_OFF);
>		} else {
>			pin_reg &= ~DB_CNTRl_MASK;
>			ret = -EINVAL;
>		}
>
>	} else {
>        ...
>		pin_reg &= ~DB_CNTRl_MASK;
>	}
>    ...
>}
>
>However in amd_gpio_irq_set_type, "ping_reg & ~(DB_CNTRl_MASK << DB_CNTRL_OFF)"
>is used,
>
>static int amd_gpio_irq_set_type(struct irq_data *d, unsigned int type)
>{
>
>    ...
>	case IRQ_TYPE_LEVEL_LOW:
>		pin_reg |= LEVEL_TRIGGER << LEVEL_TRIG_OFF;
>		pin_reg &= ~(ACTIVE_LEVEL_MASK << ACTIVE_LEVEL_OFF);
>		pin_reg |= ACTIVE_LOW << ACTIVE_LEVEL_OFF;
>		pin_reg &= ~(DB_CNTRl_MASK << DB_CNTRL_OFF);
>		pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF;
>		irq_set_handler_locked(d, handle_level_irq);
>		break;
>    ...
>}
>
>If "pin_reg & ~DB_CNTRl_MASK" is used instead, the touchpad will work
>flawlessly. So I believe "pin_reg & ~DB_CNTRl_MASK" is the correct way
>to mask out the debouncing filter and the bug lies in amd_gpio_set_type.
>
Sorry, I notice the touchpad is not as sensitive as before when using
"pin_reg & ~DB_CNTRl_MASK". When I use hid-recorder to read the HID
reports of the touchpad, several duplicates would be read. I interpret
it as spurious interrupts are fired because the debouncing filter is
disabled. So it seems there are two mistakes in pinctrl-amd. One mistake
is it shouldn't disable the debouncing filter here and the other mistake
is the way to disable the debouncing filer is incorrect.
>Btw, can you explain what's the difference between glitch filter and
>debouncing filter? Or can you point to some references? I've gain some
>experience about how to configure the GPIO controller by studying the
>code of pinctrl-amd and pinctrl-baytrail (I can't find the hardware
>reference manual for baytrail either). I also tweaked the configuration
>in pinctrl-amd, for example, setting the debounce timeout to 976 usec
>and 3.9 msec without disabling the glitch filter could also save the
>touchpad. But I need some knowledge to understand why this touchpad [1]
>which also uses the buggy pinctrl-amd isn't affected.
>
>[1] https://github.com/Syniurge/i2c-amd-mp2/issues/11#issuecomment-707427095
>
>>Regards,
>>
>>Hans
>>
>
>--
>Best regards,
>Coiby

--
Best regards,
Coiby


More information about the Linux-kernel-mentees mailing list