[Linux-kernel-mentees] Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?

Coiby Xu coiby.xu at gmail.com
Tue Oct 6 08:31:57 UTC 2020


On Tue, Oct 06, 2020 at 08:28:40AM +0200, Hans de Goede wrote:
>Hi,
>
>On 10/6/20 6:49 AM, Coiby Xu wrote:
>>Hi Hans and Linus,
>>
>>I've found the direct evidence proving the GPIO interrupt controller is
>>malfunctioning.
>>
>>I've found a way to let the GPIO chip trigger an interrupt by accident
>>when playing with the GPIO sysfs interface,
>>
>>  - export pin130 which is used by the touchad
>>  - set the direction to be "out"
>>  - `echo 0 > value` will trigger the GPIO controller's parent irq and
>>    "echo 1 > value" will make it stop firing
>>
>>(I'm not sure if this is yet another bug of the GPIO chip. Anyway I can
>>manually trigger an interrupt now.)
>>
>>I wrote a C program is to let GPIO controller quickly generate some
>>interrupts then disable the firing of interrupts by toggling pin#130's
>>value with an specified time interval, i.e., set the value to 0 first
>>and then after some time, re-set the value to 1. There is no interrupt
>>firing unless time internal > 120ms (~7Hz). This explains why we can
>>only see 7 interrupts for the GPIO controller's parent irq.
>
>That is a great find, well done.
>
>>My hypothesis is the GPIO doesn't have proper power setting so it stays
>>in an idle state or its clock frequency is too low by default thus not
>>quick enough to read interrupt input. Then pinctrl-amd must miss some
>>code to configure the chip and I need a hardware reference manual of this
>>GPIO chip (HID: AMDI0030) or reverse-engineer the driver for Windows
>>since I couldn't find a copy of reference manual online? What would you
>>suggest?
>
>This sounds like it might have something to do with the glitch filter.
>The code in pinctrl-amd.c to setup the trigger-type also configures
>the glitch filter, you could try changing that code to disable the
>glitch-filter. The defines for setting the glitch-filter bits to
>disabled are already there.
>

Disabling the glitch filter works like a charm! Other enthusiastic
Linux users who have been troubled by this issue for months would
also feel great to know this small tweaking could bring their
touchpad back to life:) Thank you!

$ git diff
diff --git a/drivers/pinctrl/pinctrl-amd.c b/drivers/pinctrl/pinctrl-amd.c
index 9a760f5cd7ed..e786d779d6c8 100644
--- a/drivers/pinctrl/pinctrl-amd.c
+++ b/drivers/pinctrl/pinctrl-amd.c
@@ -463,7 +463,7 @@ static int amd_gpio_irq_set_type(struct irq_data *d, unsigned int type)
                 pin_reg &= ~(ACTIVE_LEVEL_MASK << ACTIVE_LEVEL_OFF);
                 pin_reg |= ACTIVE_LOW << ACTIVE_LEVEL_OFF;
                 pin_reg &= ~(DB_CNTRl_MASK << DB_CNTRL_OFF);
-               pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF;
+               /** pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF; */
                 irq_set_handler_locked(d, handle_level_irq);
                 break;

I will learn more about the glitch filter and the implementation of
pinctrl and see if I can disable glitch filter only for this touchpad.

>Regards,
>
>Hans
>
>
>
>
>>
>>Thank you!
>>
>>On Sun, Oct 04, 2020 at 01:16:44PM +0800, Coiby Xu wrote:
>>>On Sun, Oct 04, 2020 at 07:03:40AM +0800, Coiby Xu wrote:
>>>>On Sat, Oct 03, 2020 at 03:22:46PM +0200, Hans de Goede wrote:
>>>>>Hi,
>>>>>
>>>>>On 10/3/20 12:45 AM, Coiby Xu wrote:
>>>>>>On Fri, Oct 02, 2020 at 09:44:54PM +0200, Hans de Goede wrote:
>>>>>>>Hi,
>>>>>>>
>>>>>>>On 10/2/20 4:51 PM, Coiby Xu wrote:
>>>>>>>>On Fri, Oct 02, 2020 at 03:36:29PM +0200, Hans de Goede wrote:
>>>>>>>
>>>>>>><snip>
>>>>>>>
>>>>>>>>>>>So are you seeing these 7 interrupts / second for the touchpad irq or for
>>>>>>>>>>>the GPIO controllers parent irq ?
>>>>>>>>>>>
>>>>>>>>>>>Also to these 7 interrupts/sec stop happening when you do not touch the
>>>>>>>>>>>touchpad ?
>>>>>>>>>>>
>>>>>>>>>>I see these 7 interrupts / second for the GPIO controller's parent irq.
>>>>>>>>>>And they stop happening when I don't touch the touchpad.
>>>>>>>>>
>>>>>>>>>Only from the parent irq, or also on the touchpad irq itself ?
>>>>>>>>>
>>>>>>>>>If this only happens on the parent irq, then I would start looking at the
>>>>>>>>>amd-pinctrl code which determines which of its "child" irqs to fire.
>>>>>>>>
>>>>>>>>This only happens on the parent irq. The input's pin#130 of the GIPO
>>>>>>>>chip is low most of the time and pin#130.
>>>>>>>
>>>>>>>Right, but it is a low-level triggered IRQ, so when it is low it should
>>>>>>>be executing the i2c-hid interrupt-handler. If it is not executing that
>>>>>>>then it is time to look at amd-pinctrl's irq-handler and figure out why
>>>>>>>that is not triggering the child irq handler for the touchpad.
>>>>>>>
>>>>>>I'm not sure if I have some incorrect understandings about GPIO
>>>>>>interrupt controller because I don't quite follow your reasoning.
>>>>>>What I actually suspect is there's something wrong with amd-pinctrl
>>>>>>which makes the GPIO chip fail to assert its common interrupt output
>>>>>>line connected to one IO-APIC's pin#7 thus IRQ#7 fails to fire. What
>>>>>>I learn about this low-level triggered IRQ is that the i2c-hid
>>>>>>interrupt-handler will be woken up by amd-pinctrl's irq-handler which
>>>>>>is executed when the parent IRQ#7 fires. The code path is as follows,
>>>>>>
>>>>>>    <IRQ>
>>>>>>    dump_stack+0x64/0x88
>>>>>>    __irq_wake_thread.cold+0x9/0x12
>>>>>>    __handle_irq_event_percpu+0x80/0x1c0
>>>>>>    handle_irq_event+0x58/0xb0
>>>>>>    handle_level_irq+0xb7/0x1a0
>>>>>>    generic_handle_irq+0x4a/0x60
>>>>>>    amd_gpio_irq_handler+0x15f/0x1b0 [pinctrl_amd]
>>>>>>    __handle_irq_event_percpu+0x45/0x1c0
>>>>>>    handle_irq_event+0x58/0xb0
>>>>>>    handle_fasteoi_irq+0xa2/0x210
>>>>>>    do_IRQ+0x70/0x120
>>>>>>    common_interrupt+0xf/0xf
>>>>>>    </IRQ>
>>>>>>
>>>>>>But the problem is somehow IRQ#7 doesn't even fire when the input's
>>>>>>pin#130 of the GIPO is low. Without IRQ#7 firing, amd-pinctrl's
>>>>>>irq-handler wouldn't be executed in the first place, let alonet
>>>>>>triggering the child irq handler. Btw, amd-pinctrl's irq-handler
>>>>>>simply iterate over all pins. If there is mapped irq found for this
>>>>>>hwirq (yes, it won't even check if this pin triggers the interrupt),
>>>>>>then it will call generic_handle_irq. So there's nothing wrong about
>>>>>>this part of code.
>>>>>
>>>>>Ok, so the i2c-hid irq does fire, but only 7 times a second just
>>>>>like the GPIO controller's parent irq.
>>>>>
>>>>I'm not sure if it's correct to say if hi2c-hid irq fires or not and how
>>>>frequently it fires since the i2c-hid irq is mapped to pin#130 of the
>>>>GPIO interrupt controller and the touchpad has another interrupt line
>>>>connected to pin#130 which fires to indicate new data. All we know is
>>>>pin#130 of the GPIO chip has low input most of the time when the finger
>>>>is on the touchpad so we can infer the touchpad has been trying to
>>>>notify the kernel of new data but somehow GPIO's parent irq only fires 7
>>>>times / second.
>>>>
>>>>>The only thing I can think of then is to add printk-s to check how
>>>>>long the i2c-hid interrupt handler takes to complete. It could be
>>>>>there is a subtle bug somewhere causing the i2c transfers to take
>>>>>longer when run from a (threaded) irq handler. That would be weird
>>>>>though, so I don't expect this to result in any useful findings.
>>>>>
>>>>
>>>>I also doubted if it takes too much time for the i2c-hid handler to
>>>>finish reading i2c transfer, processing data and delivering to the input
>>>>system. After measuring the time internal between the starting of the
>>>>GPIO irq's parent handler and when pin#130 is unmasked, we can exclude
>>>>this possibility.
>>>>
>>>>I have been wondering if we let make pin#130 have low input thus to
>>>>trigger a interrupt firing or assert the GPIO's common interrupt output
>>>>line manually thus we can measure how long does it take for the kernel
>>>>to receive the signal. But once GPIO's pin is programmed to be a
>>>>interrupt line we can't write anything to it and it seems other
>>>>interrupts can only be generated by the hardware. So this idea is not
>>>>plausible
>>>>
>>>
>>>Btw, there are other users who have the same laptop model but with a
>>>different touchpad (ELAN). Their touchpads would show in
>>>/proc/bus/input/devices but are completely dead. hid-recorder which
>>>will read HID reports from /dev/hidraw gets nothing if they put there
>>>fingers on the touchpad but the polling mode could also save their
>>>touchpads. It seems GPIO controller's parent irq for the ELAN touchpad
>>>doesn't even fire once. And unlike GPIO, IO-APIC has also be used by
>>>other devices like the keyboard. So maybe it's safe to assert the root
>>>cause is from the GPIO controller.
>>>
>>>>>Other then that I'm all out of ideas I'm afraid.
>>>>>
>>>>Thank you for taking time to investigate this issue anyway! Have a nice
>>>>weekend:)
>>>>>Regards,
>>>>>
>>>>>Hans
>>>>>
>>>>
>>>>--
>>>>Best regards,
>>>>Coiby
>>>
>>>--
>>>Best regards,
>>>Coiby
>>
>>--
>>Best regards,
>>Coiby
>>
>

--
Best regards,
Coiby


More information about the Linux-kernel-mentees mailing list