Advice on oops - memory trap on non-memory access instruction (invalid CR2?)
Guilherme G. Piccoli
gpiccoli at canonical.com
Tue Oct 15 15:21:45 UTC 2019
On 14/10/2019 11:10, Thomas Gleixner wrote:
> On Mon, 14 Oct 2019, Guilherme G. Piccoli wrote:
>> Modules linked in: <...>
>> CPU: 40 PID: 78274 Comm: qemu-system-x86 Tainted: P W OE
>
> Tainted: P - Proprietary module loaded ...
>
> Try again without that module
Thanks Thomas, for the prompt response. This is some ScaleIO stuff, I
guess it's part of customer setup, and I agree would be better to not
have this kind of module loaded. Anyway, the analysis of oops show a
quite odd situation that we'd like to at least have a strong clue before
saying the scaleio stuff is the culprit.
>
> Tainted: W - Warning issued before
>
> Are you sure that that warning is harmless and unrelated?
>
Sorry I didn't mention that before, the warn is:
[5946866.593060] WARNING: CPU: 42 PID: 173056 at
/build/linux-lts-xenial-80t3lB/linux-lts-xenial-4.4.0/arch/x86/events/intel/core.c:1868
intel_pmu_handle_irq+0x2d4/0x470()
[5946866.593061] perfevents: irq loop stuck!
It happened ~700 days before the oops (yeah, the uptime is quite large,
about 900 days when the oops happened heh).
>> 4.4.0-45-generic #66~14.04.1-Ubuntu
>
> Does the same problem happen with a not so dead kernel? CR2 handling got
> quite some updates/fixes since then.
Unfortunately we don't have ways to test that for now, but your comment
is quite interesting - we can take a look in the CR2 fixes since v4.4.
But what do you think about having a #PF while the instruction pointed
in the oops Code section (and the RIP address) is not a memory-related insn?
Thanks,
Guilherme
>
> Thanks,
>
> tglx
>
>
More information about the iommu
mailing list