[PATCH v2 1/2] fcntl: fix potential deadlocks for &fown_struct.lock

Wed Jul 7 10:44:42 UTC 2021

On Wed, 2021-07-07 at 08:05 +0200, Greg KH wrote:
> On Wed, Jul 07, 2021 at 10:35:47AM +0800, Desmond Cheong Zhi Xi wrote:
> > Syzbot reports a potential deadlock in do_fcntl:
> > 
> > ========================================================
> > WARNING: possible irq lock inversion dependency detected
> > 5.12.0-syzkaller #0 Not tainted
> > --------------------------------------------------------
> > syz-executor132/8391 just changed the state of lock:
> > ffff888015967bf8 (&f->f_owner.lock){.+..}-{2:2}, at: f_getown_ex fs/fcntl.c:211 [inline]
> > ffff888015967bf8 (&f->f_owner.lock){.+..}-{2:2}, at: do_fcntl+0x8b4/0x1200 fs/fcntl.c:395
> > but this lock was taken by another, HARDIRQ-safe lock in the past:
> >  (&dev->event_lock){-...}-{2:2}
> > 
> > and interrupts could create inverse lock ordering between them.
> > 
> > other info that might help us debug this:
> > Chain exists of:
> >   &dev->event_lock --> &new->fa_lock --> &f->f_owner.lock
> > 
> >  Possible interrupt unsafe locking scenario:
> > 
> >        CPU0                    CPU1
> >        ----                    ----
> >   lock(&f->f_owner.lock);
> >                                local_irq_disable();
> >                                lock(&dev->event_lock);
> >                                lock(&new->fa_lock);
> >   <Interrupt>
> >     lock(&dev->event_lock);
> > 
> >  *** DEADLOCK ***
> > 
> > This happens because there is a lock hierarchy of
> > &dev->event_lock --> &new->fa_lock --> &f->f_owner.lock
> > from the following call chain:
> > 
> >   input_inject_event():
> >     spin_lock_irqsave(&dev->event_lock,...);
> >     input_handle_event():
> >       input_pass_values():
> >         input_to_handler():
> >           evdev_events():
> >             evdev_pass_values():
> >               spin_lock(&client->buffer_lock);
> >               __pass_event():
> >                 kill_fasync():
> >                   kill_fasync_rcu():
> >                     read_lock(&fa->fa_lock);
> >                     send_sigio():
> >                       read_lock_irqsave(&fown->lock,...);
> > 
> > However, since &dev->event_lock is HARDIRQ-safe, interrupts have to be
> > disabled while grabbing &f->f_owner.lock, otherwise we invert the lock
> > hierarchy.
> > 
> > Hence, we replace calls to read_lock/read_unlock on &f->f_owner.lock,
> > with read_lock_irq/read_unlock_irq.
> > 
> > Here read_lock_irq/read_unlock_irq should be safe to use because the
> > functions f_getown_ex and f_getowner_uids are only called from
> > do_fcntl, and f_getown is only called from do_fnctl and
> > sock_ioctl. do_fnctl itself is only called from syscalls.
> > 
> > For sock_ioctl, the chain is
> >   compat_sock_ioctl():
> >     compat_sock_ioctl_trans():
> >       sock_ioctl()
> > 
> > And interrupts are not disabled on either path. We assert this
> > assumption with WARN_ON_ONCE(irqs_disabled()). This check is also
> > inserted into another use of write_lock_irq in f_modown.
> > 
> > Reported-and-tested-by: syzbot+e6d5398a02c516ce5e70 at syzkaller.appspotmail.com
> > Signed-off-by: Desmond Cheong Zhi Xi <desmondcheongzx at gmail.com>
> > ---
> >  fs/fcntl.c | 17 +++++++++++------
> >  1 file changed, 11 insertions(+), 6 deletions(-)
> > 
> > diff --git a/fs/fcntl.c b/fs/fcntl.c
> > index dfc72f15be7f..262235e02c4b 100644
> > --- a/fs/fcntl.c
> > +++ b/fs/fcntl.c
> > @@ -88,6 +88,7 @@ static int setfl(int fd, struct file * filp, unsigned long arg)
> >  static void f_modown(struct file *filp, struct pid *pid, enum pid_type type,
> >                       int force)
> >  {
> > +	WARN_ON_ONCE(irqs_disabled());
> 
> If this triggers, you just rebooted the box :(
> 
> Please never do this, either properly handle the problem and return an
> error, or do not check for this.  It is not any type of "fix" at all,
> and at most, a debugging aid while you work on the root problem.
> 
> thanks,
> 
> greg k-h

Wait, what? Why would testing for irqs being disabled and throwing a
WARN_ON in that case crash the box?
-- 
Jeff Layton <jlayton at kernel.org>