[PATCH review 08/12] quota: Ensure qids map to the filesystem

Thu Jul 14 17:03:19 UTC 2016

Jann Horn <jann at thejh.net> writes:

> On Wed, Jul 13, 2016 at 11:34:36AM +1000, Dave Chinner wrote:
>> On Mon, Jul 11, 2016 at 01:12:49PM -0500, Eric W. Biederman wrote:
>> > The place where I am concerned about thorough review and testing is
>> > someone poisoning quota files and then the kernel trying to use them.
>> > In the preliminary work we have done in other places in the kernel and
>> > for other filesystems there almost always winds up being some way to
>> > confuse the kernel and get it to misbave if you can poison the disk
>> > based inputs.  As poison disk based inputs is not something filesystems
>> > are stronlgy concerned about.  In most cases the disk the filesystem
>> > resides on is in the box and therefore under control of the OS at all
>> > times.  Dave Chinner has even said he will never consider handling
>> > poisoned disk based inputs for XFS as the run time cost is too high.
>> 
>> I didn't say that. I said that comprehensive checks to catch all
>> possible malicious inputs is too expensive to consider a viable
>> solution for allowing user-mounts of arbitrary filesystem images
>> through the kernel.

Dave you said that speaking as the XFS maintainer.  So I take that to be
your position and to refer to XFS.

> [...]
>> 
>> To bring this back to quota files, the only way to validate that a
>> quota file has not been tampered with is to run a quotacheck on the
>> filesystem once it has been mounted. This requires visiting every
>> inode in the filesystem, so it an expensive operation. Only XFS has
>> this functionality in kernel, so for untrusted mounts we could
>> simply run it on every mount that has quotas enabled. Of course,
>> users won't care that mounting their filesystem now takes several
>> minutes (hours, even, when we have millions of inodes in the fs)
>> while these checks are run...
>>
>> Detecting malicious corruptions that specifically manipulate the
>> on-disk structure within the bounds of format validity are difficult
>> to detect and costly to protect against. We'd need to move large
>> parts of fsck into the kernel and run it to validate every piece of
>> metadata read into the kernel. Then we've got a much larger attack
>> surface in the kernel (all the validity checking code needs to be
>> robust against invalid structures, too!), a lot more complexity
>> (more bugs!) and a lot of additional runtime overhead (slow
>> filesystem = unhappy users!). It's just not a practical solution to
>> the problem.

This critiqute as I read it confuses data integrity and safety from
privilege escalation.  If a filesystem image or it's backing store are
malicious there is no need to be concerned about data integrity.

> And ideally, you'd want to also guard against an evil disk that
> suddenly changes its contents after you've run fsck on it, and you
> can't easily do that without making things complicated.

I agree an evil disk definitely should be part of any threat analysis.

I also agree that anything that is complicated is not a practical
as complicated means lots of code, and bugs are in general a function
of the amount of code.

At the same time my only concern when analyzing something for safety
against a malicious filesystem is can the malicious data cause the
kernel to misbehave.  This includes things like stack overflows,
and memory corruption.

In the specific case of a quota file, if there is a quota file that has
little to no resemblence to reality but the kernel doesn't misbehave, I
don't think that is a problem from an unprivileged mount perspective.
On the flip side if a malicious quota file allows an in kernel quota
variable to go negative and that causes the kernel to misbehave that is
a show stopper for allowing an unprivileged mount.

Personally I see the quantity of code in current filesystems as making
it hard to have a low enough probability of problems to allow
unprivileged mounts.

Changes for all of the weird cases a backing store for unprivileged
mounts brings with it have been added to the VFS because that is the
right place to implement them.  Working at a filesystem independent
level allows all of the right people involved in the review, and it
allows clean and general solutions to the weird cases that come up
with a uid or gid does not map into the kernel.

All of that said the only filesystem with backing store that I see as a
reasonable target to support right now is fuse.  As fuse was designed
from the get go to support filesystems from unprivileged users.

Will fuse someday support quotas?  I don't know.  But the there is no
extra cost to support that case in fs/quota/quota.c so I have added the
necessary code.

Eric