[PATCH RESEND v11 7/8] open: openat2(2) syscall

Aleksa Sarai cyphar at cyphar.com
Thu Aug 29 13:19:04 UTC 2019


On 2019-08-29, Rasmus Villemoes <linux at rasmusvillemoes.dk> wrote:
> On 29/08/2019 14.15, Aleksa Sarai wrote:
> > On 2019-08-24, Daniel Colascione <dancol at google.com> wrote:
> 
> >> Why pad the structure when new functionality (perhaps accommodated via
> >> a larger structure) could be signaled by passing a new flag? Adding
> >> reserved fields to a structure with a size embedded in the ABI makes a
> >> lot of sense --- e.g., pthread_mutex_t can't grow. But this structure
> >> can grow, so the reservation seems needless to me.
> > 
> > Quite a few folks have said that ->reserved is either unnecessary or
> > too big. I will be changing this, though I am not clear what the best
> > way of extending the structure is. If anyone has a strong opinion on
> > this (or an alternative to the ones listed below), please chime in. I
> > don't have any really strong attachment to this aspect of the API.
> > 
> > There appear to be a few ways we can do it (that all have precedence
> > with other syscalls):
> > 
> >  1. Use O_* flags to indicate extensions.
> >  2. A separate "version" field that is incremented when we change.
> >  3. Add a size_t argument to openat2(2).
> >  4. Reserve space (as in this patchset).
> > 
> > (My personal preference would be (3), followed closely by (2).)
> 
> 3, definitely, and instead of having to invent a new scheme for every
> new syscall, make that the default pattern by providing a helper

Sure (though hopefully I don't need to immediately go and refactor all
the existing size_t syscalls). I will be presenting about this patchset
at the containers microconference at LPC (in a few weeks), so I'll hold
of on any API-related rewrites until after that.

> int __copy_abi_struct(void *kernel, size_t ksize, const void __user
> *user, size_t usize)
> {
> 	size_t copy = min(ksize, usize);
> 
> 	if (copy_from_user(kernel, user, copy))
> 		return -EFAULT;
> 
> 	if (usize > ksize) {
> 		/* maybe a separate "return user_is_zero(user + ksize, usize -
> ksize);" helper */
> 		char c;
> 		user += ksize;
> 		usize -= ksize;
> 		while (usize--) {
> 			if (get_user(c, user++))
> 				return -EFAULT;
> 			if (c)
> 				return -EINVAL;

This part would probably be better done with memchr_inv() and
copy_from_user() (and probably should put an upper limit on usize), but
I get what you mean.

> 		}
> 	} else if (ksize > usize) {
> 		memset(kernel + usize, 0, ksize - usize);
> 	}
> 	return 0;
> }
> #define copy_abi_struct(kernel, user, usize)	\
> 	__copy_abi_struct(kernel, sizeof(*kernel), user, usize)
>
> > Both (1) and (2) have the problem that the "struct version" is inside
> > the struct so we'd need to copy_from_user() twice. This isn't the end of
> > the world, it just feels a bit less clean than is ideal. (3) fixes that
> > problem, at the cost of making the API slightly more cumbersome to use
> > directly (though again glibc could wrap that away).
> 
> I don't see how 3 is cumbersome to use directly. Userspace code does
> struct openat_of_the_day args = {.field1 = x, .field3 = y} and passes
> &args, sizeof(args). What does glibc need to do beyond its usual munging
> of the userspace ABI registers to the syscall ABI registers?

I'd argue that

    ret = openat2(AT_FDCWD, "foo", &how, sizeof(how)); // (3)

is slightly less pretty than

    ret = openat2(AT_FDCWD, "foo", &how); // (1), (2), (4)

But it's not really that bad. Forget I said anything.

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
<https://www.cyphar.com/>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 228 bytes
Desc: not available
URL: <http://lists.linuxfoundation.org/pipermail/containers/attachments/20190829/432ef66c/attachment-0001.sig>


More information about the Containers mailing list