seccomp feature development

Aleksa Sarai asarai at suse.de
Wed May 20 06:18:51 UTC 2020


On 2020-05-19, Alexei Starovoitov <alexei.starovoitov at gmail.com> wrote:
> On Wed, May 20, 2020 at 11:20:45AM +1000, Aleksa Sarai wrote:
> > No it won't become copy_from_user(), nor will there be a TOCTOU race.
> > 
> > The idea is that seccomp will proactively copy the struct (and
> > recursively any of the struct pointers inside) before the syscall runs
> > -- as this is done by seccomp it doesn't require any copy_from_user()
> > primitives in cBPF. We then run the cBPF filter on the copied struct,
> > just like how cBPF programs currently operate on seccomp_data (how this
> > would be exposed to the cBPF program as part of the seccomp ABI is the
> > topic of discussion here).
> > 
> > Then, when the actual syscall code runs, the struct will have already
> > been copied and the syscall won't copy it again.
> 
> Let's take bpf syscall as an example.
> Are you suggesting that all of syscall logic of conditionally parsing
> the arguments will be copy-pasted into seccomp-syscall infra, then
> it will do copy_from_user() all the data and replace all aligned_u64
> in "union bpf_attr" with kernel copied pointers instead of user pointers
> and make all of bpf syscall's copy_from_user() actions to be conditional ?
> If seccomp is on, use kernel pointers... if seccomp is off, do copy_from_user ?
> And the same idea will be replicated for all syscalls?

This would be done optionally per-syscall. Only syscalls which want to
opt-in to such a mechanism (such as clone3 and openat2) would be
affected. Also, bpf is possibly the least-friendly syscall to pick as an
example of these types of filters -- openat2/clone3 is much simpler to
consider.

The point is that if we both agree that seccomp needs to have a way to
do "deep argument inspection" (filtering based on the struct argument to
a syscall), then some sort of caching mechanism is simply necessary to
solve the problem. Otherwise there's a trivial TOCTOU and seccomp
filtering for such syscalls would be rendered almost useless.

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
<https://www.cyphar.com/>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://lists.linuxfoundation.org/pipermail/containers/attachments/20200520/10d9b365/attachment.sig>


More information about the Containers mailing list