[Ksummit-discuss] [CORE TOPIC] Kernel tinification: shrinking the kernel and avoiding size regressions

Dave Jones davej at redhat.com
Fri May 2 19:49:35 UTC 2014


On Fri, May 02, 2014 at 03:33:14PM -0400, Theodore Ts'o wrote:
 > There's been a huge focus on system calls in this discussion, and I
 > suspect this is a bit of a red herring.  Taking a look at "git log
 > arch/x86/syscalls/syscall_64.tbl" --- since all the world's is no
 > longer a Vax, but rather an x86_64 :-P --- there really hasn't been
 > that many new system calls lately.

I may have a vested interest in syscalls :)

The rate we're adding them has slowed down, but the rate at which we're
finding bugs exposed through them has accelerated enormously over the
last few years.

To use just one example, on certain systems I'd love to be able to just
turn off sys_perf_event_open given what a trainwreck of vulnerabilities it's been
over the last few years [comedy: it is actually a config option, but x86
'selects' it, so you'll have it and you'll like it].
Thankfully at least the scarier parts of it are now hidden behind the
paranoid sysctl.

 > And if you look at things like renameat(2), the actual code savings by
 > removing renameat(2) is pretty small, and IMHO, not worth the
 > complexity and uncertainty that it would represent to application
 > programmers of "does this system call exist or doesn't it".

I think we've got two categories here.

"variant" syscalls like renameat, which just offers enhancements over
an existing syscall. Stuff that things like glibc tend to care about.
This stuff is usually pretty boring, and not even worth considering for
potentially disabling imo.

And then we have "enable boatload of code" syscalls that are typically
used by a few standalone apps/features. kexec, checkpointing, whatever
db it was that cares about remap_file_pages, mempolicy, etc. etc.

It's this "not used by every user" code that tends to scare me, because
it's written with 1-2 well behaved bits of userspace in mind, which
usually means "has so many unchecked corner cases it's not even funny"

Ok, maybe there is also a grey area in the middle, which I guess depends
on what your userspace is going to do, (things like vmsplice and
friends), but I lean towards just classing them in the 2nd category too.

 > In contrast, if you want to take at the bloat and complexity added by
 > the pluggable security LSM's, control groups, and name spaces, the
 > comparison isn't even close.  Furthermore, given that low level
 > progams programs like systemd have grown to require control groups,
 > it's not like you can even realistically strip it from potentially
 > even many embedded kernels, since there seems to be a movement to have
 > systemd infect even smaller embedded applications.

Yeah, we've reached a point of no return with things like cgroups now.

 > Anyone want to lay odds on when systemd will start using various
 > namespaces for its own purposes?  :-)

I thought it already was tbh.

	Dave



More information about the Ksummit-discuss mailing list