[Ksummit-discuss] [TECH TOPIC] A Safety-critical Linux system architecture

Tiejun Chen tiejunc at vmware.com
Thu Sep 13 03:13:11 UTC 2018


> -----Original Message-----
> From: Darren Hart <dvhart at infradead.org>
> Sent: Thursday, September 13, 2018 12:29 AM
> To: Linus Walleij <linus.walleij at linaro.org>
> Cc: Tiejun Chen <tiejunc at vmware.com>; ksummit-
> discuss at lists.linuxfoundation.org
> Subject: Re: [Ksummit-discuss] [TECH TOPIC] A Safety-critical Linux system
> architecture
> 
> On Wed, Sep 12, 2018 at 12:35:07PM +0200, Linus Walleij wrote:
> > On Wed, Sep 12, 2018 at 3:18 AM Tiejun Chen <tiejunc at vmware.com> wrote:
> >
> > > software contexts need to be certified according to different
> > > specifications like ARINC 653, Automotive Safety Integrity Level,
> > > and so on. So we need to explore making Linux itself certified.
> >
> > There is a bunch of these certifications and specifications. Many of
> > them include manual review of "all code on the system", which is why
> > several approaches to this includes stripping down the kernel source
> > to only the code (after removing all Kconfig buzz and ifdefs) that
> > will compile and run on the target.
> >
> > This should of course be possible to integrate into the existing Linux
> > build system, like "make sources" that would create a reduced kernel
> > tree that will also compile (russian matroska dolls come to mind). I
> > think such projects exist in Japan but I haven't heard from them
> > recently.
> >
> > My pet peeve is that the review process appears to be something along
> > the lines that a "certified person/consultant" who has training in
> > this standard is supposed to review all the code for safety, so after
> > reviewing IMO we should work on (A) making sure that these reviews and
> > comments and the exact lines of the kernel and which version/commit ID
> > of it it pertains to is made public and (B) that the review persons
> > statement be merged into the kernel git log as some kind of annotation
> > along the lines of:
> >
> > Reviewed-for-ISO-26262-by: Linus Walleij <linus.walleij at linaro.org>
> >

Thanks a lot. Something is really being inspiring me in this area.

> 
> Functional Safety (FuSa) is the freedom from unacceptable risk. It typically
> involves safety measures that manage known and acceptable risk.
> 
> There is a significant difference in the traditional Functional Safety (FuSa)
> systems and the systems built with Linux. As opposed to purpose built micro
> processors and less than 100k lines of code, Linux systems on general purpose
> CPUs present a "complex" system - where "complex" is referring to a system
> which exhibits emergent properties - properties which can only be observed in
> the assembled system, and not in the individual components.
> 
> This is significant because it requires a new approach to qualifying systems. We
> cannot apply traditional hazard analysis for fault trees to systems with CPUs
> made up of 7 Billion transistors (each) and a pre-existing software stack with 10s
> of millions lines of code.
> 
> Point being: starting to add safety specific reviews to a pre-existing complex
> software stack doesn't help. Linux will never be developed in a strictly
> compliant manner to any FuSa standard (especially 26262 which is not suitable
> for any complex software stack, fortunately it allows you to defer to the more
> generic IEC 61508).
> 
> The problem facing us here is not "how do we make Linux safe", it is "how do
> we show that Linux has been developed in such a way that it presents an
> acceptable level of risk which can be managed with a defined set of safety
> measures".
> 
> To the point of "Architecture". While it is tempting to try to "make it safe" or
> "design an architecture", safety critical systems are designed first from a safety
> case.
> 
> The problem we need to solve here is not a technical Linux kernel problem. We
> need to understand a set of use cases, determine safety requirements, and then
> complete the methods and procedures begun by the SIL2LinuxMP project to
> show that Linux (pretty much as is) can be used with an acceptable level of risk.
> 
> I do not feel Kernel Summit is the right venue for this discussion.
> 

I cannot understand why we cannot make this over there.

In the one hand, typically we already have several approaches to enable Linux kernel into such a safety-critical environment, like SIL2LinuxMP, Jailhouse and so on. Even since OSS NA, Intel has announced that Clearlinux could be considered as a good candidate. And some new {software, hardware} features have been introduced into Linux kernel in recent years. So on my side, I'd like to some potential incorporation of these existing technologies. So at this point it's worth discussing what Linux itself can do right now, and what Linux kernel itself could do in some ways.

On the other hand, even without something as you said, "understand a set of use cases, determine safety requirements, and then complete the methods and procedures". Yes, I tend to agree that we need to make these stuff clear very well, but this doesn't mean we shouldn't talk about Linux itself now. Because we already have fundamental issues right there like, 
1. Real time issue: we need to get Linux being RTOS to meet safety-critical requirements.  
2. Partitioning {software, hardware}resources: we need to have strong barrier to providing such an evidence that one program can't interact with another in any ways including shared memory, interrupts, etc.
3. How to "remove" or disable any unnecessary or unused codes in safety-critical environment.
4. documentations to safety and security in Linux.
5. ...

Thanks
Tiejun




More information about the Ksummit-discuss mailing list