[cgl_discussion] Re: [cgl_specs] Use case - Live patching

Wed Mar 30 09:54:03 PST 2005

John Cherry wrote:

>BTW, I am moving this discussion to just cgl_discussion (not cross
>posting to cgl_specs and cgl_tech_board).
>  
>
Thanks.

>On Wed, 2005-03-30 at 08:54 -0600, Corey Minyard wrote:
>  
>
>>Yes, I agree.  The requirement should only be for user applications, not 
>>distro components or kernel.  It should explicitly say so.  However, 
>>Ralf said:
>>
>>Then it should be stated very explicitely that this feature may only
>>be used for and by applications and that it is forbidden to patch
>>the underlying distro with it.
>>  
>>
>>which would restrict the feature to forbid patching the distro components.
>>    
>>
>
>Just some clarification.  The CGL requirements do not dictate
>implementations.  The CGL requirements do not restrict the usage of a
>capability, they only define the core capability that must exist.  The
>usage models that we are developing are intended to describe how the
>capability WILL be used by customers (NEPs, carriers, or end users).
>
>So....
>
>The core requirement should be crisp in defining that live patching MUST
>apply to user applications.  This should not limit an implmentation from
>also allowing live patching of distro components or the kernel.  It
>should not dictate a user-space only implementation.
>
>Let's help Takashi with the actual usage cases for live patching.  For
>instance, can we identify some specific applications or application sets
>that would benefit from live patching? We should at least be able to
>  
>
>call out some application types.
>
Practically any telecom application can benefit.  There are risks 
inherent in patching.  There are risks in having to go simplex to fix a 
bug.  There are risks in it taking a very long time to apply a fix and 
not being able to easily back it out.  It's really up to the application 
architects to decide their strategy based upon the risks.

For instance, when I worked in that realm, Nortel base-station radios 
were not patchable.  Updating the software on a few hundred of those 
took a *long* time (like days, though they have greatly reduced the 
time, as I understand).  It would have been much better to be able to 
download a few hundred byte patch to fix a problem, that would have 
taken a few minutes.

For fixing minor bugs in 1+1 applications like call control (say an HLR, 
or a core switch), patching allows fixes to be implemented without 
having to go simplex.

Maybe a likele scenario is a better way to describe it.  For instance:

An application has been deployed to the field.  A new subscriber was 
added; a bug in a function was exercised due to the uniqueness of that 
subscriber's data,  causing a function to return an incorrect value.  
The TEM generated a patch to be applied to replace the incorrect 
function and fix any incorrect data caused by the bug.

This is an ideal (low risk) use for patching.  It is fixing a bug in a 
single function with limited scope.  A one-time data fixup is included.

>Is 100ms for suspending an application
>reasonable?
>
That's a hard question.  It really depends on how the applications is 
designed and the size (number of functions) of the patch.  If you have a 
system with 10,000 threads and you are patching 100 functions, it's 
going to be hard to meet 100ms and be correct, as there are some O(n^2) 
operations that you need to do in that realm with everything stopped to 
be 100% correct.  A system with 10 threads and a 10 function patch would 
not be a problem on a modern processor.  It also depends on processor 
speed and probably a host of other things.  You could set up a scenario 
(IE Pentium 4, 3Ghz, 100 threads, 10 function patch).  But that's 
probably the best you can do.

IMHO, this is probably best left unspecified.

>How would a multi-threaded application update work?  Etc.
>  
>
I think we would want to require support for multi-threaded 
applications.  IMHO, how is an implementation detail.

Also, there probably needs to be some restrictions on other things.  We 
probably cannot patch interpreted applications (Java, Python, etc.).  
There might be language support issues, so we may want to restrict it to 
C and C++ functions.  There are classes of implementations that are 
harder to patch, like garbage-collected applications and hard real-time.

>TIA,
>John
>
>  
>
>>-Corey
>>
>>Brian F. G. Bidulock wrote:
>>
>>    
>>
>>>Corey,
>>>
>>>But, as it stands, the requirement is not limited in the use case
>>>to user applications (and specific ones at that).  I think that not
>>>forbidding expanded scope goes without saying...
>>>
>>>--brian
>>>
>>>On Wed, 30 Mar 2005, Corey Minyard wrote:
>>>
>>> 
>>>
>>>      
>>>
>>>>No.  A standard that restricts extra things a distro may do is wrong.  
>>>>This has nothing to do with verifiable certification.
>>>>
>>>>IMHO, all a distro should be required to do is allow patching of a user 
>>>>application.  With that, it would clearly meet CGL requirements and it 
>>>>is verifiable.  A distro may restrict that to only user applications, it 
>>>>may allow patching of core components, it may only allow it's own 
>>>>patches to be used for core components, etc.  But adding a restriction 
>>>>where the distro may not allow it's libraries or OS to be patched is an 
>>>>extra restriction that adds nothing to certification and limits what a 
>>>>distro can to beyond the standard to compete in the marketplace.
>>>>
>>>>And you comment about being fuzzy about what is and is not allowed is 
>>>>not correct.  I'm sure SuSE ships a lot of packages that are not 
>>>>mentioned in the CGL standard.  Would you like to have to remove all 
>>>>those packages and only ship what is explicitly mentioned in the 
>>>>standard?  The standard should be clear about what is required, and 
>>>>should not be fuzzy about that.  However, it should allow things beyond 
>>>>the standard.  Allowing this does a couple of things.  It lets the 
>>>>distro vendor add value beyond the standard.  It shows where the 
>>>>standard needs to be extended.  And it allows new technology to be 
>>>>tested in the market.
>>>>
>>>>I believe a limitation like this would be like adding the limitation: 
>>>>"You can only provide support 12 hours a day."  It's an unnecessary 
>>>>limitation.  The distro should be able to provide any level of support 
>>>>it thinks customers require.
>>>>
>>>>-Corey
>>>>
>>>>   
>>>>
>>>>        
>>>>
>>> 
>>>
>>>      
>>>
>>_______________________________________________
>>cgl_discussion mailing list
>>cgl_discussion at lists.osdl.org
>>http://lists.osdl.org/mailman/listinfo/cgl_discussion
>>    
>>
>
>  
>