[cgl_discussion] Re: [cgl_specs] Re: [cgl_tech_board] Re: [cgl_specs] Use case - Live patching

Takashi Ikebe ikebe.takashi at lab.ntt.co.jp
Tue Mar 29 00:55:17 PST 2005


Tim,

We have same problem consciousness about image difference between memory
and image.
We also working on boot patch with live patch.

Timothy D. Witham wrote:
> On Mon, 2005-03-28 at 10:38 -0600, Corey Minyard wrote:
> 
>>Timothy D. Witham wrote:
>>
>>
>>>On Mon, 2005-03-28 at 09:05 -0600, Corey Minyard wrote:
>>> 
>>>
>>>
>>>>Why do you think it is evil?  It is standard practice in most large 
>>>>telecom systems as it improves availability.
>>>>
>>>>   
>>>>
>>>
>>>   Maybe it would be better to word it as "it can improve
>>>availability". 
>>> 
>>>
>>
>>Point taken, but that is true for practically any technique to improve 
>>availability.
>>
>>
>>>But it really is a hold over from the single large expensive CPU design
>>>days.
>>> 
>>>
>>
>>Not exactly.  It is useful for any system where you don't want to have 
>>to bring it down (or partially down) to fix a bug or you want to be able 
>>to back out the changes later.  It helps in the following ways:
>>
> 
>    Right you are. 
> 
> 
>>   1. Reduces time to fix, as applying a patch is generally a lot faster
>>      than upgrading a system.
>>   2. Avoids having to go simplex in a 1+1 system.
>>   3. Allows fixes with undesirable side effects to be easily removed. 
>>      The patch systems I have used allows patches to be removed, and
>>      fixes with  undesirable side effects happened enough to make that
>>      very useful.
>>   4. Allows debugging changes to be installed and removed in the
>>      field.  I have seen patches used to debug problems in the field.
>>   5. Avoids the "big bang" effect of installing an update for a fix and
>>      getting a whole bunch of other changes that may have undesirable
>>      side effects.
> 
> 
>     All good things but without the tools to ensure that the live patch
> and the disk image are the same all of the above benefit can be
> lost that first time it has to go to storage for the image. (I'm using
> disk for disk anything that functions as the boot image.)
> 
>      This includes the same source for the build that produces the
> live patch and the updated boot image.  
> 
>      I guess I feel that live patching is like a loaded double barrel
> shotgun with no trigger guard or safety.  Yea, there are things
> that you can do quickly with it but one of them involves your
> foot.   
> 
>     I guess I would be happy with something that made sure that
> the live patch and the boot patch were the same set of code.
> 
> Tim
> 
> 
>>>   If you don't exercise absolute top down control you get to a
>>>situation
>>>were there isn't a correlation between what is on the disk for a reboot
>>>and what is in memory being executed.   While a phone company
>>>might be able to control their switch with rather infrequent updates
>>>in the general usage this can cause real issues. 
>>>
>>>In fact from my support of phone company days I remember a 
>>>couple of issues where switches where bounced because of a 
>>>major environmental issue and when they came back up they
>>>were missing features and patches.  They were in such a sorry
>>>state that they had to be reloaded in order to function correctly.   
>>> 
>>>
>>
>>Yes, this is probably the biggest problem with runtime patching.  In the 
>>systems I used, we generally took images of the system with the patches 
>>or we kept applied patches in a directory and applied them all when the 
>>system restarted.
>>
>>The other big problem is that a very good patching system tends to get 
>>"abused" and used to install major updates, major features, etc.  Things 
>>it is really not intended for.
>>
>>I don't think this is intended for general systems.  I wouldn't want to 
>>see it on workstations.  But it is useful in very controlled environments.
>>
>>-Corey
>>
>>
>>>   Tim
>>>
>>> 
>>>
>>>
>>>>-Corey
>>>>
>>>>Ralf Flaxa wrote:
>>>>
>>>>   
>>>>
>>>>
>>>>>Speaking for SUSE/Novell I can at least say that live patching is evil
>>>>>and would never be considered supported. How shall you guarantee support
>>>>>or certification with such a mechanism in place?
>>>>>
>>>>>	Ralf
>>>>>
>>>>>On Mon, Mar 28, 2005 at 11:14:50AM +0900, Takashi Ikebe wrote:
>>>>>
>>>>>
>>>>>     
>>>>>
>>>>>
>>>>>>The following is a use case for a Live patching.  This
>>>>>>addresses AVL10.0 Live patching on CGL Specification 3.0.
>>>>>>Please feel free to comment / suggestion.
>>>>>>
>>>>>>Takashi.
>>>>>>-----------------------------------------------------------------------------------------------
>>>>>>Description
>>>>>>OSDL CGL specifies that carrier grade Linux shall provide the mechanism
>>>>>>for dynamically replacing the symbols of a running process without
>>>>>>restarting. Dynamic replacement of symbols allows a process to access
>>>>>>patched functions or values without restarting and can improve process
>>>>>>availability.
>>>>>>
>>>>>>Desired Outcome
>>>>>>Mainline kernel acceptance or distro acceptance
>>>>>>
>>>>>>Participants/Roles
>>>>>>System administrators setup the requirements on installations. On
>>>>>>operation, system administrators apply patch with the requirement.
>>>>>>
>>>>>>Scenarios
>>>>>>On operation, system administrators apply patch with the requirement as
>>>>>>following scenario;
>>>>>>1.System administrators make patch file from diff file or new version's
>>>>>>source code.
>>>>>>2.System administrators load patch to the process with provided live
>>>>>>patch tool.
>>>>>>3.System administrators activate patch to the process with provided live
>>>>>>patch tool.
>>>>>>4.Confirm that the patch is correctly applied or not.
>>>>>>
>>>>>>Implementation Notes
>>>>>>The requirement need to have following functions;
>>>>>>- The function which loads the patch file to target process's memory area.
>>>>>>- The function which overwrites the branch operation code to the patch,
>>>>>>on the entry point of  target process's functions which wants to fix by
>>>>>>patch.
>>>>>>- The function which restores overwritten branch code.
>>>>>>- The function which unload the patch files.
>>>>>>Through above functions, the requirement realize on-line patch to target
>>>>>>process.
>>>>>>The requirement need to provide on-line patch even if the process is
>>>>>>multi-thread model process, or environment is SMP, and stop time of
>>>>>>target process should not over  100 milliseconds.
>>>>>>
>>>>>>References
>>>>>>Pannus project: http://pannus.sourceforge.net/
>>>>>>Live patching implementation:
>>>>>>http://prdownloads.sourceforge.net/pannus/pannus_en.pdf
>>>>>>
>>>>>>
>>>>>>-- 
>>>>>>Takashi Ikebe
>>>>>>NTT Network Service Systems Laboratories
>>>>>>9-11, Midori-Cho 3-Chome Musashino-Shi,
>>>>>>Tokyo 180-8585 Japan
>>>>>>Tel : +81 422 59 4246, Fax : +81 422 60 4012
>>>>>>e-mail : ikebe.takashi at lab.ntt.co.jp
>>>>>>  
>>>>>>
>>>>>>       
>>>>>>


-- 
Takashi Ikebe
NTT Network Service Systems Laboratories
9-11, Midori-Cho 3-Chome Musashino-Shi,
Tokyo 180-8585 Japan
Tel : +81 422 59 4246, Fax : +81 422 60 4012
e-mail : ikebe.takashi at lab.ntt.co.jp



More information about the cgl_discussion mailing list