[cgl_discussion] Re: [cgl_specs] Re: [cgl_tech_board] Re:
[cgl_specs] Use case - Live patching
Takashi Ikebe
ikebe.takashi at lab.ntt.co.jp
Tue Mar 29 00:55:17 PST 2005
Tim,
We have same problem consciousness about image difference between memory
and image.
We also working on boot patch with live patch.
Timothy D. Witham wrote:
> On Mon, 2005-03-28 at 10:38 -0600, Corey Minyard wrote:
>
>>Timothy D. Witham wrote:
>>
>>
>>>On Mon, 2005-03-28 at 09:05 -0600, Corey Minyard wrote:
>>>
>>>
>>>
>>>>Why do you think it is evil? It is standard practice in most large
>>>>telecom systems as it improves availability.
>>>>
>>>>
>>>>
>>>
>>> Maybe it would be better to word it as "it can improve
>>>availability".
>>>
>>>
>>
>>Point taken, but that is true for practically any technique to improve
>>availability.
>>
>>
>>>But it really is a hold over from the single large expensive CPU design
>>>days.
>>>
>>>
>>
>>Not exactly. It is useful for any system where you don't want to have
>>to bring it down (or partially down) to fix a bug or you want to be able
>>to back out the changes later. It helps in the following ways:
>>
>
> Right you are.
>
>
>> 1. Reduces time to fix, as applying a patch is generally a lot faster
>> than upgrading a system.
>> 2. Avoids having to go simplex in a 1+1 system.
>> 3. Allows fixes with undesirable side effects to be easily removed.
>> The patch systems I have used allows patches to be removed, and
>> fixes with undesirable side effects happened enough to make that
>> very useful.
>> 4. Allows debugging changes to be installed and removed in the
>> field. I have seen patches used to debug problems in the field.
>> 5. Avoids the "big bang" effect of installing an update for a fix and
>> getting a whole bunch of other changes that may have undesirable
>> side effects.
>
>
> All good things but without the tools to ensure that the live patch
> and the disk image are the same all of the above benefit can be
> lost that first time it has to go to storage for the image. (I'm using
> disk for disk anything that functions as the boot image.)
>
> This includes the same source for the build that produces the
> live patch and the updated boot image.
>
> I guess I feel that live patching is like a loaded double barrel
> shotgun with no trigger guard or safety. Yea, there are things
> that you can do quickly with it but one of them involves your
> foot.
>
> I guess I would be happy with something that made sure that
> the live patch and the boot patch were the same set of code.
>
> Tim
>
>
>>> If you don't exercise absolute top down control you get to a
>>>situation
>>>were there isn't a correlation between what is on the disk for a reboot
>>>and what is in memory being executed. While a phone company
>>>might be able to control their switch with rather infrequent updates
>>>in the general usage this can cause real issues.
>>>
>>>In fact from my support of phone company days I remember a
>>>couple of issues where switches where bounced because of a
>>>major environmental issue and when they came back up they
>>>were missing features and patches. They were in such a sorry
>>>state that they had to be reloaded in order to function correctly.
>>>
>>>
>>
>>Yes, this is probably the biggest problem with runtime patching. In the
>>systems I used, we generally took images of the system with the patches
>>or we kept applied patches in a directory and applied them all when the
>>system restarted.
>>
>>The other big problem is that a very good patching system tends to get
>>"abused" and used to install major updates, major features, etc. Things
>>it is really not intended for.
>>
>>I don't think this is intended for general systems. I wouldn't want to
>>see it on workstations. But it is useful in very controlled environments.
>>
>>-Corey
>>
>>
>>> Tim
>>>
>>>
>>>
>>>
>>>>-Corey
>>>>
>>>>Ralf Flaxa wrote:
>>>>
>>>>
>>>>
>>>>
>>>>>Speaking for SUSE/Novell I can at least say that live patching is evil
>>>>>and would never be considered supported. How shall you guarantee support
>>>>>or certification with such a mechanism in place?
>>>>>
>>>>> Ralf
>>>>>
>>>>>On Mon, Mar 28, 2005 at 11:14:50AM +0900, Takashi Ikebe wrote:
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>>The following is a use case for a Live patching. This
>>>>>>addresses AVL10.0 Live patching on CGL Specification 3.0.
>>>>>>Please feel free to comment / suggestion.
>>>>>>
>>>>>>Takashi.
>>>>>>-----------------------------------------------------------------------------------------------
>>>>>>Description
>>>>>>OSDL CGL specifies that carrier grade Linux shall provide the mechanism
>>>>>>for dynamically replacing the symbols of a running process without
>>>>>>restarting. Dynamic replacement of symbols allows a process to access
>>>>>>patched functions or values without restarting and can improve process
>>>>>>availability.
>>>>>>
>>>>>>Desired Outcome
>>>>>>Mainline kernel acceptance or distro acceptance
>>>>>>
>>>>>>Participants/Roles
>>>>>>System administrators setup the requirements on installations. On
>>>>>>operation, system administrators apply patch with the requirement.
>>>>>>
>>>>>>Scenarios
>>>>>>On operation, system administrators apply patch with the requirement as
>>>>>>following scenario;
>>>>>>1.System administrators make patch file from diff file or new version's
>>>>>>source code.
>>>>>>2.System administrators load patch to the process with provided live
>>>>>>patch tool.
>>>>>>3.System administrators activate patch to the process with provided live
>>>>>>patch tool.
>>>>>>4.Confirm that the patch is correctly applied or not.
>>>>>>
>>>>>>Implementation Notes
>>>>>>The requirement need to have following functions;
>>>>>>- The function which loads the patch file to target process's memory area.
>>>>>>- The function which overwrites the branch operation code to the patch,
>>>>>>on the entry point of target process's functions which wants to fix by
>>>>>>patch.
>>>>>>- The function which restores overwritten branch code.
>>>>>>- The function which unload the patch files.
>>>>>>Through above functions, the requirement realize on-line patch to target
>>>>>>process.
>>>>>>The requirement need to provide on-line patch even if the process is
>>>>>>multi-thread model process, or environment is SMP, and stop time of
>>>>>>target process should not over 100 milliseconds.
>>>>>>
>>>>>>References
>>>>>>Pannus project: http://pannus.sourceforge.net/
>>>>>>Live patching implementation:
>>>>>>http://prdownloads.sourceforge.net/pannus/pannus_en.pdf
>>>>>>
>>>>>>
>>>>>>--
>>>>>>Takashi Ikebe
>>>>>>NTT Network Service Systems Laboratories
>>>>>>9-11, Midori-Cho 3-Chome Musashino-Shi,
>>>>>>Tokyo 180-8585 Japan
>>>>>>Tel : +81 422 59 4246, Fax : +81 422 60 4012
>>>>>>e-mail : ikebe.takashi at lab.ntt.co.jp
>>>>>>
>>>>>>
>>>>>>
>>>>>>
--
Takashi Ikebe
NTT Network Service Systems Laboratories
9-11, Midori-Cho 3-Chome Musashino-Shi,
Tokyo 180-8585 Japan
Tel : +81 422 59 4246, Fax : +81 422 60 4012
e-mail : ikebe.takashi at lab.ntt.co.jp
More information about the cgl_discussion
mailing list