[sle-beta] Transactional Updates in Leap 15
jrd at netlab1.net
Thu Apr 5 09:09:00 MDT 2018
On 05/04/2018 14:31, Thorsten Kukuk wrote:
> On Thu, Apr 05, Joe Doupnik wrote:
>> My suggestion appears, in essence, in Windows. Some patches are done on
>> the fly. Others are stashed, the machine is rebooted, and the boot steps are
>> interrupted to make changes to sensitive things before proceeding.
> Correct, and you know how much damage this "I have to do urgent things
> with my machine, but Windows does not let me do this, it wants to update
> itself" already made?
> Only as example, as it was widely in the press in germany:
> "Baseketball team being relegated to lower division due to
> Windows Update".
> It maybe ok for Microsoft to live with such bad press, but I don't
> want to read Linux there.
> Only because Microsoft wasn't able to find a good solution, it does
> not mean there is no better solution ;)
>> The matter of long down time, or the possibility of it occurring, is
>> always with us. Where we can we provide fall back systems to provide
>> continued service and be insurance against possible irreversible changes.
> The feature request number one we get is still: long down times are
> inacceptable. Most often mentioned example is the Windows Updater ...
>> My suggestion side steps the issue of BTRFS or not, large disk space,
>> running with r/o file systems, complicated partitioning finesse designed in
>> at system creation time, etc. It safely deal with apps which are not built
>> to do freeze/thaw, and it allows for regression without further
>> complications. Of course it does not provide the running fallback.
> Don't know why you think you need complicated partitioning finesse.
> If you want to have reliable backups, you need to seperate your data from your
> applications. But this does not require complicated partitioning finesse.
> It only requires that your applications conform to the Linux FHS.
>> The notion is worth thinking about. My own feeling is on the fly
>> patching of important systems is asking for trouble over the long term, and
>> I would rather not have that be in a life support system.
>> A <not so> humorous mental picture is a doctor operating upon
>> him/herself and discovering the need for three hands or careful visual
>> control or the need for specialized knowledge not yet accumulated, etc.
>> Another case of not knowing in advance what we don't know.
> Looks like you should really watch a presentation from me about transactional
> updates. What you describe is what everybody is doing today, but not what
> transactional update is doing.
On the down time aspect, again.
Yes, Windows Update can be a pain, no question about that. Worse is
when it won't come up sensibly after patching due to one thing or
another (for which we go into Safe mode, seek restore
points/os-only-snapshots, uninstall things, and other annoying emergency
steps). The other side of that coin is Windows tries the dormant system
(in boot code steps) approach for many patches, and for good reason.
The commercial solution to serious changes is CYA: employ a proper
replacement system while the first is being operated upon. Clustering to
the fore, particularly when time == money or customers become most
upset, or worse. Which in turn means do such changes off line, pretty
please. More important systems have more than one fall back system to
avoid having only one good system at any time. Patching live running
servers can't be relied upon that way, no matter how clever the authors
because real server matters can be much more complicated than they know
and involve considerably more than a r/o root file system.
Thus the downtime problem is somewhat distinct from how one patches
systems or one tries to recover from related problems. Just how a site
chooses to go about this will be up to them, of course, but at least
understanding the risks and alternatives is necessary to make
intelligent choices. Meanwhile we try to explain the nuances to folks in
terms which they can assimilate.
More information about the sle-beta