[sle-beta] Transactional Updates in Leap 15

Thu Apr 5 06:25:43 MDT 2018

On 05/04/2018 08:40, Richard Brown wrote:
> Hi SLE Beta Testers,
>
> I know you're all busy with the SLE 15 beta and I don't want to distract you (too much) from that excellent
> work.
>
> But if you have any cycles left to spare, I have a request on behalf of the openSUSE project for some time
> testing a feature which may also be finding its way into future versions of SLE in some form.
>
> Leap 15 will be openSUSE's first stable version offering a System Role with Transactional Updates & Read-Only
> Root Filesystems. Already used in SUSE CaaSP & Kubic, this approach gives an even more robust method for
> ensuring that patches are applied correctly, completely, or not at all. If anything goes wrong, systems can be
> restored to their previous working state in seconds.
>
> A detailed introduction and quick start guide can be found here:
>   https://kubic.opensuse.org/blog/2018-04-04-transactionalupdates/
>
> Any thoughts, feedback, and bug reports will be greatly appreciated.
>
> Many Thanks,
>
---------
     The referenced doc is interesting to read and think about. Alas, 
patching nirvana is still on back-order.
     My thinking (yours will likely vary) is as follows. The 
snapshot-like temp area can be as large as the main o/s file system 
(root) because we put more into that area than just the o/s and we try 
to avoid over partitioning etc. Of particular concern is the arrow of 
time, which means changes occur on the running system after snapping and 
where also patches will eventually appear. Thus the snapshot becomes out 
of date after the first such change to the running system. Think about 
databases, linked systems, memory cached data and the like. Thus a 
snapshot may not be a safe item to restore in many cases.
     What makes more sense to me is patch a quiescent system. That would 
mean accumulate the new change sets and then bring up the system in 
memory based rescue mode where the regular file systems is/are otherwise 
not enabled. The scheme then tries applying patches one by one (with a 
log to revert), and if a failure occurs then consider undoing them all, 
with optional variations about accepting some regardless and so forth. 
This eliminates concerns about open files, memory caches, partial 
transactions, interaction amongst machines, huge extra disk space, not 
being restricted to BTRFS, and likely a few more nuances. It also avoids 
yet another installation-time-only option and thus can be used well 
after a machine has been built in an ordinary manner.
     A bit of clever thinking suggests create an alternative, a 
"patch-medic" virtual machine whose purpose is to collect new patches 
and apply them to a sedated real machine and thus avoid having fancy 
patch mechanism(s) and whatnot built into each regular machine. Rescue 
mode is an existing step in that direction.
     Thanks,
     Joe D.