[sle-beta] Transactional Updates in Leap 15
Joe Doupnik
jrd at netlab1.net
Thu Apr 5 07:00:27 MDT 2018
On 05/04/2018 13:38, Thorsten Kukuk wrote:
> On Thu, Apr 05, Joe Doupnik wrote:
>
>> The referenced doc is interesting to read and think about. Alas,
>> patching nirvana is still on back-order.
>> My thinking (yours will likely vary) is as follows. The snapshot-like
>> temp area can be as large as the main o/s file system (root) because we put
>> more into that area than just the o/s and we try to avoid over partitioning
>> etc. Of particular concern is the arrow of time, which means changes occur
>> on the running system after snapping and where also patches will eventually
>> appear. Thus the snapshot becomes out of date after the first such change to
>> the running system. Think about databases, linked systems, memory cached
>> data and the like. Thus a snapshot may not be a safe item to restore in many
>> cases.
> Please show me a database, which can write to a read-only filesystem ;)
> Since we work with a read-only root filesystem, you have to strictly seperate
> data from applicatons, and only the applications are part of the snapshot.
> So you will never loose data.
>
>> What makes more sense to me is patch a quiescent system. That would mean
>> accumulate the new change sets and then bring up the system in memory based
>> rescue mode where the regular file systems is/are otherwise not enabled. The
>> scheme then tries applying patches one by one (with a log to revert), and if
>> a failure occurs then consider undoing them all, with optional variations
>> about accepting some regardless and so forth. This eliminates concerns about
>> open files, memory caches, partial transactions, interaction amongst
>> machines, huge extra disk space, not being restricted to BTRFS, and likely a
>> few more nuances. It also avoids yet another installation-time-only option
>> and thus can be used well after a machine has been built in an ordinary
>> manner.
> Your approach is what we offer today, but this does not solve any of the
> problems we solve with transactinal-updates. Especially not the problem
> of a very long downtime of the machine and services for big updates.
>
> Thorsten
>
-----------
My suggestion appears, in essence, in Windows. Some patches are
done on the fly. Others are stashed, the machine is rebooted, and the
boot steps are interrupted to make changes to sensitive things before
proceeding.
The matter of long down time, or the possibility of it occurring,
is always with us. Where we can we provide fall back systems to provide
continued service and be insurance against possible irreversible changes.
My suggestion side steps the issue of BTRFS or not, large disk
space, running with r/o file systems, complicated partitioning finesse
designed in at system creation time, etc. It safely deal with apps which
are not built to do freeze/thaw, and it allows for regression without
further complications. Of course it does not provide the running fallback.
The notion is worth thinking about. My own feeling is on the fly
patching of important systems is asking for trouble over the long term,
and I would rather not have that be in a life support system.
A <not so> humorous mental picture is a doctor operating upon
him/herself and discovering the need for three hands or careful visual
control or the need for specialized knowledge not yet accumulated, etc.
Another case of not knowing in advance what we don't know.
Thanks,
Joe D.
More information about the sle-beta
mailing list