Design a downgrade process (#144)

This commit is contained in:
Firstyear 2019-11-07 18:00:57 +10:00 committed by GitHub
parent 4f28bb5bc3
commit 5049eac1e4
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
2 changed files with 48 additions and 1 deletions

47
designs/downgrade.rst Normal file
View file

@ -0,0 +1,47 @@
Downgrades
----------
It's inevitable that someone will find some issue that requires them to downgrade
their working copy of kanidmd - that means we have to understand that process,
and at least advertise or document how it should be done.
A major barrier for us to have a downgrade process is the nature of our inplace
migrations and upgrades - while we have a system that understands how to upgrade
data and make changes, when downgrading we won't be able to understand the newer
types to do a downgrade.
Consider we add a new value type XDATA which was previous a UTF8STRING. We have
version 1.0 to 1.1. Version 1.1 will change all UTF8STRING to XDATA which is
what we want, but if we were to run version 1.0 again it would not understand
the XDATA field - not know how to downgrade since the type flatly doesn't exist.
As a result this leaves one conclusion - we can not support downgrades. Rather
the correct behaviour for us to support is to advise admins to backup before
an upgrade, and to restore from the backup if anything goes wrong.
This will affect replication in two ways
First, it means the RUV of a server node can move backwards. This requires
us to limit changelog trimming of events to events that have expired by time
rather than events that are fully resolved. This way within the changelog
trim window, a server can be downgraded, and it's RUV move backwards, but the missing updates will be "replayed" backwards to it.
Second, it means we have to consider making replication either version (typed)
data agnostic *or* have CSN's reperesent a dataset version from the server which gates or blocks replication events from newer to older instances until *they* are upgraded.
Having the version gate does have a good benefit. Imagine we have three servers
A, B, C. We upgrade A and B, and they migrate UTF8STRING to XDATA. Server C has
not been upgraded.
This means that *all changes* from A and B post upgrade will NOT be sent to C. C
may accept changes and will continue to provide them to A and B (provided all
other update resolution steps uphold). If we now revert B, the changes from A will
not flow to B which has been downgraded, but C's changes that were accepted WILL
continue to be acceptted by B. Similar with A. This means in a downgrade scenario
that any data writen on upgraded nodes that are downgraded will be lost, but
that all replication as a whole will still be valid. This is good!
It does mean we need to consider that we have to upgrade data as it comes in from
replication from an older server too to bring fields up to date if needed. This
may necesitate a "data version" field on each entry, which we can also associate
to any CSN so that it can be accepted or rejected as required.

View file

@ -420,7 +420,7 @@ Let's imagine now we have two read-onlies, R1, R2.
So, at this point, A would contact both So, at this point, A would contact both
SEE ALSO: Downgrades
Attribute Level Replication Attribute Level Replication