From 5049eac1e4af912db77037e21e60f84f8595c12a Mon Sep 17 00:00:00 2001 From: Firstyear Date: Thu, 7 Nov 2019 18:00:57 +1000 Subject: [PATCH] Design a downgrade process (#144) --- designs/downgrade.rst | 47 ++++++++++++++++++++++++++ designs/repl_future_considerations.rst | 2 +- 2 files changed, 48 insertions(+), 1 deletion(-) create mode 100644 designs/downgrade.rst diff --git a/designs/downgrade.rst b/designs/downgrade.rst new file mode 100644 index 000000000..b6dbd0844 --- /dev/null +++ b/designs/downgrade.rst @@ -0,0 +1,47 @@ +Downgrades +---------- + +It's inevitable that someone will find some issue that requires them to downgrade +their working copy of kanidmd - that means we have to understand that process, +and at least advertise or document how it should be done. + +A major barrier for us to have a downgrade process is the nature of our inplace +migrations and upgrades - while we have a system that understands how to upgrade +data and make changes, when downgrading we won't be able to understand the newer +types to do a downgrade. + +Consider we add a new value type XDATA which was previous a UTF8STRING. We have +version 1.0 to 1.1. Version 1.1 will change all UTF8STRING to XDATA which is +what we want, but if we were to run version 1.0 again it would not understand +the XDATA field - not know how to downgrade since the type flatly doesn't exist. + +As a result this leaves one conclusion - we can not support downgrades. Rather +the correct behaviour for us to support is to advise admins to backup before +an upgrade, and to restore from the backup if anything goes wrong. + +This will affect replication in two ways + +First, it means the RUV of a server node can move backwards. This requires +us to limit changelog trimming of events to events that have expired by time +rather than events that are fully resolved. This way within the changelog +trim window, a server can be downgraded, and it's RUV move backwards, but the missing updates will be "replayed" backwards to it. + +Second, it means we have to consider making replication either version (typed) +data agnostic *or* have CSN's reperesent a dataset version from the server which gates or blocks replication events from newer to older instances until *they* are upgraded. + +Having the version gate does have a good benefit. Imagine we have three servers +A, B, C. We upgrade A and B, and they migrate UTF8STRING to XDATA. Server C has +not been upgraded. + +This means that *all changes* from A and B post upgrade will NOT be sent to C. C +may accept changes and will continue to provide them to A and B (provided all +other update resolution steps uphold). If we now revert B, the changes from A will +not flow to B which has been downgraded, but C's changes that were accepted WILL +continue to be acceptted by B. Similar with A. This means in a downgrade scenario +that any data writen on upgraded nodes that are downgraded will be lost, but +that all replication as a whole will still be valid. This is good! + +It does mean we need to consider that we have to upgrade data as it comes in from +replication from an older server too to bring fields up to date if needed. This +may necesitate a "data version" field on each entry, which we can also associate +to any CSN so that it can be accepted or rejected as required. diff --git a/designs/repl_future_considerations.rst b/designs/repl_future_considerations.rst index 3c3e17776..8f31bff6a 100644 --- a/designs/repl_future_considerations.rst +++ b/designs/repl_future_considerations.rst @@ -420,7 +420,7 @@ Let's imagine now we have two read-onlies, R1, R2. So, at this point, A would contact both - +SEE ALSO: Downgrades Attribute Level Replication