mirror of
https://github.com/kanidm/kanidm.git
synced 2025-02-23 12:37:00 +01:00
Docs update
This commit is contained in:
parent
b4ea4fff89
commit
e1c41d549a
159
designs/memberof.rst
Normal file
159
designs/memberof.rst
Normal file
|
@ -0,0 +1,159 @@
|
|||
|
||||
MemberOf
|
||||
--------
|
||||
|
||||
Member Of is a plugin that serves a fundamental tradeoff: precomputation of
|
||||
the relationships between a group and a user is more effective than looking
|
||||
up those relationships repeatedly.
|
||||
|
||||
There are a few reasons for this to exist.
|
||||
|
||||
The major one is that the question is generally framed "what groups is this person
|
||||
a member of". This is true in terms of application access checks (is user in group Y?), nss' calls
|
||||
ie 'id name'. As a result, we want to have our data for our user and groups in a close locality.
|
||||
Given the design of the KaniDM system, where we generally frame and present user id tokens, it
|
||||
is upon the user that we want to keep the reference to it's groups.
|
||||
|
||||
Now at this point one could consider "Why not just store the groups on the user in the first place?".
|
||||
There is a security benefit to the relationship of "groups have members" rather than "users are
|
||||
members of groups". That benefit is delegated administration. It is much easier to define access
|
||||
controls over "who" can alter the content of a group, including the addition of new members, where
|
||||
the ability to control writing to all users memberOf attribute would mean that anyone with that right
|
||||
could add anyone, to any group.
|
||||
|
||||
IE, if Claire has the write access to "library users" she can only add members to that group.
|
||||
|
||||
However, if users took memberships, for claire to add "library users", we would need to either allow
|
||||
claire to arbitrarily write any group name to users, OR we would need to increase the complexity
|
||||
of the ACI system to support validation of the content of changes.
|
||||
|
||||
|
||||
So as a result - from a user interaction viewpoint, management of groups that have members is the
|
||||
simpler, and more powerful solution, however from a query and access viewpoint, the relation ship
|
||||
of what group is a user member of is the more useful structure.
|
||||
|
||||
To this end, we have the member of plugin. Given a set of groups and there members, update the reverse
|
||||
reference on the users to contain the member of relationship to the group.
|
||||
|
||||
|
||||
There is one final benefit to memberOf - it allows us to have *fast* group nesting capability
|
||||
where the inverse look up becomes N operations to resolve the full structure.
|
||||
|
||||
Design
|
||||
------
|
||||
|
||||
Due to the nature of this plugin, there is a single attribute - 'member' - whos content is examined
|
||||
to build the relationship to others - 'memberOf'. We will examine a single group and user situation
|
||||
without nesting. We assume the user already exists, as the situation where the group exists and we add
|
||||
the user can't occur due to refint.
|
||||
|
||||
* Base Case
|
||||
|
||||
The basecase is the state where memberOf:G-uuid is present in U:memberOf. When this case is met, no
|
||||
action is taken. To determine this, we assert that entry pre:memberOf == entry post:memberOf in
|
||||
the modification - IE no action was taken.
|
||||
|
||||
* Modify Case.
|
||||
|
||||
as memberOf:G-uuid is not present in U:memberOf, we do a "modify" to add it. The modify will recurse
|
||||
to the basecase, that asserts, it is present then will return.
|
||||
|
||||
|
||||
Now let's consider the nested case. G1 -> G2 -> U. We'll assume that G2 -> U already exists
|
||||
but that now we need to add G1 -> G2. This is now trivial to apply given that we use recursion
|
||||
to apply these changes.
|
||||
|
||||
An important aspect of this is that groups *also* contain memberOf attributes: This benefits us because
|
||||
we can then apply the memberOf from our group to the members of the group!
|
||||
|
||||
::
|
||||
|
||||
G1 G2 U
|
||||
member: G2 member: U
|
||||
memberOf: G1 memberOf: G1, G2
|
||||
|
||||
So at each step, if we are a group, we take our uuid, and add it to the set, and then make a present
|
||||
modification of our memberOf + our uuid. So translated:
|
||||
|
||||
::
|
||||
|
||||
|
||||
G1 G2 U
|
||||
member: G2 member: U
|
||||
memberOf: - memberOf: - memberOf: G2
|
||||
|
||||
-> [ G1, ]
|
||||
|
||||
G1 G2 U
|
||||
member: G2 member: U
|
||||
memberOf: - memberOf: G1 memberOf: G2
|
||||
|
||||
-> [ G2, G1 ]
|
||||
|
||||
G1 G2 U
|
||||
member: G2 member: U
|
||||
memberOf: - memberOf: G1 memberOf: G2, G1
|
||||
|
||||
It's important to note, we only recures on Groups - nothing else. This is what breaks the
|
||||
cycle on U, as memberOf is now fully applied.
|
||||
|
||||
|
||||
As a result of our base-case, we can now handle the most evil of cases: circular nested groups
|
||||
and cycle breaking.
|
||||
|
||||
::
|
||||
|
||||
G1 G2 G3
|
||||
member: G2 member: G3 member: G1
|
||||
memberOf: -- memberOf: -- memberOf: --
|
||||
|
||||
-> [ G1, ]
|
||||
|
||||
G1 G2 G3
|
||||
member: G2 member: G3 member: G1
|
||||
memberOf: -- memberOf: G1 memberOf: --
|
||||
|
||||
-> [ G2, G1 ]
|
||||
|
||||
G1 G2 G3
|
||||
member: G2 member: G3 member: G1
|
||||
memberOf: -- memberOf: G1 memberOf: G1-2
|
||||
|
||||
-> [ G3, G2, G1 ]
|
||||
|
||||
G1 G2 G3
|
||||
member: G2 member: G3 member: G1
|
||||
memberOf: G1-3 memberOf: G1 memberOf: G1-2
|
||||
|
||||
-> [ G3, G2, G1 ]
|
||||
|
||||
G1 G2 G3
|
||||
member: G2 member: G3 member: G1
|
||||
memberOf: G1-3 memberOf: G1-3 memberOf: G1-2
|
||||
|
||||
-> [ G3, G2, G1 ]
|
||||
|
||||
G1 G2 G3
|
||||
member: G2 member: G3 member: G1
|
||||
memberOf: G1-3 memberOf: G1-2 memberOf: G1-3
|
||||
|
||||
-> [ G3, G2, G1 ]
|
||||
|
||||
G1 G2 G3
|
||||
member: G2 member: G3 member: G1
|
||||
memberOf: G1-3 memberOf: G1-2 memberOf: G1-3
|
||||
|
||||
BASE CASE -> Application of G1-3 on G1 has no change. END.
|
||||
|
||||
To supplement this, *removal* of a member from a group is the same process - but instead we
|
||||
use the "removed" modify keyword instead of present. The base case remains the same: if no
|
||||
changes occur, we have completed the operation.
|
||||
|
||||
|
||||
Considerations
|
||||
--------------
|
||||
|
||||
* Preventing recursion: As of course, we are
|
||||
|
||||
* Replication
|
||||
|
|
@ -8,7 +8,7 @@ At first glance it may seem correct to no-op a change where the state is:
|
|||
name: william
|
||||
}
|
||||
|
||||
with a "delete name; add name william".
|
||||
with a "purge name; add name william".
|
||||
|
||||
However, this doesn't express the full possibities of the replication topology
|
||||
in the system. The follow events could occur:
|
||||
|
@ -17,7 +17,6 @@ in the system. The follow events could occur:
|
|||
|
||||
DB 1 DB 2
|
||||
---- ----
|
||||
n: w
|
||||
del: name
|
||||
n: l
|
||||
del: name
|
||||
|
@ -27,3 +26,409 @@ The events of DB 1 seem correct in isolation, to no-op the del and re-add, howev
|
|||
when the changelogs will be replayed, they will then cause the events of DB2 to
|
||||
be the final state - whet the timing of events on DB 1 should actually be the
|
||||
final state.
|
||||
|
||||
To contrast if you no-oped the purge name:
|
||||
|
||||
::
|
||||
|
||||
DB 1 DB 2
|
||||
---- ----
|
||||
n: l
|
||||
n: w
|
||||
|
||||
Your final state is now n: [l, w] - note that we have an extra name field we didn't want!
|
||||
|
||||
|
||||
|
||||
CSN
|
||||
---
|
||||
|
||||
The CSN is a concept from 389 Directory Server. It is the Change Serial Number of a a modification
|
||||
or event in the database. The CSN is a lamport clock, where it is the current time in UTC, but
|
||||
it can never move *backwards*.
|
||||
|
||||
RID
|
||||
---
|
||||
|
||||
The RID is a concept from 389 Directory Server. It is the Replica ID of a server. The RID must
|
||||
be a unique value, that identifies exactly this server as unique.
|
||||
|
||||
CID
|
||||
---
|
||||
|
||||
The CID is a (rename?) of a concept from 389 Directory Server. It is the pair of CSN and RID, allowing
|
||||
for changes to now be qualified to a specific server origin and ordering between multiple servers.
|
||||
|
||||
As a result, this value is likely to be:
|
||||
|
||||
::
|
||||
|
||||
(CSN, RID)
|
||||
|
||||
RUV
|
||||
---
|
||||
|
||||
The RUV is a concept from 389 Directory Server. It is the replication up-to-dateness vector.
|
||||
|
||||
This is an array of RIDs, and their min-max CSN locations in the changelog for those RIDs. Min being the
|
||||
oldest change in the log related to that RID, and max being the latest change in the log related
|
||||
to that RID.
|
||||
|
||||
::
|
||||
|
||||
Server A:
|
||||
|----------------------|
|
||||
| ID | MIN | MAX |
|
||||
|----------------------|
|
||||
| 01 | 000 | 010 |
|
||||
| 02 | 002 | 005 |
|
||||
| 03 | 004 | 008 |
|
||||
|----------------------|
|
||||
|
||||
To translate, this says that for RID 01, we have CSN 000 through 010. We can use these two values to
|
||||
recreate the CID of the change itself.
|
||||
|
||||
Now, critically, it is important to be able to compare RUV's to determine what changes are required
|
||||
to be sent, and in which order. Let's assume we have a second server with a RUV of:
|
||||
|
||||
::
|
||||
|
||||
Server B:
|
||||
|----------------------|
|
||||
| ID | MIN | MAX |
|
||||
|----------------------|
|
||||
| 01 | 005 | 008 |
|
||||
| 02 | 000 | 002 |
|
||||
| 03 | 004 | 012 |
|
||||
|----------------------|
|
||||
|
||||
So if we are to compare these, we can see that for ID 1, Server A has 000 -> 010, and B has 005 -> 008.
|
||||
You can make similar determinations for the other values.
|
||||
|
||||
Importantly, in this case we need to ensure the max of Server B is at least equal to or greater than our MIN for each RID.
|
||||
|
||||
Once we have asserting this, we can generate a list of CIDs to supply.
|
||||
|
||||
::
|
||||
|
||||
(003,02)
|
||||
(004,02)
|
||||
(005,02)
|
||||
(009,01)
|
||||
(010,01)
|
||||
|
||||
It's important to note, these have been ordered by their CID, primarily by CSN! After the replication completes Server B's
|
||||
RUV would now be:
|
||||
|
||||
::
|
||||
|
||||
Server B:
|
||||
|----------------------|
|
||||
| ID | MIN | MAX |
|
||||
|----------------------|
|
||||
| 01 | 005 | 010 |
|
||||
| 02 | 000 | 005 |
|
||||
| 03 | 004 | 012 |
|
||||
|----------------------|
|
||||
|
||||
There are some other notes here: Server B is *ahead* of us for RID 3, so we actually send nothing related to
|
||||
this: it's likely that Server B will connect to us later and will supply the changes 11, 12 to us.
|
||||
|
||||
Consider also two servers make a change at the same time. Both could generate an identical CSN
|
||||
value, but due to the nature of a CID to be (CSN, RID), this means that ordering can still take
|
||||
place between the events, where the server RID is now used to determine the order.
|
||||
|
||||
|
||||
Repl Proto Ideas
|
||||
----------------
|
||||
|
||||
We should have push based replication. There should be two versions of the system:
|
||||
|
||||
* Entry Level Replication
|
||||
* Attribute Level Replication.
|
||||
|
||||
Both should be able to share the same RUV details.
|
||||
|
||||
Entry Based
|
||||
===========
|
||||
|
||||
This is the simpler version of the replication system. This is likely ONLY appropriate on a read-only
|
||||
consumer of data.
|
||||
|
||||
The read-only stores *no* server RID, and contains an initially empty RUV. The provider would then supply it's
|
||||
RUV to the consumer (so that it now has a state of where it is), but with all CSN MIN/MAX set to 0.
|
||||
|
||||
The list of CIDs is derived by RUV comparison, but instead of supplying the change log, the entries
|
||||
are sent whole, and the read-only blindly replaces them. We rely on the provider to have completed
|
||||
a correct entry update resolution process for this to make sense.
|
||||
|
||||
To achieve this, we store a list of CID's and what entries were affected within the CID.
|
||||
|
||||
One can imagine a situation where two servers change the entry, but between
|
||||
those changes the read-only is supplied the CID. We don't care in what order they did change,
|
||||
only that a change *must* have occured.
|
||||
|
||||
So example: let's take entry A with server A and B, and read-only R.
|
||||
|
||||
::
|
||||
|
||||
A {
|
||||
data: ...
|
||||
uuid: x,
|
||||
}
|
||||
|
||||
CID-list:
|
||||
[
|
||||
(001, A): [x, ...]
|
||||
]
|
||||
|
||||
So the entry was created with CID (001, A). We connect to R and it has an empty RUV.
|
||||
|
||||
::
|
||||
|
||||
RUV A: RUV R:
|
||||
A 0/1 A 0/0
|
||||
|
||||
We then determine the set of CID's to transmit must be:
|
||||
|
||||
::
|
||||
|
||||
(001, A)
|
||||
|
||||
Referencing our CID list, we know that uuid: x was modified, so we transmit that to the server.
|
||||
|
||||
Now we add server B. The ruvs now are:
|
||||
|
||||
::
|
||||
|
||||
RUV A: RUV B: RUV R:
|
||||
A 0/1 A 0/1 A 0/1
|
||||
B 0/0 B 0/0
|
||||
|
||||
CID-list A:
|
||||
[
|
||||
(001, A): [x, ...]
|
||||
]
|
||||
|
||||
CID-list B:
|
||||
[
|
||||
(001, A): [x, ...]
|
||||
]
|
||||
|
||||
At this point a change happens on B *and* A at almost the same time: We'll say B happened first
|
||||
in this case though:
|
||||
|
||||
::
|
||||
|
||||
RUV A: RUV B: RUV R:
|
||||
A 0/3 A 0/1 A 0/1
|
||||
B 0/0 B 0/2
|
||||
|
||||
CID-list A:
|
||||
[
|
||||
(001, A): [x, ...]
|
||||
(003, A): [x, ...]
|
||||
]
|
||||
|
||||
CID-list B:
|
||||
[
|
||||
(001, A): [x, ...]
|
||||
(002, B): [x, ...]
|
||||
]
|
||||
|
||||
Remember, this protocol is ASYNC however. At this point something happens - server A replicates to R first, but
|
||||
without the changes from B yet. A RUV comparison yields that RUV R must be updated with the empty RUV B, but
|
||||
that the CID: (3, A) must be sent. The entry x is sent to R again.
|
||||
|
||||
::
|
||||
|
||||
RUV A: RUV B: RUV R:
|
||||
A 0/3 A 0/1 A 0/3
|
||||
B 0/0 B 0/2 B 0/0
|
||||
|
||||
CID-list A:
|
||||
[
|
||||
(001, A): [x, ...]
|
||||
(003, A): [x, ...]
|
||||
]
|
||||
|
||||
CID-list B:
|
||||
[
|
||||
(001, A): [x, ...]
|
||||
(002, B): [x, ...]
|
||||
]
|
||||
|
||||
Now, Server B now connects to A and supplies it's changes. Since the changes on B happen *before*
|
||||
the changes on A, the CID slots between the existing changes (and an update resolution would take
|
||||
place, which is out of scope of this part of the design).
|
||||
|
||||
::
|
||||
|
||||
RUV A: RUV B: RUV R:
|
||||
A 0/3 A 0/1 A 0/3
|
||||
B 0/2 B 0/2 B 0/0
|
||||
|
||||
CID-list A:
|
||||
[
|
||||
(001, A): [x, ...]
|
||||
(002, B): [x, ...]
|
||||
(003, A): [x, ...]
|
||||
]
|
||||
|
||||
Next Server A again connects to Server R, and determines based on the RUV that the differences are: (2, B).
|
||||
|
||||
Consulting our CID-list, we see that entry X was changed in this CID. Here's what's important: the order of the change
|
||||
doesn't matter, because we take the *latest* version of UUID X, which has (1, A), (2, B) and (3, A) all
|
||||
fully resolved. We send the entry X as a whole, so all state of (2, B) and LATER changes are applied.
|
||||
|
||||
This now means that because the whole entry was sent, we can assert the entry had changes (2, B) and
|
||||
(3, A), so we can update the RUV R to:
|
||||
|
||||
::
|
||||
|
||||
RUV A: RUV B: RUV R:
|
||||
A 0/3 A 0/1 A 0/3
|
||||
B 0/2 B 0/2 B 0/2
|
||||
|
||||
Now this protocol is not without flaws: read-only's should only be supplied data by a single server
|
||||
as one could imagine the content of R flip-flopping while server A/B are not in sync. However
|
||||
to prevent this situation such as:
|
||||
|
||||
::
|
||||
|
||||
RUV A: RUV B: RUV R:
|
||||
A 0/3 A 0/1 A 0/3
|
||||
B 0/1 B 0/4 B 0/1
|
||||
|
||||
In this case, one can imagine B would then supply data, and when A recieved B's changes, it would again
|
||||
supply to R. However, this can be easily avoided by adhering to the following:
|
||||
|
||||
* A server can only supply to a read-only if all of the suppling server's RUV CSN MAX are contained
|
||||
within the destination RUV CSN MAX.
|
||||
|
||||
By following this, B would determine that as it does *not* have (3, A) (which is greater than the local
|
||||
RUV CSN MAX for A), it should not supply at this time. Once A and B resolve their changes:
|
||||
|
||||
::
|
||||
|
||||
RUV A: RUV B: RUV R:
|
||||
A 0/3 A 0/3 A 0/3
|
||||
B 0/1 B 0/4 B 0/1
|
||||
|
||||
Note that B has A's changes, but not A with B's - but now, server B does satisfy the RUV conditions
|
||||
and COULD supply to R. Similar, A now does not meet the conditions to supply to R until B replicates
|
||||
to A. There could be a risk of starvation to R however in high write-load conditions. It could just
|
||||
be preferable to allow the flip flop, but the risk there is a lack of over-all consistency of the entire
|
||||
server state. This risk is minimised by the fact that we support batching of operations, so all
|
||||
changes should be complete as a whole, and that if a changes happens on A in series, they must
|
||||
logically be valid.
|
||||
|
||||
|
||||
Deletion of entries is a different problem: Due to the entry lifecycle, most entries actually
|
||||
step to recycled, which would trigger the above process. Similar, when recycle ends, we then
|
||||
move to tombstone, again which triggers the above.
|
||||
|
||||
However, we must now discuss the tomstone purging process.
|
||||
|
||||
A tombstone would store the CID upon which it was ... well - tombstoned. As a result, the entry
|
||||
itself is aware of it's state.
|
||||
|
||||
The tombstone purge process would work by detecting the MIN RUV of all replicas. If the MIN RUV
|
||||
is greater than the tombstone CID, then it must be true that all replicas HAVE the tombstone as
|
||||
a tombstone and all changes leading to that fact (as URP would dictate that all servers would
|
||||
arrive at the same tombstone state). At this point, we can now safely remove the tombstone from our
|
||||
database, and no replication needs to occur - as all other replicas would also remove it! This applies
|
||||
to read-onlies as well.
|
||||
|
||||
However, this poses the question - how do we move the MIN RUV of a server? To achieve this we need
|
||||
to assert that *all other servers* have at least moved past a certain state, allowing us to trim out
|
||||
changelog UP TO the MIN RUV.
|
||||
|
||||
Let's consider the supplier to read-only situation first, as this is the simplest:
|
||||
|
||||
::
|
||||
|
||||
RUV A: RUV R:
|
||||
A 0/3 A 0/0
|
||||
|
||||
GRUV A:
|
||||
A:R ???
|
||||
|
||||
To achieve this, we need to view the RUV of every server we connect to: even the RO's despite their
|
||||
lack of RID (in fact this could be a reason to PROVIDE a RID to ROs) ... .
|
||||
We create a global RUV (GRUV) state which would look like
|
||||
the following:
|
||||
|
||||
::
|
||||
|
||||
RUV A: RUV R:
|
||||
A 0/3 A 0/0
|
||||
|
||||
GRUV A:
|
||||
R (A: 0/0, )
|
||||
|
||||
So A has connected to R and polled the RUV and recieved a 0/0. We now can supply our changes to
|
||||
R:
|
||||
|
||||
::
|
||||
|
||||
RUV A: --> RUV R:
|
||||
A 0/3 A 3/3
|
||||
|
||||
GRUV A:
|
||||
R (A: 0/0, )
|
||||
|
||||
As R is a read-only it has no concept of the changelog, so it sets MIN to MAX.
|
||||
|
||||
Now, we then poll the RUV again. Protocol wise RUV polling should be seperate to suppling of data!
|
||||
|
||||
::
|
||||
|
||||
RUV A: RUV R:
|
||||
A 0/3 A 3/3
|
||||
|
||||
GRUV A:
|
||||
R (A: 3/3, )
|
||||
|
||||
Now, we can see that the server R has changes MAX up to 3 - since this is the minimum of the set
|
||||
of all MAX in GRUV, we can now purge changelog of A up to MIN 3
|
||||
|
||||
::
|
||||
|
||||
RUV A: RUV R:
|
||||
A 3/3 A 3/3
|
||||
|
||||
GRUV A:
|
||||
R (A: 3/3, )
|
||||
|
||||
And we are fully consistent!
|
||||
|
||||
Let's imagine now we have two read-onlies, R1, R2.
|
||||
|
||||
|
||||
|
||||
::
|
||||
|
||||
RUV A: RUV B: RUV R:
|
||||
A 0/3 A 0/1 A 0/3
|
||||
B 0/1 B 0/4 B 0/1
|
||||
|
||||
GRUV A:
|
||||
A:B ???
|
||||
A:R ???
|
||||
|
||||
So, at this point, A would contact both
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
Attribute Level Replication
|
||||
===========================
|
||||
|
||||
TBD
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
|
|
@ -827,6 +827,13 @@ impl<'a> QueryServerWriteTransaction<'a> {
|
|||
return plug_pre_res;
|
||||
}
|
||||
|
||||
// TODO: There is a potential optimisation here, where if
|
||||
// candidates == pre-candidates, then we don't need to store anything
|
||||
// because we effectively just did an assert. However, like all
|
||||
// optimisations, this could be premature - so we for now, just
|
||||
// do the CORRECT thing and recommit as we may find later we always
|
||||
// want to add CSN's or other.
|
||||
|
||||
let res: Result<Vec<Entry<EntryValid, EntryCommitted>>, SchemaError> = candidates
|
||||
.into_iter()
|
||||
.map(|e| e.validate(&self.schema))
|
||||
|
|
Loading…
Reference in a new issue