mirror of
https://github.com/kanidm/kanidm.git
synced 2025-02-23 20:47:01 +01:00
Design doc (#1111)
This commit is contained in:
parent
1908364075
commit
af33a4580f
153
kanidm_book/src/developers/designs/scim_migration_planning.md
Normal file
153
kanidm_book/src/developers/designs/scim_migration_planning.md
Normal file
|
@ -0,0 +1,153 @@
|
||||||
|
|
||||||
|
# Scim and Migration Tooling
|
||||||
|
|
||||||
|
We need to be able to synchronise content from other directory or identity management systems.
|
||||||
|
To do this, we need the capability to have "pluggable" synchronisation drivers. This is because
|
||||||
|
not all deployments will be able to use our generic versions, or may have customisations they
|
||||||
|
wish to perform that are unique to them.
|
||||||
|
|
||||||
|
To achieve this we need a layer of seperation - This effectively becomes an "extract, transform,
|
||||||
|
load" process. In addition this process must be *stateful* where it can be run multiple times
|
||||||
|
or even continuously and it will bring kanidm into synchronisation.
|
||||||
|
|
||||||
|
We refer to a "synchronisation" as meaning a complete successful extract, transform and load cycle.
|
||||||
|
|
||||||
|
There are three expected methods of using the synchronisation tools for Kanidm
|
||||||
|
|
||||||
|
* Kanidm as a "read only" portal allowing access to it's specific features and integrations. This is less of a migration, and more of a way to "feed" data into Kanidm without relying on it's internal administration features.
|
||||||
|
* "Big Bang" migration. This is where all the data from another IDM is synchronised in a single execution and applications are swapped to Kanidm. This is rare in larger deployments, but may be used in smaller sites.
|
||||||
|
* Gradual migration. This is where data is synchronised to Kanidm and then both the existing IDM and Kanidm co-exist. Applications gradually migrate to Kanidm. At some point a "final" synchronisation is performed where Kanidm 'gains authority' over all identity data and the existing IDM is disabled.
|
||||||
|
|
||||||
|
In these processes there may be a need to "reset" the synchronsied data. The diagram below shows the possible work flows which account for the above.
|
||||||
|
|
||||||
|
┏━━━━━━━━━━━━━━━━━┓
|
||||||
|
┃ ┃
|
||||||
|
┃ Detached ┃
|
||||||
|
┌──────────────────────┬──┃ (Initial State) ┃◀─────────────────────────┐
|
||||||
|
│ │ ┃ ┃ │
|
||||||
|
│ │ ┗━━━━━━━━━━━━━━━━━┛ │
|
||||||
|
│ └──────────────────────────┐ │
|
||||||
|
│ │ │
|
||||||
|
├───────────────────────┬─────────────────────┐ │ │
|
||||||
|
│ ┌─────────────┐ │ ┌─────────────┐ │ │ ┌─────────────┐ │
|
||||||
|
│ │ │ │ │ │───┘ │ │ │ │
|
||||||
|
│ │ Initial │ │ │ Partial │ │ │ Final │ │
|
||||||
|
└─▶│ Synchronise │──────┴──▶│ Synchronise │───────┴──▶│ Synchronise │──┤
|
||||||
|
│ │ │ │ │ │ │
|
||||||
|
└─────────────┘ └─────────────┘ └─────────────┘ │
|
||||||
|
│ │ │
|
||||||
|
│ │ ┌─────────────┐ │
|
||||||
|
│ │ │ │ │
|
||||||
|
│ │ │ Purge │ │
|
||||||
|
└────────────────────────┴─────────────────▶│ Content │──┘
|
||||||
|
│ │
|
||||||
|
└─────────────┘
|
||||||
|
|
||||||
|
Kanidm starts in a "detached" state from the extern IDM source.
|
||||||
|
|
||||||
|
For Kanidm as a "read only" application source the Initial synchronisation is performed followed by periodic
|
||||||
|
partial synchronisations. At anytime a full initial synchronisation can re-occur to reset the data of the
|
||||||
|
provider. The provider can be reset and removed by a purge which reset's Kanidm to a detached state.
|
||||||
|
|
||||||
|
For a gradual migration, this process is the same as the read only application. However when ready
|
||||||
|
to perform the final cut over a final synchronisation is performed, which retains the data of the
|
||||||
|
external system and grants Kanidm the authority over it. This then moves Kanidm back to the detached
|
||||||
|
state, but with a full cope of the provider data.
|
||||||
|
|
||||||
|
A "big bang" migration is this same process, but the "final" synchronisation is the first and only
|
||||||
|
step required, where all data is loaded and then immediately granted authority to Kanidm.
|
||||||
|
|
||||||
|
## ETL process
|
||||||
|
|
||||||
|
### Extract
|
||||||
|
|
||||||
|
First a user must be able to retrieve their data from their supplying IDM source. Initially
|
||||||
|
we will target LDAP and systems with LDAP interfaces, but in the future there is no barrier
|
||||||
|
to supporting other transports.
|
||||||
|
|
||||||
|
To achieve this, we initially provide synchronisation primitives in the
|
||||||
|
[ldap3 crate](https://github.com/kanidm/ldap3).
|
||||||
|
|
||||||
|
### Transform
|
||||||
|
|
||||||
|
This process will be custom developed by the user, or may have a generic driver that we provide.
|
||||||
|
Our generic tools may provide attribute mapping abilitys so that we can allow some limited
|
||||||
|
customisation.
|
||||||
|
|
||||||
|
### Load
|
||||||
|
|
||||||
|
Finally to load the data into Kanidm, we will make a SCIM interface available. SCIM is a
|
||||||
|
"spiritual successor" to LDAP, and aligns with Kani's design. SCIM allows structured data
|
||||||
|
to be uploaded (unlike LDAP which is simply strings). Because of this SCIM will allow us to
|
||||||
|
expose more complex types that previously we have not been able to provide.
|
||||||
|
|
||||||
|
The largest benefit to SCIM's model is it's ability to perform "batched" operations, which work
|
||||||
|
with Kanidm's transactional model to ensure that during load events, that content is always valid
|
||||||
|
and correct.
|
||||||
|
|
||||||
|
## Configuring a Synchronisation Provider in Kanidm
|
||||||
|
|
||||||
|
Kanidm has a strict transactional model with full ACID compliance. Attempting to create an external
|
||||||
|
model that needs to interoperate with Kanidm's model and ensure both are compliant is fraught with
|
||||||
|
danger. As a result, Kanidm sync providers *should* be stateless, acting only as an ETL bridge.
|
||||||
|
|
||||||
|
Additionally syncproviders need permissions to access and write to content in Kanidm, so it also
|
||||||
|
necessitates Kanidm being aware of the sync relationship.
|
||||||
|
|
||||||
|
For this reason a syncprovider is a derivative of a service account, which also allows storage of
|
||||||
|
the *state* of the synchronisation operation. An example of this is that LDAP syncrepl provides a
|
||||||
|
cookie defining the "state" of what has been "consumed up to" by the ETL bridge. During the
|
||||||
|
load phase the modified entries *and* the cookie are persisted. This means that if the operation fails
|
||||||
|
the cookie also rolls back allowing a retry of the sync. If it suceeds the next sync knows that
|
||||||
|
kanidm is in the correct state. Graphically:
|
||||||
|
|
||||||
|
┌────────────┐ ┌────────────┐ ┌────────────┐
|
||||||
|
│ │ │ │ Retrieve │ │
|
||||||
|
│ │ │ │──────Cookie──────▶│ │
|
||||||
|
│ │ │ │ │ │
|
||||||
|
│ │ │ │ Provide │ │
|
||||||
|
│ │ │ │◀────Cookie────────│ │
|
||||||
|
│ │ Sync Request │ │ │ │
|
||||||
|
│ External │◀───With Cookie─────│ ETL │ │ │
|
||||||
|
│ IDM │ │ Bridge │ │ Kanidm │
|
||||||
|
│ │ Sync Response │ │ │ │
|
||||||
|
│ │────New Cookie─────▶│ │ │ │
|
||||||
|
│ │ │ │ │ │
|
||||||
|
│ │ │ │ Upload Entries │ │
|
||||||
|
│ │ │ │──Persist Cookie──▶│ │
|
||||||
|
│ │ │ │ │ │
|
||||||
|
│ │ │ │◀─────Result───────│ │
|
||||||
|
└────────────┘ └────────────┘ └────────────┘
|
||||||
|
|
||||||
|
At any point the operation *may* fail, so by locking the state with the upload of entries this
|
||||||
|
guarantees correct upload has suceeded and persisted. A success really means it!
|
||||||
|
|
||||||
|
## SCIM
|
||||||
|
|
||||||
|
### Authentication to the endpoint
|
||||||
|
|
||||||
|
This will be based on Kanidm's existing authentication infrastructure, allowing service accounts
|
||||||
|
to use bearer tokens. These tokens will internally bind that changes from the account MUST contain
|
||||||
|
the associated state identifier (cookie).
|
||||||
|
|
||||||
|
### Batch Operations
|
||||||
|
|
||||||
|
Per [rfc7644 section 3.7](https://datatracker.ietf.org/doc/html/rfc7644#section-3.7)
|
||||||
|
|
||||||
|
A requirement of the sync account will be a PATCH request to update the state identifier as the
|
||||||
|
first operation of the batch request. Failure to do so will result in an error.
|
||||||
|
|
||||||
|
### Schema and Attributes
|
||||||
|
|
||||||
|
SCIM defines a number of "generic" schemas for User's and Group's. Kanidm will provide it's own
|
||||||
|
schema definitions that extend or replace these. TBD.
|
||||||
|
|
||||||
|
## Post Migration Concerns
|
||||||
|
|
||||||
|
### Reattaching a Provider Post Final Sync
|
||||||
|
|
||||||
|
In the case that a provider is re-attached after it has been through a final synchronisation,
|
||||||
|
entries that Kanidm now has authority over will NOT be synced and will be highlighted as conflicts.
|
||||||
|
The administrator then needs to decide how to proceed with these conflicts determining which data
|
||||||
|
source is the authority on the information.
|
||||||
|
|
Loading…
Reference in a new issue