This completely reworks how we approach and handle cryptographic keys in Kanidm. This is needed as a foundation for replication coordination which will require handling and rotation of cryptographic keys in automated ways.
This change influences many other parts of the code base in it's implementation.
The primary influences are:
* Modification of how domain user signing keys are revoked or rotated.
* Merging of all existing service-account token keys are retired (retained) keys into the domain to simplify token signing and validation
* Allowing multiple configurations of local command line tools to swap between instances using disparate signing keys.
* Modification of key retrieval to be key id based (KID), removing the need to embed the JWK into tokens
A side effect of this change is that most user authentication sessions and oauth2 sessions will have to be re-established after upgrade. However we feel that session renewal after upgrade is an expected side effect of an upgrade.
In the future this lays the ground work to remove a large number of legacy key handling processes that have evolved, which will allow large parts of code to be removed.
When an admin configured oauth2 custom claims during the creation it
was not enforced that at least one value must be present. This led to
an incorrect logic flaw in str_concat! which didn't handle the 0 case.
This hardens str_concat! to prevent the thread crash by using itertools
for the join instead, and it enforces stricter validation on the valueset
to deny creation of empty claims.
This fix has a low security impact as only an administrator or high
level user can trigger this as a possible denial of service.
Fixes#2680Fixes#2681
Fixes#2601Fixes#393 - gid numbers can be part of the systemd nspawn range.
Previously we allocated gid numbers based on the fact that uid_t is a u32, so we allowed 65536 through u32::max. However, there are two major issues with this that I didn't realise. The first is that anything greater than i32::max (2147483648) can confuse the linux kernel.
The second is that systemd allocates 524288 through 1879048191 to itself for nspawn.
This leaves with with only a few usable ranges.
1000 through 60000
60578 through 61183
65520 through 65533
65536 through 524287
1879048192 through 2147483647
The last range being the largest is the natural and obvious area we should allocate from. This happens to nicely fall in the pattern of 0x7000_0000 through 0x7fff_ffff which allows us to take the last 24 bits of the uuid then applying a bit mask we can ensure that we end up in this range.
There are now two major issues.
We have now changed our validation code to enforce a tighter range, but we may have already allocated users into these ranges.
External systems like FreeIPA allocated uid/gid numbers with reckless abandon directly into these ranges.
As a result we need to make two concessions.
We *secretly* still allow manual allocation of id's from 65536 through to 1879048191 which is the nspawn container range. This happens to be the range that freeipa allocates into. We will never generate an ID in this range, but we will allow it to ease imports since the users of these ranges already have shown they 'don't care' about that range. This also affects SCIM imports for longer term migrations.
Second is id's that fall outside the valid ranges. In the extremely unlikely event this has occurred, a startup migration has been added to regenerate these id values for affected entries to prevent upgrade issues.
An accidental effect of this is freeing up the range 524288 to 1879048191 for other subuid uses.
* fixing up error handling for prctl calls
* minor clippy lintypoos
* making clippy happier
* clippizing a test
* more clippy-calming
* adding tpm-udev to ubuntu flows for testing
* rebuilt wasm
* moving from rg to grep because someone doesn't like nice things
* such clippy like wow
* clippy config to the rescue
Fixes two major issues with replication.
The first was related to server refreshes. When a server was refreshed it would retain it's server unique id. If the server had lagged and was disconnected from replication and administrator would naturally then refresh it's database. This meant that on next tombstone purge of the server, it's RUV would jump ahead causing it's refresh-supplier to now believe it was lagging (which was not the case).
In the situation where a server is refreshed, we reset the servers unique replication ID which avoids the RUV having "jumps".
The second issue was related to RUV trimming. A server which had older RUV entries (say from servers that have been trimmed) would "taint" and re-supply those server ID's back to nodes that wanted to trim them. This also meant that on a restart of the server, that if the node had correctly trimmed the server ID, it would be re-added in memory.
This improves RUV trimming by limiting what what compare and check as a supplier to only CID's that are within the valid changelog window. This itself presented challenges with "how to determine if a server should be removed from the RUV". To achieve this we now check for "overlap" of the RUVS. If overlap isn't occurring it indicates split brain or node isolation, and replication is stopped in these cases.
When a delete of an entry occurs which is reference by another entry,
if the entry has a MUST schema condition on the deleted entry then the
delete should be blocked to prevent the entries structure becoming
invalid.