* added `thread_count` configuration for the server
* added `thread_count` to orca
---------
Co-authored-by: Sebastiano Tocci <sebastiano.tocci@proton.me>
Thanks to @Seba-T's work with Orca, we were able to identify a number of performance issues in certain high load conditions.
This commit contains fixes for the following issues
* Unbounded Memory Growth - due to how ARCache works, to maintain temporal consistency it must retain copies of keys (not values) in a special data set for tracking. The Filter Resolve Cache was using unresolved filters as keys. This caused memory explosions when refint or memberof were updating a group with a large number of members because they would emit a query with hundreds of filter terms that would only be used once and never again, causing the ARCache haunted set to grow without bound. To limit this, we no longer cache large/complex queries for resolution, and in future we may implement some other methods to reduce this like sha256/hmac of the queries.
* When creating a new account, dyngroups would be engaged to add the account as a member due to the matching scope. However the change to the dyngroup was triggering an update of all the dyngroups *members* related memberof attributes. This would mean that adding an account would trigger every other account to be loaded an updated.
* When memberof would iterate over leaf entries and update them one at a time. This mean a large number of small fragmented queries in the case of a lot of leaf entries being updated. Now leaf entries are updated in a single stripe once groups are stabilised.
* Member of would always trigger it's members to always update. Instead, we should only update members where a difference is observed, or all members if the group's memberof itself has changed since this needs to propogate to all leaf entries. This significantly reduces the amount of writes and operations to examine the changed member of set.
* Referential integrity would examine all reference uuids on entries for validity rather than just the reference uuids that were altered within the transaction. This change means that only uuids that were *added* are validated during an operation.
* During async write backs (delayed actions) these were performed one at a time. Instead, when possible this should be done in a single transaction as the write transaction caches all writes in memory until the commit meaning that by batching we reduce overall latency.
* In the server there can only be one write transaction and many readers. These are guarded by tokio semaphores that act as fair queues - first in gets the lock next. Due to the design of the server readers would be blocked on the *database* semaphore, and writers would block on the write semaphore and THEN the database semaphore. This arrangement was creating a situation which unfairly advantaged readers over writers, as any write would first have to become the head of it's queue, and then compete with all readers to access a db transaction. Instead, we now have a reader semaphore with size threads minus 1, clamped at a minimum of 1. This means that provided there are two or more threads, then a writer will *always* have a database handle available, and readers will pre-queue with each other before queueing on the db ticket. If there is only one thread, then writes and reads will alternate between each other fairly.
* disable mimalloc on illumos, in part because it immediately segfaults,
but also because we prefer libumem and link it into all Rust binaries
* switch from fs2 (unmaintained crate) to fs4 which provides the same
interface and has wider platform support
While basking under the shade of the coolabah tree, I was overcome by an intense desire to improve the performance and memory usage of Kanidm.
This pr reduces a major source of repeated small clones, lowers default log level in testing, removes some trace fields that are both large and probably shouldn't be traced, and also changes some lto settings for release builds.
* kanidm cli logs on debug level - Fixes#2745
* such clippy like wow
* It's important for a wordsmith to know when to get its fixes in.
* updootin' wasms
Improve the models and what can be performed in the orca benchmarks.
---------
Co-authored-by: Sebastiano Tocci <seba.tocci@gmail.com>
Co-authored-by: Sebastiano Tocci <sebastiano.tocci@proton.me>
This completely reworks how we approach and handle cryptographic keys in Kanidm. This is needed as a foundation for replication coordination which will require handling and rotation of cryptographic keys in automated ways.
This change influences many other parts of the code base in it's implementation.
The primary influences are:
* Modification of how domain user signing keys are revoked or rotated.
* Merging of all existing service-account token keys are retired (retained) keys into the domain to simplify token signing and validation
* Allowing multiple configurations of local command line tools to swap between instances using disparate signing keys.
* Modification of key retrieval to be key id based (KID), removing the need to embed the JWK into tokens
A side effect of this change is that most user authentication sessions and oauth2 sessions will have to be re-established after upgrade. However we feel that session renewal after upgrade is an expected side effect of an upgrade.
In the future this lays the ground work to remove a large number of legacy key handling processes that have evolved, which will allow large parts of code to be removed.
Fixes#2601Fixes#393 - gid numbers can be part of the systemd nspawn range.
Previously we allocated gid numbers based on the fact that uid_t is a u32, so we allowed 65536 through u32::max. However, there are two major issues with this that I didn't realise. The first is that anything greater than i32::max (2147483648) can confuse the linux kernel.
The second is that systemd allocates 524288 through 1879048191 to itself for nspawn.
This leaves with with only a few usable ranges.
1000 through 60000
60578 through 61183
65520 through 65533
65536 through 524287
1879048192 through 2147483647
The last range being the largest is the natural and obvious area we should allocate from. This happens to nicely fall in the pattern of 0x7000_0000 through 0x7fff_ffff which allows us to take the last 24 bits of the uuid then applying a bit mask we can ensure that we end up in this range.
There are now two major issues.
We have now changed our validation code to enforce a tighter range, but we may have already allocated users into these ranges.
External systems like FreeIPA allocated uid/gid numbers with reckless abandon directly into these ranges.
As a result we need to make two concessions.
We *secretly* still allow manual allocation of id's from 65536 through to 1879048191 which is the nspawn container range. This happens to be the range that freeipa allocates into. We will never generate an ID in this range, but we will allow it to ease imports since the users of these ranges already have shown they 'don't care' about that range. This also affects SCIM imports for longer term migrations.
Second is id's that fall outside the valid ranges. In the extremely unlikely event this has occurred, a startup migration has been added to regenerate these id values for affected entries to prevent upgrade issues.
An accidental effect of this is freeing up the range 524288 to 1879048191 for other subuid uses.
* Refactor: move the object graph ui to admin web ui
* Add dynamic js loading support
Load viz.js dynamically
* Add some js docs
* chore: cleanup imports
* chore: remove unused clipboard feature
chore: remove unused mermaid.sh
* Messing with the profile.release settings and reverting the changes I tried has now made the build much smaller yay :D
* Refactor: user raw search requests
Assert service-accounts properly
* refactor: new v1 proto structure
* Add self to CONTRIBUTORS.md
---------
Co-authored-by: James Hodgkinson <james@terminaloutcomes.com>
* fixing up error handling for prctl calls
* minor clippy lintypoos
* making clippy happier
* clippizing a test
* more clippy-calming
* adding tpm-udev to ubuntu flows for testing
* rebuilt wasm
* moving from rg to grep because someone doesn't like nice things
* such clippy like wow
* clippy config to the rescue
* service-account validity expire-at doesn't accept all time nouns as defined by docs
Fixes#2153
* realised a logic bug
* making clippy happy while I'm here
* returning an empty set from the creds if the creds attribute is not found, which is then handled downstream
* adding some test coverage because there was some rando panic-inducing thing
* ldap constants
* documenting a macro
* helpful weird errors
* the war on strings continues
* less json more better
* testing things fixing bugs
* idm_domain_reset_token_key wasn't working, added a test and fixed it (we weren't testing it)
* idm_domain_set_ldap_basedn - adding tests
* adding testing for idm_account_credential_update_cancel_mfareg
* warning of deprecation
* pulling out exitcode, adding hyper dep to handle errors (was already transitively there due to reqwest)
* adding better error handling, more options for client things
Refers #1987
Notable changes:
- in server/lib/src/entry.rs - aiming to pass the enum instead of the strings
- changed signature of add_ava to take Attribute instead of &str (which is used in the entry_init macro... which was fun)
- set_ava<T> now takes Attribute
- added TryFrom<&AttrString> for Attribute
* wopsies, missing imports
* more clippy and fmt
* adding test build for kanidm with idv-tui feature
* making codespell happy
---------
Co-authored-by: James Hodgkinson <james@terminaloutcomes.com>
* Starting to chase down testing
* commenting out unused/inactive endpoints, adding more tests
* clippyism
* making clippy happy v2
* testing when things are not right
* moar checkpoint
* splitting up testkit things a bit
* moving https -> tide
* mad lad be crabbin
* spawning like a frog
* something something different spawning
* woot it works ish
* more server things
* adding version header to requests
* adding kopid_middleware
* well that was supposed to be an hour... four later
* more nonsense
* carrying on with the conversion
* first pass through the conversion is DONE!
* less pub more better
* session storage works better, fixed some paths
* axum-csp version thing
* try a typedheader
* better openssl config things
* updating lockfile
* http2
* actually sending JSON when we say we will!
* just about to do something dumb
* flargl
* more yak shaving
* So many clippy-isms, fixing up a query handler bleep bloop
* So many clippy-isms, fixing up a query handler bleep bloop
* fmt
* all tests pass including basic web logins and nav
* so much clippyism
* stripping out old comments
* fmt
* commenty things
* stripping out tide
* updates
* de-tiding things
* fmt
* adding optional header matching ,thanks @cuberoot74088
* oauth2 stuff to match #1807 but in axum
* CLIPPY IS FINALLY SATED
* moving scim from /v1/scim to /scim
* one day clippy will make sense
* cleanups
* removing sketching middleware
* cleanup, strip a broken test endpoint (routemap), more clippy
* docs fmt
* pulling axum-csp from the wrong cargo.toml
* docs fmt
* fmt fixes
logging changes:
* Offering auth mechanisms -> debug
* 404's aren't really warnings
* double tombstone message, one goes to debug
other changes:
* CSP changes to allow the bootstrap images to load
* more testing javascriptfile things, I R
* it's nice to know where things are
* putting non-rust web things in static/ instead of src/
* RequestCredentials::SameOrigin is the default, also adding a utility function to save dupe code. Wow this saved... kilobytes.
* removing commented code, fixing up codespell config
* clippyisms
* wtf, gha
* dee-gloo-ing some things
* adding some ubuntu build test things
* sigh rustwasm/wasm-pack/issues/1138
* more do_request things
* packaging things
* hilarious dev env setup script
* updated script works, all the UI works, including the experimental UI for naughty crabs
* deb package fixes
* fixed some notes
* setup experimental UI tweaks