Previously on schema definitions for attributes, the list of index
types was manually set on attributes. The issue with this approach is
that not all index types apply to all attribute syntaxes. This made it
error prone not just to Kanidm developers, but to future users who
want to define custom attributes and may incorrectly index those
attributes.
Instead, this changes the index value to be a boolean to indicate
if this attribute should or should not be indexed. Internally Kanidm
has a list of appropriate indexes to apply to these syntax types.
As part of this change, the tests were reviewed to find missing index
types for syntaxes, and other causes of unindexed searches which led
to some changes around the dyngroup plugin (which pushes the boundaries
of a lot of things in Kani due to how it works).
Thanks to @Seba-T's work with Orca, we were able to identify a number of performance issues in certain high load conditions.
This commit contains fixes for the following issues
* Unbounded Memory Growth - due to how ARCache works, to maintain temporal consistency it must retain copies of keys (not values) in a special data set for tracking. The Filter Resolve Cache was using unresolved filters as keys. This caused memory explosions when refint or memberof were updating a group with a large number of members because they would emit a query with hundreds of filter terms that would only be used once and never again, causing the ARCache haunted set to grow without bound. To limit this, we no longer cache large/complex queries for resolution, and in future we may implement some other methods to reduce this like sha256/hmac of the queries.
* When creating a new account, dyngroups would be engaged to add the account as a member due to the matching scope. However the change to the dyngroup was triggering an update of all the dyngroups *members* related memberof attributes. This would mean that adding an account would trigger every other account to be loaded an updated.
* When memberof would iterate over leaf entries and update them one at a time. This mean a large number of small fragmented queries in the case of a lot of leaf entries being updated. Now leaf entries are updated in a single stripe once groups are stabilised.
* Member of would always trigger it's members to always update. Instead, we should only update members where a difference is observed, or all members if the group's memberof itself has changed since this needs to propogate to all leaf entries. This significantly reduces the amount of writes and operations to examine the changed member of set.
* Referential integrity would examine all reference uuids on entries for validity rather than just the reference uuids that were altered within the transaction. This change means that only uuids that were *added* are validated during an operation.
* During async write backs (delayed actions) these were performed one at a time. Instead, when possible this should be done in a single transaction as the write transaction caches all writes in memory until the commit meaning that by batching we reduce overall latency.
* In the server there can only be one write transaction and many readers. These are guarded by tokio semaphores that act as fair queues - first in gets the lock next. Due to the design of the server readers would be blocked on the *database* semaphore, and writers would block on the write semaphore and THEN the database semaphore. This arrangement was creating a situation which unfairly advantaged readers over writers, as any write would first have to become the head of it's queue, and then compete with all readers to access a db transaction. Instead, we now have a reader semaphore with size threads minus 1, clamped at a minimum of 1. This means that provided there are two or more threads, then a writer will *always* have a database handle available, and readers will pre-queue with each other before queueing on the db ticket. If there is only one thread, then writes and reads will alternate between each other fairly.
Refers #1987
Notable changes:
- in server/lib/src/entry.rs - aiming to pass the enum instead of the strings
- changed signature of add_ava to take Attribute instead of &str (which is used in the entry_init macro... which was fun)
- set_ava<T> now takes Attribute
- added TryFrom<&AttrString> for Attribute