mirror of
https://github.com/kanidm/kanidm.git
synced 2025-02-23 20:47:01 +01:00
180 lines
7.7 KiB
ReStructuredText
180 lines
7.7 KiB
ReStructuredText
Resource Limits
|
|
---------------
|
|
|
|
As a piece of security sensitive software, kanidm must be "available" (as defined by
|
|
integrity, availability and confidentiality). This means that as a service we must
|
|
be able to handle a large volume of potentially malicous traffic, and still able
|
|
to serve legitimate requests without fault or failure.
|
|
|
|
To achieve this, the resources of the server must be managed and distributed so allow
|
|
potentially thousands of operations per second, while preventing exhaustion of those
|
|
resources.
|
|
|
|
Kanidm is structured as a database, where each request requires a process
|
|
to resolve that query into an answer. This could be a request for authetication
|
|
which is a true/false response, or a request for an identity so that we can
|
|
determine their groups for authorisation, or even just a request to find
|
|
someone's email address in a corporate directory context.
|
|
|
|
Each operation requires resources to complete individually, but many operations
|
|
can be processed in parallel.
|
|
|
|
For a resource exhaustion to occur, this would be due to input from a client
|
|
that consumes resources. This means the attack surface is any possible input
|
|
and how the server interprets and processes that. In kanidm this could be
|
|
a search filter for example - the query may be small, but it could be expensive
|
|
to compute.
|
|
|
|
CPU time
|
|
========
|
|
|
|
Every operation requires time on a CPU. A CPU logically is many threads which
|
|
can have many operations in parallel, but also bound by a frequency and their
|
|
performance as to how many instructions can be performed per second.
|
|
|
|
This means that given a say 8 core CPU, each individual core may be able to
|
|
complete 1000 operations per second, and combined this becomes 8x1000.
|
|
|
|
If CPU time is exhausted, then operations are unable to be served. This means
|
|
that we must manage the amount of time an operation is allowed to take
|
|
on the system.
|
|
|
|
Some types of queries can take more time than others. For example:
|
|
|
|
* Unindexed searches which require full table scans.
|
|
* Indexed searches that have a large number of results.
|
|
* Write operations that affect many entries.
|
|
|
|
These necesitate the following limits:
|
|
|
|
* No unindexed searches allowed
|
|
* Prevent searching on terms that do not exist (done, filter schema validation)
|
|
* Limit on the number of entries that can be checked during a partial indexed search
|
|
* Limit on the number of entries that can be loaded from a fully indexed search
|
|
* Limit on the number of entries that can be written during a write operation
|
|
* Reduce the size of allowed queries to prevent complex index access patterns
|
|
|
|
Memory (RAM) Capacity
|
|
=====================
|
|
|
|
A system has finite ram amounts which must be shared between all CPU's of the system.
|
|
Effective use of that RAM improves the speed of all operations, meaning that as a shared
|
|
resource it can not be monopolised by a single operation.
|
|
|
|
Each operation can be assumed to have access to RAM/N CPUS of ram. If an operation tempororily
|
|
exceeds this, the impact is minimal, but continued use of a single CPU using a high capacity
|
|
of RAM can prevent other operations functioning. Additionally, past operations can impact
|
|
the content of RAM though cache invalidation attacks for example.
|
|
|
|
To prevent overuse of RAM, the following must be considered
|
|
|
|
* Cache strategies to prevent invalidation attacks (done, ARC + ahash).
|
|
* Limit the amount of entries that can be dirtied in a write operation
|
|
* Limit the numer of entries that can be filter-tested in partial index queries
|
|
* Limit the amount of entries that can be loaded for a read operation
|
|
* Limit the size of cache to allow room for operation result sets (done)
|
|
* Cache closer to the application rather than using db/vfs caches (done)
|
|
* Compress resources that consume bandwidth to operate on (done, index compression)
|
|
|
|
Disk IOPS
|
|
=========
|
|
|
|
Storage has a limit on how many input/output operations it can perform per second. Each operation
|
|
must be able to access the storage system for entries and indexes that may not be cached in memory.
|
|
The best way to reduce IOPS is more RAM to allow more application level caching, but this is
|
|
not always possible. This means that we must have operations minimise their IOPS requirements,
|
|
or to not issue or allow requests that would generate large amounts of IOPS.
|
|
|
|
* Cache strategies that keep cache hit ratios high (done, ARC)
|
|
* Compress elements to reduce IOPS for transfer to RAM (done indexes, not-done entries)
|
|
* Limit the number of large items that can be loaded in an operation IE partial index scans, full table scans
|
|
* Limit the size of write operations.
|
|
|
|
Disk/Capacity
|
|
=============
|
|
|
|
If the storage capacity is exceeded this can cause the system to panic or stop, which would act
|
|
as a denial of service. It may also prevent security auditing or other data to be recorded or in
|
|
the worst case, data corruption. This means that during write operations, the amount of change
|
|
must be limited.
|
|
|
|
* Limit the number of entries that can be dirtied in an operation
|
|
* Limit the maximum number of multivalue attributes on an entrie to prevent the entry growing over-size
|
|
* Entry maximum size limits
|
|
* Limit the amount of churn in an entry to prevent changelog growth
|
|
|
|
Stack Limits
|
|
============
|
|
|
|
As some operations are recursive, the stack depth becomes a concern as when exceeded, the system
|
|
can crash or panic. The major recursive structure is filters and how we process queries.
|
|
|
|
To prevent stack exhaustion:
|
|
|
|
* Limit the depth of nesting in a search query (filter).
|
|
|
|
Network
|
|
=======
|
|
|
|
To prevent a client from exhausting memory, but also network bandwidth, if an operation is too
|
|
large it should be denied
|
|
|
|
* Limit size of incoming network requests
|
|
|
|
Limits Summary
|
|
==============
|
|
|
|
This summary is the set of limits that should be applied. Note some limits prevent multiple issues,
|
|
and the limits apply in different areas.
|
|
|
|
Search Limits:
|
|
* Prevent searching on terms that do not exist (done, filter schema validation)
|
|
* Deny unindexed searches
|
|
* Limit the amount of entries that can be loaded for a read operation
|
|
* Limit the numer of entries that can be filter-tested in partial index queries
|
|
* Limit on the number of entries that can be loaded from a search
|
|
* Limit the depth of nesting in a search query (filter).
|
|
* Reduce the size of allowed queries to prevent complex index access patterns
|
|
|
|
Caching:
|
|
* Cache strategies to prevent invalidation attacks (done, ARC + ahash).
|
|
* Limit the size of cache to allow room for operation result sets (done)
|
|
* Cache closer to the application rather than using db/vfs caches (done)
|
|
|
|
Network:
|
|
* Limit size of incoming network requests
|
|
|
|
Db storage:
|
|
* Compress elements to reduce IOPS for transfer to RAM (done indexes, not-done entries)
|
|
|
|
Writes:
|
|
* Limit the number of entries that can be dirtied in an operation
|
|
* Limit the maximum number of multivalue attributes on an entry to prevent the entry growing over-size
|
|
* Entry maximum size limits
|
|
* Limit the amount of churn in an entry to prevent changelog growth
|
|
|
|
These limits should be applied per-account to allow some accounts to over-ride these. For example
|
|
an application which needs to bulk update accounts, or admins who need to perform audits.
|
|
|
|
The system maintains a default set of limits. Accounts can then have resource groups associated.
|
|
The "highest" value of the resource group or default is the value that is applied. These limits
|
|
could also be claim based or related, meaning they become per session rather than per account, so
|
|
they would be stored in the user authentication token.
|
|
|
|
The session limits would be:
|
|
|
|
* allow unindexed search
|
|
* maximum number of entries in search
|
|
* maximum number of entries in partial filter test
|
|
* number of filter elements
|
|
* maximum number of modified entries
|
|
* write rate limit (writes over time)
|
|
* network request size
|
|
|
|
The system limits that can not be account over-riden are:
|
|
|
|
* maximum entry size
|
|
* maximum number of multi value attributes
|
|
|
|
|