mirror of
https://github.com/kanidm/kanidm.git
synced 2025-02-23 20:47:01 +01:00
Add design documents as drafts
This commit is contained in:
parent
651abe3762
commit
84ff865304
|
@ -28,6 +28,109 @@ An example is that user Alice should only be able to search for objects where th
|
||||||
is person, and where they are a memberOf "visible" group. Alice should only be able to
|
is person, and where they are a memberOf "visible" group. Alice should only be able to
|
||||||
see those users displayNames (not their legalName for example), and their public email.
|
see those users displayNames (not their legalName for example), and their public email.
|
||||||
|
|
||||||
|
Worded a bit differently. You need permission over the scope of entries, you need to be able
|
||||||
|
to read the attribute to filter on it, and you need to be able to read the attribute to recieve
|
||||||
|
it in the result entry.
|
||||||
|
|
||||||
|
Threat: If we search for '(&(name=william)(secretdata=x))', we should not allow this to
|
||||||
|
proceed because you don't have the rights to read secret data, so you should not be allowed
|
||||||
|
to filter on it. How does this work with two overlapping ACPs? For example one that allows read
|
||||||
|
of name and description to class = group, and one that allows name to user. We don't want to
|
||||||
|
say '(&(name=x)(description=foo))' and have it allowed, because we don't know the target class
|
||||||
|
of the filter. Do we "unmatch" all users because they have no access to the filter components? (Could
|
||||||
|
be done by inverting and putting in an AndNot of the non-matchable overlaps). Or do we just
|
||||||
|
filter our description from the users returned (But that implies they DID match, which is a disclosure).
|
||||||
|
|
||||||
|
More concrete:
|
||||||
|
|
||||||
|
search {
|
||||||
|
action: allow
|
||||||
|
targetscope: Eq("class", "group")
|
||||||
|
targetattr: name
|
||||||
|
targetattr: description
|
||||||
|
}
|
||||||
|
|
||||||
|
search {
|
||||||
|
action: allow
|
||||||
|
targetscope: Eq("class", "user")
|
||||||
|
targetattr: name
|
||||||
|
}
|
||||||
|
|
||||||
|
SearchRequest {
|
||||||
|
...
|
||||||
|
filter: And: {
|
||||||
|
Pres("name"),
|
||||||
|
Pres("description"),
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
A potential defense is:
|
||||||
|
|
||||||
|
acp class group: Pres(name) and Pres(desc) both in target attr, allow
|
||||||
|
acp class user: Pres(name) allow, Pres(desc) deny. Invert and Append
|
||||||
|
|
||||||
|
So the filter now is:
|
||||||
|
And: {
|
||||||
|
AndNot: {
|
||||||
|
Eq("class", "user")
|
||||||
|
},
|
||||||
|
And: {
|
||||||
|
Pres("name"),
|
||||||
|
Pres("description"),
|
||||||
|
},
|
||||||
|
}
|
||||||
|
|
||||||
|
This would now only allow access to the name/desc of group.
|
||||||
|
|
||||||
|
If we extend this to a third, this would work. But a more complex example:
|
||||||
|
|
||||||
|
search {
|
||||||
|
action: allow
|
||||||
|
targetscope: Eq("class", "group")
|
||||||
|
targetattr: name
|
||||||
|
targetattr: description
|
||||||
|
}
|
||||||
|
|
||||||
|
search {
|
||||||
|
action: allow
|
||||||
|
targetscope: Eq("class", "user")
|
||||||
|
targetattr: name
|
||||||
|
}
|
||||||
|
|
||||||
|
search {
|
||||||
|
action: allow
|
||||||
|
targetscope: And(Eq("class", "user"), Eq("name", "william"))
|
||||||
|
targetattr: description
|
||||||
|
}
|
||||||
|
|
||||||
|
Now we have a single user where we can read desc. So the compiled filter above as:
|
||||||
|
|
||||||
|
And: {
|
||||||
|
AndNot: {
|
||||||
|
Eq("class", "user")
|
||||||
|
},
|
||||||
|
And: {
|
||||||
|
Pres("name"),
|
||||||
|
Pres("description"),
|
||||||
|
},
|
||||||
|
}
|
||||||
|
|
||||||
|
This would now be invalid, first, because we would see that class=user and william has no name
|
||||||
|
so that would be excluded also. We also may not even have "class=user" in the second ACP, so we can't
|
||||||
|
use subset filter matching to merge the two.
|
||||||
|
|
||||||
|
As a result, I think the only possible valid solution is to perform the initial filter, then determine
|
||||||
|
on the candidates if we *could* have have valid access to filter on all required attributes. IE
|
||||||
|
this means even with an index look up, we still are required to perform some filter application
|
||||||
|
on the candidates.
|
||||||
|
|
||||||
|
I think this will mean on a possible candidate, we have to apply all ACP, then create a union of
|
||||||
|
the resulting targetattrs, and then compared that set into the set of attributes in the filter.
|
||||||
|
|
||||||
|
This will be slow on large candidate sets (potentially), but could be sped up with parallelism, caching
|
||||||
|
or other. However, in the same step, we can also apply the step of extracting only the allowed
|
||||||
|
read target attrs, so this is a valuable exercise.
|
||||||
|
|
||||||
Delete Requirements
|
Delete Requirements
|
||||||
-------------------
|
-------------------
|
||||||
|
|
||||||
|
@ -91,6 +194,8 @@ be best implemented as a compilation of self -> eq(uuid, self.uuid).
|
||||||
Implementation Details
|
Implementation Details
|
||||||
----------------------
|
----------------------
|
||||||
|
|
||||||
|
CHANGE: Receiver should be a group, and should be single value/multivalue? Can *only* be a group.
|
||||||
|
|
||||||
Example profiles:
|
Example profiles:
|
||||||
|
|
||||||
search {
|
search {
|
||||||
|
|
38
designs/entries.rst
Normal file
38
designs/entries.rst
Normal file
|
@ -0,0 +1,38 @@
|
||||||
|
|
||||||
|
Entries
|
||||||
|
-------
|
||||||
|
|
||||||
|
Entries are the base unit of data in this server. This is one of the three foundational concepts
|
||||||
|
along with filters and schema that everything thing else builds upon.
|
||||||
|
|
||||||
|
What is an Entry?
|
||||||
|
-----------------
|
||||||
|
|
||||||
|
An entry is a collection of attribute-values. These are sometimes called attribute-value-assertions,
|
||||||
|
attr-value sets. The attribute is a "key", and it holds 1 to infinite values associated. An entry
|
||||||
|
can have many avas associated, which creates the entry as a whole. An example entry (minus schema):
|
||||||
|
|
||||||
|
Entry {
|
||||||
|
"name": ["william"],
|
||||||
|
"mail": ["william@email", "email@william"],
|
||||||
|
"uuid": ["..."],
|
||||||
|
}
|
||||||
|
|
||||||
|
There are only a few rules that are true in entries.
|
||||||
|
|
||||||
|
* UUID
|
||||||
|
|
||||||
|
All entries *must* have a UUID attribute, and there must ONLY exist a single value. This UUID ava
|
||||||
|
MUST be unique within the database, regardless of entry state (live, recycled, tombstoned etc).
|
||||||
|
|
||||||
|
* Zero values
|
||||||
|
|
||||||
|
An attribute with zero values, is removed from the entry.
|
||||||
|
|
||||||
|
* Unsorted
|
||||||
|
|
||||||
|
Values within an attribute are "not sorted" in any meaningful way for a client utility (in reality
|
||||||
|
they are sorted by an undefined internal order for fast lookup/insertion).
|
||||||
|
|
||||||
|
|
||||||
|
That's it.
|
163
designs/filter.rst
Normal file
163
designs/filter.rst
Normal file
|
@ -0,0 +1,163 @@
|
||||||
|
|
||||||
|
Filters
|
||||||
|
-------
|
||||||
|
|
||||||
|
Filters (along with Entries and Schema) is one of the foundational concepts in the
|
||||||
|
design of KaniDM. They are used in nearly every aspect of the server to provide
|
||||||
|
checking and searching over entry sets.
|
||||||
|
|
||||||
|
A filter is a set of requirements where the attribute-value pairs of the entry must
|
||||||
|
conform for the filter to be considered a "match". This has two useful properties:
|
||||||
|
|
||||||
|
* We can apply a filter to a single entry to determine quickly assertions about that entry
|
||||||
|
hold true.
|
||||||
|
* We can apply a filter to a set of entries to reduce the set only to the matching entries.
|
||||||
|
|
||||||
|
Filter Construction
|
||||||
|
-------------------
|
||||||
|
|
||||||
|
Filters are rooted in relational algebra and set mathematics. I am not an expert on either
|
||||||
|
topic, and have learnt from experience about there design.
|
||||||
|
|
||||||
|
* Presence
|
||||||
|
|
||||||
|
The simplest filter is a "presence" test. It asserts that some attribute, regardless
|
||||||
|
of it's value exists on the entry. For example, the entries below:
|
||||||
|
|
||||||
|
Entry {
|
||||||
|
name: william
|
||||||
|
}
|
||||||
|
|
||||||
|
Entry {
|
||||||
|
description: test
|
||||||
|
}
|
||||||
|
|
||||||
|
If we apply "Pres(name)", then we would only see the entry containing "name: william" as a matching
|
||||||
|
result.
|
||||||
|
|
||||||
|
* Equality
|
||||||
|
|
||||||
|
Equality checks that an attribute and value are present on an entry. For example
|
||||||
|
|
||||||
|
Entry {
|
||||||
|
name: william
|
||||||
|
}
|
||||||
|
|
||||||
|
Entry {
|
||||||
|
name: test
|
||||||
|
}
|
||||||
|
|
||||||
|
If we apply Eq(name, william) only the first entry would match. If the attribute is multivalued,
|
||||||
|
we only assert that one value in the set is there. For example:
|
||||||
|
|
||||||
|
Entry {
|
||||||
|
name: william
|
||||||
|
}
|
||||||
|
|
||||||
|
Entry {
|
||||||
|
name: test
|
||||||
|
name: claire
|
||||||
|
}
|
||||||
|
|
||||||
|
In this case application of Eq(name, claire), would match the second entry as name=claire is present
|
||||||
|
in the multivalue set.
|
||||||
|
|
||||||
|
* Sub
|
||||||
|
|
||||||
|
Substring checks that the substring exists in an attribute of the entry. This is a specialisation
|
||||||
|
of equality, where the same value and multivalue handling holds true.
|
||||||
|
|
||||||
|
Entry {
|
||||||
|
name: william
|
||||||
|
}
|
||||||
|
|
||||||
|
In this example, Sub(name, liam) would match, but Sub(name, air) would not.
|
||||||
|
|
||||||
|
* Or
|
||||||
|
|
||||||
|
Or contains multiple filters and asserts that provided *any* of them are true, this condition
|
||||||
|
will hold true. For example:
|
||||||
|
|
||||||
|
Entry {
|
||||||
|
name: claire
|
||||||
|
}
|
||||||
|
|
||||||
|
In this the filter Or(Eq(name, claire), Eq(name, william)) will be true, because the Eq(name, claire)
|
||||||
|
is true, thus the Or condition is true. If nothing inside the Or is true, it returns false.
|
||||||
|
|
||||||
|
* And
|
||||||
|
|
||||||
|
And checks that all inner filter conditions are true, to return true. If any are false, it will
|
||||||
|
yield false.
|
||||||
|
|
||||||
|
Entry {
|
||||||
|
name: claire
|
||||||
|
class: person
|
||||||
|
}
|
||||||
|
|
||||||
|
For this example, And(Eq(class, person), Eq(name, claire)) would be true, but And(Eq(class, group),
|
||||||
|
Eq(name, claire)) would be false.
|
||||||
|
|
||||||
|
* AndNot
|
||||||
|
|
||||||
|
AndNot is different to a logical not.
|
||||||
|
|
||||||
|
If we had Not(Eq(name, claire)), then the logical result is "All entries where name is not
|
||||||
|
claire". However, this is (today...) not very efficient. Instead, we have "AndNot" which asserts
|
||||||
|
that a condition of a candidate set is not true. So the operation: AndNot(Eq(name, claire)) would
|
||||||
|
yield and empty set. AndNot is important when you need to check that something is also not true
|
||||||
|
but without getting all entries where that not holds. An example:
|
||||||
|
|
||||||
|
Entry {
|
||||||
|
name: william
|
||||||
|
class: person
|
||||||
|
}
|
||||||
|
|
||||||
|
Entry {
|
||||||
|
name: claire
|
||||||
|
class: person
|
||||||
|
}
|
||||||
|
|
||||||
|
In this case "And(Eq(class, person), AndNot(Eq(name, claire)))". This would find all persons
|
||||||
|
where their name is also not claire: IE william. However, the following would be empty result.
|
||||||
|
"AndNot(Eq(name, claire))". This is because there is no candidate set already existing, so there
|
||||||
|
is nothing to return.
|
||||||
|
|
||||||
|
|
||||||
|
Filter Schema Considerations
|
||||||
|
----------------------------
|
||||||
|
|
||||||
|
In order to make filters work properly, the server normalises entries on input to allow simpler
|
||||||
|
comparisons and ordering in the actual search phases. This means that for a filter to operate
|
||||||
|
it too must be normalised an valid.
|
||||||
|
|
||||||
|
If a filter requests an operation on an attribute we do not know of in schema, the operation
|
||||||
|
is rejected. This is to prevent a denial of service attack where Eq(NonExist, value) would cause
|
||||||
|
un-indexed full table scans to be performed consuming server resources.
|
||||||
|
|
||||||
|
In a filter request, the Attribute name in use is normalised according to schema, as it
|
||||||
|
the search value. For example, Eq(nAmE, Claire) would normalise to Eq(name, claire) as both
|
||||||
|
attrname and name are UTF8_INSENSITIVE. However, displayName is case sensitive so a search like:
|
||||||
|
Eq(displayName, Claire) would become Eq(displayname, Claire). Note Claire remains cased.
|
||||||
|
|
||||||
|
This means that instead of having costly routines to normalise entries on each read and search,
|
||||||
|
we can normalise on entry modify and create, then we only need to ensure filters match and we
|
||||||
|
can do basic string comparisons as needed.
|
||||||
|
|
||||||
|
|
||||||
|
Discussion
|
||||||
|
----------
|
||||||
|
|
||||||
|
Is it worth adding a true "not" type, and using that instead? It would be extremely costly on
|
||||||
|
indexes or filter testing, but would logically be better than AndNot as a filter term.
|
||||||
|
|
||||||
|
Not could be implemented as Not(<filter>) -> And(Pres(class), AndNot(<filter>)) which would
|
||||||
|
yield the equivalent result, but it would consume a very large index component. In this case
|
||||||
|
though, filter optimising would promote Eq > Pres, so we would should be able to skip to a candidate
|
||||||
|
test, or we access the index and get the right result anyway over fulltable scan.
|
||||||
|
|
||||||
|
Additionally, Not/AndNot could be security risks because they could be combined with And
|
||||||
|
queries that allow them to bypass the filter-attribute permission check. Is there an example
|
||||||
|
of using And(Eq, AndNot(Eq)) that could be used to provide information disclosure about
|
||||||
|
the status of an attribute given a result/non result where the AndNot is false/true?
|
||||||
|
|
152
designs/indexing.rst
Normal file
152
designs/indexing.rst
Normal file
|
@ -0,0 +1,152 @@
|
||||||
|
|
||||||
|
Indexing
|
||||||
|
--------
|
||||||
|
|
||||||
|
Indexing is deeply tied to the concept of filtering. Indexes exist to make the application of a
|
||||||
|
search term (filter) faster.
|
||||||
|
|
||||||
|
World without indexing
|
||||||
|
----------------------
|
||||||
|
|
||||||
|
Almost all databases are built ontop of a key-value storage engine of some nature. In our
|
||||||
|
case we are using (feb 2019) sqlite and hopefully SLED in the future.
|
||||||
|
|
||||||
|
So our entries that contain sets of avas, these are serialised into a byte format (feb 2019, json
|
||||||
|
but soon cbor) and stored in a table of "id: entry". For example:
|
||||||
|
|
||||||
|
|----------------------------------------------------------------------------------------|
|
||||||
|
| ID | data |
|
||||||
|
|----------------------------------------------------------------------------------------|
|
||||||
|
| 01 | { 'Entry': { 'name': ['name'], 'class': ['person'], 'uuid': ['...'] } } |
|
||||||
|
| 02 | { 'Entry': { 'name': ['beth'], 'class': ['person'], 'uuid': ['...'] } } |
|
||||||
|
| 03 | { 'Entry': { 'name': ['alan'], 'class': ['person'], 'uuid': ['...'] } } |
|
||||||
|
| 04 | { 'Entry': { 'name': ['john'], 'class': ['person'], 'uuid': ['...'] } } |
|
||||||
|
| 05 | { 'Entry': { 'name': ['kris'], 'class': ['person'], 'uuid': ['...'] } } |
|
||||||
|
|----------------------------------------------------------------------------------------|
|
||||||
|
|
||||||
|
The ID column is *private* to the backend implementation and is never revealed to the higher
|
||||||
|
level components. However the ID is very important to indexing :)
|
||||||
|
|
||||||
|
If we wanted to find Eq(name, john) here, what do we need to do? A full table scan is where we
|
||||||
|
perform:
|
||||||
|
|
||||||
|
data = sqlite.do(SELECT * from id2entry);
|
||||||
|
for row in data:
|
||||||
|
entry = deserialise(row)
|
||||||
|
entry.match_filter(...) // check Eq(name, john)
|
||||||
|
|
||||||
|
For a small database (maybe up to 20 objects), this is probably fine. But once you start to get
|
||||||
|
much larger this is really costly. We continually load, deserialise, check and free data that
|
||||||
|
is not relevant to the search. This is why full table scans of any database (sql, ldap, anything)
|
||||||
|
are so costly. It's really really scanning everything!
|
||||||
|
|
||||||
|
How does indexing work?
|
||||||
|
-----------------------
|
||||||
|
|
||||||
|
Indexing is a pre-computed lookup table of what you *might* search in a specific format. Let's say
|
||||||
|
in our example we have an equality index on "name" as an attribute. Now in our backend we define
|
||||||
|
an extra table called "index_eq_name". It's contents would look like:
|
||||||
|
|
||||||
|
|------------------------------------------|
|
||||||
|
| index | idl |
|
||||||
|
|------------------------------------------|
|
||||||
|
| alan | [03, ] |
|
||||||
|
| beth | [02, ] |
|
||||||
|
| john | [04, ] |
|
||||||
|
| kris | [05, ] |
|
||||||
|
| name | [01, ] |
|
||||||
|
|------------------------------------------|
|
||||||
|
|
||||||
|
So when we perform our search for Eq(name, john) again, we see name is indexed. We then perform:
|
||||||
|
|
||||||
|
SELECT * from index_eq_name where index=john;
|
||||||
|
|
||||||
|
This would give us the idl (ID list) of [04,]. This is the "ID's of every entry where name equals
|
||||||
|
john".
|
||||||
|
|
||||||
|
We can now take this back to our id2entry table and perform:
|
||||||
|
|
||||||
|
data = sqlite.do(SELECT * from id2entry where ID = 04)
|
||||||
|
|
||||||
|
The key-value engine only gives us the entry for john, and we have a match! If id2entry had 1 million
|
||||||
|
entries, a full table scan would be 1 million loads and compares - with the index, it was 2 loads and
|
||||||
|
one compare. That's 30000x faster (potentially ;) )!
|
||||||
|
|
||||||
|
To improve on this, if we had a query like Or(Eq(name, john), Eq(name, kris)) we can use our
|
||||||
|
indexes to speed this up.
|
||||||
|
|
||||||
|
We would query index_eq_name again, and we would perform the search for both john, and kris. Because
|
||||||
|
this is an OR we then union the two idl's, and we would have:
|
||||||
|
|
||||||
|
[04, 05,]
|
||||||
|
|
||||||
|
Now we just have to get entries 04,05 from id2entry, and we have our matching query. This means
|
||||||
|
filters are often applied as idl set operations.
|
||||||
|
|
||||||
|
Compressed ID lists
|
||||||
|
-------------------
|
||||||
|
|
||||||
|
In order to make idl loading faster, and the set operations faster there is an idl library
|
||||||
|
(developed by me, firstyear), which will be used for this. To read more see:
|
||||||
|
|
||||||
|
https://github.com/Firstyear/idlset
|
||||||
|
|
||||||
|
Filter Optimisation
|
||||||
|
-------------------
|
||||||
|
|
||||||
|
Filter optimisation begins to play an important role when we have indexes. If we indexed
|
||||||
|
something like "Pres(class)", then the idl for that search is the set of all database
|
||||||
|
entries. Similar, if our database of 1 million entries has 250,000 class=person, then
|
||||||
|
Eq(class, person), will have an idl containing 250,000 ids. Even with idl compression, this
|
||||||
|
is still a lot of data!
|
||||||
|
|
||||||
|
There tend to be two types of searches against a directory like kanidm.
|
||||||
|
|
||||||
|
* Broad searches
|
||||||
|
* Targetted single entry searches
|
||||||
|
|
||||||
|
For broad searches, filter optimising does little - we just have to load those large idls, and
|
||||||
|
use them. (Yes, loading the large idl and using it is still better than full table scan though!)
|
||||||
|
|
||||||
|
However, for targetted searches, filter optimisng really helps.
|
||||||
|
|
||||||
|
Imagine a query like:
|
||||||
|
|
||||||
|
And(Eq(class, person), Eq(name, claire))
|
||||||
|
|
||||||
|
In this case with our database of 250,000 persons, our idl's would have:
|
||||||
|
|
||||||
|
And( idl[250,000 ids], idl(1 id))
|
||||||
|
|
||||||
|
Which means the result will always be the *single* id in the idl or *no* value because it wasn't
|
||||||
|
present.
|
||||||
|
|
||||||
|
We add a single concept to the server called the "filter test threshold". This is the state in which
|
||||||
|
a candidate set that is not completed operation, is shortcut, and we then apply the filter in
|
||||||
|
the manner of a full table scan to the partial set because it will be faster than the index loading
|
||||||
|
and testing.
|
||||||
|
|
||||||
|
When we have this test threshold, there exists two possibilities for this filter.
|
||||||
|
|
||||||
|
And( idl[250,000 ids], idl(1 id))
|
||||||
|
|
||||||
|
We load 250,000 idl and then perform the intersection with the idl of 1 value, and result in 1 or 0.
|
||||||
|
|
||||||
|
And( idl(1 id), idl[250,000 ids])
|
||||||
|
|
||||||
|
We load the single idl value for name, and then as we are below the test-threshold we shortcut out
|
||||||
|
and apply the filter to entry ID 1 - yielding a match or no match.
|
||||||
|
|
||||||
|
Notice in the second, by promoting the "smaller" idl, we were able to save the work of the idl load
|
||||||
|
and intersection as our first equality of "name" was more targetted?
|
||||||
|
|
||||||
|
Filter optimisation is about re-arranging these filters in the server using our insight to
|
||||||
|
data to provide faster searches and avoid indexes that are costly unless they are needed.
|
||||||
|
|
||||||
|
In this case, we would *demote* any filter where Eq(class, ...) to the *end* of the And, because it
|
||||||
|
is highly likely to be less targetted than the other Eq types. Another example would be promotion
|
||||||
|
of Eq filters to the front of an And over a Sub term, wherh Sub indexes tend to be larger and have
|
||||||
|
longer IDLs.
|
||||||
|
|
||||||
|
|
||||||
|
|
117
designs/schema.rst
Normal file
117
designs/schema.rst
Normal file
|
@ -0,0 +1,117 @@
|
||||||
|
|
||||||
|
Schema
|
||||||
|
------
|
||||||
|
|
||||||
|
Schema is one of the three foundational concepts of the server, along with filters and entries.
|
||||||
|
Schema defines how attribute values *must* be represented, sorted, indexed and more. It also
|
||||||
|
defines what attributes could exist on an entry.
|
||||||
|
|
||||||
|
Why Schema?
|
||||||
|
-----------
|
||||||
|
|
||||||
|
The way that the server is designed, you could extract the backend parts and just have "Entries"
|
||||||
|
with no schema. That's totally valid if you want!
|
||||||
|
|
||||||
|
However, usually in the world all data maintains some form of structure, even if loose. We want to
|
||||||
|
have ways to say a database entry represents a person, and what a person requires.
|
||||||
|
|
||||||
|
Attributes
|
||||||
|
----------
|
||||||
|
|
||||||
|
In the entry document, I discuss that avas have a single attribute, and 1 to infinite values that
|
||||||
|
are utf8 case sensitive strings. Which schema attribute types we can constrain these avas on an
|
||||||
|
entry.
|
||||||
|
|
||||||
|
For example, while the entry may be capable of holding 1 to infinite "name" values, the schema
|
||||||
|
defines that only one name is valid on the entry. Addition of a second name would be a violation. Of
|
||||||
|
course, schema also defines "multi-value", our usual 1 to infinite value storage concept.
|
||||||
|
|
||||||
|
Schema can also define that values of the attribute must conform to a syntax. For example, name
|
||||||
|
is a case *insensitive* string. So despite the fact that avas store case-sensitive data, all inputs
|
||||||
|
to name will be normalised to a lowercase form for faster matching. There are a number of syntax
|
||||||
|
types built into the server, and we'll add more later.
|
||||||
|
|
||||||
|
Finally, an attribute can be defined as indexed, and in which ways it can be indexed. We often will
|
||||||
|
want to search for "mail" on a person, so we can define in the schema that mail is indexed by the
|
||||||
|
backend indexing system. We don't define *how* the index is built - only that some index should exist
|
||||||
|
for when a query is made.
|
||||||
|
|
||||||
|
Classes
|
||||||
|
-------
|
||||||
|
|
||||||
|
So while we have attributes that define "what is valid in the avas", classes define "which attributes
|
||||||
|
can exist on the entry itself".
|
||||||
|
|
||||||
|
A class defines requirements that are "may", "must", "systemmay", "systemmust". The system- variants
|
||||||
|
exist so that we can ship what we believe are good definitions. The may and must exists so you can
|
||||||
|
edit and extend our classes with your extra attribute fields (but it may be better just to add
|
||||||
|
your own class types :) )
|
||||||
|
|
||||||
|
An attribute in a class marked as "may" is optional on the entry. It can be present as an ava, or
|
||||||
|
it may not be.
|
||||||
|
|
||||||
|
An attribute in a class marked as "must" is required on the entry. An ava that is valid to the
|
||||||
|
attribute syntax is required on this entry.
|
||||||
|
|
||||||
|
An attribute that is not "may" or "must" can not be present on this entry.
|
||||||
|
|
||||||
|
Lets imagine we have a class (pseudo example) of "person". We'll make it:
|
||||||
|
|
||||||
|
Class {
|
||||||
|
"name": "person",
|
||||||
|
"systemmust": ["name"],
|
||||||
|
"systemmay": ["mail"]
|
||||||
|
}
|
||||||
|
|
||||||
|
If we had an entry such as:
|
||||||
|
|
||||||
|
Entry {
|
||||||
|
"class": ["person"],
|
||||||
|
"uid": ["bob"],
|
||||||
|
"mail": ["bob@email"]
|
||||||
|
}
|
||||||
|
|
||||||
|
This would be invalid: We are missing the "systemmust" name attribute. It's also invalid because uid
|
||||||
|
is not present in systemmust or systemmay.
|
||||||
|
|
||||||
|
Entry {
|
||||||
|
"class": ["person"],
|
||||||
|
"name": ["claire"],
|
||||||
|
"mail": ["claire@email"]
|
||||||
|
}
|
||||||
|
|
||||||
|
This entry is now valid. We have met the must requirement of name, and we have the optional
|
||||||
|
mail ava populated. The following is also valid.
|
||||||
|
|
||||||
|
Entry {
|
||||||
|
"class": ["person"],
|
||||||
|
"name": ["claire"],
|
||||||
|
}
|
||||||
|
|
||||||
|
Classes are 'additive' - this means given two classes on an entry, the must/may are unioned, and the
|
||||||
|
strongest rule is applied to attribute presence.
|
||||||
|
|
||||||
|
Imagine we have also
|
||||||
|
|
||||||
|
Class {
|
||||||
|
"name": "person",
|
||||||
|
"systemmust": ["name"],
|
||||||
|
"systemmay": ["mail"]
|
||||||
|
}
|
||||||
|
|
||||||
|
Class {
|
||||||
|
"name": "emailperson",
|
||||||
|
"systemmust": ["mail"]
|
||||||
|
}
|
||||||
|
|
||||||
|
With our entry now, this turns the "may" from person, into a "must" because of the emailperson
|
||||||
|
class. On our entry Claire, that means this entry below is now invalid:
|
||||||
|
|
||||||
|
Entry {
|
||||||
|
"class": ["person", "emailperson"],
|
||||||
|
"name": ["claire"],
|
||||||
|
}
|
||||||
|
|
||||||
|
Simply adding an ava of mail back to the entry would make it valid once again.
|
||||||
|
|
||||||
|
|
|
@ -158,56 +158,6 @@ impl Filter<FilterInvalid> {
|
||||||
}
|
}
|
||||||
_ => panic!(),
|
_ => panic!(),
|
||||||
}
|
}
|
||||||
|
|
||||||
/*
|
|
||||||
match self {
|
|
||||||
Filter::Eq(attr, value) => match schema_attributes.get(attr) {
|
|
||||||
Some(schema_a) => schema_a.validate_value(value),
|
|
||||||
None => Err(SchemaError::InvalidAttribute),
|
|
||||||
},
|
|
||||||
Filter::Sub(attr, value) => match schema_attributes.get(attr) {
|
|
||||||
Some(schema_a) => schema_a.validate_value(value),
|
|
||||||
None => Err(SchemaError::InvalidAttribute),
|
|
||||||
},
|
|
||||||
Filter::Pres(attr) => {
|
|
||||||
// This could be better as a contains_key
|
|
||||||
// because we never use the value
|
|
||||||
match schema_attributes.get(attr) {
|
|
||||||
Some(_) => Ok(()),
|
|
||||||
None => Err(SchemaError::InvalidAttribute),
|
|
||||||
}
|
|
||||||
}
|
|
||||||
Filter::Or(filters) => {
|
|
||||||
// This should never happen because
|
|
||||||
// optimising should remove them as invalid parts?
|
|
||||||
if filters.len() == 0 {
|
|
||||||
return Err(SchemaError::EmptyFilter);
|
|
||||||
};
|
|
||||||
filters.iter().fold(Ok(()), |acc, filt| {
|
|
||||||
if acc.is_ok() {
|
|
||||||
self.validate(filt)
|
|
||||||
} else {
|
|
||||||
acc
|
|
||||||
}
|
|
||||||
})
|
|
||||||
}
|
|
||||||
Filter::And(filters) => {
|
|
||||||
// This should never happen because
|
|
||||||
// optimising should remove them as invalid parts?
|
|
||||||
if filters.len() == 0 {
|
|
||||||
return Err(SchemaError::EmptyFilter);
|
|
||||||
};
|
|
||||||
filters.iter().fold(Ok(()), |acc, filt| {
|
|
||||||
if acc.is_ok() {
|
|
||||||
self.validate(filt)
|
|
||||||
} else {
|
|
||||||
acc
|
|
||||||
}
|
|
||||||
})
|
|
||||||
}
|
|
||||||
Filter::Not(filter) => self.validate(filter),
|
|
||||||
}
|
|
||||||
*/
|
|
||||||
}
|
}
|
||||||
|
|
||||||
pub fn from(f: &ProtoFilter) -> Self {
|
pub fn from(f: &ProtoFilter) -> Self {
|
||||||
|
|
|
@ -1040,6 +1040,13 @@ mod tests {
|
||||||
})
|
})
|
||||||
}
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn test_modify_invalid_class() {
|
||||||
|
// Test modifying an entry and adding an extra class, that would cause the entry
|
||||||
|
// to no longer conform to schema.
|
||||||
|
unimplemented!()
|
||||||
|
}
|
||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn test_qs_delete() {
|
fn test_qs_delete() {
|
||||||
run_test!(|_log, mut server: QueryServer, audit: &mut AuditScope| {
|
run_test!(|_log, mut server: QueryServer, audit: &mut AuditScope| {
|
||||||
|
|
Loading…
Reference in a new issue