Trust Design and Thoughts ------------------------- Trust is a process where users and groups of a seperate kanidm instance may be granted access to resources through this system. Trust is a one way concept, but of course, could be implemented twice in each direction to achieve bidirectional trust. Why? ---- There are a number of reasons why a trust configuration may be desired. You may have a seperate business to customer instance, where business users should be able to authenticate to customer resources, but not the inverse. You may have two businesses merge or cooperate and require resource sharing. It allows seperation of high value credentials onto different infrastructure. You could also potentially use trust as a method of sync between between a different IDM project and this. Why not? -------- Trust is complicated, and adds more fragility and could be solved in different ways. Applications could on a case by case basis have multiple backends instead of trying to go through one domain with trusts. Sync could be designed more as a migration or specialised one way tool rather than needing a replication design. Scope of the Trust ------------------ There are different ways we can scope a trust out, each with pros-cons. Here are some possibilities: * Kerberos Style - you login to your "home" domain, and then that grants you access across the trust boundaries. This means your credentials are valid "everywhere" effectively, and the permissions/groups it carries. Everyone on one side of the trust is trusted by the other side (you can't filter who is/isn't trusted, but you can limit what resources they get via groups). You still have to share some info via the global catalog, meaning you can add and remove users locally to your resources. * x509 style - you trust an authority, and then anyone that authority validates, you trust. There is no global catalog, just the details you get in the presented authentication (certificate). You may implement some controls around which subject DN's to allow/deny, but this is pretty fraught with landminds. You don't know who exists until they login! * Azure AD individiual account trusting. Instead of trusting a whole domain you allow a user from a remote tennant to access your resources. You don't trust everyone in their tennant, just that one account that you can invite. You can then revoke them as needed. * Group-trust - FreeIPA does this with AD. It's still like kerberos, but you only trust a subset of the users determined by "groups" from the trusted site. * All-or-nothing - LDAP style, just bring in the subtree of the remote business (or proxy it) and then act like there is one flat namespace. * Client trust - rather than being server side, clients (applications) like SSSD have a case/switch on the authenticating username, and then have multiple backends configured to select who they auth to. OpenID somewhat works like this where you just redirect to some OpenID portal that may be in a whitelist. * Fractional Replication - similar to the GC in AD, replicate in a subset of your data, but then ask for redirects or other information. This is used with 389 and RO servers where you may only replicate a subset of accounts to branch offices or a seperate backend. Each of these has pros and cons, good bad, and different models. They each achieve different things. For example, the Kerberos style trust creates silos where the accounts credential material is stored (in the home domain), but others still trust that authentication (via cryptographic means). You can limit what is seen or sent, and even where the authentication happens. To help choose a model, or determine properties we want lets write some down. * Single Sign On - only need to authenticate once * Forwardable Credentials - once you issue a token in one domain can it forward to another and authenticate you * Credential Siloing - are credentials (pw, private keys) only stored in your home domain * PII Limits - limit the transmission of personal information * Group Management - can you add a trusted account to a local group to manage it's access? * Invite un-trusted domain - can you invite accounts to use resources from domains you don't know about? * Fully distributed - openid style, where any openid server could be a trusted provided * Client Switched - Is it up to the client to trust different domains? Or is it a server side issue? | | Kerberos | x509 | Azure AD | Group-Trust | All-or-nothing| Client Trust | Fractional | | | SSO | y | y | y | ? | n | n | n | | | Forwarding | y | y? | n | ? | n | n | n | | | Cred Silo | y | n? | y | y | n | y | y | | | PII Limit | y | y | y | ? | n | y | y | | | Group mgmt | y | n | y | y | y | n | y | | | Invite Ext | n | n | y | n | n | y | n | | | Distributed | n | y | n | n | n | y | n | | | Client Swch | n | n | n | n | n | y | n | | So with a lot of though, I'm going to go with fractional replication. * Single Sign On - I don't want this, because it causes a lot of harm. It's better to have many devices with different creds and long lived sessions that are revokeable. * Forwarding - I don't want credentials to be forwarded, or sso to be forwarded. * Cred Silo - I want this because it means you have defined boundaries of where security material is stored by who. * PII limit - I want this as you can control who-has-what PII on the system side. * Group Mgmt - I want this as it enables rbac and familar group management locally for remote and local entries. * Invite Ext - On the fence - cool idea, but not sure how it fits into kanidm with trusts. * Distributed - I don't want this because it's model is really different to what kani is trying to be * Client Switched - I don't want this because clients should only know they trust an IDM silo, and that does the rest. But there are some things I want: * Claims define credential policy, so we need to fractionally replicate the strength of the accounts cred material. This also means in any auth-redirection we need to indicate the strength or name of the credential that was authenticated through so we can correctly apply claims on the trusting domain. This is something for the design of claims to consider. * RADIUS pws are per-domain, not replicated. This would breach the cred-silo idea, and really, if domain B has radius it probably has different SSID/ca cert to domain A, so why share the pw? If we did want to really share the credentials, we can have RADIUS act as a client switch instead. * We can't proxy authentications because of webuathn domain verification, so clients that want to auth users to either side have to redirect through their origin domain to generate the session. This means the origin domain may have to be accessible in some cases. * Public-key auth types can be replicated fractionally, which allows the domain to auth a user via ssh key but without needing to access the origin domain. (some questions about sudo exist here though). Use cases --------- With the fractional case in mind, this means we have sets of use cases that exist. * Access to websites via oauth for users on either domain * Unix server access / Workstation access * RADIUS authentication to a different network infra in the trusting domain (but the Radius creds are local to the site) * Limiting presence of credentials in cloud (but making public key credentials avail) * Limiting distribution of personal information to untrusted sites * Creating administration domains or other business hierachies that may exist in some complex scenarios We need to consider how to support these use cases of course :) Possible Design --------------- As trust is a relationship where groups and accounts from domain B are trusted into domain A, this is a very similar scenario to replication. As Kanidm plans to implement a push based replication system, this may work very well for our needs. More formally - domain A trusting domain B is the establishment of a one directional fractional replication agreement, and resource proxy from A to B. Let's assume a user and group exists on domain B such as: :: spn: claire@domainb class: [account, object] ssh_public_key: aaaa... displayName: claire legalName: Super Secret Legal Name primary_credential: ... uuid: X memberOf: [ group@domainb ] spn: group@domainb class: [group, object] member: X (ref to claire) On domain A, we would replicate a partial entry that serves as: * A stub for references * A redirect for auth operations * A cache for certain attributes :: spn: claire@domainb class: [trustedaccount, object] ssh_public_key: aaaa... displayName: claire uuid: X memberOf: [ group@domainb, group@domaina ] source: Y spn: group@domainb class: [trustedgroup, object] member: X (ref to claire) name: domainb uuid: Y class: [trustanchor] url: https://idm_1.domainb url: https://idm_2.domainb cacert: ..... trust_key: .... spn: group@domaina class: [group, object] member: X Domain A with this information could: * Add claire to local groups (due to name + uuid + memberOf presence) * Generate unix information for claire (from uuid + sshkey + displayname) * Proxy authentication (limited) to domainb * Allow claire to use radius or other local resources. To authenticate claire we have to send a request to the remote domain to get the required information or to provide the required information to the remote domain. We would do a normal auth process, but on determining this is a trust account, we have to return a response to the core.rs layer. This should then trigger an async request to domain B which contains the request. When this is returned, we then complete the request to the client. This does increase the liklihood of issues or delays in processing in the domain A IO layers if many requests exist at the same time. if multiple urls exist in the trustanchor, we should choose randomly which to contact for authentications. If a URL is not available, we move to the next URL (failover) We could consider in-memory caching these values, but then we have to consider the cache expiry and management of this data. Additionally types like TOTP aren't cachable. I think we should avoid caching in these cases. Auth Scenarios -------------- We assume a 1 way trust where B trusts A. Kanidm portal: user@domain_a logs into kanidm portal on domain B Oauth: user@domain_a logs into oauth portal on domain B SSH: user@domain_a sshes to a machine on domain B pam/application pws: user@domain_a uses pam w_ pw on a machine on domain B RADIUS: user@domain_a authenticates to WIFI_B radius via domain B. Trust Through ------------- Not supported. There are some reasons for this, but I think it's adds too much complexity to an already complex system design. It especially complicates "what entries do we send forward" to a domain, because we need to send (our entries + all trusted entries) - target domain entries. I think trust through also is a surprising behaviour - just because my friend trusts another person, doesn't mean that I implicitly do. We need to establish our own trust relationship. Security Considerations ----------------------- There are certain entries on a domain by default that should NOT be replicated. * schema * admin * anonymous * default privilege groups * no personal or sensitive fields * uuids of any of the above Rather it may be easier to consider what *should* be replicated: * Groups (member, uuid, spn) * Accounts ( displayName, spn, uuid, ssh-keys) It could be questioned if: * homedirectory * loginshell * gidnumber Should be replicated as the local domain may have other policies around their handling. For now, we may exclude these, but some consideration is needed here. Excluding items from Domain B from replicating back --------------------------------------------------- In a situation where domain A trusts B, and inverse B trusts A, then A will contain trust stubs to entries in B. Due to the use of spn's we can replicate only our entries for domain to the trust reciever. :: and [ eq(class, group), eq(class, account), sub(spn, my_domain), andnot(or[ eq(class, recycled), eq(class, tombstone), ]) ] Because SPN's would be stored on each object, we could not change domain name post install. Need to do ASAP --------------- How do we get the domain at setup time for spn? We already require domain for webauthn ... should we write this into the system_info? This means we need to determine a difference between a localgroup and a group that will be synced for trust. This may require a seperate class or label? We need to make name -> SPN on groups/accounts that can be sent across a trust boundary. Local groups and accounts should have a class name change to allow them to continue to use "name" or we need to Change setup/fixtures for default accounts to have an spn with the correct domain. Must do ------- Must check and assert that incoming objects via the trust belong to the correct domain (spn) Gotchas ------- Server IDs ========== Every server on both sides of the domain have to have unique SID's to avoid UUID conflicts. This is a requirement for replication anyway, and SID regeneration is not a complex task. It's highly unlikely that we would ever see duplicates anyway as this is a 32bit field. An alternate option is to have the stub objects generate ids, but to have a trusted_uuid field that is used for replication checking, and a seperate CSN for trust replication. Webauthn ======== Webauthn requires correct presentation of a domain name that matches the TLS name of the host that is being connected to. Because of this it may not be possible to proxy Webauthn through in a trust scenario, requiring clients to need to directly authenticate to the trusted domain. Oauth ===== Oauth may support some trust resources of it's own, that may support or help the Webauthn cases. This should be investigated. An alternate solution to these two is that when domain A wants to issue oauth to a user in domain b we redirect to domain b, conduct an auth, then from a bearer authorization, domain a then allows the authentication and generates a domain a uat/oauth from the domain b bearer. More thought on this topic is needed but I think there are solutions on how to do webauthn/oauth via trust.