* sally forth into the great otel unknown * make the build env identification slightly more durable * docs updates * wasm recompile
3.3 KiB
Troubleshooting
Some things to try.
Is the server started?
If you don't see "ready to rock! 🪨" in your logs, it's not started. Scroll back and look for errors!
Can you connect?
If the server's running on idm.example.com:8443
then a simple connectivity test is done using
curl.
Run the following command:
curl https://idm.example.com:8443/status
This is similar to what you should see:
{{#rustdoc_include troubleshooting/curl_connection_test.txt}}
This means:
- you've successfully connected to a host (10.0.0.14),
- TLS worked
- Received the status response "true"
If you see something like this:
➜ curl -v https://idm.example.com:8443
* Trying 10.0.0.1:8443...
* connect to 10.0.0.1 port 8443 failed: Connection refused
* Failed to connect to idm.example.com port 8443 after 5 ms: Connection refused
* Closing connection 0
curl: (7) Failed to connect to idm.example.com port 8443 after 5 ms: Connection refused
Then either your DNS is wrong (it's pointing at 10.0.0.1) or you can't connect to the server for some reason.
If you get errors about certificates, try adding -k
to skip certificate verification checking and
just test connectivity:
curl -vk https://idm.example.com:8443/status
Server things to check
- Has the config file got
bindaddress = "127.0.0.1:8443"
? Change it tobindaddress = "[::]:8443"
, so it listens on all interfaces. - Is there a firewall on the server?
- If you're running in docker, did you expose the port (
-p 8443:8443
) or configure the network to host/macvlan/ipvlan?
Client errors
When you receive a client error it will list an "Operation ID" sometimes also called the OpId or KOpId. This UUID matches to the UUID's in the logs allowing you to precisely locate the server logs related to the failing operation.
Try running commands with RUST_LOG=debug
to get more information:
RUST_LOG=debug kanidm login --name anonymous
Reverse Proxies not sending HTTP/1.1 requests
NGINX (and probably other proxies) send HTTP/1.0 requests to the upstream server by default. This'll lead to errors like this in your proxy logs:
*17 upstream prematurely closed connection while reading response header from upstream, client: 172.19.0.1, server: example.com, request: "GET / HTTP/1.1", upstream: "https://172.19.0.3:8443/", host: "example.com:8443"
The fix for NGINX is to set the
proxy_http_version
to 1.1
. This can go in the same block as the proxy_pass
option.
proxy_http_version 1.1
OpenTelemetry errors
If you see something like this:
OpenTelemetry trace error occurred. Exporter otlp encountered the following error(s): the grpc server returns error (The system is not in a state required for the operation's execution): , detailed error message: TRACE_TOO_LARGE: max size of trace (5000000) exceeded while adding 86725 bytes to trace a657b63f6ca0415eb70b6734f20f82cf for tenant single-tenant
Then you'l need to tweak the maximum trace size in your OTLP receiver. In Grafana Tempo you can add
the following keys to your tempo.yaml
, in this example we're setting it to 20MiB:
overrides:
defaults:
global:
max_bytes_per_trace: 20971520 # 20MiB