
Loki
- Canonical Observability
Channel | Revision | Published | Runs on |
---|---|---|---|
latest/stable | 190 | 15 Apr 2025 | |
latest/candidate | 192 | 15 Apr 2025 | |
latest/beta | 194 | 15 Apr 2025 | |
latest/edge | 195 | 08 May 2025 | |
latest/edge | 193 | 03 Apr 2025 | |
1.0/stable | 104 | 12 Dec 2023 | |
1.0/candidate | 104 | 22 Nov 2023 | |
1.0/beta | 104 | 22 Nov 2023 | |
1.0/edge | 104 | 22 Nov 2023 |
juju deploy loki-k8s
Deploy Kubernetes operators easily with Juju, the Universal Operator Lifecycle Manager. Need a Kubernetes cluster? Install MicroK8s to create a full CNCF-certified Kubernetes system in under 60 seconds.
Platform:
After relating loki to other charms, you may encounter situations where log lines appear to be missing.
Checklist
- Source of log files is related to loki.
- Loki url (from grafana-agent or promtail config files) is reachable from the source container.
- Loki is not out of disk space.
- You can manually post a log line to loki from within the loki pod (via localhost), from the host (via pod IP), from the traefik container (via k8s fqdn) and from another model (via ingress url).
Check status
You can curl
the loki unit IP for status.
In the sample output below, the ingester isn’t ready yet.
❯ curl 10.1.166.94:3100/ready
Ingester not ready: waiting for 15s after being ready
❯ curl 10.1.166.94:3100/services
querier => Running
query-frontend-tripperware => Running
ring => Running
query-scheduler => Running
query-frontend => Running
ingester-querier => Running
compactor => Running
ruler => Running
ingester => Running
distributor => Running
server => Running
memberlist-kv => Running
analytics => Running
store => Running
cache-generation-loader => Running
query-scheduler-ring => Running
Confirm if Loki received anything at all
You can curl
the loki unit IP for labels and alerts.
❯ curl 10.1.166.94:3100/loki/api/v1/labels
{"status":"success"}
❯ curl 10.1.166.94:3100/loki/api/v1/labels
{"status":"success","data":["filename","job","juju_application","juju_charm","juju_model","juju_model_uuid","juju_unit"]}
❯ curl 10.1.166.94:3100/loki/api/v1/label/juju_unit/values
{"status":"success","data":["pg/0"]}
❯ curl 10.1.166.94:3100/loki/api/v1/rules
no rule groups found
Now that you know which labels exist, you can retrieve some logs:
❯ curl -sG 10.1.166.94:3100/loki/api/v1/query_range --data-urlencode 'query={juju_unit="pg/0"}' | jq '.data.result[0]'
You can query for the average logging rate. In the sample below, it is 0.1 log lines per second (6 log lines per minute).
❯ curl -sG 10.1.166.94:3100/loki/api/v1/query --data-urlencode 'query=rate({job=~".+"}[10m])' | jq '.data.result'
[
{
"metric": {
"filename": "/var/log/postgresql/patroni.log",
"job": "juju_test-bundle-iwfn_f427ffe2_pg",
"juju_application": "pg",
"juju_charm": "postgresql-k8s",
"juju_model": "test-bundle-iwfn",
"juju_model_uuid": "f427ffe2-9d96-482c-80c4-f200a20eb1bd",
"juju_unit": "pg/0"
},
"value": [
1715247333.466,
"0.1"
]
}
]
Query for particular log lines
If only a subset of logs is missing, you can confirm their existence in Loki by filtering labels and/or content. In the sample below loki is queried for log lines that contain “leader”.
❯ curl -sG 10.1.166.94:3100/loki/api/v1/query --data-urlencode 'query=({job=~".+"} |= "leader")' | jq '.data.result'
[
{
"stream": {
"filename": "/var/log/postgresql/patroni.log",
"job": "juju_test-bundle-iwfn_f427ffe2_pg",
"juju_application": "pg",
"juju_charm": "postgresql-k8s",
"juju_model": "test-bundle-iwfn",
"juju_model_uuid": "f427ffe2-9d96-482c-80c4-f200a20eb1bd",
"juju_unit": "pg/0"
},
"values": [
[
"1715258886211320804",
"2024-05-09 12:48:06 UTC [15]: INFO: no action. I am (pg-0), the leader with the lock "
],
[
"1715258876184953745",
"2024-05-09 12:47:56 UTC [15]: INFO: no action. I am (pg-0), the leader with the lock "
],
[
"1715258866412113833",
"2024-05-09 12:47:46 UTC [15]: INFO: no action. I am (pg-0), the leader with the lock "
]
]
}
]
List active loggers
To obtain a list of all sources that logged something recently,
❯ curl -sG 10.1.166.94:3100/loki/api/v1/query --data-urlencode 'query=count_over_time({filename=~".+"}[1m]) > 2' | jq '.data.result'
[
{
"metric": {
"filename": "/var/log/postgresql/patroni.log",
"job": "juju_test-bundle-iwfn_f427ffe2_pg",
"juju_application": "pg",
"juju_charm": "postgresql-k8s",
"juju_model": "test-bundle-iwfn",
"juju_model_uuid": "f427ffe2-9d96-482c-80c4-f200a20eb1bd",
"juju_unit": "pg/0"
},
"value": [
1715249068.007,
"6"
]
}
]
Logs pushed by grafana-agent or promtail
Confirm that logs are being sent out:
# grafana-agent
juju ssh grafana-agent/0 curl localhost:12345/metrics | grep "promtail_sent_"
# promtail
juju ssh mysql-router/0 curl localhost:9080/metrics | grep -E "promtail_read_|promtail_sent_"
If the values are zero (or constant for quite some time), make sure the monitored log files exist and are not empty.
Confirm Loki is reachable
A typical loki deployment may look like this:
app --- grafana-agent --- loki --- traefik
Important: If logs are pushed to loki via a cross-model relation, make sure loki has an ingress relation!
Note: In the code samples below, the
loki/0
IP address is assumed to be10.1.27.247
.
First, confirm loki itself is ready:
~> curl 10.1.27.247:3100/ready
Ingester not ready: waiting for 15s after being ready
...
~> curl 10.1.27.247:3100/ready
ready
Confirm loki is in the traefik config
~> juju ssh --container traefik trfk/0 grep "url:" -B4 /opt/traefik/juju/juju_ingress_ingress-per-unit_36_loki.yaml
services:
juju-cos-loki-0-service:
loadBalancer:
servers:
- url: http://loki-0.loki-endpoints.cos.svc.cluster.local:3100
and reachable from within traefik:
~> juju ssh --container traefik trfk/0 curl http://loki-0.loki-endpoints.cos.svc.cluster.local:3100/ready
ready
and reachable from the host via traefik:
~> juju run trfk/0 show-proxied-endpoints
Running operation 1 with 1 task
- task 2 on unit-trfk-0
Waiting for task 2...
proxied-endpoints: '{"trfk": {"url": "http://10.167.177.193"}, "loki/0": {"url": "http://10.167.177.193/cos-loki-0"}}'
~> curl http://10.167.177.193/cos-loki-0/ready
ready
and reachable from within grafana-agent:
~> juju switch lxd
microk8s:admin/cos -> lxd:admin/welcome-lxd
~> juju ssh ga/6 curl http://10.167.177.193/cos-loki-0/ready
ready
context deadline exceeded
when attempting to POST to loki
Sometimes POST requests to loki fail with context deadline exceeded
:
caller=client.go:419 level=warn component=logs
logs_config=log_file_scraper component=client host=10.84.208.194
msg="error sending batch, will retry" status=-1 tenant=
error="Post \"http://10.84.208.194/cos-loki-0/loki/api/v1/push\": context deadline exceeded"
First, try to manually POST a short and simple log line to loki (ref):
~> curl -H "Content-Type: application/json" \
-s -X POST "http://10.167.177.193/selcem-loki-0/loki/api/v1/push" \
--data-raw "{\"streams\": [{ \"stream\": { \"foo\": \"bar2\" }, \"values\": [ [ \"$(date +%s%9N)\", \"fizzbuzz\" ] ] }]}"
If the above POST request succeeds, then perhaps the payload that fails is too large.
Inspect the ingester timeout config and check if increasing it helps (remember to restart the pebble service after manually modifying the config):
~> curl -s http://10.167.177.193/selcem-loki-0/config \
| yq '.ingester_client.remote_timeout'
5s
References
- https://grafana.com/docs/loki/v2.9.x/query/
- https://grafana.com/docs/loki/v2.9.x/reference/api/#query-loki
- https://megamorf.gitlab.io/cheat-sheets/loki/
- https://grafana.com/docs/loki/v2.9.x/query/log_queries/
- https://helgeklein.com/blog/logql-a-primer-on-querying-loki-from-grafana/