Loki

Canonical Observability

Architecture:

Base version:

Channel	Revision	Published	Runs on
1/stable	199	08 Jul 2025	Ubuntu 20.04
1/candidate	199	26 Jun 2025	Ubuntu 20.04
1/beta	207	18 Sep 2025	Ubuntu 20.04
1/edge	207	22 Sep 2025	Ubuntu 20.04
2/candidate	208	08 Oct 2025	Ubuntu 24.04
2/edge	208	22 Sep 2025	Ubuntu 24.04

Learn to deploy on juju >

Platform:

Relevant links

Homepage

Contacts

Submit a bug

Share your thoughts on this charm with the community on discourse.

Join the discussion

After relating loki to other charms, you may encounter situations where log lines appear to be missing.

Checklist

Source of log files is related to loki.
Loki url (from grafana-agent or promtail config files) is reachable from the source container.
Loki is not out of disk space.
You can manually post a log line to loki from within the loki pod (via localhost), from the host (via pod IP), from the traefik container (via k8s fqdn) and from another model (via ingress url).

Check status

You can curl the loki unit IP for status. In the sample output below, the ingester isn’t ready yet.

❯ curl 10.1.166.94:3100/ready
Ingester not ready: waiting for 15s after being ready

❯ curl 10.1.166.94:3100/services
querier => Running
query-frontend-tripperware => Running
ring => Running
query-scheduler => Running
query-frontend => Running
ingester-querier => Running
compactor => Running
ruler => Running
ingester => Running
distributor => Running
server => Running
memberlist-kv => Running
analytics => Running
store => Running
cache-generation-loader => Running
query-scheduler-ring => Running

Confirm if Loki received anything at all

You can curl the loki unit IP for labels and alerts.

❯ curl 10.1.166.94:3100/loki/api/v1/labels
{"status":"success"}

❯ curl 10.1.166.94:3100/loki/api/v1/labels
{"status":"success","data":["filename","job","juju_application","juju_charm","juju_model","juju_model_uuid","juju_unit"]}

❯ curl 10.1.166.94:3100/loki/api/v1/label/juju_unit/values
{"status":"success","data":["pg/0"]}

❯ curl 10.1.166.94:3100/loki/api/v1/rules
no rule groups found

Now that you know which labels exist, you can retrieve some logs:

❯ curl -sG 10.1.166.94:3100/loki/api/v1/query_range --data-urlencode 'query={juju_unit="pg/0"}' | jq '.data.result[0]'

You can query for the average logging rate. In the sample below, it is 0.1 log lines per second (6 log lines per minute).

❯ curl -sG 10.1.166.94:3100/loki/api/v1/query --data-urlencode 'query=rate({job=~".+"}[10m])' | jq '.data.result'
[
  {
    "metric": {
      "filename": "/var/log/postgresql/patroni.log",
      "job": "juju_test-bundle-iwfn_f427ffe2_pg",
      "juju_application": "pg",
      "juju_charm": "postgresql-k8s",
      "juju_model": "test-bundle-iwfn",
      "juju_model_uuid": "f427ffe2-9d96-482c-80c4-f200a20eb1bd",
      "juju_unit": "pg/0"
    },
    "value": [
      1715247333.466,
      "0.1"
    ]
  }
]

Query for particular log lines

If only a subset of logs is missing, you can confirm their existence in Loki by filtering labels and/or content. In the sample below loki is queried for log lines that contain “leader”.

❯ curl -sG 10.1.166.94:3100/loki/api/v1/query --data-urlencode 'query=({job=~".+"} |= "leader")' | jq '.data.result'
[
  {
    "stream": {
      "filename": "/var/log/postgresql/patroni.log",
      "job": "juju_test-bundle-iwfn_f427ffe2_pg",
      "juju_application": "pg",
      "juju_charm": "postgresql-k8s",
      "juju_model": "test-bundle-iwfn",
      "juju_model_uuid": "f427ffe2-9d96-482c-80c4-f200a20eb1bd",
      "juju_unit": "pg/0"
    },
    "values": [
      [
        "1715258886211320804",
        "2024-05-09 12:48:06 UTC [15]: INFO: no action. I am (pg-0), the leader with the lock "
      ],
      [
        "1715258876184953745",
        "2024-05-09 12:47:56 UTC [15]: INFO: no action. I am (pg-0), the leader with the lock "
      ],
      [
        "1715258866412113833",
        "2024-05-09 12:47:46 UTC [15]: INFO: no action. I am (pg-0), the leader with the lock "
      ]
    ]
  }
]

List active loggers

To obtain a list of all sources that logged something recently,

❯ curl -sG 10.1.166.94:3100/loki/api/v1/query --data-urlencode 'query=count_over_time({filename=~".+"}[1m]) > 2' | jq '.data.result'
[
  {
    "metric": {
      "filename": "/var/log/postgresql/patroni.log",
      "job": "juju_test-bundle-iwfn_f427ffe2_pg",
      "juju_application": "pg",
      "juju_charm": "postgresql-k8s",
      "juju_model": "test-bundle-iwfn",
      "juju_model_uuid": "f427ffe2-9d96-482c-80c4-f200a20eb1bd",
      "juju_unit": "pg/0"
    },
    "value": [
      1715249068.007,
      "6"
    ]
  }
]

Logs pushed by grafana-agent or promtail

Confirm that logs are being sent out:

# grafana-agent
juju ssh grafana-agent/0 curl localhost:12345/metrics | grep "promtail_sent_"

# promtail
juju ssh mysql-router/0 curl localhost:9080/metrics | grep -E "promtail_read_|promtail_sent_"

If the values are zero (or constant for quite some time), make sure the monitored log files exist and are not empty.

Confirm Loki is reachable

A typical loki deployment may look like this:

app --- grafana-agent --- loki --- traefik

Important: If logs are pushed to loki via a cross-model relation, make sure loki has an ingress relation!

Note: In the code samples below, the loki/0 IP address is assumed to be 10.1.27.247.

First, confirm loki itself is ready:

~> curl 10.1.27.247:3100/ready
Ingester not ready: waiting for 15s after being ready

...

~> curl 10.1.27.247:3100/ready
ready

Confirm loki is in the traefik config

~> juju ssh --container traefik trfk/0 grep "url:" -B4 /opt/traefik/juju/juju_ingress_ingress-per-unit_36_loki.yaml
  services:
    juju-cos-loki-0-service:
      loadBalancer:
        servers:
        - url: http://loki-0.loki-endpoints.cos.svc.cluster.local:3100

and reachable from within traefik:

~> juju ssh --container traefik trfk/0 curl http://loki-0.loki-endpoints.cos.svc.cluster.local:3100/ready
ready

and reachable from the host via traefik:

~> juju run trfk/0 show-proxied-endpoints
Running operation 1 with 1 task
  - task 2 on unit-trfk-0

Waiting for task 2...
proxied-endpoints: '{"trfk": {"url": "http://10.167.177.193"}, "loki/0": {"url": "http://10.167.177.193/cos-loki-0"}}'

~> curl http://10.167.177.193/cos-loki-0/ready
ready

and reachable from within grafana-agent:

~> juju switch lxd
microk8s:admin/cos -> lxd:admin/welcome-lxd

~> juju ssh ga/6 curl http://10.167.177.193/cos-loki-0/ready
ready

`context deadline exceeded` when attempting to POST to loki

Sometimes POST requests to loki fail with context deadline exceeded:

caller=client.go:419 level=warn component=logs
  logs_config=log_file_scraper component=client host=10.84.208.194
  msg="error sending batch, will retry" status=-1 tenant= 
  error="Post \"http://10.84.208.194/cos-loki-0/loki/api/v1/push\": context deadline exceeded"

First, try to manually POST a short and simple log line to loki (ref):

~> curl -H "Content-Type: application/json" \
  -s -X POST "http://10.167.177.193/selcem-loki-0/loki/api/v1/push" \
  --data-raw "{\"streams\": [{ \"stream\": { \"foo\": \"bar2\" }, \"values\": [ [ \"$(date +%s%9N)\", \"fizzbuzz\" ] ] }]}"

If the above POST request succeeds, then perhaps the payload that fails is too large.

Inspect the ingester timeout config and check if increasing it helps (remember to restart the pebble service after manually modifying the config):

~> curl -s http://10.167.177.193/selcem-loki-0/config \
  | yq '.ingester_client.remote_timeout'
5s

References

Help improve this document in the forum (guidelines). Last updated 5 months ago.