Meaning#

A queue has ready messages but zero consumers attached, so nothing is draining it. Messages will sit indefinitely until a consumer connects.

Fires when:

max by (queue) (rabbitmq_queue_messages_ready{namespace="safetywing-<env>-infra"}) > 0
and
max by (queue) (rabbitmq_queue_consumers{namespace="safetywing-<env>-infra"}) == 0

for: 15m, severity ticket, tier component. The queue label identifies the affected queue.

Impact#

Work enqueued on this queue is not being processed at all. Unlike a slow backlog, there is no progress whatsoever, so the dependent feature is effectively down. The backlog will keep growing and can eventually trip the memory or disk alarms and block publishing cluster-wide.

Diagnosis#

kubectl config use-context hetzner

kubectl get rabbitmqcluster -n safetywing-<env>-infra
kubectl get pods -n safetywing-<env>-infra -l app.kubernetes.io/component=rabbitmq

# Confirm the queue has messages and no consumers
kubectl exec -n safetywing-<env>-infra <rabbitmq-pod> -- \
  rabbitmq-diagnostics list_queues name messages_ready consumers state

Confirm via metrics:

max by (queue) (rabbitmq_queue_messages_ready{namespace="safetywing-<env>-infra"})
max by (queue) (rabbitmq_queue_consumers{namespace="safetywing-<env>-infra"})

In the management UI, open Queues → (consumers shows 0) and Connections to confirm the consuming service has no live connection.

Mitigation#

  1. Identify and check the consuming service that should bind to this queue:
    kubectl get pods -n safetywing-<env>-applications | grep <consuming-service>
    kubectl describe deployment/<consuming-service> -n safetywing-<env>-applications
    kubectl logs deployment/<consuming-service> -n safetywing-<env>-applications --tail=100
    Common causes: the deployment is scaled to 0, crash-looping, failing to start its AMQP connection, or pointing at the wrong vhost/credentials.
  2. Bring the consumer back up — scale it up or restart it:
    kubectl scale deployment/<consuming-service> -n safetywing-<env>-applications --replicas=<n>
    kubectl rollout restart deployment/<consuming-service> -n safetywing-<env>-applications
  3. Verify connectivity/credentials if it starts but never attaches a consumer: check the service’s RabbitMQ host/vhost/user config and the operator-managed default-user secret in safetywing-<env>-infra, and confirm the connection appears in the management UI.
  4. Confirm the queue name matches what the consumer subscribes to — a renamed or misdeclared queue leaves the old one orphaned with no consumers; remove the orphan if it is stale.
  5. Once a consumer reattaches, confirm rabbitmq_queue_consumers for the queue is > 0 and the backlog drains.

References#