Meaning#
A queue has ready messages but zero consumers attached, so nothing is draining it. Messages will sit indefinitely until a consumer connects.
Fires when:
max by (queue) (rabbitmq_queue_messages_ready{namespace="safetywing-<env>-infra"}) > 0
and
max by (queue) (rabbitmq_queue_consumers{namespace="safetywing-<env>-infra"}) == 0for: 15m, severity ticket, tier component. The queue label identifies the affected queue.
Impact#
Work enqueued on this queue is not being processed at all. Unlike a slow backlog, there is no progress whatsoever, so the dependent feature is effectively down. The backlog will keep growing and can eventually trip the memory or disk alarms and block publishing cluster-wide.
Diagnosis#
kubectl config use-context hetzner
kubectl get rabbitmqcluster -n safetywing-<env>-infra
kubectl get pods -n safetywing-<env>-infra -l app.kubernetes.io/component=rabbitmq
# Confirm the queue has messages and no consumers
kubectl exec -n safetywing-<env>-infra <rabbitmq-pod> -- \
rabbitmq-diagnostics list_queues name messages_ready consumers stateConfirm via metrics:
max by (queue) (rabbitmq_queue_messages_ready{namespace="safetywing-<env>-infra"})
max by (queue) (rabbitmq_queue_consumers{namespace="safetywing-<env>-infra"})In the management UI, open Queues →
Mitigation#
- Identify and check the consuming service that should bind to this
queue:Common causes: the deployment is scaled to 0, crash-looping, failing to start its AMQP connection, or pointing at the wrong vhost/credentials.kubectl get pods -n safetywing-<env>-applications | grep <consuming-service> kubectl describe deployment/<consuming-service> -n safetywing-<env>-applications kubectl logs deployment/<consuming-service> -n safetywing-<env>-applications --tail=100 - Bring the consumer back up — scale it up or restart it:
kubectl scale deployment/<consuming-service> -n safetywing-<env>-applications --replicas=<n> kubectl rollout restart deployment/<consuming-service> -n safetywing-<env>-applications - Verify connectivity/credentials if it starts but never attaches a consumer: check the service’s RabbitMQ host/vhost/user config and the operator-managed default-user secret in
safetywing-<env>-infra, and confirm the connection appears in the management UI. - Confirm the queue name matches what the consumer subscribes to — a renamed or misdeclared queue leaves the old one orphaned with no consumers; remove the orphan if it is stale.
- Once a consumer reattaches, confirm
rabbitmq_queue_consumersfor the queue is > 0 and the backlog drains.