Skip to content

Commit

Permalink
Add runbook for RabbitMQ issue.
Browse files Browse the repository at this point in the history
  • Loading branch information
achton committed Nov 8, 2023
1 parent dc78104 commit b244946
Showing 1 changed file with 33 additions and 0 deletions.
33 changes: 33 additions & 0 deletions docs/runbooks/rabbitmq-broker.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# RabbitMQ broker force start

## When to use

When the PR environments are no longer being created, and the `lagoon-core-broker-<n>` pods are missing or not running, and the container logs contain errors like `Error while waiting for Mnesia tables: {timeout_waiting_for_tables`.

Check failure on line 5 in docs/runbooks/rabbitmq-broker.md

View workflow job for this annotation

GitHub Actions / Lint Markdown

Line length [Expected: 80; Actual: 232]

This situation is caused by the RabbitMQ broker not starting correctly.

## Prerequisites

* A [dplsh session](using-dplsh.md) with DPLPLAT_ENV exported .

## Procedure

You are going to exec into the pod and stop the RabbitMQ application, and then force start it so that it can perform its Mnesia sync correctly.

Check failure on line 15 in docs/runbooks/rabbitmq-broker.md

View workflow job for this annotation

GitHub Actions / Lint Markdown

Line length [Expected: 80; Actual: 143]

Exec into the pod:
```shell

Check failure on line 18 in docs/runbooks/rabbitmq-broker.md

View workflow job for this annotation

GitHub Actions / Lint Markdown

Fenced code blocks should be surrounded by blank lines [Context: "```shell"]
dplsh:~/host_mount$ kubectl -n lagoon-core exec -ti pod/lagoon-core-broker-0 -- sh
```

Stop RabbitMQ:
```shell

Check failure on line 23 in docs/runbooks/rabbitmq-broker.md

View workflow job for this annotation

GitHub Actions / Lint Markdown

Fenced code blocks should be surrounded by blank lines [Context: "```shell"]
/ $ rabbitmqctl stop_app
Stopping rabbit application on node rabbit@lagoon-core-broker-0.lagoon-core-broker-headless.lagoon-core.svc.cluster.local ...

Check failure on line 25 in docs/runbooks/rabbitmq-broker.md

View workflow job for this annotation

GitHub Actions / Lint Markdown

Line length [Expected: 80; Actual: 125]
```

Start it using [the `force_boot` flag](https://www.rabbitmq.com/rabbitmqctl.8.html#force_boot):
```shell

Check failure on line 29 in docs/runbooks/rabbitmq-broker.md

View workflow job for this annotation

GitHub Actions / Lint Markdown

Fenced code blocks should be surrounded by blank lines [Context: "```shell"]
/ $ rabbitmqctl force_boot
```

Then exit the shell and check the container logs for one of the broker pods. It should start without errors.

Check failure on line 33 in docs/runbooks/rabbitmq-broker.md

View workflow job for this annotation

GitHub Actions / Lint Markdown

Line length [Expected: 80; Actual: 108]

0 comments on commit b244946

Please sign in to comment.