unable to join session keyring: unable to create session key: disk quota exceeded: unknown

I was trying to spin up a docker-compose up -d stack and kept getting this error.

username@server-name:~/directory$ docker compose --env-file .env.custom up -d
[+] Running 24/25
 ✔ Container project-name-symbolicator-cleanup-1                Running                                 0.0s
 ✔ Container project-name-symbolicator-1                        Running                                 0.0s
 ✔ Container project-name-smtp-1                                Running                                 0.0s
 ✔ Container project-name-clickhouse-1                          Healthy                                 0.0s
 ✔ Container project-name-memcached-1                           Running                                 0.0s
 ✔ Container project-name-postgres-1                            Running                                 0.0s
 ✔ Container project-name-kafka-1                               Healthy                                 0.0s
 ✔ Container project-name-redis-1                               Healthy                                 0.0s
 ✔ Container project-name-snuba-subscription-consumer-events-1  Running                                 0.0s
 ✔ Container project-name-snuba-replacer-1                      Running                                 0.0s
 ✔ Container project-name-snuba-outcomes-billing-consumer-1     Running                                 0.0s
 ✔ Container project-name-snuba-outcomes-consumer-1             Running                                 0.0s
 ✔ Container project-name-snuba-api-1                           Running                                 0.0s
 ✔ Container project-name-snuba-errors-consumer-1               Running                                 0.0s
 ✔ Container project-name-snuba-group-attributes-consumer-1     Running                                 0.0s
 ✔ Container project-name-cron-1                                Running                                 0.0s
 ✔ Container project-name-subscription-consumer-events-1        Running                                 0.0s
 ✔ Container project-name-post-process-forwarder-errors-1       Running                                 0.0s
 ✔ Container project-name-cleanup-1                             Running                                 0.0s
 ✔ Container project-name-attachments-consumer-1                Running                                 0.0s
 ✔ Container project-name-worker-1                              Running                                 0.0s
 ✔ Container project-name-web-1                                 Running                                 0.0s
 ✔ Container project-name-events-consumer-1                     Running                                 0.0s
 ✔ Container project-name-relay-1                               Running                                 0.0s
 ⠴ Container project-name-geoipupdate-1                         Starting                                0.5s
Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: unable to join session keyring: unable to create session key: disk quota exceeded: unknown

It worked yesterday. It worked an hour ago. It worked 10 minutes ago when I ran the same command. ✌️Disk quota exceeded✌️? Ok computer, sure bud.

I checked for stale volumes and outdated stored images with the usual Docker tools:

docker volume ls -f dangling=true

There wasn't anything there that should've held up other containers running, but I ran docker volume prune anyways. Let's check images.

docker images -f dangling=true

There was a bunch of stuff here but once again, the containers spun up a few minutes ago. Cleared it with docker image prune and tried starting the containers again. Same error message.

Ran some df -h as a sanity check. Plenty of storage at /

username@server-name:~/directory$ df -h
Filesystem                                Size  Used Avail Use% Mounted on
/dev/container/containers_server--name  275G   20G  245G   8% /
none                                      492K  4.0K  488K   1% /dev
tmpfs                                     100K     0  100K   0% /dev/lxd
tmpfs                                     1.0M     0  1.0M   0% /dev/.lxd-mounts
tmpfs                                     189G     0  189G   0% /dev/shm
tmpfs                                      76G  4.8M   76G   1% /run
tmpfs                                     5.0M     0  5.0M   0% /run/lock
tmpfs                                     4.0M     0  4.0M   0% /sys/fs/cgroup
snapfuse                                   64M   64M     0 100% /snap/core20/2318
snapfuse                                   64M   64M     0 100% /snap/core20/2264
snapfuse                                   88M   88M     0 100% /snap/lxd/28373
snapfuse                                   88M   88M     0 100% /snap/lxd/29351
snapfuse                                   39M   39M     0 100% /snap/snapd/21465
snapfuse                                   39M   39M     0 100% /snap/snapd/21759
tmpfs                                     1.0M     0  1.0M   0% /var/snap/lxd/common/ns
tmpfs                                      38G  4.0K   38G   1% /run/user/1001

LXD has been known to cause complications with Docker - nested containerization and all that fun stuff. I doubled checked for arbitrary limits that might have been imposed via container profiles. Nothing stood out.

I started digging through the docker logs but nothing looked wrong. Docker containers were starting, running, getting stopped. Pretty normal. But then why can I start only 24 of 25 containers?

I did some searching, ran into a lot Proxmox forums and some LXD issues on Github but nothing was meshing with what I was experiencing. Then I ran across a blog post from a SWE that was almost exactly what I was experiencing. Thanks internet stranger.

The solve was the same as they posted, with some minor changes for my environment. Since the stack I'm running is Ubuntu->LXD->LXC Ubuntu Container->Docker there were some differences.

On the container host I checked out the maxbytes for the keys.

username@server-name:~/ cat /proc/sys/kernel/keys/maxbytes
20000

What's 20,000 bytes? Roughly 0.02 MB. It's 2024, we have abundant storage. Following the post's advice, I upped it to 50mb.

echo "52428800" > /proc/sys/keys/maxbytes

Let's verify that value actually got inserted.

username@server-name:~/ cat /proc/sys/kernel/keys/maxbytes
52428800

I tried starting the container again and they spun right up. The last part of the solve was adding this solution to the internal wiki so other people can find it if needed.