# Fixing "etcdserver: mvcc: database space exceeded" (kind cluster) After load testing cert-manager with 100 000 certs on a Kind cluster, I ended up with the error message: ``` etcdserver: mvcc: database space exceeded ``` I had 79725 secrets in the cluster... I tried compacting and defragmenting, it got the DB usage from 2.2GiB to 1.2GiB but still unusable. After restarting etcd, its DB size went back to 2.2GiB. ## Installing etcdctl But first, I had to install etcdctl to debug what was going on: 1. Go to https://github.com/etcd-io/etcd/releases. 2. Exec into your cluster: ``` docker exec -it kind-control-plane bash ``` 4. Copy-paste and run the installation commands. 5. Finally, move the binary as you will get "Permission denied" if you try to run a binary from `/tmp` instead of `/bin`: ```bash mv /tmp/etcd-download-test/etcdctl /usr/local/bin/etcdctl-bin ``` 6. Create a shim with the connection flags so you don't need to type them every single time: ```bash tee /usr/local/bin/etcdctl <<'EOF' && chmod ugo+x /usr/local/bin/etcdctl #!/bin/bash ETCDCTL_API=3 exec etcdctl-bin --endpoints=https://127.0.0.1:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/peer.crt --key=/etc/kubernetes/pki/etcd/peer.key "$@" EOF ``` ## Debugging with etcdctl ``` etcdctl endpoint status --write-out=table ``` ``` +------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+ | ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS | +------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+ | https://127.0.0.1:2379 | fa0ec0897a822f07 | 3.5.10 | 2.2 GB | true | false | 62 | 8025698 | 8025683 | | +------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+ ``` The problem was that the default value for `--quota-backend-bytes` is 2GiB. As soon as that quota is reached, ETCD refuses to do writes. To increase it, I did ``` vim /etc/kubernetes/manifests/etcd.yaml ``` Then added: ```diff + - --quota-backend-bytes=4294967296 # 4GiB ``` Finally, restarted the container: ```bash crictl ps -name etcd -o json \ | jq ".containers[].id" -r \ | xargs -I@ crictl stop @ ``` If you need, you can see the etcd logs with the command: ```bash crictl ps -name etcd -o json \ | jq ".containers[].id" -r \ | xargs -I@ crictl logs @ ``` ## Tuning cert-manager for clusters with over 50,000 secrets