# openebs hostpath localpv privisioner 使用 ext4/xfs project quota 功能限制pvc使用量
## 介绍
### project quota 介绍
quota 子系统用于限制磁盘的使用量。
从限制的主体进行分类,quota 包含 user quota、group quota 与 project quota 三部分。顾名思义,user quota、group quota 限制的主体分别是 user、user group,而 project quota 限制的主体则是 project id。当一个目录下的所有子目录和文件拥有相同的 project id 时,就可以限制一个目录下总的磁盘使用量。
quota 子系统其实是一项“古老”的特性,user quota 与 group quota 早在 Linux v2.6 就开始支持,而 project quota 则来得相对晚一些。project quota 特性最初来源于 XFS,Linux v4.5 开始 ext4 才正式支持 project quota。
### quota 配额
quota 从限制的对象进行分类,包括 block quota 与 inode quota 两部分。
限制的类型又包含 softlimit 与 hardlimit 两类,其中 hardlimit 是不可超越的,即 inode/block 分配过程中若当前占用的 inode/block 数量超过 hardlimit,那么分配过程就会失败。
softlimit 是可以暂时超越的,inode/block 分配过程中若当前占用的 inode/block 数量超过 softlimit 但是尚未超过 hardlimit,那么只会打印 warnning 信息,但是分配过程并不会失败。
但是系统不能长时间超过 softlimit,系统可以超过 softlimit 的最长时间称为 grace time,在第一次超过 softlimit 的时候开始计时,在 grace time 时间以内,尽管当前占用的 inode/block 数量超过 softlimit,但是分配过程不会失败;而如果 grace time 时间以后,当前占用的 inode/block 数量仍然没有降到 softlimit 以下,此时分配过程就会失败。
grace time 参数是 filesystem wide 统一的,即同一个文件系统(磁盘分区)共用同一份 grace time 参数。
关于xfs/ext4 quota更多介绍可查阅以下文档:
- xfs quota: https://man7.org/linux/man-pages/man8/xfs_quota.8.html
- ext4 quota: https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/managing_file_systems/assembly_limiting-storage-space-usage-on-ext4-with-quotas_managing-file-systems
## 测试环境基本信息
单节点k8s集群
- os: ubuntu 20.04.4 LTS (Focal Fossa)
- kernel: Linux 5.4.0-135-generic #152-Ubuntu SMP Wed Nov 23 20:19:22 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
- k8s: v1.23.7
- helm: v3.6.3
## 安装 openebs localpv helm chart
```bash
root@stonetest1:~# helm repo add openebs https://openebs.github.io/charts
root@stonetest1:~# helm update
root@stonetest1:~# helm search repo openebs
NAME CHART VERSION APP VERSION DESCRIPTION
openebs/openebs 3.3.1 3.3.0 Containerized Attached Storage for Kubernetes
root@stonetest1:~# cat openebs-values.yaml
apiserver:
enabled: false
varDirectoryPath:
baseDir: "/openebs"
provisioner:
enabled: false
localprovisioner:
enabled: true
image: "stoneshiyunify/openebs-localpv"
imageTag: "v0.2"
basePath: "/openebs/local"
deviceClass:
enabled: false
hostpathClass:
# Name of the default hostpath StorageClass
name: openebs-hostpath
# If true, enables creation of the openebs-hostpath StorageClass
enabled: true
# Available reclaim policies: Delete/Retain, defaults: Delete.
reclaimPolicy: Delete
# If true, sets the openebs-hostpath StorageClass as the default StorageClass
isDefaultClass: false
# Path on the host where local volumes of this storage class are mounted under.
# NOTE: If not specified, this defaults to the value of localprovisioner.basePath.
basePath: "/openebs/local"
# Custom node affinity label(s) for example "openebs.io/node-affinity-value"
# that will be used instead of hostnames
# This helps in cases where the hostname changes when the node is removed and
# added back with the disks still intact.
# Example:
# nodeAffinityLabels:
# - "openebs.io/node-affinity-key-1"
# - "openebs.io/node-affinity-key-2"
nodeAffinityLabels: []
# Prerequisite: XFS Quota requires an XFS filesystem mounted with
# the 'pquota' or 'prjquota' mount option.
xfsQuota:
# If true, enables XFS project quota
enabled: true
# Detailed configuration options for XFS project quota.
# If XFS Quota is enabled with the default values, the usage limit
# is set at the storage capacity specified in the PVC.
softLimitGrace: "80%"
hardLimitGrace: "100%"
# Prerequisite: EXT4 Quota requires an EXT4 filesystem mounted with
# the 'prjquota' mount option.
ext4Quota:
# If true, enables XFS project quota
enabled: true
# Detailed configuration options for EXT4 project quota.
# If EXT4 Quota is enabled with the default values, the usage limit
# is set at the storage capacity specified in the PVC.
softLimitGrace: "80%"
hardLimitGrace: "100%"
snapshotOperator:
enabled: false
ndm:
enabled: false
ndmOperator:
enabled: false
ndmExporter:
enabled: false
webhook:
enabled: false
crd:
enableInstall: false
policies:
monitoring:
enabled: false
analytics:
enabled: false
jiva:
enabled: false
openebsLocalpv:
enabled: false
localpv-provisioner:
openebsNDM:
enabled: false
cstor:
enabled: false
openebsNDM:
enabled: false
openebs-ndm:
enabled: false
localpv-provisioner:
enabled: false
openebsNDM:
enabled: false
zfs-localpv:
enabled: false
lvm-localpv:
enabled: false
nfs-provisioner:
enabled: false
root@stonetest1:~# kubectl create ns openebs
root@stonetest1:~# helm install openebs openebs/openebs -n openebs -f openebs-values.yaml
root@stonetest1:~# kubectl -n openebs get pod
NAME READY STATUS RESTARTS AGE
openebs-localpv-provisioner-6d9bffd9db-g2hpw 1/1 Running 1 (56m ago) 134m
root@stonetest1:~# kubectl get sc openebs-hostpath
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
openebs-hostpath openebs.io/local Delete WaitForFirstConsumer false 133m
root@stonetest1:~# kubectl get sc openebs-hostpath -o yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
annotations:
cas.openebs.io/config: |
- name: StorageType
value: "hostpath"
- name: BasePath
value: "/openebs/local"
- name: XFSQuota
enabled: "true"
data:
softLimitGrace: "80%"
hardLimitGrace: "100%"
- name: EXT4Quota
enabled: "true"
data:
softLimitGrace: "80%"
hardLimitGrace: "100%"
meta.helm.sh/release-name: openebs
meta.helm.sh/release-namespace: openebs
openebs.io/cas-type: local
creationTimestamp: "2022-12-09T03:59:00Z"
labels:
app.kubernetes.io/managed-by: Helm
name: openebs-hostpath
resourceVersion: "34215091"
uid: 94b5b547-efde-478f-92b8-f3f676064dcf
provisioner: openebs.io/local
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
```
**注意**
- 安装chart时, 我将localpv-provisioner的镜像更改至我本人(stone)编译的镜像(stoneshiyunify/openebs-localpv:v0.2), 因为我认为官方的镜像计算出的quota limit数值有误,但此问题暂未得到openebs官方确认。详情参阅 github issue: https://github.com/openebs/dynamic-localpv-provisioner/issues/150
- 强烈建议每个节点都使用独立的文件系统(对应独立的硬盘/分区)专门为openebs使用,因为project quota 需在文件系统层级支持和开启。chart中将openebs的basepath指定为/openebs, 此目录即是一个独立硬盘/分区的挂载点, 专供openebs使用。下文介绍了如何将新硬盘挂载至/openebs并开启quota功能。不建议使用chart默认的 /var/openebs 目录,因为它通常和根目录/共享一个文件系统,根目录所在的文件系统可能无法开启quota功能。
## 准备测试文件
准备一个deployment用来测试quota功能。此deployment会创建一个pvc并挂载到busybox容器中。
```bash
root@stonetest1:~# cat busy-deployment.yaml
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: busybox-test
spec:
storageClassName: openebs-hostpath
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10G
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: busybox-test
spec:
replicas: 1
selector:
matchLabels:
app: busybox-test
template:
metadata:
labels:
app: busybox-test
spec:
containers:
- name: busybox
image: busybox:1.29
imagePullPolicy: IfNotPresent
command: [ "/bin/sh", "-c", "tail -f /dev/null" ]
volumeMounts:
- name: volume1
mountPath: "/mnt/volume1"
volumes:
- name: volume1
persistentVolumeClaim:
claimName: busybox-test
```
## 使用 ext4 project quota
### 挂载openebs专用文件系统并开启project quota
登录集群每个节点,执行以下操作:
- 安装 quota utility 和 quota kernel module
```bash
root@stonetest1:~# sudo apt update && sudo apt install quota linux-image-extra-virtual
# check if quota installed
root@stonetest1:~# quota --version
```
- 将硬盘挂载至集群节点。本文的硬盘挂载至/dev/vdc。
- 格式化硬盘,并创建文件系统
```bash
root@stonetest1:~# mkfs.ext4 -O project,quota /dev/vdc
```
- 挂载文件系统,并开启project quota
```bash
root@stonetest1:~# mkdir /openebs
root@stonetest1:~# mount -o prjquota /dev/vdc /openebs
root@stonetest1:~# mount | grep openebs
/dev/vdc on /openebs type ext4 (rw,relatime,prjquota)
```
提示:可将此硬盘添加到/etc/fstab以使节点重启后自动生效
- 查看quota状态
```bash
root@stonetest1:~# quotaon -Ppv /openebs
project quota on /openebs (/dev/vdc) is on (enforced)
```
quota状态应为 enforced,表示project quota已开启
### 测试
```bash
# 创建pvc,启动工作负载
root@stonetest1:~# kubectl create ns ext4
root@stonetest1:~# kubectl -n ext4 apply -f busy-deployment.yaml
root@stonetest1:~# kubectl -n ext4 get pod
NAME READY STATUS RESTARTS AGE
busybox-test-64856dd56f-wkp92 1/1 Running 0 92m
root@stonetest1:~# kubectl -n ext4 get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
busybox-test Bound pvc-ec2182f3-0e78-48e8-a427-eb5224c8c9c7 10G RWO openebs-hostpath 92m
# /openebs的project id为0
root@stonetest1:~# lsattr -p -d /openebs
0 --------------e----- /openebs
# /openeb/local/pvc-ec2182f3-0e78-48e8-a427-eb5224c8c9c7 的project id 为1
root@stonetest1:/openebs/local# lsattr -p
1 --------------e---P- ./pvc-ec2182f3-0e78-48e8-a427-eb5224c8c9c7
# 查看quota使用情况
root@stonetest1:~# repquota -P /openebs
*** Report for project quotas on device /dev/vdc
Block grace time: 7days; Inode grace time: 7days
Block limits File limits
Project used soft hard grace used soft hard grace
----------------------------------------------------------------------
#0 -- 24 0 0 3 0 0
#1 -- 4 8000000 10000000 1 0 0
说明:
project id 0 是/openebs 的quota状况
peoject id 1 是pvc-ec2182f3-0e78-48e8-a427-eb5224c8c9c7的quota使用状况
8000000KB 即为pvc容量(10G)的80% (soft limit)
10000000KB 即为pvc容量(10G)的100% (hard limit)
4KB 为此pvc的当前空间使用量
# 登录busybox容器并创建文件进行测试
root@stonetest1:~# kubectl -n ext4 exec -it busybox-test-64856dd56f-wkp92 -- sh
/ # df -h
Filesystem Size Used Available Use% Mounted on
overlay 96.7G 80.0G 16.8G 83% /
tmpfs 64.0M 0 64.0M 0% /dev
tmpfs 7.8G 0 7.8G 0% /sys/fs/cgroup
/dev/vda1 96.7G 80.0G 16.8G 83% /dev/termination-log
/dev/vdc 7.6G 4.0K 7.6G 0% /mnt/volume1
/dev/vda1 96.7G 80.0G 16.8G 83% /etc/resolv.conf
/dev/vda1 96.7G 80.0G 16.8G 83% /etc/hostname
/dev/vda1 96.7G 80.0G 16.8G 83% /etc/hosts
shm 64.0M 0 64.0M 0% /dev/shm
tmpfs 14.4G 12.0K 14.4G 0% /var/run/secrets/kubernetes.io/serviceaccount
tmpfs 7.8G 0 7.8G 0% /proc/acpi
tmpfs 64.0M 0 64.0M 0% /proc/kcore
tmpfs 64.0M 0 64.0M 0% /proc/keys
tmpfs 64.0M 0 64.0M 0% /proc/timer_list
tmpfs 64.0M 0 64.0M 0% /proc/sched_debug
tmpfs 7.8G 0 7.8G 0% /proc/scsi
tmpfs 7.8G 0 7.8G 0% /sys/firmware
/ # cd /mnt/volume1
/mnt/volume1 # fallocate -l 6G aaa
/mnt/volume1 # ls -l && df -B 1K /mnt/volume1
total 6291460
-rw-r--r-- 1 root root 6442450944 Dec 9 05:42 aaa
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/vdc 8000000 6291464 1708536 79% /mnt/volume1
/mnt/volume1 # fallocate -l 3G bbb
/mnt/volume1 # ls -l && df -B 1K /mnt/volume1
total 9437192
-rw-r--r-- 1 root root 6442450944 Dec 9 05:42 aaa
-rw-r--r-- 1 root root 3221225472 Dec 9 05:43 bbb
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/vdc 8000000 8000000 0 100% /mnt/volume1
/mnt/volume1 # fallocate -l 2G ccc
fallocate: fallocate 'ccc': Disk quota exceeded
/mnt/volume1 # fallocate -l 50M ddd
/mnt/volume1 # ls -l && df -B 1K /mnt/volume1
total 9922568
-rw-r--r-- 1 root root 6442450944 Dec 9 05:42 aaa
-rw-r--r-- 1 root root 3221225472 Dec 9 05:43 bbb
-rw-r--r-- 1 root root 444596224 Dec 9 05:43 ccc
-rw-r--r-- 1 root root 52428800 Dec 9 05:43 ddd
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/vdc 8000000 8000000 0 100% /mnt/volume1
# 查看quota report
root@stonetest1:~# repquota -P /openebs
*** Report for project quotas on device /dev/vdc
Block grace time: 7days; Inode grace time: 7days
Block limits File limits
Project used soft hard grace used soft hard grace
----------------------------------------------------------------------
#0 -- 24 0 0 3 0 0
#1 +- 9922572 8000000 10000000 6days 5 0 0
```
### 关闭/开启/手动编辑 project quota
开启: quotaon -P /openebs
关闭: quotaoff -P /openebs
编辑:edquota
## 使用 xfs project quota
### 挂载openebs专用文件系统并开启project quota
登录集群每个节点,执行以下操作:
- xfs_quota 功能比较古老,一般的linux发行版中都自带有此命令。
```bash
root@stonetest:~# xfs_quota -V
xfs_quota version 5.13.0
```
- 将硬盘挂载至集群节点。本文的硬盘挂载至/dev/vdc。
- 格式化硬盘,并创建文件系统
```bash
root@stonetest:~# mkfs -t xfs /dev/vdc
```
- 挂载文件系统,并开启project quota
```bash
root@stonetest:~# mkdir /openebs
root@stonetest:~# mount -o prjquota /dev/vdc /openebs
root@stonetest:~# mount | grep vdc
/dev/vdc on /openebs type xfs (rw,relatime,attr2,inode64,logbufs=8,logbsize=32k,prjquota)
```
提示:可将此硬盘添加到/etc/fstab以使节点重启后自动生效
- 查看quota状态
```bash
root@stonetest:~# xfs_quota -x
xfs_quota> state -p
Project quota state on /openebs (/dev/vdc)
Accounting: ON
Enforcement: ON
Inode: #131 (2 blocks, 2 extents)
Blocks grace time: [7 days]
Blocks max warnings: 5
Inodes grace time: [7 days]
Inodes max warnings: 5
Realtime Blocks grace time: [7 days]
```
### 测试
```bash
root@stonetest:~# kubectl -n xfs apply -f busy-deployment.yaml
persistentvolumeclaim/busybox-test created
deployment.apps/busybox-test created
root@stonetest:~# kubectl -n xfs get pod
NAME READY STATUS RESTARTS AGE
busybox-test-64856dd56f-9282t 1/1 Running 0 35s
root@stonetest:~# kubectl -n xfs get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
busybox-test Bound pvc-29537c78-cbc6-4d60-a2d4-84db36483a94 10G RWO openebs-hostpath 38s
root@stonetest:~# xfs_quota -x
xfs_quota> report
Project quota on /openebs (/dev/vdc)
Blocks
Project ID Used Soft Hard Warn/Grace
---------- --------------------------------------------------
#0 0 0 0 00 [0 days]
#1 0 8000000 10000000 00 [--------]
# 解释:
# project id 0 为 /openebs 的project quota
# project id 1 为 上述pvc的project quota
root@stonetest:~# kubectl -n xfs exec -it busybox-test-64856dd56f-9282t -- sh
/ # df -h
Filesystem Size Used Available Use% Mounted on
overlay 96.7G 26.3G 70.4G 27% /
tmpfs 64.0M 0 64.0M 0% /dev
/dev/vdc 7.6G 0 7.6G 0% /mnt/volume1
/dev/vda1 96.7G 26.3G 70.4G 27% /dev/termination-log
/dev/vda1 96.7G 26.3G 70.4G 27% /etc/resolv.conf
/dev/vda1 96.7G 26.3G 70.4G 27% /etc/hostname
/dev/vda1 96.7G 26.3G 70.4G 27% /etc/hosts
shm 64.0M 0 64.0M 0% /dev/shm
tmpfs 14.4G 12.0K 14.4G 0% /var/run/secrets/kubernetes.io/serviceaccount
tmpfs 7.8G 0 7.8G 0% /proc/acpi
tmpfs 64.0M 0 64.0M 0% /proc/kcore
tmpfs 64.0M 0 64.0M 0% /proc/keys
tmpfs 64.0M 0 64.0M 0% /proc/timer_list
tmpfs 7.8G 0 7.8G 0% /proc/scsi
tmpfs 7.8G 0 7.8G 0% /sys/firmware
/ # cd /mnt/volume1
/mnt/volume1 # ls -l && df -B 1K /mnt/volume1
total 0
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/vdc 8000000 0 8000000 0% /mnt/volume1
/mnt/volume1 # fallocate -l 6G aaa
/mnt/volume1 # ls -l && df -B 1K /mnt/volume1
total 6291456
-rw-r--r-- 1 root root 6442450944 Dec 9 09:40 aaa
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/vdc 8000000 6291456 1708544 79% /mnt/volume1
/mnt/volume1 # fallocate -l 3G bbb
/mnt/volume1 # ls -l && df -B 1K /mnt/volume1
total 9437184
-rw-r--r-- 1 root root 6442450944 Dec 9 09:40 aaa
-rw-r--r-- 1 root root 3221225472 Dec 9 09:41 bbb
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/vdc 8000000 8000000 0 100% /mnt/volume1
/mnt/volume1 # fallocate -l 2G ccc
fallocate: fallocate 'ccc': No space left on device
/mnt/volume1 # fallocate -l 50M ddd
/mnt/volume1 # ls -l && df -B 1K /mnt/volume1
total 9488384
-rw-r--r-- 1 root root 6442450944 Dec 9 09:40 aaa
-rw-r--r-- 1 root root 3221225472 Dec 9 09:41 bbb
-rw-r--r-- 1 root root 0 Dec 9 09:41 ccc
-rw-r--r-- 1 root root 52428800 Dec 9 09:41 ddd
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/vdc 8000000 8000000 0 100% /mnt/volume1
/mnt/volume1 # du -s .
9488384 .
root@stonetest:~# xfs_quota -x
xfs_quota> report -a
Project quota on /openebs (/dev/vdc)
Blocks
Project ID Used Soft Hard Warn/Grace
---------- --------------------------------------------------
#0 0 0 0 00 [0 days]
#1 9488384 8000000 10000000 00 [6 days]
project id 1 已使用9488384KB, 已超过softlimit,还未超过hardlimit。
```
### 关闭/开启/手动编辑 project quota
xfs_quota -x, 根据help输入相应的命令即可。
```
root@stonetest:~# xfs_quota -x
xfs_quota> help
df [-bir] [-hN] [-f file] -- show free and used counts for blocks and inodes
disable [-gpu] [-v] -- disable quota enforcement
dump [-g|-p|-u] [-f file] -- dump quota information for backup utilities
enable [-gpu] [-v] -- enable quota enforcement
help [command] -- help for one or all commands
limit [-g|-p|-u] bsoft|bhard|isoft|ihard|rtbsoft|rtbhard=N -d|id|name -- modify quota limits
off [-gpu] [-v] -- permanently switch quota off for a path
path [N] -- set current path, or show the list of paths
print -- list known mount points and projects
project [-c|-s|-C|-d <depth>|-p <path>] project ... -- check, setup or clear project quota trees
quit -- exit the program
quot [-bir] [-g|-p|-u] [-acv] [-f file] -- summarize filesystem ownership
quota [-bir] [-g|-p|-u] [-hnNv] [-f file] [id|name]... -- show usage and limits
remove [-gpu] [-v] -- remove quota extents from a filesystem
report [-bir] [-gpu] [-ahnt] [-f file] -- report filesystem quota information
restore [-g|-p|-u] [-f file] -- restore quota limits from a backup file
state [-gpu] [-a] [-v] [-f file] -- get overall quota state information
timer [-bir] [-g|-p|-u] value [-d|id|name] -- set quota enforcement timeouts
warn [-bir] [-g|-p|-u] value -d|id|name -- get/set enforcement warning counter
Use 'help commandname' for extended help.
```
## 遗留问题
- 使用xfs project quota时,当pvc删除之后,`xfs_quota -x -c 'report -h' /openebs` 仍显示此pvc的project quota, openebs并没有将次quota条目删除。可使用`xfs_quota -x -c 'limit -p bsoft=0 bhard=0 {project-id}' /openebs` 手动删除此project quota. 使用ext4 project quota没有此问题。
- 不论xfs或ext4,将pv挂载至容器之后,在容器内,pv的挂载点显示的总容量为 soft limit(本例中,8GB),但实际可写入的容量为 hard limit(本例中,10GB)。此为系统限制,无法更改。
- 使用xfs project quota时,无法查阅project id对应的path。这是因为openebs没有将project id 和 path 写入 /etc/projects 和 /etc/projid 文件中。pvc多了以后,维护起来非常困难,因为分不清哪个project id对应哪个pvc的目录。ext4不使用/etc/projects和/etc/projid,无此问题。
## 总结
- ext4 和 xfs 都可使用project quota特性来限制pvc的可用空间。
- 基于以上遗留问题,推荐优先使用 ext4 project quota.
- openebs localpv provisioner 官方镜像有bug(limit计算问题,pvc删除后未自动清除quota条目问题,未写入/etc/projects(projid)文件等问题),且仅支持blocks limit,不包括inodes limit,需修复和改进。