# Kubernetes Storage Tutorial: PV, PVC, and StorageClass
This tutorial provides a comprehensive understanding of Kubernetes storage concepts through practical examples and hands-on exercises.
**Repository:**
https://github.com/Yang92047111/K8s-Storage
## Table of Contents
1. [Storage Fundamentals](#Storage-Fundamentals)
2. [PersistentVolume (PV) Deep Dive](#PersistentVolume-PV-Deep-Dive)
3. [PersistentVolumeClaim (PVC) Deep Dive](#PersistentVolumeClaim-PVC-Deep-Dive)
4. [StorageClass Deep Dive](#StorageClass-Deep-Dive)
5. [Storage Lifecycle](#Storage-Lifecycle)
6. [Hands-On Exercises](#Hands-On-Exercises)
7. [Best Practices](#Best-Practices)
8. [Troubleshooting Guide](#Troubleshooting-Guide)
---
## Storage Fundamentals
### Why Kubernetes Storage?
In containerized environments, storage presents unique challenges:
- **Ephemeral nature**: Containers are stateless by design
- **Pod lifecycle**: When pods die, their data disappears
- **Scaling requirements**: Applications need persistent data across replicas
- **Cloud portability**: Storage should work across different environments
Kubernetes storage abstractions solve these problems by:
- Decoupling storage from pod lifecycle
- Providing consistent APIs across storage types
- Enabling dynamic provisioning and management
### Storage Architecture Overview
```
┌────────────────────────────────────────────────────────────────┐
│ Kubernetes Cluster │
│ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ Pod A │ │ Pod B │ │ Pod C │ │
│ │ │ │ │ │ │ │
│ │ ┌───────┐ │ │ ┌───────┐ │ │ ┌───────┐ │ │
│ │ │ Volume│ │ │ │ Volume│ │ │ │ Volume│ │ │
│ │ │ Mount │ │ │ │ Mount │ │ │ │ Mount │ │ │
│ │ └───┬───┘ │ │ └───┬───┘ │ │ └───┬───┘ │ │
│ └──────┼──────┘ └──────┼──────┘ └──────┼──────┘ │
│ │ │ │ │
│ ┌──────▼──────┐ ┌──────▼──────┐ ┌──────▼──────┐ │
│ │ PVC │ │ PVC │ │ PVC │ │
│ │ (Claim) │ │ (Claim) │ │ (Claim) │ │
│ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ │
│ │ │ │ │
│ └──────────────────┼──────────────────┘ │
│ │ │
│ ┌───────▼───────┐ │
│ │ PV │ │
│ │ (Physical │ │
│ │ Storage) │ │
│ └───────────────┘ │
└────────────────────────────────────────────────────────────────┘
```
### Key Concepts
1. **Abstraction Layers**: Kubernetes provides multiple layers of abstraction
2. **Declarative Management**: Resources are defined through YAML manifests
3. **Binding Process**: Automatic matching of claims to volumes
4. **Lifecycle Management**: Independent lifecycles for different components
---
## PersistentVolume (PV) Deep Dive
### What is a PersistentVolume?
A PersistentVolume (PV) represents a piece of storage in the cluster that has been provisioned by an administrator or dynamically provisioned using Storage Classes.
### Key Characteristics
#### 1. Cluster-Scoped Resource
```yaml
apiVersion: v1
kind: PersistentVolume
metadata:
name: my-pv # Available cluster-wide, not namespace-specific
```
#### 2. Independent Lifecycle
- Exists beyond any individual pod
- Survives pod deletion and recreation
- Can be reused by different pods
#### 3. Storage Abstraction
- Hides implementation details of underlying storage
- Provides consistent interface across storage types
### PV Specifications
#### Storage Capacity
```yaml
spec:
capacity:
storage: 10Gi # Total available storage
```
#### Access Modes
```yaml
spec:
accessModes:
- ReadWriteOnce # RWO: Single node, read-write
- ReadOnlyMany # ROX: Multiple nodes, read-only
- ReadWriteMany # RWX: Multiple nodes, read-write
```
**Access Mode Details:**
| Mode | Abbreviation | Description | Use Cases |
|------|--------------|-------------|-----------|
| ReadWriteOnce | RWO | Volume can be mounted as read-write by a single node | Databases, file systems |
| ReadOnlyMany | ROX | Volume can be mounted read-only by many nodes | Static content, configurations |
| ReadWriteMany | RWX | Volume can be mounted as read-write by many nodes | Shared file systems, distributed storage |
#### Reclaim Policy
```yaml
spec:
persistentVolumeReclaimPolicy: Retain # Retain | Delete | Recycle
```
**Reclaim Policies:**
- **Retain**: Manual reclamation (default for manually created PVs)
- **Delete**: Automatic deletion when PVC is deleted
- **Recycle**: Deprecated - performs basic scrub (`rm -rf /thevolume/*`)
#### Storage Classes
```yaml
spec:
storageClassName: fast-ssd # Links to StorageClass
```
### Volume Types
#### Local Storage (hostPath)
```yaml
spec:
hostPath:
path: /mnt/data
type: DirectoryOrCreate
```
**hostPath Types:**
- `DirectoryOrCreate`: Create directory if it doesn't exist
- `Directory`: Directory must exist
- `File`: File must exist
- `Socket`: Unix socket must exist
#### Network Storage (NFS)
```yaml
spec:
nfs:
server: nfs-server.example.com
path: /exported/path
```
#### Cloud Storage (AWS EBS)
```yaml
spec:
awsElasticBlockStore:
volumeID: vol-1234567890abcdef0
fsType: ext4
```
### Complete PV Example
```yaml
apiVersion: v1
kind: PersistentVolume
metadata:
name: example-pv
labels:
type: local
environment: development
spec:
capacity:
storage: 5Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: manual
hostPath:
path: /tmp/data
type: DirectoryOrCreate
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- node-1
```
---
## PersistentVolumeClaim (PVC) Deep Dive
### What is a PersistentVolumeClaim?
A PersistentVolumeClaim (PVC) is a request for storage by a user. It's similar to a pod consuming node resources - PVCs consume PV resources.
### Key Characteristics
#### 1. Namespace-Scoped
```yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: my-claim
namespace: default # Lives in a specific namespace
```
#### 2. Storage Request
- Specifies desired storage characteristics
- Kubernetes finds matching PV or creates one dynamically
#### 3. Pod Consumption
- Pods reference PVCs, not PVs directly
- Provides abstraction layer for applications
### PVC Specifications
#### Resource Requests
```yaml
spec:
resources:
requests:
storage: 3Gi # Minimum storage required
```
#### Access Modes
```yaml
spec:
accessModes:
- ReadWriteOnce # Must match or be subset of PV access modes
```
#### Storage Class Selection
```yaml
spec:
storageClassName: fast-ssd # Specific StorageClass
# storageClassName: "" # Empty string = no StorageClass
# No storageClassName # Default StorageClass
```
#### Selectors
```yaml
spec:
selector:
matchLabels:
environment: production
matchExpressions:
- key: tier
operator: In
values: ["cache"]
```
### PVC Binding Process
#### 1. Manual Binding (Static Provisioning)
```mermaid
graph LR
A[Create PV] --> B[Create PVC]
B --> C[Kubernetes Matches]
C --> D[PVC Bound to PV]
```
#### 2. Dynamic Binding (Dynamic Provisioning)
```mermaid
graph LR
A[Create StorageClass] --> B[Create PVC]
B --> C[StorageClass Creates PV]
C --> D[PVC Bound to New PV]
```
### PVC States
| Phase | Description |
|-------|-------------|
| Pending | PVC is created but not yet bound to a PV |
| Bound | PVC is bound to a PV |
| Lost | PV associated with PVC is lost |
### Complete PVC Example
```yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: database-claim
namespace: production
labels:
app: database
tier: storage
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20Gi
storageClassName: fast-ssd
selector:
matchLabels:
environment: production
type: database
```
---
## StorageClass Deep Dive
### What is a StorageClass?
A StorageClass provides a way for administrators to describe the "classes" of storage they offer. Different classes might map to quality-of-service levels, backup policies, or arbitrary policies determined by cluster administrators.
### Key Characteristics
#### 1. Dynamic Provisioning
- Automatically creates PVs when PVCs are created
- Eliminates need for pre-provisioned storage
#### 2. Storage Parameters
- Defines storage-specific configuration
- Enables fine-tuned storage behavior
#### 3. Provisioner-Specific
- Each cloud provider has specific provisioners
- Supports various storage backends
### StorageClass Components
#### Provisioner
```yaml
provisioner: kubernetes.io/aws-ebs # Storage system to use
```
**Common Provisioners:**
- `kubernetes.io/aws-ebs` - Amazon EBS
- `kubernetes.io/gce-pd` - Google Cloud Persistent Disk
- `kubernetes.io/azure-disk` - Azure Disk
- `kubernetes.io/no-provisioner` - Static provisioning only
#### Parameters
```yaml
parameters:
type: gp2 # Storage type
fsType: ext4 # File system type
encrypted: "true" # Encryption enabled
```
#### Volume Binding Mode
```yaml
volumeBindingMode: WaitForFirstConsumer # Immediate | WaitForFirstConsumer
```
**Binding Modes:**
- **Immediate**: Bind and provision immediately when PVC is created
- **WaitForFirstConsumer**: Delay binding until pod using PVC is scheduled
#### Reclaim Policy
```yaml
reclaimPolicy: Delete # Delete | Retain
```
#### Allow Volume Expansion
```yaml
allowVolumeExpansion: true # Enable volume resize
```
### Cloud Provider Examples
#### AWS EBS StorageClass
```yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: fast-ebs
provisioner: kubernetes.io/aws-ebs
parameters:
type: gp3
fsType: ext4
encrypted: "true"
iops: "3000"
throughput: "125"
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
reclaimPolicy: Delete
```
#### Google Cloud PD StorageClass
```yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: fast-ssd
provisioner: kubernetes.io/gce-pd
parameters:
type: pd-ssd
replication-type: regional-pd
zones: us-central1-a,us-central1-b
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
```
#### Local Storage StorageClass
```yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: local-storage
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
reclaimPolicy: Delete
```
### Default StorageClass
```yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: standard
annotations:
storageclass.kubernetes.io/is-default-class: "true"
provisioner: kubernetes.io/gce-pd
parameters:
type: pd-standard
```
---
## Storage Lifecycle
### Complete Workflow
#### 1. Admin Phase (Static Provisioning)
```bash
# Create StorageClass
kubectl apply -f storageclass.yaml
# Create PersistentVolume
kubectl apply -f pv.yaml
```
#### 2. User Phase
```bash
# Create PersistentVolumeClaim
kubectl apply -f pvc.yaml
# Create Pod that uses PVC
kubectl apply -f pod.yaml
```
#### 3. Binding Process
```mermaid
sequenceDiagram
participant User
participant K8s API
participant Controller
participant Storage
User->>K8s API: Create PVC
K8s API->>Controller: PVC Event
Controller->>Storage: Find Matching PV
Storage-->>Controller: PV Found/Created
Controller->>K8s API: Bind PVC to PV
User->>K8s API: Create Pod
K8s API->>Controller: Pod Event
Controller->>Storage: Mount Volume
Storage-->>Controller: Volume Mounted
```
### Dynamic Provisioning Workflow
```mermaid
graph TD
A[Create StorageClass] --> B[Create PVC]
B --> C{StorageClass<br>with Provisioner?}
C -->|Yes| D[External Provisioner<br>Creates PV]
C -->|No| E[Wait for Manual PV]
D --> F[Bind PVC to PV]
E --> F
F --> G[Pod Can Use PVC]
```
### Volume Lifecycle States
#### PV States
- **Available**: Free resource, not bound to claim
- **Bound**: Volume is bound to a claim
- **Released**: Claim has been deleted, but not reclaimed
- **Failed**: Volume has failed its automatic reclamation
#### PVC States
- **Pending**: Waiting for binding
- **Bound**: Successfully bound to PV
- **Lost**: Associated PV is lost
---
## Hands-On Exercises
### Exercise 1: Basic Static Provisioning
#### Step 1: Create StorageClass
```yaml
# storage-class-basic.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: manual
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
reclaimPolicy: Retain
```
#### Step 2: Create PersistentVolume
```yaml
# pv-basic.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
name: task-pv-volume
labels:
type: local
spec:
storageClassName: manual
capacity:
storage: 1Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/tmp/data"
```
#### Step 3: Create PersistentVolumeClaim
```yaml
# pvc-basic.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: task-pv-claim
spec:
storageClassName: manual
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
```
#### Step 4: Create Pod Using PVC
```yaml
# pod-basic.yaml
apiVersion: v1
kind: Pod
metadata:
name: task-pv-pod
spec:
volumes:
- name: task-pv-storage
persistentVolumeClaim:
claimName: task-pv-claim
containers:
- name: task-pv-container
image: nginx
ports:
- containerPort: 80
name: "http-server"
volumeMounts:
- mountPath: "/usr/share/nginx/html"
name: task-pv-storage
```
#### Commands to Run
```bash
# Apply resources
kubectl apply -f storage-class-basic.yaml
kubectl apply -f pv-basic.yaml
kubectl apply -f pvc-basic.yaml
kubectl apply -f pod-basic.yaml
# Check status
kubectl get storageclass
kubectl get pv
kubectl get pvc
kubectl get pods
# Test persistence
kubectl exec -it task-pv-pod -- /bin/bash
echo "Hello from PV" > /usr/share/nginx/html/index.html
exit
# Delete and recreate pod
kubectl delete pod task-pv-pod
kubectl apply -f pod-basic.yaml
# Verify data persistence
kubectl exec -it task-pv-pod -- cat /usr/share/nginx/html/index.html
```
### Exercise 2: Multiple Access Modes
#### ReadWriteMany Example
```yaml
# pv-rwx.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
name: shared-pv
spec:
capacity:
storage: 2Gi
accessModes:
- ReadWriteMany
storageClassName: shared
nfs:
server: nfs-server.example.com
path: /shared/data
```
#### Multiple Pods Sharing Volume
```yaml
# deployment-shared.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: shared-storage-app
spec:
replicas: 3
selector:
matchLabels:
app: shared-app
template:
metadata:
labels:
app: shared-app
spec:
containers:
- name: app
image: nginx
volumeMounts:
- name: shared-storage
mountPath: /shared
volumes:
- name: shared-storage
persistentVolumeClaim:
claimName: shared-pvc
```
### Exercise 3: Volume Expansion
#### Create Expandable StorageClass
```yaml
# storage-class-expandable.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: expandable
provisioner: kubernetes.io/aws-ebs
parameters:
type: gp2
allowVolumeExpansion: true
```
#### Test Expansion
```bash
# Create PVC with 1Gi
kubectl apply -f pvc-1gi.yaml
# Edit PVC to request 5Gi
kubectl patch pvc my-pvc -p '{"spec":{"resources":{"requests":{"storage":"5Gi"}}}}'
# Check expansion status
kubectl get pvc my-pvc -w
```
---
## Best Practices
### 1. StorageClass Design
#### Use Descriptive Names
```yaml
metadata:
name: fast-ssd-retain # Clear purpose
# NOT: name: storage1 # Ambiguous
```
#### Define Multiple Classes
```yaml
# Development
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: dev-standard
parameters:
type: gp2
---
# Production
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: prod-high-iops
parameters:
type: io1
iops: "1000"
```
### 2. PVC Management
#### Right-Size Storage Requests
```yaml
spec:
resources:
requests:
storage: 10Gi # Request what you need, not more
```
#### Use Labels for Organization
```yaml
metadata:
labels:
app: database
tier: storage
environment: production
```
### 3. Security Considerations
#### Encrypt Sensitive Data
```yaml
parameters:
encrypted: "true"
kmsKeyId: "arn:aws:kms:us-west-2:123456789012:key/12345678-1234-1234-1234-123456789012"
```
#### Use RBAC
```yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: pvc-manager
rules:
- apiGroups: [""]
resources: ["persistentvolumeclaims"]
verbs: ["get", "list", "create", "delete"]
```
### 4. Monitoring and Maintenance
#### Set Resource Quotas
```yaml
apiVersion: v1
kind: ResourceQuota
metadata:
name: storage-quota
spec:
hard:
requests.storage: "100Gi"
persistentvolumeclaims: "10"
```
#### Monitor Usage
```bash
# Check PV/PVC status
kubectl get pv,pvc --all-namespaces
# Monitor storage usage
kubectl top nodes
kubectl describe node <node-name>
```
---
## Troubleshooting Guide
### Common Issues and Solutions
#### 1. PVC Stuck in Pending State
**Symptoms:**
```bash
$ kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
my-pvc Pending standard 5m
```
**Causes and Solutions:**
**No Matching PV Available:**
```bash
# Check available PVs
kubectl get pv
# Check PVC details
kubectl describe pvc my-pvc
```
**Solution:** Create matching PV or fix StorageClass
**StorageClass Not Found:**
```bash
# Check if StorageClass exists
kubectl get storageclass
# Check PVC events
kubectl describe pvc my-pvc
```
**Insufficient Storage:**
```bash
# Check PV capacity
kubectl get pv -o custom-columns=NAME:.metadata.name,SIZE:.spec.capacity.storage,STATUS:.status.phase
```
#### 2. Pod Can't Mount Volume
**Symptoms:**
```bash
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
my-pod 0/1 Pending 0 2m
```
**Check Events:**
```bash
kubectl describe pod my-pod
```
**Common Errors:**
**Volume Already Mounted (RWO):**
```
Warning FailedMount: Multi-Attach error for volume "pvc-xxx" Volume is already exclusively attached to one node and can't be attached to another
```
**Solution:** Use ReadWriteMany or ensure pod is on same node
**Node Affinity Mismatch:**
```
Warning FailedScheduling: 0/3 nodes are available: 3 node(s) didn't match pod affinity rules
```
**Solution:** Fix node affinity or use WaitForFirstConsumer
#### 3. Volume Mount Permission Issues
**Symptoms:**
```bash
$ kubectl logs my-pod
mkdir: cannot create directory '/data': Permission denied
```
**Solutions:**
**Security Context:**
```yaml
spec:
securityContext:
fsGroup: 2000
containers:
- name: app
securityContext:
runAsUser: 1000
runAsGroup: 2000
```
**Init Container:**
```yaml
spec:
initContainers:
- name: volume-mount-hack
image: busybox
command: ["sh", "-c", "chmod 777 /data"]
volumeMounts:
- name: data
mountPath: /data
```
#### 4. Dynamic Provisioning Not Working
**Check StorageClass:**
```bash
kubectl describe storageclass my-storage-class
```
**Check Provisioner:**
```bash
# For AWS EBS
kubectl get csidriver
# Check provisioner logs
kubectl logs -n kube-system -l app=ebs-csi-controller
```
#### 5. Volume Expansion Failures
**Check if StorageClass Supports Expansion:**
```bash
kubectl get storageclass -o custom-columns=NAME:.metadata.name,EXPANSION:.allowVolumeExpansion
```
**Check Expansion Status:**
```bash
kubectl describe pvc my-pvc
```
### Diagnostic Commands
#### Storage Overview
```bash
# Get all storage-related resources
kubectl get storageclass,pv,pvc --all-namespaces
# Check storage usage by nodes
kubectl describe nodes | grep -A 5 "Allocated resources"
```
#### Detailed Inspection
```bash
# PV details
kubectl get pv -o yaml my-pv
# PVC details with events
kubectl describe pvc my-pvc
# Storage class parameters
kubectl get storageclass my-storage-class -o yaml
```
#### Debug Pod Issues
```bash
# Check pod events
kubectl describe pod my-pod
# Check volume mounts
kubectl exec my-pod -- df -h
# Check permissions
kubectl exec my-pod -- ls -la /mount/path
```
### Debug Tools
#### Storage Inspector Pod
```yaml
apiVersion: v1
kind: Pod
metadata:
name: storage-debug
spec:
containers:
- name: debug
image: busybox
command: ["sleep", "3600"]
volumeMounts:
- name: debug-volume
mountPath: /debug
volumes:
- name: debug-volume
persistentVolumeClaim:
claimName: target-pvc
```
#### Volume Test Script
```bash
#!/bin/bash
# test-storage.sh
echo "Testing storage functionality..."
# Create test file
echo "Test data $(date)" > /debug/test.txt
# Check write permissions
if [ $? -eq 0 ]; then
echo "✅ Write successful"
else
echo "❌ Write failed"
exit 1
fi
# Read test
if [ -f /debug/test.txt ]; then
echo "✅ Read successful: $(cat /debug/test.txt)"
else
echo "❌ Read failed"
exit 1
fi
# Check disk space
df -h /debug
```
---
## Conclusion
Understanding Kubernetes storage requires grasping the relationship between StorageClasses, PersistentVolumes, and PersistentVolumeClaims. Each serves a specific purpose in the storage abstraction layer:
- **StorageClass**: Defines storage characteristics and provisioning behavior
- **PersistentVolume**: Represents actual storage resources in the cluster
- **PersistentVolumeClaim**: Requests storage with specific requirements
By mastering these concepts and following best practices, you can design robust, scalable storage solutions for your Kubernetes applications.
### Next Steps
1. **Practice**: Work through all exercises in this tutorial
2. **Experiment**: Try different StorageClass configurations
3. **Scale**: Test with multiple pods and volumes
4. **Production**: Apply learnings to real-world scenarios
5. **Advanced Topics**: Explore CSI drivers, volume snapshots, and cross-zone replication
### Additional Resources
- [Kubernetes Storage Documentation](https://kubernetes.io/docs/concepts/storage/)
- [CSI Driver Documentation](https://kubernetes-csi.github.io/docs/)
- [Cloud Provider Storage Guides](https://kubernetes.io/docs/concepts/storage/storage-classes/)
- [Storage Best Practices](https://kubernetes.io/docs/concepts/storage/persistent-volumes/#best-practices)