Skip to content

📚 Complete Ceph CSI Deployment & Troubleshooting Guide

This guide details the preparation and configuration necessary for successful dynamic provisioning of Ceph RBD (RWO) and CephFS (RWX) volumes in a Kubernetes cluster running on MicroK8s, backed by a Proxmox VE (PVE) Ceph cluster.


1. ⚙️ Ceph Cluster Preparation (Proxmox VE)

These steps ensure the Ceph backend has the necessary pools and structure.

  • Create Dedicated Pools: Create OSD Pools for data, e.g., k8s_rbd (for RWO), and k8s_data and k8s_metadata (for CephFS).
  • Create CephFS Metadata Servers (MDS): Deploy at least two Metadata Server (MDS) instances.
  • Create CephFS Filesystem: Create the Ceph Filesystem (e.g., named k8s), linking the metadata and data pools.
  • Create Subvolume Group (Mandatory Fix): Create the dedicated Subvolume Group csi inside your CephFS. This is required by the CSI driver's default configuration and fixes the "No such file or directory" error during provisioning.
    • CLI Command: ceph fs subvolumegroup create k8s csi

2. 🔑 Ceph User and Authorization (The Permission Fix)

This addresses the persistent "Permission denied" errors during provisioning.

  • Create and Configure Ceph User: Create the user (client.kubernetes) and set permissions for all services. The wildcard MGR cap (mgr "allow *") is critical for volume creation.
    • Final Correct Caps Command: bash sudo ceph auth caps client.kubernetes \ mon 'allow r' \ mgr "allow *" \ mds 'allow rw' \ osd 'allow class-read object_prefix rbd_children, allow pool k8s_rbd rwx, allow pool k8s_metadata rwx, allow pool k8s_data rwx'
  • Export Key to Kubernetes Secrets: Create and place two Secrets with the user key in the correct CSI provisioner namespaces:
    • RBD Secret: csi-rbd-secret (in the RBD Provisioner namespace).
    • CephFS Secret: csi-cephfs-secret (in the CephFS Provisioner namespace).

The secrets should contain the keys: userID & userKey. userID should omit the 'client.' from the ceph output.


3. 🌐 Network Configuration and Bi-Directional Routing

These steps ensure stable, bidirectional communication for volume staging and mounting.

A. PVE Host Firewall Configuration

The PVE firewall must explicitly allow inbound traffic from the entire Kubernetes Pod Network to the Ceph service ports.

Protocol Port(s) Source Purpose
TCP 6789 K8s Pod Network CIDR (e.g., 10.1.0.0/16) Monitor connection.
TCP 6800-7300 K8s Pod Network CIDR OSD/MDS/MGR data transfer.

Alternatively, you may find a 'ceph' macro you can use on the PVE Firewall, if so use the Macro instead of additional rules.

B. PVE Host Static Routing (Ceph > K8s)

Add persistent static routes on all PVE Ceph hosts to allow Ceph to send responses back to the Pod Network.

  • Action: Edit /etc/network/interfaces on each PVE host: ini # Example: post-up ip route add <POD_NETWORK_CIDR> via <K8S_NODE_IP> dev <PVE_INTERFACE> # e.g., post-up ip route add 10.1.0.0/16 via 172.35.100.40 dev vmbr0

C. K8s Node IP Forwarding (Gateway Function)

Enable IP forwarding on all Kubernetes nodes so they can route incoming Ceph traffic to the correct Pods.

  • Action: Run on all K8s nodes: bash sudo sysctl net.ipv4.ip_forward=1 sudo sh -c 'echo "net.ipv4.ip_forward = 1" >> /etc/sysctl.d/99-sysctl.conf'

D. K8s Static Routing (K8s > Ceph) - Conditional/Advanced ⚠️

This routing is only required if the Ceph Public Network (the network Ceph Monitors/OSDs listen on) is not reachable by your Kubernetes Node's default gateway.

  • Action: This is implemented via a Netplan configuration on the Kubernetes nodes, using multiple routes with different metrics to provide load balancing and automatic failover.
  • Example Netplan Configuration (/etc/netplan/99-ceph-routes.yaml): ```yaml network: version: 2 renderer: networkd ethernets: eth0: # Replace with your primary K8s network interface routes: # Route 1: Directs traffic destined for the first Ceph Monitor IP (10.11.12.1) # through three different PVE hosts (172.35.100.x) as gateways. # The lowest metric (10) is preferred. - to: 10.11.12.1/32 via: 172.35.100.10 metric: 10 - to: 10.11.12.1/32 via: 172.35.100.20 metric: 100 - to: 10.11.12.1/32 via: 172.35.100.30 metric: 100
      # Route 2: Directs traffic destined for the second Ceph Monitor IP (10.11.12.2)
      # with a similar failover strategy.
      - to: 10.11.12.2/32
        via: 172.35.100.20
        metric: 10
      - to: 10.11.12.2/32
        via: 172.35.100.10
        metric: 100
      - to: 10.11.12.2/32
        via: 172.35.100.30
        metric: 100
    

    ```

Use route priorities (Lower is Higher) to prefer the most direct path, while still offering alternative gateways into the Ceph network where needed.


4. 🧩 MicroK8s CSI Driver Configuration (The Path Fix) - Conditional/Advanced ⚠️

This adjustment is only required if MicroK8s is running your Kuberneted deployment. Alternative changes may be needed for other Kubetnetes distributions.

This resolves the staging path does not exist on node error for the Node Plugin.

  • Update kubeletDir: When deploying the CSI driver (via Helm or YAML), the kubeletDir parameter must be set to the MicroK8s-specific path. yaml # Correct path for MicroK8s Kubelet root directory kubeletDir: /var/snap/microk8s/common/var/lib/kubelet