Storage

Earlier, we discussed the possibilities to setup storage. Now, let’s see how to configure it!

By default, the HELM chart creates PersistentVolumeClaims for the data and software using the standard StorageClass. You can configure the storage class and size of the PVCs in the values.yaml file. This is not recommended, because it simply uses the default StorageClass and that might not be what you want. Look out for a section like this:

storage:
  class: "standard"
  size:
    software: 10Gi
    data: 10Gi

Alternatively, and highly reommended, you simply create the PVCs yourself and use them. For that, set these two aspects:

  1. Don’t create them via the HELM charts, i.e. storage.create: false.

  2. Let CoCalc know about their names. E.g. if they’re called pvc-data and pvc-software, the relevant part in the config file would look like this:

storage:
  create: false

global:
  [...]
  storage:
    data:
      claimName: pvc-data
    software:
      claimName: pvc-software

The default names are projects-data and projects-software.

Note

The most important detail to know is that these file-systems are accessed by the projects, where the users run with the UID 2001 and GID 2001. This is for security reasons!

But, this also means that your storage must give them proper access. Despite setting the fsGroup value in the securityContext, your NFS server’s StorageClass might not respect that.

A symptom of wrong permissions are projects failing with an Error. Check their logs, they might end with:

cp: cannot create regular file '/home/user/.bashrc': Permission denied

To fix the permissions, you can enable an Init Container for each project. This init container runs as root, before the project itself is started, and does the chown on the directory, where the project’s data (i.e. “home directory”) will be stored.

The setting is: manage.project.fixPermissionsInit: true (default false)

Regarding an NFS server, what’s also necessary is to have no_root_squash – because otherwise the chmod command issued by a root user in that Init Container will not work.

Per-project PVC mode (advanced)

By default, every project’s home directory is a sub-directory of the shared projects-data PersistentVolumeClaim. As an alternative, the manage service can create one PVC per project and mount it as the project’s home. See Per-project PVC mode for the operational context (when this is useful, what it changes, RBAC implications).

To enable it, uncomment the manage.perProjectPVC block in my-values.yaml:

manage:
  perProjectPVC:
    # REQUIRED. PVC name template. MUST contain the literal "PROJECT_ID",
    # which is replaced with the project's UUID at pod-creation time.
    # The literal "NAMESPACE" is also substituted with the release
    # namespace. The `name` field is the on-switch: if it is omitted the
    # chart emits no env var and manage stays in legacy shared-PVC mode.
    name: "pvc-cocalc-project-PROJECT_ID"

    # REQUIRED. Storage class used for the per-project PVC.
    sc: "longhorn"

    # Optional. Requested size, e.g. "5Gi", "20Gi". Defaults to "10Gi".
    #size: "10Gi"

    # Optional. Defaults to ["ReadWriteOnce"].
    #accessModes: ["ReadWriteOnce"]

    # Optional. Default (key omitted): uid 2001, gid 2001 — matches the
    # hardcoded project user inside the project image. Manage sets pod
    # `fsGroup=2001` (with `fsGroupChangePolicy: OnRootMismatch`) and
    # rewrites the project init container to `chown 2001:2001 /mnt` as a
    # belt-and-suspenders fallback for CSIs that don't honor fsGroup.
    # Set to `null` to opt out (no fsGroup, init container removed).
    # Overriding uid/gid is rarely correct.
    #initOwner:
    #  uid: 2001
    #  gid: 2001

    # Optional. Mount the original shared projects-data PVC read-only at
    # this path inside the project pod. Useful for migration scenarios
    # where users still need to read pre-migration data.
    #mountGlobalProjectData: "/home/old-data"

The block is serialized to JSON and exposed to manage via the COCALC_MANAGE_PROJECT_PVC environment variable. Leaving the block commented out (the default) preserves the legacy shared-PVC behavior — no env var is emitted and the manage Role grants no PVC permissions.

When the block is set, the chart additionally grants the manage ServiceAccount persistentvolumeclaims access in its namespace Role, which is required for manage to create and list the per-project PVCs. PVCs are intentionally never auto-deleted by manage; project home data must survive pod recreation, and reclamation is a manual admin operation (kubectl delete pvc ...).

Note

Validation is performed by manage on startup: bad JSON, missing name or sc, or a name template that does not contain the literal PROJECT_ID will cause the manage pod to fail-loud at startup. size defaults to 10Gi if omitted.

Warning

The ssh-gateway must be enabled when this mode is on. With per-project PVCs neither manage-copy nor manage-share can mount every project’s home from a shared PVC; both services drive rsync over SSH against port 2222 of the project pod, reusing the ssh-gateway’s id_ed25519 keypair from the ssh-gateway-keys Secret. The chart enforces this with a helm template-time check: if manage.perProjectPVC.name is set while global.ssh_gateway.enabled is false, rendering fails with an explanatory error. Project pods need no chart changes — they already run sshd on 2222 and trust the gateway’s public key.

Note

In this mode manage-copy and manage-share may briefly start a stopped project on demand to pull files out of it. A deferred stop (default 10 min, skipped if the project is in active use as judged by last_edited) cleans up afterwards. Operators who tune timeout_pending_projects_min or have unusually slow image pulls should be aware of this short-lived pod activity from the manage side.

Example 1

If you bring your own NFS server and want to setup everything manually, you could follow the notes of setting up PV/PVC on Azure. This walks you through the steps of creating a PV and PVC for software and project data in two subdirectories. Then it deploys a small setup job, which creates these subdirectories and sets their proper ownership and permissions.

Example 2

As reference, on a cluster on Google’s GKE, the following PV, PVCs and StorageClasses exist:

$ kubectl get -o wide pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                                      STORAGECLASS   REASON   AGE    VOLUMEMODE
pvc-58ada9eb-220d-46d9-ba5d-31ceb0c0fc45   10Gi       ROX            Retain           Bound    cocalc/projects-software                   nfs                     111d   Filesystem
pvc-ace52cd2-fb85-4fb9-96f7-19cd9575f5c2   20Gi       RWO            Retain           Bound    cocalc/data-nfs-nfs-server-provisioner-0   pd-standard             112d   Filesystem
pvc-ceae334d-8f0a-447b-b5c6-fcae6843a498   10Gi       RWX            Retain           Bound    cocalc/projects-data                       nfs                     111d   Filesystem

$ kubectl get -o wide pvc
NAME                                STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE    VOLUMEMODE
data-nfs-nfs-server-provisioner-0   Bound    pvc-ace52cd2-fb85-4fb9-96f7-19cd9575f5c2   20Gi       RWO            pd-standard    112d   Filesystem
projects-data                       Bound    pvc-ceae334d-8f0a-447b-b5c6-fcae6843a498   10Gi       RWX            nfs            111d   Filesystem
projects-software                   Bound    pvc-58ada9eb-220d-46d9-ba5d-31ceb0c0fc45   10Gi       ROX            nfs            111d   Filesystem


$ kubectl get -o wide storageclass nfs
NAME   PROVISIONER                                RECLAIMPOLICY   VOLUMEBINDINGMODE   ALLOWVOLUMEEXPANSION   AGE
nfs    cluster.local/nfs-nfs-server-provisioner   Retain          Immediate           true                   111d

$ kubectl get -o wide storageclass pd-standard
NAME          PROVISIONER             RECLAIMPOLICY   VOLUMEBINDINGMODE   ALLOWVOLUMEEXPANSION   AGE
pd-standard   pd.csi.storage.gke.io   Retain          Immediate           true                   111d

Note

  • The nfs storage class is created by the nfs-ganesha-server-and-external-provisioner helm chart. It uses a pd-standard disk to store the data.

  • projects-data and projects-software are provided by that NFS service.

  • There is no PV/PVC for a database, because this cluster uses GCP’s managed PostgreSQL service.