Storage

Earlier, we discussed the possibilities to setup storage. Now, let’s see how to configure it!

By default, the HELM chart creates PersistentVolumeClaims for the data and software using the nfs StorageClass. You can configure the storage class and size of the PVCs in the values.yaml file. This is not recommended, because it simply uses the default StorageClass and that might not be what you want. Look out for a section like this:

storage:
class: "nfs"
size:
  software: 10Gi
  data: 10Gi

Alternatively, and highly reommended, you simply create the PVCs yourself and use them. For that, set these two aspects:

  1. Don’t create them via the HELM charts, i.e. storage.create: false.

  2. Let CoCalc know about their names. E.g. if they’re called pvc-data and pvc-software, the relevant part in the config file would look like this:

storage:
  create: false

global:
  [...]
  storage:
    data:
      claimName: pvc-data
    software:
      claimName: pvc-software

The default names are projects-data and projects-software.

Note

The most important detail to know is that these file-systems are accessed by the projects, where the users run with the UID 2001 and GID 2001. This is for security reasons!

But, this also means that your storage must give them proper access. Despite setting the fsGroup value in the securityContext, your NFS server’s StorageClass might not respect that.

A symptom of wrong permissions are projects failing with an Error. Check their logs, they might end with:

cp: cannot create regular file '/home/user/.bashrc': Permission denied

To fix the permissions, you can enable an Init Container for each project. This init container runs as root, before the project itself is started, and does the chown on the directory, where the project’s data (i.e. “home directory”) will be stored.

The setting is: manage.project.fixPermissionsInit: true (default false)

Regarding an NFS server, what’s also necessary is to have no_root_squash – because otherwise the chmod command issued by a root user in that Init Container will not work.

Example 1

If you bring your own NFS server and want to setup everything manually, you could follow the notes of setting up PV/PVC on Azure. This walks you through the steps of creating a PV and PVC for software and project data in two subdirectories. Then it deploys a small setup job, which creates these subdirectories and sets their proper ownership and permissions.

Example 2

As reference, on a cluster on Google’s GKE, the following PV, PVCs and StorageClasses exist:

$ kubectl get -o wide pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                                      STORAGECLASS   REASON   AGE    VOLUMEMODE
pvc-58ada9eb-220d-46d9-ba5d-31ceb0c0fc45   10Gi       ROX            Retain           Bound    cocalc/projects-software                   nfs                     111d   Filesystem
pvc-ace52cd2-fb85-4fb9-96f7-19cd9575f5c2   20Gi       RWO            Retain           Bound    cocalc/data-nfs-nfs-server-provisioner-0   pd-standard             112d   Filesystem
pvc-ceae334d-8f0a-447b-b5c6-fcae6843a498   10Gi       RWX            Retain           Bound    cocalc/projects-data                       nfs                     111d   Filesystem

$ kubectl get -o wide pvc
NAME                                STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE    VOLUMEMODE
data-nfs-nfs-server-provisioner-0   Bound    pvc-ace52cd2-fb85-4fb9-96f7-19cd9575f5c2   20Gi       RWO            pd-standard    112d   Filesystem
projects-data                       Bound    pvc-ceae334d-8f0a-447b-b5c6-fcae6843a498   10Gi       RWX            nfs            111d   Filesystem
projects-software                   Bound    pvc-58ada9eb-220d-46d9-ba5d-31ceb0c0fc45   10Gi       ROX            nfs            111d   Filesystem


$ kubectl get -o wide storageclass nfs
NAME   PROVISIONER                                RECLAIMPOLICY   VOLUMEBINDINGMODE   ALLOWVOLUMEEXPANSION   AGE
nfs    cluster.local/nfs-nfs-server-provisioner   Retain          Immediate           true                   111d

$ kubectl get -o wide storageclass pd-standard
NAME          PROVISIONER             RECLAIMPOLICY   VOLUMEBINDINGMODE   ALLOWVOLUMEEXPANSION   AGE
pd-standard   pd.csi.storage.gke.io   Retain          Immediate           true                   111d

Note

  • The nfs storage class is created by the nfs-ganesha-server-and-external-provisioner helm chart. It uses a pd-standard disk to store the data.

  • projects-data and projects-software are provided by that NFS service.

  • There is no PV/PVC for a database, because this cluster uses GCP’s managed PostgreSQL service.