Patching Projects

This section solves all problems that arise if you want to customize the actual Pod of a project beyond the limits of what configuring my-values.yaml or license quotas can do. Also, this customization is specific to individual projects, not across all of them!

To make this possible, you can configure Licenses containing a JSON Patch. A JSON Patch is a JSON string (array), containing a series of JSON patch operations. The basic idea is that by applying such a license, it’s opertions are applied to the project pod datastructure right before it is sent to the Kubernetes API. If there are several licenses, their operations are aggregated.

This is the sequence leading to the final project pod:

  1. During HELM deployment, the charts/manage/templates/_project.tpl template is rendered and the resulting JSON is stored in the project-pod ConfigMap. If you are audacious, you could deploy your own project-pod ConfigMap with a different template!

  2. When a project is issued to start, the “manage” service reads the template from the project-pod ConfigMap. Feel free to run kubectl get configmap project-pod -o yaml to see it.

  3. It replaces placeholders:

    • {PROJECT_IMAGE} the docker image name to use for the project, that holds the “software” (what users work with)

    • {PROJECT_TAG} corresponding to the abve, the docker image tag to use for the project. They’re concatenated with a ":".

    • {project_id} the project’s UUID, as defined by CoCalc in the database

    • {extra_env} additional environment variables to set during project initialization

    • {project_config} the project configuration, as defined by CoCalc in the database

  4. When the licenses are processed, a quota is computed.

    • This adjust the CPU and memory requests and limits of the project pod.

    • Exact details depend on the max_upgrades and default_quotas server settings.

    • A license quota could contain a “patch” as well.

  5. The combined list of all patches found in the licenses is applied to the project pod template after all the above happened. CoCalc uses the applyPatch function of fast-json-patch.

Example

As admin, create a “Site License” and in the JSON Patch field, add this:

[
  {
    "op": "add",
    "path": "/metadata/labels/mylabel",
    "value": "test123"
  }
]

This operation adds a label to the metadata of the project pod. See RFC 6902 for more information.

Then, add this license to a project, which will restart it.

As a result, the project pod is now:

$ kubectl get pod -o yaml project-[UUID]
apiVersion: v1
kind: Pod
metadata:
  annotations: {...}
  labels:
    mylabel: test123   # <<< that's new
    network: outside
    project_id: [UUID]
    project_tag: project-20230110-1143
    run: project
[...]

How to create a patch?

You could the fast-json-patch library to create a patch:

npm init -y
npm i fast-json-patch

Get the project pod of an running project as JSON, and make a copy:

$ kubectl get pod -o json project-[UUID] > p1.json
$ cp p1.json p2.json

Now, edit p2.json to your liking, but don’t touch those fields which belong to Kubernetes! To match the above, e.g. changes like this:

$ diff p1.json p2.json
14c14,15
<             "run": "project"
---
>             "run": "project",
>             "mylabel": "test123"
191,192c192,193
<         "serviceAccountName": "default",
---
>         "serviceAccountName": "default2",

Now, create a script diff.js with the following content:

var jsonpatch = require("fast-json-patch");
var fs = require("fs");

var p1 = JSON.parse(fs.readFileSync("p1.json", "utf8"));
var p2 = JSON.parse(fs.readFileSync("p2.json", "utf8"));

var delta = jsonpatch.compare(p1, p2);
console.log(delta);

and run it:

$ node diff.js
[
  {
    op: 'replace',
    path: '/spec/serviceAccountName',
    value: 'default2'
  },
  { op: 'add', path: '/metadata/labels/mylabel', value: 'test123' }
]

That’s the patch for the license to make the above changes.

Note: under the hood, the patch is stored as a serialized JSON string in the site_licenses database table:

cocalc=> select * from site_licenses where id = 'f3d76abf-38a5-4e4e-884d-2e70f1afddb9';
id          | f3d76abf-38a5-4e4e-884d-2e70f1afddb9
title       | patch1
activates   | 2023-01-20 16:37:12.595
quota       | {"cpu": 1, "ram": 1, "patch": "[{\"op\":\"add\",\"path\":\"/metadata/labels/mylabel\",\"value\":\"test123\"}]"}
run_limit   | 1
[...]

Use Cases

Access all user files

The above can be used to outfit a specific project for administrators to have access to all files of all users. To do this, the key idea is to mount the shared volume for user data – without specifying the project’s UUID as a sub-directory.

This is the patch to add the volume named "home" to the directory /global in the project.

[
  {
    "op": "add",
    "path": "/spec/containers/0/volumeMounts/-",
    "value": {
      "mountPath": "/global",
      "name": "home"
    }
  }
]

This patch could be part of an “Admin” license, which enables read/write access to the global /ext directory as well. For example, a possible configuration for a powerful admin license could be:

{
  "cpu": 1,
  "ram": 3,
  "patch": "[{\"op\":\"add\",\"path\":\"/spec/containers/0/volumeMounts/7\",\"value\":{\"mountPath\":\"/global\",\"name\":\"home\"}}]",
  "ext_rw": true,
  "always_running": true
}

If you have further ideas, look at the top of this page to see how and where you can investigate the project’s template and get the current version directly from Kubernetes.

Additional storage

Here, we show how to mount an additional PVC called storage-extra in a project at /extra.

First, you have to define it. For example, we use the following pvc-extra.yaml file, to define some additional storage on our NFS server, which runs a provisioner for the StorageClass nfs. This does not really matter, though, and you can configure this in any way you like.

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: storage-extra
  namespace: cocalc
spec:
  storageClassName: "nfs" # the storageclass, in my case an NFS provisioner
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: "10G"

What really counts is adding a volume and volumeMount definition on the first container of the project pod.

This patch below first appends the name/mountPath key/value object to the existing first container’s volumeMounts array. The .../- means “add at the end of the array”. You can change the mountPath in any way you like, although there are already some in use. Neither pick /home/user or /cocalc. Replacing Linux standard directories is not a good idea as well. /data/... might be a good idea, though.

Similarly, we append the name/persistentVolumeClaim object to the spec.volumes array. Again, you can do this any way you like, depending on what your cluster’s storage system can do for you. The PVC is just an example, you can referenc your an NFS or CephFS server directly as well.

What must match are both value.name, which must also be unique among the valumes and mounts.

[
  {
    "op": "add",
    "path": "/spec/containers/0/volumeMounts/-",
    "value": {
      "name": "storage-extra",
      "mountPath": "/extra"
    }
  },
  {
    "op": "add",
    "path": "/spec/volumes/-",
    "value": {
      "name": "storage-extra",
      "persistentVolumeClaim": {
        "claimName": "storage-extra"
      }
    }
  }
]

As a result, in that project, where the license with that patch has been applied and restarted, there is an additional mount point:

Beyond that, do you want to give some users access to the same volume, but read-only?

  1. Create a new license, call it something like “storage extra read-only”.

  2. Change the run limit to a high value, e.g. 9999. (Note: 0 for unlimited seems to be broken).

  3. Define almost the same patch, but with a small tweak. Mount the volume with "readOnly": true:

    [{ "op": "add", ... "value": { ...,  "readOnly": true }}, {"op": "add", ... }]
    

Problems?

To debug this, you have to check the log of manage-action-... during project startup. It might fail with an error about a problematic JSON object (e.g. above, it fails for me, because I do not have a “default2” service account).

You can also set the manage.dbg_project_patching value to "1" to output the project pod JSON right before the patch is applied and also the patch set itself. Look for lines containing project_pod= and project_pod_before= and patch=.