Patching Projects¶
This section solves all problems that arise if you want to customize the actual Pod of a project beyond the limits of what configuring my-values.yaml or license quotas can do. Also, this customization is specific to individual projects, not across all of them!
To make this possible, you can configure Licenses containing a JSON Patch. A JSON Patch is a JSON string (array), containing a series of JSON patch operations. The basic idea is that by applying such a license, it’s opertions are applied to the project pod datastructure right before it is sent to the Kubernetes API. If there are several licenses, their operations are aggregated.
This is the sequence leading to the final project pod:
During HELM deployment, the
charts/manage/templates/_project.tpl
template is rendered and the resulting JSON is stored in theproject-pod
ConfigMap. If you are audacious, you could deploy your ownproject-pod
ConfigMap with a different template!When a project is issued to start, the “manage” service reads the template from the
project-pod
ConfigMap. Feel free to runkubectl get configmap project-pod -o yaml
to see it.It replaces placeholders:
{PROJECT_IMAGE}
the docker image name to use for the project, that holds the “software” (what users work with){PROJECT_TAG}
corresponding to the abve, the docker image tag to use for the project. They’re concatenated with a":"
.{project_id}
the project’s UUID, as defined by CoCalc in the database{extra_env}
additional environment variables to set during project initialization{project_config}
the project configuration, as defined by CoCalc in the database
When the licenses are processed, a quota is computed.
This adjust the CPU and memory requests and limits of the project pod.
Exact details depend on the
max_upgrades
anddefault_quotas
server settings.A license quota could contain a “patch” as well.
The combined list of all patches found in the licenses is applied to the project pod template after all the above happened. CoCalc uses the
applyPatch
function of fast-json-patch.
Example¶
As admin, create a “Site License” and in the JSON Patch field, add this:
[
{
"op": "add",
"path": "/metadata/labels/mylabel",
"value": "test123"
}
]
This operation adds a label to the metadata of the project pod. See RFC 6902 for more information.
Then, add this license to a project, which will restart it.
As a result, the project pod is now:
$ kubectl get pod -o yaml project-[UUID]
apiVersion: v1
kind: Pod
metadata:
annotations: {...}
labels:
mylabel: test123 # <<< that's new
network: outside
project_id: [UUID]
project_tag: project-20230110-1143
run: project
[...]
How to create a patch?¶
You could the fast-json-patch library to create a patch:
npm init -y
npm i fast-json-patch
Get the project pod of an running project as JSON, and make a copy:
$ kubectl get pod -o json project-[UUID] > p1.json
$ cp p1.json p2.json
Now, edit p2.json
to your liking, but don’t touch those fields which belong to Kubernetes!
To match the above, e.g. changes like this:
$ diff p1.json p2.json
14c14,15
< "run": "project"
---
> "run": "project",
> "mylabel": "test123"
191,192c192,193
< "serviceAccountName": "default",
---
> "serviceAccountName": "default2",
Now, create a script diff.js
with the following content:
var jsonpatch = require("fast-json-patch");
var fs = require("fs");
var p1 = JSON.parse(fs.readFileSync("p1.json", "utf8"));
var p2 = JSON.parse(fs.readFileSync("p2.json", "utf8"));
var delta = jsonpatch.compare(p1, p2);
console.log(delta);
and run it:
$ node diff.js
[
{
op: 'replace',
path: '/spec/serviceAccountName',
value: 'default2'
},
{ op: 'add', path: '/metadata/labels/mylabel', value: 'test123' }
]
That’s the patch for the license to make the above changes.
Note: under the hood, the patch is stored as a serialized JSON string in
the site_licenses
database table:
cocalc=> select * from site_licenses where id = 'f3d76abf-38a5-4e4e-884d-2e70f1afddb9';
id | f3d76abf-38a5-4e4e-884d-2e70f1afddb9
title | patch1
activates | 2023-01-20 16:37:12.595
quota | {"cpu": 1, "ram": 1, "patch": "[{\"op\":\"add\",\"path\":\"/metadata/labels/mylabel\",\"value\":\"test123\"}]"}
run_limit | 1
[...]
Use Cases¶
Access all user files¶
The above can be used to outfit a specific project for administrators to have access to all files of all users. To do this, the key idea is to mount the shared volume for user data – without specifying the project’s UUID as a sub-directory.
This is the patch to add the volume named "home"
to the directory /global
in the project.
[
{
"op": "add",
"path": "/spec/containers/0/volumeMounts/-",
"value": {
"mountPath": "/global",
"name": "home"
}
}
]
This patch could be part of an “Admin” license, which enables read/write access to the global /ext
directory as well.
For example, a possible configuration for a powerful admin license could be:
{
"cpu": 1,
"ram": 3,
"patch": "[{\"op\":\"add\",\"path\":\"/spec/containers/0/volumeMounts/7\",\"value\":{\"mountPath\":\"/global\",\"name\":\"home\"}}]",
"ext_rw": true,
"always_running": true
}
If you have further ideas, look at the top of this page to see how and where you can investigate the project’s template and get the current version directly from Kubernetes.
Additional storage¶
Here, we show how to mount an additional PVC called storage-extra
in a project at /extra
.
First, you have to define it.
For example, we use the following pvc-extra.yaml
file,
to define some additional storage on our NFS server, which runs a provisioner for the StorageClass nfs
.
This does not really matter, though, and you can configure this in any way you like.
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: storage-extra
namespace: cocalc
spec:
storageClassName: "nfs" # the storageclass, in my case an NFS provisioner
accessModes:
- ReadWriteMany
resources:
requests:
storage: "10G"
What really counts is adding a volume
and volumeMount
definition on the first container of the project pod.
This patch below first appends the name/mountPath
key/value object to the existing first container’s volumeMounts
array.
The .../-
means “add at the end of the array”. You can change the mountPath
in any way you like, although there are already some in use.
Neither pick /home/user
or /cocalc
. Replacing Linux standard directories is not a good idea as well.
/data/...
might be a good idea, though.
Similarly, we append the name/persistentVolumeClaim
object to the spec.volumes
array.
Again, you can do this any way you like, depending on what your cluster’s storage system can do for you.
The PVC is just an example, you can referenc your an NFS or CephFS server directly as well.
What must match are both value.name
, which must also be unique among the valumes and mounts.
[
{
"op": "add",
"path": "/spec/containers/0/volumeMounts/-",
"value": {
"name": "storage-extra",
"mountPath": "/extra"
}
},
{
"op": "add",
"path": "/spec/volumes/-",
"value": {
"name": "storage-extra",
"persistentVolumeClaim": {
"claimName": "storage-extra"
}
}
}
]
As a result, in that project, where the license with that patch has been applied and restarted, there is an additional mount point:
Beyond that, do you want to give some users access to the same volume, but read-only?
Create a new license, call it something like “storage extra read-only”.
Change the run limit to a high value, e.g.
9999
. (Note:0
for unlimited seems to be broken).Define almost the same patch, but with a small tweak. Mount the volume with
"readOnly": true
:[{ "op": "add", ... "value": { ..., "readOnly": true }}, {"op": "add", ... }]
Problems?¶
To debug this, you have to check the log of manage-action-...
during project startup.
It might fail with an error about a problematic JSON object
(e.g. above, it fails for me, because I do not have a “default2” service account).
You can also set the manage.dbg_project_patching
value to "1"
to
output the project pod JSON right before the patch is applied and also
the patch set itself. Look for lines containing project_pod=
and
project_pod_before=
and patch=
.