Microsoft Azure/AKS

The following walks you through my journey of setting up a CoCalc cluster on Microsoft Azure AKS. This guide was written in November 2023. Feel free to deviate at any point from this guide to fit your needs.

Resource groups

I’ve created a new one called “cocalc-onprem”.

Kubernetes Cluster

To get started, we setup a Kubernetes cluster. In “Kubernetes services” → “Create” → “Create Kubernetes cluster”

The overall goal is to setup two node pools:

  • “services” 2x a small 2 CPU cores and ~16GB RAM for Kubernetes itself and CoCalc services,

  • “projects”: 1x or more for the CoCalc projects, with 4 CPU cores and more memory.

Note

All nodes must run Linux and have x86 CPUs.

Basics

Parameter

Value

Subscription/Resource Group

the usual, and “cocalc-onprem”

Cluster preset

“Dev/Test” (YMMV)

Cluster name

cocalc1

Region

East US

Availability

1, 2, 3

AKS pricing

Free

Kubernetes version

1.26.6 (as of 2023-11-06, that’s the default)

Automatic upgrade

I picked the recommended one, i.e. "Enabled with patches"

Authentication and Authorization

Local accounts with Kubernetes RBAC

Node Pools

Service node pool

That’s the first “default” pool of nodes, where also system services for Kubernetes will run. I clicked on “agentpool” and renamed it to “services”.

Parameter

Value

Mode

System

OS SKU

Ubuntu Linux I don’t know if NFS mounts work with Azure Linux Something to experiment later on.

Availability

1, 2, 3

VM Size

Filtered for 2 vCPU (x86/64), 16 GB RAM and The cheapest one was A2m_v2. A better choice might be 4 vCPU and 16GB RAM.

Scale method

manual

Node count

2

Max pods per node

110 (default)

Public IP per node

No

Kubernetes Label

cocalc-role=services

Project node pool

Clicking on “+ add node pool”

Parameter

Value

Name

projects

Mode

User

OS SKU

Ubuntu Linux

Availability

1, 2, 3

Spot Instances

Yes (cheaper, but randomly interrupts projects, and you have to make sure you have enough quotas)

Spot Type

Capacity only

Spot Policy

Delete

Spot VM Size

D4s_v3 (4 vCPU (x86/64) and 16GB RAM)

A better choice might be 4 vCPU and 32 GB RAM.

Public IP per node

No

Scale method

manual

Node count

1

Node drain timeout

5 (rather impatient)

Kubernetes Label

cocalc-role=projects

Kubernetes Taints

Key

Value

Effect

cocalc-projects-init

false

NoExecute

cocalc-projects

init

NoSchedule

Networking

Parameter

Value

Private cluster

no (but maybe you know how to set this up)

Network configuration

kubenet (default)

Bring your own virtual network

Yes (which opened up a few defaults)

  • Virtual network: cocalc-onprem-vnet

  • Subnet: default (a /16)

DNS name prefix

cocalc1-dns (default)

Network policy

Calico (default)

Integrations

  • I disabled “Defender”

  • No container registry (images are external, but maybe you’ll mirror them internally)

  • Azure Monitor: Off (I don’t know the pricing of this, and I bet this can be enabled later on)

  • Alerting: kept the defaults

  • Azure policy: Disabled, since I don’t know what this is

Advanced

I kept the defaults. In particular, it tells me to setup a specifically named infrastructure resource group. Ok.

Tags

none

Validation/Creation

“Validation in progress”: took a minute or so. Creation as well.

Connecting

I was then able to open the cloud shell in Azure’s web interface (bash) and run:

az aks get-credentials --resource-group cocalc-onprem --name cocalc1

Which added credentials to my ~/.kube/config file, and I was able to run kubectl get nodes and see the three nodes and kubectl get pods -A and see the system pods running. Success!

[ ~ ]$ kubectl get nodes
NAME                               STATUS   ROLES   AGE   VERSION
aks-projects-24080581-vmss000000   Ready    agent   34m   v1.26.6
aks-services-47947981-vmss000002   Ready    agent   49m   v1.26.6
aks-services-47947981-vmss000003   Ready    agent   49m   v1.26.6

[ ~ ]$ k get pods -A
NAMESPACE         NAME                                       READY   STATUS      RESTARTS   AGE
aks-command       command-8a584164af9543f6ae7da87fde59b667   0/1     Completed   0          4m3s
calico-system     calico-kube-controllers-679bc4d8d7-8g2sw   1/1     Running     0          26h
calico-system     calico-node-7csgp                          1/1     Running     0          34m
calico-system     calico-node-f9jhp                          1/1     Running     0          49m
calico-system     calico-node-kbsww                          1/1     Running     0          49m
calico-system     calico-typha-77669b8d96-6pf4r              1/1     Running     0          34m
calico-system     calico-typha-77669b8d96-dqmcq              1/1     Running     0          26h
kube-system       cloud-node-manager-4dhz4                   1/1     Running     0          34m
kube-system       cloud-node-manager-bbfhv                   1/1     Running     0          49m
kube-system       cloud-node-manager-gzkfk                   1/1     Running     0          49m
[...]

Ref: Connect to an AKS cluster

Namespace “cocalc”

The Namespace throughout this documentation is cocalc. Here, we create it and switch to it.

kubectl create namespace cocalc
kubectl config set-context --current --namespace=cocalc

Tweaking AKS behavior

Warning

For reasons I don’t understand, AKS disallows changing node taints of the project pool. This renders the Prepull feature useless, since it relies changing taints.

A “trick” is to disable a hook via:

kubectl get ValidatingWebhookConfiguration aks-node-validating-webhook -o yaml | sed -e 's/\(objectSelector: \){}/\1{"matchLabels": {"disable":"true"}}/g' | kubectl apply -f -

Ref.: GitHub issue 2934

PostgreSQL

We use the “Azure Database for PostgreSQL Flexible Server” service to create a small database. The “flexible server” variant seems the be the newer one, while the non-flexible one is deprecated. Below are just the bare minimum parameters I chose for testing. For production, a small default choice is probably fine.

Parameter

Value

Subscription and Resource group

same as above: “cocalc-onprem”

Name

cocalc1

Region

East US (same as the K8S cluster)

Version

15 (default)

Workload

Development In general, the load should not exceed 1 core and fit within a few GB of RAM.

High availability

No, though YMMV

Authentication

PostgreSQL only (or both)

  • Username: cocalc

  • Password: [secret]

Networking

  • Private access (should be more secure, you do not need a public IP as well)

  • Virtual network: existing one, same as the cluster above

  • Subnet: Created a new one: db, 10.225.0.0/24 I wasn’t able to do this from here, but clicked on “Manage selected Virtual Network” and from there added that Subnet. There, I also delegated it to Microsoft.DBforPostgreSQL/flexibleServers

Security

defaults

Tags

empty

Review + Create

Price is about $16 per month

Once it has been created, in “Settings/Connect” it told me that PGHOST is cocalc1.postgres.database.azure.com and so on. Set those parameters in deployment configuration under global.database to tell CoCalc how to connect to the database.

Note

You might wonder why I created a new subnet. I tried to setup a private link to the database, but I just got an error, that this kind of sub-resource is not supported. In turn, this db subnet is not used elsewhere. The “magic” seems to be this subnet delegation.

Warning

There is no SSL encryption, hence you have to disable it. For that, open the just created database, Settings/Server Parameters and flip require_secure_transport to Off, and save it.

Ref:

Storage/Files

The goal of this aspect is to create a managed shared file-system, which we’ll mount into the Kubernetes pods as a ReadWriteMany NFS file-system. Any storage solution that supports this should work. This here uses the Azure Files solution to set up Azure NFS.

Note

Alternatively to the setup below, you can also run your own NFS server: Use a Linux NFS Server with AKS.

If you go down this route, continue here where the PV/PVC setup is explained Kubernetes PV/PVC.

Basics

To get started: “Storage accounts” → “Create”

Parameter

Value

Subscription and Resource group

same as above: “cocalc-onprem”

Name

cocalc1 (globally unique)

Region

East US (same as the cluster and DB)

Performance

Premium (required for NFS)

Premium account type

File shares

Redundancy

Locally-redundant storage (LRS)

Advanced

I don’t know much about the options, hence I kept them as they are

Networking

  • Network connectivity: Disable public access / enable private access

  • Add private endpoint:

    • Name: cocalccloud1files

    • Storage sub-resource: file

    • Virtual network: picked the one of the K8S cluster

    • Subnet: default, i.e. the same as the K8S cluster above

    • Integrated with private DNS zone: yes

  • Microsoft network routing (the default)

Data protection and Encryption

Kept as it is.

Although, later once deployed, “Secure transfer required” must be disabled, because NFS does not support that. (You can change this later in Settings → Configuration).

Tags: none

File share

After this was deployed, there is a menu entry “File share” to create a new one.

Parameter

Value

Name

cocalc-onprem-1

Provisioned capacity

100 GB (minimum)

Protocol

NFS

Root Squash

No

Ref: Mount NFS Azure file share on Linux

I got an error about “Secure transfer required” and disabled it. Once that was done, that NFS File share panel showed me instructions how to install the nfs-common package in Linux and then told me how to mount this share. Looks good!

Network access

Make sure there is a private endpoint for the file share. Otherwise, opened that cocalc-onprem-1 file share and added a private endpoint, where it allows me to setup access:

Parameter

Value

Name

cocalc-onprem-files

Interface

same as above with -nic appended

Resource

file

Virtual Network

the one of the cluster, and the subnet of the aks cluster

Dynamically allocate IP address

Yes (although, statically might be better)

Private DNS

Yes

Tags

none

I opened that new “Private endpoint” and under Settings/DNS configuration, I saw there is an internal IP address and under FQDN there is cocalccloud1.file.core.windows.net. In the next step, that’s the name of the NFS server.

When this doesn’t work, I got this error when trying to mount from within a pod in the kubernetes cluster:

mount.nfs: access denied by server while mounting cocalccloud1.file.core.windows.net:/cocalccloud1/cocalc-onprem-1

What helped me, essentially:

Kubernetes PV/PVC

Once we have the NFS server running, we create three PersistentVolumes and corresponding PersistentVolumeClaims. We expand on the second part of Use a Linux NFS Server with AKS.

Specific to CoCalc, we need two PVC’s for the data of all projects and the globally shared software. We mount them from subdirectories of the NFS file share (doesn’t need to be the case, but why not…), which will be /data and /software. The names used here are the default values – otherwise specify them.

Warning

Since we setup the PVCs on our own, you have to tell CoCalc to not create them. That’s the storage.create: false setting in Storage.

The way I created this is by setting up a third PV/PVC pair called root-data. Then, I run small setup Job, which creates these subdirectories and fixes the ownership and permissions.

Make sure you deploy this in your namespace.

Note

You have to change the NFS server name and path to match your setup.

We start by defining the PVs: download pv-nfs.yaml, edit it, and then add it via kubectl apply -f pv-nfs.yaml.

To figure out server and path, open the file share and look at the “Connect from Linux” information. There is a line for the mount command. This is composed of:

[FQDN of server]:/[storage account name]/[file share name]

Below in pv-nfs.yaml, the server is the FQDN of the server. The path is the remainder, i.e. /[storage account name]/[file share name] for the “root” PV, and the others have additional subdirectories appended, i.e. .../data and .../software.

 1apiVersion: v1
 2kind: PersistentVolume
 3metadata:
 4  name: cocalccloud1-root
 5  labels:
 6    type: nfs
 7    aspect: root
 8spec:
 9  capacity:
10    storage: 100Gi
11  accessModes:
12    - ReadWriteMany
13  nfs:
14    server: "cocalccloud1.file.core.windows.net"        # your NFS server
15    path: "/cocalccloud1/cocalccloud1"                  # file share without a subdirectory
16---
17apiVersion: v1
18kind: PersistentVolume
19metadata:
20  name: cocalccloud1-data
21  labels:
22    type: nfs
23    aspect: data
24spec:
25  capacity:
26    storage: 100Gi
27  accessModes:
28    - ReadWriteMany
29  nfs:
30    server: "cocalccloud1.file.core.windows.net"        # your NFS server
31    path: "/cocalccloud1/cocalccloud1/data"             # the file share name and subdirectory
32  mountOptions:
33    - noacl
34    - noatime
35    - nodiratime
36    - acregmin=30    # bit of a tradeoff between performance and consistency
37---
38apiVersion: v1
39kind: PersistentVolume
40metadata:
41  name: cocalccloud1-software
42  labels:
43    type: nfs
44    aspect: software
45spec:
46  capacity:
47    storage: 100Gi
48  accessModes:
49    - ReadWriteMany
50  nfs:
51    server: "cocalccloud1.file.core.windows.net"        # your NFS server
52    path: "/cocalccloud1/cocalccloud1/software"         # the file share name and subdirectory
53  mountOptions:
54    - noacl
55    - noatime
56    - nodiratime
57    - acregmin=600   # we only expect rare changes

Next up, we define corresponding PVCs: download pvc-cocalc.yaml, edit it, and then add it via kubectl apply -f pvc-cocalc.yaml. Both names match the default values – otherwise specify them.

 1apiVersion: v1
 2kind: PersistentVolumeClaim
 3metadata:
 4  name: root-data                     # only used by the setup Job
 5spec:
 6  accessModes:
 7    - ReadWriteMany
 8  storageClassName: ""
 9  resources:
10    requests:
11      storage: 1Gi
12  selector:
13    matchLabels:
14      type: nfs
15      aspect: root
16---
17apiVersion: v1
18kind: PersistentVolumeClaim
19metadata:
20  name: projects-data                  # default name
21spec:
22  accessModes:
23    - ReadWriteMany
24  storageClassName: ""
25  resources:
26    requests:
27      storage: 1Gi
28  selector:
29    matchLabels:
30      type: nfs
31      aspect: data
32---
33apiVersion: v1
34kind: PersistentVolumeClaim
35metadata:
36  name: projects-software               # default name
37spec:
38  accessModes:
39    - ReadWriteMany
40  storageClassName: ""
41  resources:
42    requests:
43      storage: 1Gi
44  selector:
45    matchLabels:
46      type: nfs
47      aspect: software

Finally, we run the following Kubernetes Job to create these subdirectories and fix the permissions. Download storage-setup-job.yaml and run it via kubectl apply -f storage-setup-job.yaml.

Once it worked – check this via kubectl describe job storage-setup-job – you can delete the job via kubectl delete -f storage-setup-job.yaml.

You can also check its log via kubectl log [jobs's pod name]: it should say something like setting up data and software done.

This also confirms, that the NFS server is working and can be mounted from within the cluster.

 1apiVersion: batch/v1
 2kind: Job
 3metadata:
 4  name: storage-setup-job
 5spec:
 6  backoffLimit: 1
 7  template:
 8    spec:
 9      restartPolicy: Never
10      containers:
11        - name: storage-setup-job
12          image: "ubuntu:22.04"
13          command:
14            - bash
15            - "-c"
16            # make sure the directories exist and set the permissions to 2001:2001 – the UID:GID of CoCalc users operating on these files
17            - "cd /nfs && mkdir -p data software && chown -R 2001:2001 data software && chmod a+rwx data software && echo 'setting up data and software done'"
18            #- "sleep infinity"
19          imagePullPolicy: IfNotPresent
20          resources:
21            requests:
22              cpu: 100m
23              memory: 256Mi
24          volumeMounts:
25            - mountPath: /nfs
26              name: root-data
27
28          ## uncomment this to run exactly as the user of CoCalc projects and associated services
29          #securityContext:
30          #  runAsGroup: 2001
31          #  runAsUser: 2001
32
33      dnsPolicy: ClusterFirst
34      terminationGracePeriodSeconds: 30
35      volumes:
36        - name: root-data
37          persistentVolumeClaim:
38            claimName: root-data

Note

If you ever have the need to do manual actions on the files, comment/uncomment the command lines to run sleep infinity and uncomment the securityContext section to run under a specific user id.

Deploy the job, it will continue to run, get the pod and exec into it via: kubectl exec -it [pod name] -- /bin/bash

The root mount point is in /nfs.

Afterwards, delete the job again.

Ingress

We also need a way to route internet traffic via a LoadBalancer to the corresponding service in our Kubernetes cluster.

This follows the documentation for an unmanaged NGINX Ingress controller and loads the configuration from the ingress-nginx/values.yaml file, included in this repository.

To deploy this, make sure you’re in your namespace, then install the PriorityClass as explained in the ingress/README.md and then deploy it. Regrading the HELM chart version number below: please double check with what is listed in the Setup notes – might be outdated.

kubectl apply -f ingress-nginx/priority-class.yaml
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm repo update
helm install ingress-nginx ingress-nginx/ingress-nginx \
    --version 4.8.3 \
    --set controller.service.annotations."service\.beta\.kubernetes\.io/azure-load-balancer-health-probe-request-path"=/healthz \
    -f ingress-nginx/values.yaml

Then, I checked if two controller pods are running kubectl get pods -n cocalc and also confirmed there is a LoadBalancer with an external IP:

$ kubectl get svc ingress-nginx-controller

NAME                       TYPE           CLUSTER-IP    EXTERNAL-IP     PORT(S)                      AGE
ingress-nginx-controller   LoadBalancer   10.0.193.93   XX.XXX.XXX.XX   80:30323/TCP,443:30228/TCP   5m35s

… and I also checked, that two ingress controller pods are running:

$ kubectl get deploy ingress-nginx-controller

NAME                       READY   UP-TO-DATE   AVAILABLE   AGE
ingress-nginx-controller   2/2     2            2           72s

Certificate Manager

Afterwards, install the certificate manager according to the notes in letsencrypt/README.md or do your own setup. More details in Azure/TLS/cert-manager.

Final step is to register that external IP address in your DNS server. That DNS name is then used in the deployment configuration, under global.dns.

(I also found notes about “tagging” that IP address, and then add a CNAME record to your DNS provider. That’s more robust in case the LoadBalancer is re-created and the IP address changes.)

Next steps …

Ok. At that point we have a cluster, a database, and an NFS file share with two pre-configured subdirectories.

The next steps are

Here is a starting point for your my-values.yaml configuration. Download azurecalc.yaml and edit it.

  1global:
  2  dns: &DNS "your.domain.tld"   # EDIT THIS
  3
  4  kubectl: "1.28" # enter it as a string, not a floating point number
  5
  6  imagePullSecrets:
  7    - name: regcred
  8
  9  database:
 10    host: "cocalc1.postgres.database.azure.com"
 11    user: "cocalc"
 12    database: "cocalc"
 13
 14  setup_admin:
 15    email: "[email protected]"
 16    # password: "[secret]"  # pass in the real password via $ helm [...] --set global.setup_admin.password=[password], you can change it later
 17    name: "Your Name"
 18
 19  setup_registration_token: "[secret token]" # pass in the real token via $ helm [...] --set global.setup_registration_token=[token]
 20
 21  ingress:
 22    class: "nginx"
 23    cert_manager:
 24      issuer: "letsencrypt-prod"
 25    tls:
 26      - hosts:
 27          - *DNS
 28        secretName: cocalc-tls
 29
 30  ssh_gateway:
 31    enabled: false # Note: on the very first helm deployment, it must be disabled. Then you can enable it.
 32
 33  # All settings have to match with the keys in the site settings config, see
 34  # https://github.com/sagemathinc/cocalc/blob/master/src/packages/util/db-schema/site-defaults.ts
 35  settings:
 36    site_name: "AzureCalc"
 37    site_description: "I live in an Azure Datacenter!"
 38    organization_name: "[your organization]"
 39    organization_email: &EMAIL "[email protected]"
 40    organization_url: ""
 41    terms_of_service_url: ""
 42    help_email: *EMAIL
 43    splash_image: ""
 44    logo_square: "[URL to a png or jpeg]"
 45    logo_rectangular: "[URL to a png or jpeg]"
 46    # This activates sharing files (public or semi-public)
 47    share_server: "yes"
 48    index_info_html: |
 49      ## Welcome to Azure Calc
 50
 51      This is a test instance of CoCalc running in an Azure Datacenter.
 52
 53    imprint: |
 54      # Imprint
 55    policies: |
 56      # Policies
 57    pii_retention: "3 month"
 58    anonymous_signup: "no"
 59    email_enabled: "no"
 60    #verify_emails: "yes"
 61    #email_backend: "smtp"
 62    #email_smtp_server: "[EMAIL SERVER]"
 63    #email_smtp_from: "[email protected]"
 64    #email_smtp_login: "[EMAIL LOGIN NAME]"
 65    # set the SMTP password either via the var below or via the admin UI
 66    #email_smtp_password: "[secret]"
 67    email_smtp_secure: "yes" # usually yes, and with port 465
 68    email_smtp_port: "465"
 69
 70    # CGroup quotas for a project, out of the box
 71    # e.g. '{"internet":true,"idle_timeout":3600,"mem":1000,"cpu":1,"cpu_oc":10,"mem_oc":5}'
 72    default_quotas: '{"internet":true,"idle_timeout":1800,"mem":2000,"cpu":1,"cpu_oc":20,"mem_oc":10}'
 73
 74storage:
 75  create: false
 76
 77manage:
 78  prepull:
 79    enabled: true
 80
 81  timeout_pending_projects_min: 15
 82
 83  resources:
 84    requests:
 85      cpu: 100m
 86      memory: 256Mi
 87
 88  project:
 89    dedicatedProjectNodesTaint: "cocalc-projects"
 90    dedicatedProjectNodesLabel: "cocalc-role"
 91
 92    # if projects are on a spot instance, AKS adds its own taint. We have to ignore it.
 93    # kubernetes.azure.com/scalesetpriority=spot:NoSchedule
 94    extraTaintTolerations:
 95    - key: "kubernetes.azure.com/scalesetpriority"
 96      value: "spot"
 97      effect: "NoSchedule"
 98
 99static:
100  replicaCount: 2
101
102hub:
103  resources:
104    requests:
105      cpu: 100m
106      memory: 256Mi
107
108  multiple_replicas:
109    websocket: 2
110    proxy: 2
111    next: 2
112    api: 1