Software Environment

In the realm of CoCalc, our mission is to empower users to execute code effectively and collaborate effortlessly. This chapter will dive into the essential theme of executing code!

Every user interacts with a complex ecosystem—one that abstracts away the hardware while relying on programming languages, scripts, external packages, libraries, and necessary data files.

Users have varied needs when it comes to executing their code; some seek highly specific packages, while others prefer a robust, stable environment. You may receive requests to install proprietary software for specialized tasks, or to update certain packages that are not available by default.

The good news is that CoCalc provides the flexibility to tailor the software environment of a project to meet these diverse requirements. We’ll explore three key approaches for customization:

  1. Within a Project: users can install their software packages directly within their project environments.

  2. Global Software in /ext: install software globally, shared across projects.

  3. Custom Software Environments: build, host and deploy customized software environments as Docker images

By understanding these options, you can create a more accommodating and efficient workspace for your users.

Within a Project

Users are able to install their own software packages in their projects. A project is essentially a full Linux “user” environment, without elevated privileges. This means all the usual ways to install software as a user are available, e.g. for Python:

pip install --user --upgrade [mypackage]

for R Software:

install.packages("[mypackage]", lib="~/R")

or for GNU Autotools based packages:

./configure --prefix=$HOME/.local
make
make install

or CMake:

mkdir build
cd build
cmake ..
cmake --install .. --prefix=$HOME/.local

Read more:

Global Software

See Projects Software about how to get read/write access to the global /ext mountpoint. This is quite powerful, because it allows you to install software packages globally – available to all projects.

Note

Useful detail: if a file /ext/.bashrc exists, it is sourced by all projects via their local ~/.bashrc file. This means it is possible to extend the path, configure aliases, etc. right there. If some users want to opt out for a project, they just have to comment or delete this from the bottom of their local ~/.bashrc file.

Custom Jupyter Kernels

It is possible to globally deploy customized Jupyter Kernels. Each sub-directory of /ext/jupyter/kernels/ could hold your own kernels, where that /ext mountpoint is where the globally shared read-only filesystem is mounted in all projects (see Projects Software).

This works, because by default $JUPYTER_PATH is configured and points to that jupyter directory. Globally installed kernels with the same directory name can be overwritten, because that path takes precedence – e.g. python3.

To check if a kernel is available:

  1. Open a terminal in a project and run jupyter kernelspec list.

  2. Try to start it via jupyter console --kernel=[kernelname].

Note

For a Python kernel, we suggest to add these parameters to the argv array in the kernel.json file:

  • "--HistoryManager.enabled=False": there is no need to record the history in a local database. In particular, if you’re on an NFS file-system, the underlying Sqlite database could cause problems in the form of “database is locked” errors, preventing the kernel from starting.

  • "--matplotlib=inline": to automatically load matplotlib

Ref.:

Custom Software Environment

The entire project image can be provided by you, hosted on a Docker registry of yours. This is the most flexible way to customize the software environment.

Benefits:

  • Complete control over the environment

  • Use existing environment definitions (Dockerfiles and build scripts) you already possess

  • Offer multiple environments for users to select from, potentially categorized by specific tasks

  • Incorporate proprietary packages, code, or configurations

  • Update the software environment at your own pace, allowing users to choose:
    • “stable” releases, named with release date. This enables your users to adhere to a specific environment and avoid disruptions due to software updates.

    • “testing” releases, which will evolve into the new default after some iterations and updates. This allows users to test the new environment and provide feedback before it becomes “stable”.

Requirements:

It might surprise you, but as a first test you might be able to just run any image you alredy have. This works, because during the pod setup, the code for CoCalc’s project server will be bind-mounted dynamically from a sidecar container.

  • The image should define a user named user with the UID/GID 2001 – otherwise you’ll encounter some warnings or errors. Snippet for your Dockerfile:

    # user "user" must be 2001:2001. Do not change the UID, assumed in several places!
    RUN umask 022 \
      && mkdir /home/user \
      && chown 2001:2001 -R /home/user \
      && /usr/sbin/groupadd --gid=2001 --non-unique user \
      && /usr/sbin/useradd --home-dir=/home/user --gid=2001 --uid=2001 --shell=/bin/bash user
    
  • The file, mount, and ps utility should be installed, to check up on file-types, mountpoints, and running processes.

  • To further customize what happens during startup, create a file /etc/cocalc_init.sh. It will be sourced during project initalization. Please do not re-define the PATH variable, though! (only extend it): one other reqirement is to keep /cocalc/bin in the $PATH, because there are some scripts used by CoCalc.

To get started, take a look into the ./project directory for more details. Pick the ./essential directory if you’re unsure, it is the most complete setup example. Its setup.sh script runs during the build process and outlines how to install interesting tools like R, VSCode, a customized default virtual Python environment (using cocalc_init.sh) and X11.

You can also import existing images and make them useful for CoCalc. Examples:

  • TexLive LaTeX distribution: see directory /project/import/texlive/

  • Jupyter’s “DataScience Notebook” image: project/import/jupyter-datascience/, which shows how to use Jupyter’s start.sh to change the user and permissions.

Note

Once your built project image is on your own registry, configure the Software Environment of our CoCalc deployment, to make it available to your users.

Note

If something goes wrong and e.g. creating new files does not work, you have to use the “mini terminal” in the explorer to create a terminal file: touch term.term. Then, open that term.term file to investigate the environment to understand what’s going on. E.g. for creating new files that /cocalc/bin needs to be in the $PATH and cc-new-file has to work.