.. index:: Software Environment
.. _ops-software-env:

Software Environment
=====================

In the realm of CoCalc, our mission is to empower users to execute code effectively and collaborate effortlessly.
This chapter will dive into the essential theme of *executing code*!

Every user interacts with a complex ecosystem—one that abstracts away the hardware while relying on programming languages, scripts, external packages, libraries, and necessary data files.

Users have varied needs when it comes to executing their code; some seek highly specific packages, while others prefer a robust, stable environment.
You may receive requests to install proprietary software for specialized tasks, or to update certain packages that are not available by default.

The good news is that CoCalc provides the flexibility to tailor the software environment of a project to meet these diverse requirements.
We'll explore three key approaches for customization:

1. :ref:`Within a Project <software-in-project>`: users can install their software packages directly within their project environments.
2. :ref:`Global Software in /ext <global-software>`: install software globally, shared across projects.
3. :ref:`Custom Software Environments <custom-software-env>`: build, host and deploy customized software environments as Docker images

By understanding these options, you can create a more accommodating and efficient workspace for your users.

.. _software-in-project:

Within a Project
-----------------

Users are able to install their own software packages in their projects.
A project is essentially a full Linux "user" environment, without elevated privileges.
This means all the usual ways to install software as a user are available, e.g. for :term:`Python`::

    pip install --user --upgrade [mypackage]

for :term:`R Software <R>`::

    install.packages("[mypackage]", lib="~/R")

or for :term:`GNU Autotools` based packages::

    ./configure --prefix=$HOME/.local
    make
    make install

or :term:`CMake`::

    mkdir build
    cd build
    cmake ..
    cmake --install .. --prefix=$HOME/.local

Read more:

* CoCalc Doc: `Installing Python Packages <https://doc.cocalc.com/howto/install-python-lib.html>`_

.. _global-software:

Global Software
----------------

See :ref:`projects-software` about how to get read/write access to the global ``/ext`` mountpoint.
This is quite powerful, because it allows you to install software packages globally – available to all projects.

.. note::

  Useful detail: if a file ``/ext/.bashrc`` exists, it is sourced by
  all projects via their local ``~/.bashrc`` file. This means it is
  possible to extend the path, configure aliases, etc. right there.
  If some users want to opt out for a project, they just have to
  comment or delete this from the bottom of their local ``~/.bashrc``
  file.

.. _custom-jupyter-kernels:

Custom Jupyter Kernels
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

It is possible to globally deploy customized Jupyter Kernels.
Each sub-directory of ``/ext/jupyter/kernels/`` could hold your own kernels,
where that ``/ext`` mountpoint is where the globally shared read-only
filesystem is mounted in all projects (see :ref:`projects-software`).

This works, because by default ``$JUPYTER_PATH`` is configured and points to that jupyter directory.
Globally installed kernels with the same directory name can be overwritten,
because that path takes precedence – e.g. ``python3``.

To check if a kernel is available:

#. Open a terminal in a project and run ``jupyter kernelspec list``.
#. Try to start it via ``jupyter console --kernel=[kernelname]``.

.. note::

    For a Python kernel, we suggest to add these parameters to the ``argv`` array in the ``kernel.json`` file:

    * ``"--HistoryManager.enabled=False"``: there is no need to record the history in a local database.
      In particular, if you're on an :term:`NFS` file-system, the underlying Sqlite database could cause
      problems in the form of "database is locked" errors, preventing the kernel from starting.
    * ``"--matplotlib=inline"``: to automatically load matplotlib


Ref.:

* `JUPYTER_PATH environment variable <https://docs.jupyter.org/en/latest/use/jupyter-directories.html#envvar-JUPYTER_PATH>`_
* CoCalc's documentation about `custom jupyter kernels <https://doc.cocalc.com/howto/custom-jupyter-kernel.html>`_


.. _project-image:
.. _custom-software-env:

Custom Software Environment
----------------------------

The entire project image can be provided by you, hosted on a Docker registry of yours.
This is the most flexible way to customize the software environment.

Benefits:

* Complete control over the environment
* Use existing environment definitions (Dockerfiles and build scripts) you already possess
* Offer multiple environments for users to select from, potentially categorized by specific tasks
* Incorporate proprietary packages, code, or configurations
* Update the software environment at your own pace, allowing users to choose:
   * "stable" releases, named with release date. This enables your users to adhere to a specific environment and avoid disruptions due to software updates.
   * "testing" releases, which will evolve into the new default after some iterations and updates. This allows users to test the new environment and provide feedback before it becomes "stable".

Requirements:

It might surprise you, but as a first test you might be able to just run any image you already have.
This works, because during the pod setup, the code for CoCalc's project server
will be bind-mounted dynamically from a sidecar container.

* The image should define a user named ``user`` with the :term:`UID/GID` ``2001`` – otherwise you'll encounter some warnings or errors. Snippet for your Dockerfile::

    # user "user" must be 2001:2001. Do not change the UID, assumed in several places!
    RUN umask 022 \
      && mkdir /home/user \
      && chown 2001:2001 -R /home/user \
      && /usr/sbin/groupadd --gid=2001 --non-unique user \
      && /usr/sbin/useradd --home-dir=/home/user --gid=2001 --uid=2001 --shell=/bin/bash user

* The ``file``, ``mount``, and ``ps`` utility should be installed, to check up on file-types, mount points, and running processes.
* To further customize what happens during startup, create a file ``/etc/cocalc_init.sh``. It will be :term:`sourced <Sourcing a Bash Script>` during project initialization. Please do not re-define the ``PATH`` variable, though! (only extend it): one other requirement is to keep ``/cocalc/bin`` in the ``$PATH``, because there are some scripts used by CoCalc.

To get started, take a look into the ``./project`` directory for more details.
Pick the ``./essential`` directory if you're unsure, it is the most complete setup example.
Its ``setup.sh`` script runs during the build process
and outlines how to install interesting tools like R, VSCode,
a customized default virtual Python environment (using ``cocalc_init.sh``) and X11.

You can also import existing images and make them useful for CoCalc. Examples:

* TexLive LaTeX distribution: see directory ``/project/import/texlive/``
* Jupyter's "DataScience Notebook" image: ``project/import/jupyter-datascience/``, which shows how to use Jupyter's ``start.sh`` to change the user and permissions.

.. note::

    Once your built project image is on your own registry,
    configure the :ref:`Software Environment <conf-software-env>` of our CoCalc deployment,
    to make it available to your users.

.. note::

    If something goes wrong and e.g. creating new files does not work,
    you have to use the "mini terminal" in the explorer to create a terminal file: ``touch term.term``.
    Then, open that ``term.term`` file to investigate the environment to understand what's going on.
    E.g. for creating new files that ``/cocalc/bin`` needs to be in the ``$PATH`` and ``cc-new-file`` has to work.