.. index:: Software Environment .. _ops-software-env: Software Environment ===================== In the realm of CoCalc, our mission is to empower users to execute code effectively and collaborate effortlessly. This chapter will dive into the essential theme of *executing code*! Every user interacts with a complex ecosystem—one that abstracts away the hardware while relying on programming languages, scripts, external packages, libraries, and necessary data files. Users have varied needs when it comes to executing their code; some seek highly specific packages, while others prefer a robust, stable environment. You may receive requests to install proprietary software for specialized tasks, or to update certain packages that are not available by default. The good news is that CoCalc provides the flexibility to tailor the software environment of a project to meet these diverse requirements. We'll explore three key approaches for customization: 1. :ref:`Within a Project `: users can install their software packages directly within their project environments. 2. :ref:`Global Software in /ext `: install software globally, shared across projects. 3. :ref:`Custom Software Environments `: build, host and deploy customized software environments as Docker images By understanding these options, you can create a more accommodating and efficient workspace for your users. .. _software-in-project: Within a Project ----------------- Users are able to install their own software packages in their projects. A project is essentially a full Linux "user" environment, without elevated privileges. This means all the usual ways to install software as a user are available, e.g. for :term:`Python`:: pip install --user --upgrade [mypackage] for :term:`R Software `:: install.packages("[mypackage]", lib="~/R") or for :term:`GNU Autotools` based packages:: ./configure --prefix=$HOME/.local make make install or :term:`CMake`:: mkdir build cd build cmake .. cmake --install .. --prefix=$HOME/.local Read more: * CoCalc Doc: `Installing Python Packages `_ .. _global-software: Global Software ---------------- See :ref:`projects-software` about how to get read/write access to the global ``/ext`` mountpoint. This is quite powerful, because it allows you to install software packages globally – available to all projects. .. note:: Useful detail: if a file ``/ext/.bashrc`` exists, it is sourced by all projects via their local ``~/.bashrc`` file. This means it is possible to extend the path, configure aliases, etc. right there. If some users want to opt out for a project, they just have to comment or delete this from the bottom of their local ``~/.bashrc`` file. .. _custom-jupyter-kernels: Custom Jupyter Kernels ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ It is possible to globally deploy customized Jupyter Kernels. Each sub-directory of ``/ext/jupyter/kernels/`` could hold your own kernels, where that ``/ext`` mountpoint is where the globally shared read-only filesystem is mounted in all projects (see :ref:`projects-software`). This works, because by default ``$JUPYTER_PATH`` is configured and points to that jupyter directory. Globally installed kernels with the same directory name can be overwritten, because that path takes precedence – e.g. ``python3``. To check if a kernel is available: #. Open a terminal in a project and run ``jupyter kernelspec list``. #. Try to start it via ``jupyter console --kernel=[kernelname]``. .. note:: For a Python kernel, we suggest to add these parameters to the ``argv`` array in the ``kernel.json`` file: * ``"--HistoryManager.enabled=False"``: there is no need to record the history in a local database. In particular, if you're on an :term:`NFS` file-system, the underlying Sqlite database could cause problems in the form of "database is locked" errors, preventing the kernel from starting. * ``"--matplotlib=inline"``: to automatically load matplotlib Ref.: * `JUPYTER_PATH environment variable `_ * CoCalc's documentation about `custom jupyter kernels `_ .. _project-image: .. _custom-software-env: Custom Software Environment ---------------------------- The entire project image can be provided by you, hosted on a Docker registry of yours. This is the most flexible way to customize the software environment and provides complete control over the user computing environment. Benefits: * **Complete control** over the environment including operating system, packages, and configurations * **Use existing environment definitions** (Dockerfiles and build scripts) you already possess * **Offer multiple environments** for users to select from, potentially categorized by specific tasks * **Incorporate proprietary packages**, code, or configurations * **Version control** your software environment at your own pace, allowing users to choose: * "stable" releases, named with release date. This enables your users to adhere to a specific environment and avoid disruptions due to software updates. * "testing" releases, which will evolve into the new default after some iterations and updates. This allows users to test the new environment and provide feedback before it becomes "stable". * **Organizational compliance** by including required security tools, corporate certificates, and compliance software * **Performance optimization** by pre-installing and configuring software for your specific use cases Default "Full" Environment ^^^^^^^^^^^^^^^^^^^^^^^^^^^ CoCalc OnPrem includes a comprehensive "full" software environment (``software-YYYYMMDD-HHMM`` images) that serves as both a reference implementation and a foundation for customization. This environment includes: **Programming Languages & Frameworks:** - Python 3 with 200+ scientific packages (NumPy, SciPy, Pandas, Matplotlib, PyTorch, TensorFlow, JAX) - R with 80+ packages and RStudio Server - Julia with essential packages (IJulia, Pluto, Plots) - SageMath for mathematical computing (optional) - Java (OpenJDK), Go, C/C++ (GCC, Clang) - Octave for MATLAB compatibility **Development Tools:** - VS Code Server (optional) - Jupyter Lab and Jupyter Notebook - Git and version control tools - LaTeX with full TeXLive distribution - Build tools (Make, CMake, Autotools) **Scientific & Data Analysis:** - QGIS for geospatial analysis - Pandoc and Quarto for document processing - Statistical and visualization libraries - Machine learning frameworks **Desktop Applications (optional):** - X11 server with Xpra for remote desktop - GIMP, Inkscape, LibreOffice - Scientific applications (Spyder, Scilab) - Development tools (Texmaker, TeXstudio) Building Custom Images ^^^^^^^^^^^^^^^^^^^^^^ .. note:: These files are only accessible if you have access to the private repository. The recommended approach is to start with the default "full" environment and customize it for your needs. The build system uses a modular approach with specialized installation scripts. **Project Structure:** The ``./project/full/`` directory contains the reference implementation with these key components: * ``Dockerfile`` - Build configuration with customizable arguments * ``common.sh`` - Base system packages and prerequisites * ``python.sh`` - Python ecosystem with virtual environment setup * ``full.sh`` - Core development tools and libraries * ``r.sh`` - R statistical computing environment * ``julia.sh`` - Julia scientific computing setup * ``sage.sh`` - SageMath installation (optional) * ``vscode.sh`` - VS Code Server (optional) * ``x11.sh`` - X11 desktop environment (optional) * ``user-2001.sh`` - Project user setup * ``py3.txt`` - Python package list for customization **Build Arguments:** Control optional components during build: .. code-block:: dockerfile ARG INSTALL_VSCODE=true # Enable VS Code Server ARG INSTALL_SAGE=10.5 # SageMath version or 'none' ARG INSTALL_X11=true # Enable X11 desktop support **Customization Examples:** 1. **Add Corporate Packages:** .. code-block:: bash # In common.sh, add corporate packages aptitude install -q -y corporate-security-agent corporate-monitoring # In python.sh, add to py3.txt echo "corporate-python-lib==1.0.0" >> py3.txt 2. **Configure Corporate Infrastructure:** .. code-block:: bash # In common.sh, add corporate CA certificates curl -o /usr/local/share/ca-certificates/corporate-ca.crt https://ca.corp.com/cert update-ca-certificates # Configure corporate proxy echo 'export https_proxy=https://proxy.corp.com:8080' >> /etc/environment 3. **Optimize for Specific Use Cases:** .. code-block:: bash # Build without desktop applications for server use docker build --build-arg INSTALL_X11=false -t custom-server . # Build with specific SageMath version docker build --build-arg INSTALL_SAGE=10.4 -t custom-math . 4. **Add Custom Software:** .. code-block:: bash # In full.sh, add domain-specific tools wget https://example.com/custom-tool.deb -O /tmp/tool.deb dpkg -i /tmp/tool.deb # Configure environment variables echo 'export CUSTOM_TOOL_HOME=/opt/custom-tool' >> /etc/cocalc_init.sh Requirements for Custom Images ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ **Essential Requirements:** * **User Configuration:** The image must define a user named ``user`` with :term:`UID/GID` ``2001``: .. code-block:: dockerfile # user "user" must be 2001:2001. Do not change the UID, assumed in several places! RUN umask 022 \ && mkdir /home/user \ && chown 2001:2001 -R /home/user \ && /usr/sbin/groupadd --gid=2001 --non-unique user \ && /usr/sbin/useradd --home-dir=/home/user --gid=2001 --uid=2001 --shell=/bin/bash user * **System Utilities:** Install essential utilities for CoCalc operation: .. code-block:: dockerfile RUN apt-get update && apt-get install -y \ file mount psutils curl wget git vim \ python3 python3-pip build-essential * **PATH Configuration:** Keep ``/cocalc/bin`` in the ``$PATH`` for CoCalc functionality: .. code-block:: bash # In /etc/cocalc_init.sh export PATH="/cocalc/bin:$PATH" * **Initialization Script:** Create ``/etc/cocalc_init.sh`` for project startup customization: .. code-block:: bash # Example /etc/cocalc_init.sh export CUSTOM_VAR="value" source /opt/venvs/cocalc/bin/activate # Activate Python environment export PS1='\w\$ ' # Set prompt **Python Environment Setup:** For Python-based environments, follow this pattern: .. code-block:: bash # Create virtual environment mkdir -p /opt/venvs python3 -m venv /opt/venvs/cocalc # Install packages /opt/venvs/cocalc/bin/pip install jupyter ipykernel [other packages] # Configure kernels /opt/venvs/cocalc/bin/python -m ipykernel install --name python3 --display-name "Python 3" # Activate by default echo "source /opt/venvs/cocalc/bin/activate" >> /etc/cocalc_init.sh **Jupyter Kernel Integration:** Ensure Jupyter kernels are properly configured: .. code-block:: bash # Install kernels in system location mkdir -p /usr/local/share/jupyter/kernels # For Python kernel /opt/venvs/cocalc/bin/jupyter kernelspec install --system python3_kernel_spec # For R kernel (example) echo 'IRkernel::installspec(user = FALSE)' | R --no-save Build Process ^^^^^^^^^^^^^ **Standard Build:** .. code-block:: bash # Build with all components docker build -t custom-cocalc:latest ./project/full/ **Customized Build:** .. code-block:: bash # Build optimized for specific use case docker build \ --build-arg INSTALL_VSCODE=true \ --build-arg INSTALL_SAGE=none \ --build-arg INSTALL_X11=false \ -t custom-cocalc:optimized ./project/full/ Integration with CoCalc OnPrem ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ **Registry Configuration:** 1. **Build and Push Image:** .. code-block:: bash docker build -t your-registry.com/cocalc/custom-env:20250115-1200 . docker push your-registry.com/cocalc/custom-env:20250115-1200 2. **Configure values.yaml:** .. code-block:: yaml global: software: environments: custom-env: title: "Custom Environment" descr: "Organization-specific environment with custom tools" tag: "custom-env-20250115-1200" group: "Custom" registry: "your-registry.com/cocalc" 3. **Deploy Changes:** .. code-block:: bash helm upgrade cocalc ./cocalc -f your-values.yaml **Multiple Environment Strategy:** .. code-block:: yaml global: software: environments: data-science: title: "Data Science" descr: "Optimized for data analysis and machine learning" tag: "data-science-20250115-1200" group: "Specialized" mathematics: title: "Mathematics" descr: "Mathematical computing with SageMath" tag: "mathematics-20250115-1200" group: "Specialized" development: title: "Development" descr: "Software development environment" tag: "development-20250115-1200" group: "Development" Legacy Directory References ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ For backward compatibility and additional examples: * **Essential Environment:** ``./project/essential/`` - Minimal but complete setup * **Import Examples:** ``./project/import/`` - Adapting existing images: * ``texlive/`` - LaTeX-focused environment * ``jupyter-datascience/`` - Jupyter ecosystem integration * ``anaconda-gpu/`` - GPU-accelerated computing * ``cuda-*/`` - CUDA development environments Testing Custom Images ^^^^^^^^^^^^^^^^^^^^^^ **Local Testing:** .. code-block:: bash # Test basic functionality docker run -it --rm custom-image:latest python3 -c "import numpy; print('OK')" docker run -it --rm custom-image:latest R --version docker run -it --rm custom-image:latest julia --version **Integration Testing:** .. code-block:: bash # Test with CoCalc project server docker run -it --rm \ -v /path/to/project/data:/home/user/data \ custom-image:latest \ bash -c "source /etc/cocalc_init.sh && python3 -c 'import sys; print(sys.path)'" **Kernel Testing:** .. code-block:: bash # Verify Jupyter kernels docker run -it --rm custom-image:latest jupyter kernelspec list docker run -it --rm custom-image:latest jupyter console --kernel=python3 Troubleshooting ^^^^^^^^^^^^^^^ **Common Issues:** * **Permission Problems:** Ensure UID/GID 2001 is used consistently * **Path Issues:** Verify ``/cocalc/bin`` remains in PATH * **Kernel Problems:** Check Jupyter kernel specifications and permissions * **Package Conflicts:** Review package installation order and dependencies **Debugging Commands:** .. code-block:: bash # Check user configuration docker run -it --rm custom-image:latest id user # Verify environment setup docker run -it --rm custom-image:latest bash -c "source /etc/cocalc_init.sh && env" # Test file operations docker run -it --rm custom-image:latest bash -c "touch /tmp/test && ls -la /tmp/test" **Debug Mode Build:** .. code-block:: bash # Build with detailed output docker build --no-cache --progress=plain -t debug-image . .. note:: Once your built project image is on your own registry, configure the :ref:`Software Environment ` of your CoCalc deployment, to make it available to your users. .. note:: If something goes wrong and e.g. creating new files does not work, you have to use the "mini terminal" in the explorer to create a terminal file: ``touch term.term``. Then, open that ``term.term`` file to investigate the environment to understand what's going on. E.g. for creating new files that ``/cocalc/bin`` needs to be in the ``$PATH`` and ``cc-new-file`` has to work.