This lesson is in the early stages of development (Alpha version)

Apptainer: Getting started

Overview

Teaching: 30 min
Exercises: 20 min
Questions
  • What is Apptainer and why might I want to use it?

  • What issues motivated the creation of Apptainer?

  • What are the differences between Docker, Apptainer, and Singularity?

Objectives
  • Understand what Apptainer is and when you might want to use it.

  • Learn the design goals behind Apptainer.

What is it?

Apptainer is a container platform that allows software engineers and researchers to easily share their work with others by packaging and deploying their software applications in a portable and reproducible manner.

When you download an Apptainer container image, you essentially receive a virtual computer disk that contains all of the necessary software, libraries and configuration to run one or more applications or undertake a particular task, e.g. to support a specific research project.

This saves you the time and effort of installing and configuring software on your own system or setting up a new computer from scratch, as you can simply run an Apptainer container from the image and have a virtual environment that is identical to the one used by the person who created the image.

To visualize what containers are offering, let’s consider one division of the different layers of what a software package’A’ needs in order to run. You need to provide some hardware in the form of a computer with a CPU, RAM, disk space, and other similar resources. Then, you need to install an operating system (OS) on this hardware, such as CentOS. With the operating system installed, you can install the binaries and libraries that your software ‘A’ depends on. Finally, you can installand use software ‘A’ itself.

This workflow is OK for personal workstations, or other single-user machines. It has worked for decades, after all. But you can run into a few different issues that make it less than ideal for some workflows:

Workstation vs Shared Resource

There are a couple of common ways to deal with these issues - virtual machines (VMs) and containers. Conceptually these two solutions are similar - isolate the environment you want to manage away from the host system, so you can run multiple independent environments. They differ a bit in implementation, however. While a VM-based solution requires each environment to run its own independent operating system, a container-based solution provides a translation layer between the environment and the host operating system. The specifics of this go a bit beyond the intended scope of today’s material, but this has two important high-level consequences that make containers the more appealing solution for MSI:

Container vs VM

Images and containers

A brief note on the terminology used in this section of the course. We refer to both images and containers. What is the distinction between these two terms?

Images are bundles of files including an operating system, software and potentially data and other application-related files. They may sometimes be referred to as a disk image or container image and they may be stored in different ways, perhaps as a single file, or as a group of files. Either way, we refer to this file, or collection of files, as an image.

A container is a virtual environment that is based on an image. That is, the files, applications, tools, etc that are available within a running container are determined by the image that the container is started from. It may be possible to start multiple container instances from an image. You could, perhaps, consider an image to be a form of template from which running container instances can be started.

Many solutions are available for working with containers. One of the more common ones you may have heard about is Docker, which was developed for enterprise infrastructure workloads. You might have even used Docker to run some containers on a personal workstation, as it can be a good fit for that type of workflow as well. It isn’t a good fit for running HPC workflows, however, due to some security and feature restrictions imposed by its design.

Apptainer is an alternative container platform created specifically for the HPC use case. It allows users to build and run containers with just a few steps in most of the cases, and its design presents key concepts for the scientific community:

Apptainer vs Singularity

In these lessons you see the name Apptainer or Apptainer/Singularity, and the command apptainer. As stated in the move and renaming announcement, “Singularity IS Apptainer”. Currently there are three products derived from the original Singularity project from 2015:

MSI provides Apptainer, the most adopted variation in the scientific community, so we are using the apptainer command. If you are using Singularity or SingularityCE, just replace the command apptainer with singularity and the APPTAINER_ and APPTAINERENV_ variable prefixes with SINGULARITY_ and SINGULARITYENV_. But since its previous version was named Singularity and the developers wanted backwards-compatibility, if you have older scripts still using the singularity command they will work also in Apptainer because it is providing the singularity alias and full compatibility with the previous Singularity environment.

Documentation

The official Apptainer documentation is available online. Contains basic and advanced usage of Apptainer beyond the scope of this training document. Take a look and read the nice introduction, explaining the motivation behind the creation of Apptainer.

Key Points

  • Apptainer is a container platform designed by and for scientists.

  • Apptainer has a different security model to other container platforms, one of the key reasons that it is well suited to HPC and cluster environments. User inside the container = user outside.

  • Apptainer/Singularity has its own container image format (SIF).