.. _task-reference:

=====
Tasks
=====

Ontology for generic tasks
==========================

While tasks are unique in theory, we can have different tasks sharing
some commonalities. In the Debian context in particular, we have different
ways to build Debian packages with different helper programs (sbuild,
pbuilder, etc.) and we want those tasks to reuse the same set of
parameters so that they can be called interchangeably.

This public interface is materialized by a generic task that can be
scheduled by the users and that will run one of the available
implementations that can run on one of the available workers.

This section documents those generic tasks and their interface.

There are some ``task_data`` keys that apply to all tasks:

* ``notifications`` (optional): a dictionary containing:

  * ``on_failure`` (required): a specification of what to do if the task
    fails, formatted as an array of dictionaries as follows:

    * ``channel`` (required): the ``NotificationChannel`` to use for this
      notification
    * ``data`` (optional): a dictionary as follows (for email channels; this
      may change for other notification methods):

      * ``from`` (optional): the email address to send this notification
	from (defaults to the channel's ``from`` property)
      * ``to`` (optional): a list of email addresses to send this
	notification to (defaults to the channel's ``to`` property)
      * ``cc`` (optional): a list of email addresses to CC this notification
	to (defaults to the channel's ``cc`` property, if any)
      * ``subject`` (optional): the subject line for this notification
	(defaults to the channel's ``subject`` property, or failing that to
	``WorkRequest $work_request_id completed in $work_request_result``);
	the strings ``${work_request_id}`` and ``${work_request_result}``
	(or ``$work_request_id`` and ``$work_request_result``, provided that
	they are not followed by valid identifier characters) are replaced
	by their values

* ``workflow_instance_id`` (optional): if this task is running as part of a
  workflow instance, then this is set to the ID of that instance

Task data key names are used in ``pydantic`` models, and must therefore be
:external+python:ref:`syntactically valid Python identifiers <identifiers>`
(although they may collide with keywords, in which case ``pydantic`` aliases
should be used).

.. _package-build-task:

Task ``PackageBuild``
---------------------

A generic task to represent a package build, i.e. the act of transforming
a source package (.dsc) into binary packages (.deb).

The ``task_data`` associated to this task can contain the following keys:

* ``input`` (required): a dictionary describing the input data

  * ``source_artifact_id`` (required): source_artifact_id pointing to a source
    package, it is used to retrieve the source package to build.

* ``distribution`` (required if ``backend`` is ``schroot``): name of the target
   distribution.
* ``environment_id`` (required if ``backend`` is not ``schroot``): ID of an
  artifact of category ``debian:system-tarball``. Backend will use it.
* ``backend`` (optional, defaults to ``unshare``):
  If ``auto``, the task uses the default.
  Supported backends: ``incus-lxc``, ``incus-vm``, ``qemu``,
  ``schroot``, and ``unshare``.
* ``extra_repositories`` (optional): a list of extra repositories to enable.
  Each repository is described by a dictionary with the following
  possible keys:

  * ``sources_list``: a single-line for an APT's sources.list file
  * ``authentication_key`` (optional): the ascii-armored public key used to
    authenticate the repository

* ``host_architecture`` (required): the architecture that we want to build
  for, it defines the architecture of the resulting architecture-specific
  .deb (if any)
* ``build_architecture`` (optional, defaults to the host architecture):
  the architecture on which we want to build the package (implies
  cross-compilation if different from the host architecture). Can be
  explicitly set to the undefined value (Python's ``None`` or JavaScript's
  ``null``) if we want to allow cross-compilation with any build architecture.
* ``build_components`` (optional, defaults to ``any``): list that can contain
  the following 3 words (cf ``dpkg-buildpackage --build=any,all,source``):

  * ``any``: enables build of architecture-specific .deb
  * ``all``: enables build of architecture-independent .deb
  * ``source``: enables build of the source package (.dsc)
* ``build_profiles``: list of build profiles to enable during package build (cf
  ``dpkg-buildpackage --build-profiles``)

* ``build_options``: value of ``DEB_BUILD_OPTIONS`` during build
* ``build_path`` (optional, default unset): forces the build to happen
  through a path named according to the passed value. When this value
  is not set, there's no restriction on the name of the path.

.. _system-bootstrap-task:

Task ``SystemBootstrap``
------------------------

A generic task to represent the bootstrapping of a Debian system out
of an APT repository. The end result of such a task is to generate
an artifact of category ``debian:system-tarball``.

The ``task_data`` associated to this task can contain the following keys:

* ``bootstrap_options``: a dictionary with a few global options:

  * ``variant`` (optional): maps to the ``--variant`` command line option
    of debootstrap
  * ``extra_packages`` (optional): list of extra packages to include in
    the bootstrapped system
  * ``architecture`` (required): the native architecture of the built
    Debian system. The task will be scheduled on a system of that
    architecture.

* ``bootstrap_repositories``: a list of repositories used to bootstrap
  the Debian system. Note that not all implementations might support
  multiple repositories.

  * ``types`` (optional): a list of source types to enable among ``deb``
    (binary repository) and ``deb-src`` (source repository).
    Defaults to a list with ``deb`` only.
  * ``mirror`` (required): the base URL of a mirror containing APT
    repositories in ``$mirror/dists/$suite``
  * ``suite`` (required): name of the distribution's repository to
    use for the bootstrap
  * ``components`` (optional): list of components to use in the APT
    repository (e.g. ``main``, ``contrib``, ``non-free``, ...). Defaults
    to download the ``Release`` from the suite and using all the Components.
  * ``check_signature_with`` (optional, defaults to ``system``): indicates
    whether we want to check the repository signature with the system-wide
    keyrings (``system``), or with the external keyring documented in the
    in the ``keyring`` key (value ``external``), or whether we don't want
    to check it at all (value ``no-check``).
  * ``keyring_package`` (optional): install an extra keyring package in
    the bootstrapped system
  * ``keyring`` (optional): provide an external keyring for the bootstrap

    * ``url`` (required): URL of the external keyring to download
    * ``sha256sum`` (optional): SHA256 checksum of the keyring to validate
      the downloaded file
    * ``install`` (boolean, defaults to False): if True, the downloaded
      keyring is installed and used in the target system.

* ``customization_script`` (optional): a script that is copied in the
  target chroot, executed from inside the chroot and then removed. It lets
  you perform arbitrary customizations to the generated system. You can
  use apt to install extra packages. If you want to use something more
  elaborated than a shell script, you need to make sure to install the
  appropriate interpreter during the bootstrap phase with the
  ``extra_packages`` key.

.. _system-image-build-task:

Task ``SystemImageBuild``
-------------------------

This generic task is an extension of the :ref:`SystemBootstrap
<system-bootstrap-task>` generic task: it should generate a disk image
artifact complying with the :ref:`debian:system-image
<artifact-system-image>` definition. That disk image contains a Debian-based
system matching the description provided by the SystemBootstrap interface.

The following additional keys are supported:

* ``disk_image``

  * ``format`` (required): desired format for the disk image. Supported values are ``raw``
    and ``qcow2``.

  * ``filename`` (optional): base of the generated disk image filename.

  * ``kernel_package`` (optional): name of the kernel package to install,
    the default value is ``linux-image-generic``, which is only
    available on Bullseye and later, on some architectures.

  * ``bootloader`` (optional): name of the bootloader package to use,
    the default value is ``systemd-boot`` on architectures that support
    it.

  * ``partitions`` (required): a list of partitions, each represented by a
    dictionary with the following keys:

    * ``size`` (required): size of the partition in gigabytes
    * ``filesystem`` (required): filesystem used in the partition, can be
      ``none`` for no filesystem, ``swap`` for a swap partition, or
      ``freespace`` for free space that doesn't result in any partition
      (it will thus just offset the position of the following partitions).
    * ``mountpoint`` (optional, defaults to ``none``): mountpoint of the
      partition in the target system, can be ``none`` for a partition that
      doesn't get a mountpoint.

Specifications of tasks
=======================

This section lists all the available tasks, with the input that they
are accepting, the description of what they are doing, including
the artifacts that they are generating.

The tasks listed in this section are those that you can use to submit work
requests. 

Autopkgtest task
----------------

The ``task_data`` associated to this task can contain the following keys:

* ``input`` (required): a dictionary describing the input data:

  * ``source_artifact_id`` (required): the ``debian:source-package`` or
    ``debian:upload`` artifact ID representing the source package to be
    tested with autopkgtest
  * ``binary_artifacts_ids`` (required): a list of
    ``debian:binary-packages`` or ``debian:upload`` artifact IDs
    representing the binary packages to be tested with autopkgtest (they are
    expected to be part of the same source package as the one identified
    with ``source_artifact_id``)
  * ``context_artifacts_ids`` (optional): a list of
    ``debian:binary-packages`` or ``debian:upload`` artifact IDs
    representing a special context for the tests. This is used to trigger
    autopkgtests of reverse dependencies, where ``context_artifacts_ids`` is
    set to the artifacts of the updated package whose reverse dependencies
    are tested, and source/binary artifact IDs are one of the reverse
    dependency whose autopkgtests will be executed.

* ``host_architecture`` (required): the Debian architecture that will be
  used in the chroot or VM where tests are going to be run.  The
  packages submitted in ``input:binary_artifacts_ids`` usually have a
  matching architecture (but need not in the case of cross-architecture
  package testing, eg. testing i386 packages in an amd64 system).

* ``environment_id`` (required): Artifact ID of the
  ``debian:system-tarball`` or ``debian:system-image`` (as appropriate
  for the selected backend) that will be used to run the tests.

* ``backend`` (optional): the virtualization backend to use, defaults to
  ``auto`` where the task is free to use the most suitable backend.
  Supported: ``incus-lxc``, ``incus-vm``, ``qemu``, and ``unshare``.

* ``include_tests`` (optional): a list of the tests that will be executed.
  If not provided (or empty), defaults to all tests being executed. Translates into
  ``--test-name=TEST`` command line options.

* ``exclude_tests`` (optional): a list of tests that will skipped.
  If not provided (or empty), then no tests are skipped. Translates into
  the ``--skip-test=TEST`` command line options.

* ``debug_level`` (optional, defaults to 0): a debug level between 0 and
  3. Translates into ``-d`` up to ``-ddd`` command line options.

* ``extra_apt_sources`` (optional): a list of APT sources. Each APT source
  is described by a single line (``deb http://MIRROR CODENAME
  COMPONENT``) that is copied to a file in /etc/apt/sources.list.d.
  Translates into ``--add-apt-source`` command line options.

* ``use_packages_from_base_repository`` (optional, defaults to False): if
  True, then we pass ``--apt-default-release=$DISTRIBUTION`` with the name
  of the base distribution given in the ``distribution`` key.

* ``environment`` (optional): a dictionary listing environment variables
  to inject in the build and test environment. Translates into
  (multiple) ``--env=VAR=VALUE`` command line options.

* ``needs_internet`` (optional, defaults to "run"): Translates directly
  into the ``--needs-internet`` command line option. Allowed values
  are "run", "try" and "skip".

* ``fail_on`` (optional): indicates whether the work request must be
  marked as failed in different scenario identified by the following
  sub-keys:

  * ``failed_test`` (optional, defaults to true): at least one test has
    failed (and the test was not marked as flaky).
  * ``flaky_test`` (optional, defaults to false): at least one flaky test
    has failed.
  * ``skipped_test`` (optional, defaults to false): at least one test has
    been skipped.

* ``timeout`` (optional): a dictionary where each key/value pair maps to
  the corresponding ``--timeout-KEY=VALUE`` command line option with the
  exception of the ``global`` key that maps to ``--timeout=VALUE``.
  Supported keys are ``global``, ``factor``, ``short``, ``install``, ``test``,
  ``copy`` and ``build``.

.. note::

   At this point, we have voluntarily not added any key for the
   ``--pin-packages`` option because that option is not explicit enough:
   differences between the mirror used to schedule jobs and the mirror
   used by the jobs result in tests that are not testing the version that
   we want. At this point, we believe it's better to submit all modified
   packages explicitly via ``input:context_artifacts_ids`` so that we are
   sure of the .deb that we are submitting and testing with. That way we
   can even test reverse dependencies before the modified package is
   available in any repository.

   This assumes that we can submit arbitrary .deb on the command line and
   that they are effectively used as part of the package setup.

autopkgtest is always run with the options ``--apt-upgrade
--output-dir=ARTIFACT-DIR --summary=ARTIFACT-DIR/summary --no-built-binaries``.

An artifact of category ``debian:autopkgtest`` is generated to store all output
files, and is described :ref:`in the artifacts reference <artifact-autopkgtest>`.


Debootstrap task (NOT IMPLEMENTED YET)
--------------------------------------

The ``debootstrap`` task implements the :ref:`SystemBootstrap
<system-bootstrap-task>` interface except that it only supports a single
repository in the ``bootstrap_repositories`` key.

On top of the keys defined in that interface, it also supports the
following additional keys in ``task_data``:

* ``bootstrap_options``

  * ``script``: last parameter on debootstrap's command line

The various keys in the first entry of ``bootstrap_repositories`` are mapped to the
corresponding command line options and parameters:

* ``mirror``, ``suite`` and ``script`` map to positional command line parameters
* ``components`` maps to ``--components``
* ``check_signature`` maps to ``--check-gpg`` or ``--no-check-gpg``
* ``keyring_package`` maps to an extra package name in ``--include``
* ``keyring`` maps to ``--keyring``

The following keys from ``bootstrap_options`` are also mapped:
* ``variant`` maps to ``--variant``
* ``extra_packages`` maps to ``--include``

.. _task-mmdebstrap:

Mmdebstrap task
---------------

The ``mmdebstrap`` task fully implements the :ref:`SystemBootstrap
<system-bootstrap-task>` interface.

On top of the keys defined in that interface, it also supports the
following additional keys in ``task_data``:

* ``bootstrap_options``

  * ``use_signed_by`` (defaults to True): if set to False, then we
    do not pass the keyrings to APT via the ``Signed-By`` sources.list
    option, instead we rely on the ``--keyring`` command line parameter.

The keys from ``bootstrap_options`` are mapped to command line options:

* ``variant`` maps to ``--variant`` (and it supports more values than
  debootstrap, see its manual page)
* ``extra_packages`` maps to ``--include``

The keys from ``bootstrap_repositories`` are used to build a sources.list
file that is then fed to ``mmdebstrap`` as input.

SimpleSystemImageBuild task
---------------------------

The ``simplesystemimagebuild`` task implements the :ref:`SystemImageBuild
<system-image-build-task>` interface except that it expects a single
entry in the list of partitions: the entry for the root filesystem (thus
with a mountpoint of ``/``).

In terms of compliance with the ``SystemBootstrap`` interface, the
bootstrap phase only uses a single repository but the remaining
repositories are enabled after the bootstrap.

This task is implemented with the help of the ``debos`` tool.

Lintian task
------------

The ``task_data`` associated to this task can contain the following keys:

* ``input`` (required): a dictionary of values describing the input data,
  one of the sub-keys is required but both can be given at the same time
  too.

  * ``source_artifact_id`` (optional): the ``debian:source-package`` or
    ``debian:upload`` artifact ID representing the source package to be
    tested with lintian
  * ``binary_artifacts_ids`` (optional): a list of
    ``debian:binary-packages`` or ``debian:upload`` artifact IDs
    representing the binary packages to be tested with lintian (they are
    expected to be part of the same source package as the one identified
    with ``source_artifact_id``)

.. note::

   While it's possible to submit only a source or only a single binary
   artifact, you should aim to always submit source + arch-all + arch-any
   related artifacts to have the best test coverage as some tags can only
   be emitted when lintian has access to all of them at the same time.

* ``environment_id`` (required): Artifact ID of the ``debian:system-tarball``
  that will be used to run lintian. Must have ``lintian`` installed.

* ``backend`` (optional): the virtualization backend to use, defaults to
  ``auto`` where the task is free to use the most suitable backend.
  Supported options: ``incus-lxc``, ``incus-vm``, ``unshare``.

* ``output`` (optional): a dictionary of values controlling some aspects
  of the generated artifacts

  * ``source_analysis`` (optional, defaults to True): indicates whether
    we want to generate the ``debian:lintian`` artifact for the source
    package

  * ``binary_all_analysis`` (optional, defaults to True): same as
    ``source_analysis`` but for the ``debian:lintian`` artifact related
    to ``Architecture: all`` packages

  * ``binary_any_analysis`` (optional, defaults to True): same as
    ``source_analysis`` but for the ``debian:lintian`` artifact related
    to ``Architecture: any`` packages

* ``target_distribution`` (optional): the fully qualified name of the
  distribution that will provide the lintian software to analyze the
  packages. Defaults to ``debian:unstable``.

* ``include_tags`` (optional): a list of the lintian tags that are allowed to
  be reported. If not provided (or empty), defaults to all. Translates into the
  ``--tags`` or ``--tags-from file`` command line option.

* ``exclude_tags`` (optional): a list of the lintian tags that are not
  allowed to be reported. If not provided (or empty), then no tags are
  hidden. Translates into the ``--suppress-tags`` or
  ``--suppress-tags-from file`` command line option.

* ``fail_on_severity`` (optional, defaults to ``none``): if the analysis emits
  tags of that severity or higher, then the task will return a "failure"
  instead of a "success". Valid values are (in decreasing severity)
  "error", "warning", "info", "pedantic", "experimental", "overridden".
  "none" is a special value indicating that we should never fail.

The lintian runs will always use the options ``--display-level
">=classification"`` (``>=pedantic`` in jessie) ``--no-cfg
--display-experimental --info --show-overrides`` to collect the full set of
data that lintian can provide.

.. note::

   Current lintian can generate "masked" tags (with `M:` prefix) when you
   use ``--show-overrides``. For the purpose of debusine, we entirely
   ignore those tags on the basis that it's lintian's decision to hide
   them (and not the maintainer's decision) and as such, they don't bring
   any useful information. Lintian is full of exceptions to not emit some
   tags and the fact that some tags rely on a modular exception mechanism
   that can be diverted to generate masked tags is not useful to package
   maintainers.

   For those reasons, we suggested to lintian's maintainers to entirely
   stop emitting those tags in https://bugs.debian.org/1053892

Between 1 to 3 artifacts of category ``debian:lintian`` will be generated (one
for each source/binary package artifact submitted) and they will have a
"relates to" relationship with the corresponding artifact that has been
analyzed. The ``debian:lintian`` artifacts are described
:ref:`in the artifacts reference <artifact-lintian>`.

.. _task-sbuild:

Sbuild task
-----------

Regarding inputs, the ``sbuild`` task is compatible with the ontology
defined for :ref:`package-build-task` even though it implements only
a subset of the possible options at this time.

Currently unsupported ``PackageBuild`` task keys:

* ``extra_repositories``
* ``build_architecture`` / ``build_profiles``
* ``build_options``
* ``build_path``

Specific options:

* ``sbuild_options``: command-line options to pass to
  ``sbuild``. Note: this is a temporary feature; to avoid arbitrary
  command execution on the worker, we will prevent that in the long
  term.

Output artifacts and relationships:

a. ``debian:package-build-log``: sbuild output

   * relates-to: ``source_artifact_id``
   * relates-to: `b`

b. ``debian:binary-packages``: the binary packages (``*.deb``) built
   from the source package

   * relates-to: ``source_artifact_id``

c. ``debian:upload``: `b` plus the right administrative files
   (``.changes``, ``.buildinfo``) necessary for its binary upload

   * extends: `b`
   * relates-to: `b`

d. ``debusine:work-request-debug-logs``: debusine-specific worker logs

   * relates-to: ``source_artifact_id``


Piuparts task
-------------

A specific task to represent a binary package check using the
``piuparts`` utility.

The ``task_data`` associated to this task can contain the following keys:

* ``input`` (required): a dictionary describing the input data

  * ``binary_artifacts_ids`` (required): a list of
    ``debian:binary-packages`` or ``debian:upload`` artifact IDs
    representing the binary packages to be tested. Multiple Artifacts can be
    provided so as to support e.g. testing binary packages from split
    indep/arch builds.

* ``backend`` (optional, defaults to ``unshare``).
  If ``auto``, the task uses the default.
  Supported backends: ``incus-lxc``, ``incus-vm``, ``schroot``, and ``unshare``.
* ``environment_id`` (required): ID of an artifact of category
  ``debian:system-tarball``. Backend will use it.
* ``base_tgz_id`` (required): ID of an artifact of category
  ``debian:system-tarball``. Used in piuparts's command line in
  ``--base-tgz``. If the artifact's data has ``with_dev: True``, the task
  will remove the files ``/dev/*`` before using it.

* ``host_architecture`` (required): the architecture that we want to
  test on.

The ``piuparts`` output will be provided as a new artifact.

Blhc task
---------

A task to represent a build log check using the ``blhc`` utility.

The ``task_data`` associated to this task can contain the following keys:

* ``input`` (required): a dictionary describing the input data

  * ``artifact_id`` (required): a ``debian:package-build-log`` artifact ID
    corresponding to the build log to be checked. The file should have a
    ``.build`` suffix.

* ``extra_flags`` (optional): a list of flags to be passed to the blhc
  command, such as ``--bindnow`` or ``--pie``. If an unsupported flag is
  passed then the request will fail.

The ``blhc`` output will be provided as a new artifact of category
``debian:blhc``, described :ref:`in the artifacts reference <artifact-blhc>`.

The task returns success if ```blhc``` returns an exit code of 0 or 1, and
failure otherwise.


APTMirror task (NOT IMPLEMENTED YET)
------------------------------------

This is a :ref:`server-side task <explanation-tasks>` that updates an
existing :ref:`debian:suite collection <collection-suite>` to reflect the
state of an external APT suite. It creates source and binary package
artifacts as required.

The ``task_data`` for this task may contain the following keys:

* ``url`` (required): the base URL of the external suite
* ``components`` (optional): if set, only mirror these components
* ``architectures`` (optional): if set, only mirror binary packages for this
  list of architectures
* ``signing_key`` (optional): ASCII-armored public key used to authenticate
  this suite

The suite must have at least a ``Release`` file to make it possible to
handle multiple architectures in a reasonable way, and if ``signing_key`` is
specified then it must also have either an ``InRelease`` or a
``Release.gpg`` file.

Additions and removals of collection items are timestamped as described in
:ref:`explanation-collections`, so tasks that need a static view of a
collection (e.g. so that all tasks in the same workflow instance see the
same base suite contents) can do this by filtering on collection items that
were created before or at a given point in time and were not yet removed at
that point in time.

.. todo::

   Document exactly how transactional updates of collections work in
   general: tasks need to see a coherent state, and simple updates to
   collections can be done in a database transaction, but some longer update
   tasks where using a single database transaction is impractical may need
   more careful handling.  See discussion in `!517
   <https://salsa.debian.org/freexian-team/debusine/-/merge_requests/517>`_.
