dcnum.logic.job
===============

.. py:module:: dcnum.logic.job


Attributes
----------

.. autoapisummary::

   dcnum.logic.job.hdf5plugin


Classes
-------

.. autoapisummary::

   dcnum.logic.job.DCNumPipelineJob


Module Contents
---------------

.. py:data:: hdf5plugin

.. py:class:: DCNumPipelineJob(path_in: pathlib.Path | str, path_out: pathlib.Path | str | None = None, data_code: str = 'hdf', data_kwargs: dict | None = None, background_code: str = 'sparsemed', background_kwargs: dict | None = None, segmenter_code: str = 'thresh', segmenter_kwargs: dict | None = None, feature_code: str = 'legacy', feature_kwargs: dict | None = None, gate_code: str = 'norm', gate_kwargs: dict | None = None, basin_strategy: Literal['drain', 'tap'] = 'drain', compression: str = 'zstd-5', num_procs: int | None = None, log_level: int = logging.INFO, debug: bool = False)

   Pipeline job recipe

   :param path_in: input data path
   :type path_in: pathlib.Path | str
   :param path_out: output data path
   :type path_out: pathlib.Path | str
   :param data_code: identification code of input data reader to use
   :type data_code: str
   :param data_kwargs: keyword arguments for data reader
   :type data_kwargs: dict
   :param background_code: identification code of background data computation method
   :type background_code: str
   :param background_kwargs: keyword arguments for background data computation method
   :type background_kwargs: dict
   :param segmenter_code: identification code of segmenter to use
   :type segmenter_code: str
   :param segmenter_kwargs: keyword arguments for segmenter
   :type segmenter_kwargs: dict
   :param feature_code: identification code of feature extractor
   :type feature_code: str
   :param feature_kwargs: keyword arguments for feature extractor
   :type feature_kwargs: dict
   :param gate_code: identification code for gating/event filtering class
   :type gate_code: str
   :param gate_kwargs: keyword arguments for gating/event filtering class
   :type gate_kwargs: dict
   :param basin_strategy: strategy on how to handle event data; In principle, not all
                          events have to be stored in the output file if basins are
                          defined, linking back to the original file.

                          - You can "drain" all basins which means that the output file
                            will contain all features, but will also be very big.
                          - You can "tap" the basins, including the input file, which means
                            that the output file will be comparatively small.
   :type basin_strategy: str
   :param compression: compression algorithm to use; Set this to "none" to disable
                       compression. Currently, only the Zstandard compression
                       algorithm may be used, with the least compression "zstd-1"
                       and the best compression "zstd-9". The default "zstd-5" is
                       a trade-off. Set the compression to a higher number if the
                       bottleneck is disk-IO. Set the compression to a lower number
                       if the bottleneck is the CPU. Note that "zstd-5" is the
                       accepted minimum compression setting for long-term data
                       storage in the DC universe (enforced e.g. by DCOR-Aid).
   :type compression: str
   :param num_procs: Number of processes to use
   :type num_procs: int
   :param log_level: Logging level to use.
   :type log_level: int
   :param debug: Whether to set logging level to "DEBUG" and
                 use threads instead of processes
   :type debug: bool


   .. py:attribute:: kwargs

      initialize keyword arguments for this job


   .. py:method:: __getitem__(item)


   .. py:method:: __getstate__()


   .. py:method:: __setstate__(state)


   .. py:method:: assert_pp_codes()

      Sanity check of `self.kwargs`


   .. py:method:: get_hdf5_dataset_kwargs() -> dict

      Validate and return output HDF5 Dataset keyword arguments


   .. py:method:: get_ppid(ret_hash=False, ret_dict=False)


   .. py:method:: get_segmenter_class()

      Return the class of the segmenter associated with this job


   .. py:method:: validate()

      Make sure the pipeline will run given the job kwargs

      :returns: for testing convenience
      :rtype: True

      :raises dcnum.segm.SegmenterNotApplicableError:: the segmenter is incompatible with the input path