dcnum.feat.feat_background
==========================

.. py:module:: dcnum.feat.feat_background

.. autoapi-nested-parse::

   Feature computation: background image data from image data


Submodules
----------

.. toctree::
   :maxdepth: 1

   /autoapi/dcnum/feat/feat_background/base/index
   /autoapi/dcnum/feat/feat_background/bg_copy/index
   /autoapi/dcnum/feat/feat_background/bg_roll_median/index
   /autoapi/dcnum/feat/feat_background/bg_sparse_median/index


Classes
-------

.. autoapisummary::

   dcnum.feat.feat_background.Background
   dcnum.feat.feat_background.BackgroundCopy
   dcnum.feat.feat_background.BackgroundRollMed
   dcnum.feat.feat_background.BackgroundSparseMed


Functions
---------

.. autoapisummary::

   dcnum.feat.feat_background.get_available_background_methods


Package Contents
----------------

.. py:class:: Background(input_data, output_path, compress=True, num_cpus=None, **kwargs)

   Bases: :py:obj:`abc.ABC`


   Base class for background computation

   :param input_data: The input data can be either a path to an HDF5 file with
                      the "evtens/image" dataset or an array-like object that
                      behaves like an image stack (first axis enumerates events)
   :type input_data: array-like or pathlib.Path
   :param output_path: Path to the output file. If `input_data` is a path, you can
                       set `output_path` to the same path to write directly to the
                       input file. The data are written in the "events/image_bg"
                       dataset in the output file.
   :type output_path: pathlib.Path
   :param compress: Whether to compress background data. Set this to False
                    for faster processing.
   :type compress: bool
   :param num_cpus: Number of CPUs to use for median computation. Defaults to
                    `dcnum.common.cpu_count()`.
   :type num_cpus: int
   :param kwargs: Additional keyword arguments passed to the subclass.


   .. py:attribute:: logger


   .. py:attribute:: output_path


   .. py:attribute:: kwargs

      background keyword arguments


   .. py:attribute:: num_cpus
      :value: None


      number of CPUs used


   .. py:attribute:: image_proc

      fraction of images that have been processed


   .. py:attribute:: hdin
      :value: None


      HDF5Data instance for input data


   .. py:attribute:: h5in
      :value: None


      input h5py.File


   .. py:attribute:: h5out
      :value: None


      output h5py.File


   .. py:attribute:: paths_ref
      :value: []


      reference paths for logging to the output .rtdc file


   .. py:attribute:: image_shape

      shape of event images


   .. py:attribute:: image_count

      number of images in the input data


   .. py:attribute:: writer


   .. py:method:: __enter__()


   .. py:method:: __exit__(type, value, tb)


   .. py:method:: check_user_kwargs(**kwargs)
      :abstractmethod:


      Implement this to check the kwargs during init


   .. py:method:: get_ppid()

      Return a unique background pipeline identifier

      The pipeline identifier is universally applicable and must
      be backwards-compatible (future versions of dcnum will
      correctly acknowledge the ID).

      The segmenter pipeline ID is defined as::

          KEY:KW_BACKGROUND

      Where KEY is e.g. "sparsemed" or "rollmed", and KW_BACKGROUND is a
      list of keyword arguments for `check_user_kwargs`, e.g.::

          kernel_size=100^batch_size=10000

      which may be abbreviated to::

          k=100^b=10000


   .. py:method:: get_ppid_code()
      :classmethod:


   .. py:method:: get_ppid_from_ppkw(kwargs)
      :classmethod:


      Return the PPID based on given keyword arguments for a subclass


   .. py:method:: get_ppkw_from_ppid(bg_ppid)
      :staticmethod:


      Return keyword arguments for any subclass from a PPID string


   .. py:method:: get_progress()

      Return progress of background computation, float in [0,1]


   .. py:method:: process()

      Perform the background computation

      This irreversibly removes/overrides any "image_bg" and
      "bg_off" features defined in the output file `self.h5out`.


   .. py:method:: process_approach()
      :abstractmethod:


      The actual background computation approach


.. py:function:: get_available_background_methods()

   Return dictionary of background computation methods


.. py:class:: BackgroundCopy(*args, **kwargs)

   Bases: :py:obj:`dcnum.feat.feat_background.base.Background`


   Copy the input background data to the output file


   .. py:method:: check_user_kwargs()
      :staticmethod:


      Implement this to check the kwargs during init


   .. py:method:: process()

      Copy input data to output dataset


   .. py:method:: process_approach()

      The actual background computation approach


.. py:class:: BackgroundRollMed(input_data, output_path, kernel_size=100, batch_size=10000, compress=True, num_cpus=None)

   Bases: :py:obj:`dcnum.feat.feat_background.base.Background`


   Rolling median RT-DC background image computation

   1. There is one big shared array `shared_input` that contains
      the image data for each batch.
   2. User specifies batch size (10000) and kernel size (default
      is 100)
   3. There is a second shared array `shared_output` that contains
      the median values corresponding to the data in `shared_input`.
   4. Background computation is done by copying the input images
      from a file into the shared array.
   5. The input array is split into and workers compute the
      rolling median for each point in `shared_input`.

   :param input_data: The input data can be either a path to an HDF5 file with
                      the "evtens/image" dataset or an array-like object that
                      behaves like an image stack (first axis enumerates events)
   :type input_data: array-like or pathlib.Path
   :param output_path: Path to the output file. If `input_data` is a path, you can
                       set `output_path` to the same path to write directly to the
                       input file. The data are written in the "events/image_bg"
                       dataset in the output file.
   :type output_path: pathlib.Path
   :param kernel_size: Kernel size for median computation. This is the number of
                       events that are used to compute the median for each pixel.
   :type kernel_size: int
   :param batch_size: Number of events to process at the same time. Increasing this
                      number much more than two orders of magnitude larger than
                      ``kernel_size`` will not increase computation speed. Larger
                      values lead to a higher memory consumption.
   :type batch_size: int
   :param compress: Whether to compress background data. Set this to False
                    for faster processing.
   :type compress: bool
   :param num_cpus: Number of CPUs to use for median computation. Defaults to
                    `dcnum.common.cpu_count()`.
   :type num_cpus: int


   .. py:attribute:: kernel_size
      :value: 100


      kernel size used for median filtering


   .. py:attribute:: batch_size
      :value: 10000


      number of events processed at once


   .. py:attribute:: shared_input_raw

      mp.RawArray for temporary batch input data


   .. py:attribute:: shared_output_raw

      mp.RawArray for temporary batch output data


   .. py:attribute:: shared_input

      numpy array reshaped view on `self.shared_input_raw`.
      The first axis enumerating the events


   .. py:attribute:: shared_output

      numpy array reshaped view on `self.shared_output_raw`.
      The first axis enumerating the events


   .. py:attribute:: current_batch
      :value: 0


      current batch index (see `self.process` and `process_next_batch`)


   .. py:attribute:: worker_counter

      counter tracking process of workers


   .. py:attribute:: queue

      queue for median computation jobs


   .. py:attribute:: workers

      list of workers (processes)


   .. py:method:: __enter__()


   .. py:method:: __exit__(type, value, tb)


   .. py:method:: check_user_kwargs(*, kernel_size: int = 100, batch_size: int = 10000)
      :staticmethod:


      Check user-defined properties of this class

      This method primarily exists so that the CLI knows which
      keyword arguments can be passed to this class.

      :param kernel_size: Kernel size for median computation. This is the number of
                          events that are used to compute the median for each pixel.
      :type kernel_size: int
      :param batch_size: Number of events to process at the same time. Increasing this
                         number much more than two orders of magnitude larger than
                         ``kernel_size`` will not increase computation speed. Larger
                         values lead to a higher memory consumption.
      :type batch_size: int


   .. py:method:: get_slices_for_batch(batch_index=0)

      Returns slices for getting the input and writing to output

      The input slice is `self.kernel_size` longer.


   .. py:method:: map_iterator()

      Iterates over arguments for `compute_median_for_slice`


   .. py:method:: process_approach()

      Perform median computation on entire input data


   .. py:method:: process_next_batch()

      Process one batch of input data


.. py:class:: BackgroundSparseMed(input_data, output_path, kernel_size=200, split_time=1.0, thresh_cleansing=0, frac_cleansing=0.8, offset_correction=True, compress=True, num_cpus=None)

   Bases: :py:obj:`dcnum.feat.feat_background.base.Background`


   Sparse median background correction with cleansing

   In contrast to the rolling median background correction,
   this algorithm only computes the background image every
   ``split_time`` seconds, but with a larger window (default kernel
   size is 200 frames instead of 100 frames).

   1. At time stamps every `split_time` seconds, a background image is
      computed, resulting in a background series.
   2. Cleansing: The background series is checked for images that
      contain event data using a lengthy algorithm that is documented
      in the source code (sorry). In short, this gets rid of
      background images that contain streaks of RBCs.
   3. Each frame gets the background image closest to it
      based on time from the background series.

   :param input_data: The input data can be either a path to an HDF5 file with
                      the "evtens/image" dataset or an array-like object that
                      behaves like an image stack (first axis enumerates events).
   :type input_data: array-like or pathlib.Path
   :param output_path: Path to the output file. If `input_data` is a path, you can
                       set `output_path` to the same path to write directly to the
                       input file. The data are written in the "events/image_bg"
                       dataset in the output file.
   :type output_path: pathlib.Path
   :param kernel_size: Kernel size for median computation. This is the number of
                       events that are used to compute the median for each pixel.
   :type kernel_size: int
   :param split_time: Time between background images in the background series
   :type split_time: float
   :param thresh_cleansing: A positive floating point value for scaling the thresholding
                            operation when excluding background images from the series.
                            Larger values mean more background images are excluded.
                            Set to zero to enforce a fixed fraction via `frac_cleansing`.
   :type thresh_cleansing: float
   :param frac_cleansing: Fraction between 0 and 1 indicating how many background images
                          must still be present after cleansing (in case the cleansing
                          factor is too large). Set to 1 to disable cleansing altogether.
   :type frac_cleansing: float
   :param offset_correction: The sparse median background correction produces one median
                             image for multiple input frames (BTW this also leads to very
                             efficient data storage with internal HDF5 basins). In
                             case the input frames are subject to frame-by-frame brightness
                             variations (e.g. flickering of the illumination source), it
                             is useful to have an offset value per frame that can then be
                             used in a later step to perform a more accurate background
                             correction. This offset is computed here by taking a 20px wide
                             slice from each frame (where the channel wall is located)
                             and computing the median therein relative to the computed
                             background image. The data are written to the "bg_off" feature
                             in the output file alongside "image_bg". To obtain the
                             corrected background image, add "image_bg" and "bg_off".
                             Set this to False if you don't need the "bg_off" feature.
   :type offset_correction: bool
   :param compress: Whether to compress background data. Set this to False
                    for faster processing.
   :type compress: bool
   :param num_cpus: Number of CPUs to use for median computation. Defaults to
                    `dcnum.common.cpu_count()`.
   :type num_cpus: int
   :param .. versionchanged:: 0.23.5: The background image data are stored as an internal
                                      mapped basin to reduce the output file size.


   .. py:attribute:: kernel_size
      :value: 200


      kernel size used for median filtering


   .. py:attribute:: split_time
      :value: 1.0


      time between background images in the background series


   .. py:attribute:: thresh_cleansing
      :value: 0


      cleansing threshold factor


   .. py:attribute:: frac_cleansing
      :value: 0.8


      keep at least this many background images from the series


   .. py:attribute:: offset_correction
      :value: True


      offset/flickering correction


   .. py:attribute:: time
      :value: None


   .. py:attribute:: duration

      duration of the measurement


   .. py:attribute:: step_times


   .. py:attribute:: bg_images

      array containing all background images


   .. py:attribute:: shared_input_raw

      mp.RawArray for temporary batch input data


   .. py:attribute:: shared_output_raw

      mp.RawArray for the median background image


   .. py:attribute:: shared_input

      numpy array reshaped view on `self.shared_input_raw`.
      The First axis enumerating the events


   .. py:attribute:: shared_output

      numpy array reshaped view on `self.shared_output_raw`.
      The First axis enumerating the events


   .. py:attribute:: worker_counter

      counter tracking process of workers


   .. py:attribute:: queue

      queue for median computation jobs


   .. py:attribute:: workers

      list of workers (processes)


   .. py:method:: __enter__()


   .. py:method:: __exit__(type, value, tb)


   .. py:method:: check_user_kwargs(*, kernel_size: int = 200, split_time: float = 1.0, thresh_cleansing: float = 0, frac_cleansing: float = 0.8, offset_correction: bool = True)
      :staticmethod:


      Initialize user-defined properties of this class

      This method primarily exists so that the CLI knows which
      keyword arguments can be passed to this class.

      :param kernel_size: Kernel size for median computation. This is the number of
                          events that are used to compute the median for each pixel.
      :type kernel_size: int
      :param split_time: Time between background images in the background series
      :type split_time: float
      :param thresh_cleansing: A positive floating point value for scaling the thresholding
                               operation when excluding background images from the series.
                               Larger values mean more background images are excluded.
                               Set to 0 (default) to enforce a fixed fraction `frac_cleansing`.
      :type thresh_cleansing: float
      :param frac_cleansing: Fraction between 0 and 1 indicating how many background images
                             must still be present after cleansing (in case the cleansing
                             factor is too large). Set to 1 to disable cleansing altogether.
      :type frac_cleansing: float
      :param offset_correction: The sparse median background correction produces one median
                                image for multiple input frames (BTW this also leads to very
                                efficient data storage with internal HDF5 basins). In
                                case the input frames are subject to frame-by-frame brightness
                                variations (e.g. flickering of the illumination source), it
                                is useful to have an offset value per frame that can then be
                                used in a later step to perform a more accurate background
                                correction. This offset is computed here by taking a 20px wide
                                slice from each frame (where the channel wall is located)
                                and computing the median therein relative to the computed
                                background image. The data are written to the "bg_off" feature
                                in the output file alongside "image_bg". To obtain the
                                corrected background image, add "image_bg" and "bg_off".
                                Set this to False if you don't need the "bg_off" feature.
      :type offset_correction: bool


   .. py:method:: process_approach()

      Perform median computation on entire input data


   .. py:method:: process_second(ii: int, second: float | int)