dcnum.feat.feat_background.bg_roll_median ========================================= .. py:module:: dcnum.feat.feat_background.bg_roll_median Attributes ---------- .. autoapisummary:: dcnum.feat.feat_background.bg_roll_median.ndi Classes ------- .. autoapisummary:: dcnum.feat.feat_background.bg_roll_median.BackgroundRollMed dcnum.feat.feat_background.bg_roll_median.WorkerRollMed Functions --------- .. autoapisummary:: dcnum.feat.feat_background.bg_roll_median.compute_median_for_slice Module Contents --------------- .. py:data:: ndi .. py:class:: BackgroundRollMed(input_data, output_path, kernel_size=100, batch_size=10000, compress=True, num_cpus=None) Bases: :py:obj:`dcnum.feat.feat_background.base.Background` Rolling median RT-DC background image computation 1. There is one big shared array `shared_input` that contains the image data for each batch. 2. User specifies batch size (10000) and kernel size (default is 100) 3. There is a second shared array `shared_output` that contains the median values corresponding to the data in `shared_input`. 4. Background computation is done by copying the input images from a file into the shared array. 5. The input array is split into and workers compute the rolling median for each point in `shared_input`. :param input_data: The input data can be either a path to an HDF5 file with the "evtens/image" dataset or an array-like object that behaves like an image stack (first axis enumerates events) :type input_data: array-like or pathlib.Path :param output_path: Path to the output file. If `input_data` is a path, you can set `output_path` to the same path to write directly to the input file. The data are written in the "events/image_bg" dataset in the output file. :type output_path: pathlib.Path :param kernel_size: Kernel size for median computation. This is the number of events that are used to compute the median for each pixel. :type kernel_size: int :param batch_size: Number of events to process at the same time. Increasing this number much more than two orders of magnitude larger than ``kernel_size`` will not increase computation speed. Larger values lead to a higher memory consumption. :type batch_size: int :param compress: Whether to compress background data. Set this to False for faster processing. :type compress: bool :param num_cpus: Number of CPUs to use for median computation. Defaults to `dcnum.common.cpu_count()`. :type num_cpus: int .. py:attribute:: kernel_size :value: 100 kernel size used for median filtering .. py:attribute:: batch_size :value: 10000 number of events processed at once .. py:attribute:: shared_input_raw mp.RawArray for temporary batch input data .. py:attribute:: shared_output_raw mp.RawArray for temporary batch output data .. py:attribute:: shared_input numpy array reshaped view on `self.shared_input_raw`. The first axis enumerating the events .. py:attribute:: shared_output numpy array reshaped view on `self.shared_output_raw`. The first axis enumerating the events .. py:attribute:: current_batch :value: 0 current batch index (see `self.process` and `process_next_batch`) .. py:attribute:: worker_counter counter tracking process of workers .. py:attribute:: queue queue for median computation jobs .. py:attribute:: workers list of workers (processes) .. py:method:: __enter__() .. py:method:: __exit__(type, value, tb) .. py:method:: check_user_kwargs(*, kernel_size: int = 100, batch_size: int = 10000) :staticmethod: Check user-defined properties of this class This method primarily exists so that the CLI knows which keyword arguments can be passed to this class. :param kernel_size: Kernel size for median computation. This is the number of events that are used to compute the median for each pixel. :type kernel_size: int :param batch_size: Number of events to process at the same time. Increasing this number much more than two orders of magnitude larger than ``kernel_size`` will not increase computation speed. Larger values lead to a higher memory consumption. :type batch_size: int .. py:method:: get_slices_for_batch(batch_index=0) Returns slices for getting the input and writing to output The input slice is `self.kernel_size` longer. .. py:method:: map_iterator() Iterates over arguments for `compute_median_for_slice` .. py:method:: process_approach() Perform median computation on entire input data .. py:method:: process_next_batch() Process one batch of input data .. py:class:: WorkerRollMed(job_queue, counter, shared_input, shared_output, batch_size, kernel_size, *args, **kwargs) Bases: :py:obj:`dcnum.feat.feat_background.base.mp_spawn.Process` Worker process for median computation .. py:attribute:: queue .. py:attribute:: counter .. py:attribute:: shared_input_raw .. py:attribute:: shared_output_raw .. py:attribute:: batch_size .. py:attribute:: kernel_size .. py:method:: run() Main loop of worker process (breaks when `self.counter` <0) .. py:method:: start() Start child process .. py:function:: compute_median_for_slice(shared_input, shared_output, kernel_size, output_size, job_slice) Compute the rolling median for a slice of a shared array :param shared_input: Input data for which to compute the median. For each pixel in the original image, batch_size + kernel_size events are stored in this array one after another in a row. The total size of this array is ``batch_size * kernel_size * number_of_pixels_in_the_image``. :type shared_input: multiprocessing.RawArray :param shared_output: Used for storing the result. Note that the last `kernel_size` elements for each pixel in this output array are junk data (because it is a rolling median). :type shared_output: multiprocessing.RawArray :param kernel_size: Kernel size for median computation. This is the number of events that are used to compute the median for each pixel. :type kernel_size: int :param output_size: The partial batch size, i.e. the number of events for which to compute the rolling median. Note that output_size + kernel_size events are taken from shared_input :type output_size: int :param job_slice: Now this is the important part. We can write to `shared_input` and shared_output from multiple processes. This slice tells us which part of the data we are working on. Only this slice will be edited in `shared_output`. This slice defines how many pixels we are looking at. :type job_slice: slice