dcnum.feat.feat_background.bg_roll_median

Attributes

ndi

Classes

BackgroundRollMed

Rolling median RT-DC background image computation

WorkerRollMed

Worker process for median computation

Functions

compute_median_for_slice(shared_input, shared_output, ...)

Compute the rolling median for a slice of a shared array

Module Contents

dcnum.feat.feat_background.bg_roll_median.ndi
class dcnum.feat.feat_background.bg_roll_median.BackgroundRollMed(input_data, output_path, kernel_size=100, batch_size=10000, compress=True, num_cpus=None)[source]

Bases: dcnum.feat.feat_background.base.Background

Rolling median RT-DC background image computation

  1. There is one big shared array shared_input that contains the image data for each batch.

  2. User specifies batch size (10000) and kernel size (default is 100)

  3. There is a second shared array shared_output that contains the median values corresponding to the data in shared_input.

  4. Background computation is done by copying the input images from a file into the shared array.

  5. The input array is split into and workers compute the rolling median for each point in shared_input.

Parameters:
  • input_data (array-like or pathlib.Path) – The input data can be either a path to an HDF5 file with the “evtens/image” dataset or an array-like object that behaves like an image stack (first axis enumerates events)

  • output_path (pathlib.Path) – Path to the output file. If input_data is a path, you can set output_path to the same path to write directly to the input file. The data are written in the “events/image_bg” dataset in the output file.

  • kernel_size (int) – Kernel size for median computation. This is the number of events that are used to compute the median for each pixel.

  • batch_size (int) – Number of events to process at the same time. Increasing this number much more than two orders of magnitude larger than kernel_size will not increase computation speed. Larger values lead to a higher memory consumption.

  • compress (bool) – Whether to compress background data. Set this to False for faster processing.

  • num_cpus (int) – Number of CPUs to use for median computation. Defaults to dcnum.common.cpu_count().

kernel_size = 100

kernel size used for median filtering

batch_size = 10000

number of events processed at once

shared_input_raw

mp.RawArray for temporary batch input data

shared_output_raw

mp.RawArray for temporary batch output data

shared_input

numpy array reshaped view on self.shared_input_raw. The first axis enumerating the events

shared_output

numpy array reshaped view on self.shared_output_raw. The first axis enumerating the events

current_batch = 0

current batch index (see self.process and process_next_batch)

worker_counter

counter tracking process of workers

queue

queue for median computation jobs

workers

list of workers (processes)

__enter__()[source]
__exit__(type, value, tb)[source]
static check_user_kwargs(*, kernel_size: int = 100, batch_size: int = 10000)[source]

Check user-defined properties of this class

This method primarily exists so that the CLI knows which keyword arguments can be passed to this class.

Parameters:
  • kernel_size (int) – Kernel size for median computation. This is the number of events that are used to compute the median for each pixel.

  • batch_size (int) – Number of events to process at the same time. Increasing this number much more than two orders of magnitude larger than kernel_size will not increase computation speed. Larger values lead to a higher memory consumption.

get_slices_for_batch(batch_index=0)[source]

Returns slices for getting the input and writing to output

The input slice is self.kernel_size longer.

map_iterator()[source]

Iterates over arguments for compute_median_for_slice

process_approach()[source]

Perform median computation on entire input data

process_next_batch()[source]

Process one batch of input data

class dcnum.feat.feat_background.bg_roll_median.WorkerRollMed(job_queue, counter, shared_input, shared_output, batch_size, kernel_size, *args, **kwargs)[source]

Bases: dcnum.feat.feat_background.base.mp_spawn.Process

Worker process for median computation

queue
counter
shared_input_raw
shared_output_raw
batch_size
kernel_size
run()[source]

Main loop of worker process (breaks when self.counter <0)

start()[source]

Start child process

dcnum.feat.feat_background.bg_roll_median.compute_median_for_slice(shared_input, shared_output, kernel_size, output_size, job_slice)[source]

Compute the rolling median for a slice of a shared array

Parameters:
  • shared_input (multiprocessing.RawArray) – Input data for which to compute the median. For each pixel in the original image, batch_size + kernel_size events are stored in this array one after another in a row. The total size of this array is batch_size * kernel_size * number_of_pixels_in_the_image.

  • shared_output (multiprocessing.RawArray) – Used for storing the result. Note that the last kernel_size elements for each pixel in this output array are junk data (because it is a rolling median).

  • kernel_size (int) – Kernel size for median computation. This is the number of events that are used to compute the median for each pixel.

  • output_size (int) – The partial batch size, i.e. the number of events for which to compute the rolling median. Note that output_size + kernel_size events are taken from shared_input

  • job_slice (slice) – Now this is the important part. We can write to shared_input and shared_output from multiple processes. This slice tells us which part of the data we are working on. Only this slice will be edited in shared_output. This slice defines how many pixels we are looking at.