dcnum.feat.feat_background.bg_roll_median
Attributes
Classes
Rolling median RT-DC background image computation |
|
Worker process for median computation |
Functions
|
Compute the rolling median for a slice of a shared array |
Module Contents
- dcnum.feat.feat_background.bg_roll_median.ndi
- class dcnum.feat.feat_background.bg_roll_median.BackgroundRollMed(input_data, output_path, kernel_size=100, batch_size=10000, compress=True, num_cpus=None)[source]
Bases:
dcnum.feat.feat_background.base.BackgroundRolling median RT-DC background image computation
There is one big shared array shared_input that contains the image data for each batch.
User specifies batch size (10000) and kernel size (default is 100)
There is a second shared array shared_output that contains the median values corresponding to the data in shared_input.
Background computation is done by copying the input images from a file into the shared array.
The input array is split into and workers compute the rolling median for each point in shared_input.
- Parameters:
input_data (array-like or pathlib.Path) – The input data can be either a path to an HDF5 file with the “evtens/image” dataset or an array-like object that behaves like an image stack (first axis enumerates events)
output_path (pathlib.Path) – Path to the output file. If input_data is a path, you can set output_path to the same path to write directly to the input file. The data are written in the “events/image_bg” dataset in the output file.
kernel_size (int) – Kernel size for median computation. This is the number of events that are used to compute the median for each pixel.
batch_size (int) – Number of events to process at the same time. Increasing this number much more than two orders of magnitude larger than
kernel_sizewill not increase computation speed. Larger values lead to a higher memory consumption.compress (bool) – Whether to compress background data. Set this to False for faster processing.
num_cpus (int) – Number of CPUs to use for median computation. Defaults to dcnum.common.cpu_count().
- kernel_size = 100
kernel size used for median filtering
- batch_size = 10000
number of events processed at once
mp.RawArray for temporary batch input data
mp.RawArray for temporary batch output data
numpy array reshaped view on self.shared_input_raw. The first axis enumerating the events
numpy array reshaped view on self.shared_output_raw. The first axis enumerating the events
- current_batch = 0
current batch index (see self.process and process_next_batch)
- worker_counter
counter tracking process of workers
- queue
queue for median computation jobs
- workers
list of workers (processes)
- static check_user_kwargs(*, kernel_size: int = 100, batch_size: int = 10000)[source]
Check user-defined properties of this class
This method primarily exists so that the CLI knows which keyword arguments can be passed to this class.
- Parameters:
kernel_size (int) – Kernel size for median computation. This is the number of events that are used to compute the median for each pixel.
batch_size (int) – Number of events to process at the same time. Increasing this number much more than two orders of magnitude larger than
kernel_sizewill not increase computation speed. Larger values lead to a higher memory consumption.
- class dcnum.feat.feat_background.bg_roll_median.WorkerRollMed(job_queue, counter, shared_input, shared_output, batch_size, kernel_size, *args, **kwargs)[source]
Bases:
dcnum.feat.feat_background.base.mp_spawn.ProcessWorker process for median computation
- queue
- counter
- batch_size
- kernel_size
- dcnum.feat.feat_background.bg_roll_median.compute_median_for_slice(shared_input, shared_output, kernel_size, output_size, job_slice)[source]
Compute the rolling median for a slice of a shared array
- Parameters:
shared_input (multiprocessing.RawArray) – Input data for which to compute the median. For each pixel in the original image, batch_size + kernel_size events are stored in this array one after another in a row. The total size of this array is
batch_size * kernel_size * number_of_pixels_in_the_image.shared_output (multiprocessing.RawArray) – Used for storing the result. Note that the last kernel_size elements for each pixel in this output array are junk data (because it is a rolling median).
kernel_size (int) – Kernel size for median computation. This is the number of events that are used to compute the median for each pixel.
output_size (int) – The partial batch size, i.e. the number of events for which to compute the rolling median. Note that output_size + kernel_size events are taken from shared_input
job_slice (slice) – Now this is the important part. We can write to shared_input and shared_output from multiple processes. This slice tells us which part of the data we are working on. Only this slice will be edited in shared_output. This slice defines how many pixels we are looking at.