dcnum.feat.feat_background
Feature computation: background image data from image data
Submodules
Classes
Base class for background computation |
|
Copy the input background data to the output file |
|
Rolling median RT-DC background image computation |
|
Sparse median background correction with cleansing |
Functions
Return dictionary of background computation methods |
Package Contents
- class dcnum.feat.feat_background.Background(input_data, output_path, compress=True, num_cpus=None, **kwargs)[source]
Bases:
abc.ABCBase class for background computation
- Parameters:
input_data (array-like or pathlib.Path) – The input data can be either a path to an HDF5 file with the “evtens/image” dataset or an array-like object that behaves like an image stack (first axis enumerates events)
output_path (pathlib.Path) – Path to the output file. If input_data is a path, you can set output_path to the same path to write directly to the input file. The data are written in the “events/image_bg” dataset in the output file.
compress (bool) – Whether to compress background data. Set this to False for faster processing.
num_cpus (int) – Number of CPUs to use for median computation. Defaults to dcnum.common.cpu_count().
kwargs – Additional keyword arguments passed to the subclass.
- logger
- output_path
- kwargs
background keyword arguments
- num_cpus = None
number of CPUs used
- image_proc
fraction of images that have been processed
- hdin = None
HDF5Data instance for input data
- h5in = None
input h5py.File
- h5out = None
output h5py.File
- paths_ref = []
reference paths for logging to the output .rtdc file
- image_shape
shape of event images
- image_count
number of images in the input data
- writer
- get_ppid()[source]
Return a unique background pipeline identifier
The pipeline identifier is universally applicable and must be backwards-compatible (future versions of dcnum will correctly acknowledge the ID).
The segmenter pipeline ID is defined as:
KEY:KW_BACKGROUND
Where KEY is e.g. “sparsemed” or “rollmed”, and KW_BACKGROUND is a list of keyword arguments for check_user_kwargs, e.g.:
kernel_size=100^batch_size=10000
which may be abbreviated to:
k=100^b=10000
- classmethod get_ppid_from_ppkw(kwargs)[source]
Return the PPID based on given keyword arguments for a subclass
- static get_ppkw_from_ppid(bg_ppid)[source]
Return keyword arguments for any subclass from a PPID string
- dcnum.feat.feat_background.get_available_background_methods()[source]
Return dictionary of background computation methods
- class dcnum.feat.feat_background.BackgroundCopy(*args, **kwargs)[source]
Bases:
dcnum.feat.feat_background.base.BackgroundCopy the input background data to the output file
- class dcnum.feat.feat_background.BackgroundRollMed(input_data, output_path, kernel_size=100, batch_size=10000, compress=True, num_cpus=None)[source]
Bases:
dcnum.feat.feat_background.base.BackgroundRolling median RT-DC background image computation
There is one big shared array shared_input that contains the image data for each batch.
User specifies batch size (10000) and kernel size (default is 100)
There is a second shared array shared_output that contains the median values corresponding to the data in shared_input.
Background computation is done by copying the input images from a file into the shared array.
The input array is split into and workers compute the rolling median for each point in shared_input.
- Parameters:
input_data (array-like or pathlib.Path) – The input data can be either a path to an HDF5 file with the “evtens/image” dataset or an array-like object that behaves like an image stack (first axis enumerates events)
output_path (pathlib.Path) – Path to the output file. If input_data is a path, you can set output_path to the same path to write directly to the input file. The data are written in the “events/image_bg” dataset in the output file.
kernel_size (int) – Kernel size for median computation. This is the number of events that are used to compute the median for each pixel.
batch_size (int) – Number of events to process at the same time. Increasing this number much more than two orders of magnitude larger than
kernel_sizewill not increase computation speed. Larger values lead to a higher memory consumption.compress (bool) – Whether to compress background data. Set this to False for faster processing.
num_cpus (int) – Number of CPUs to use for median computation. Defaults to dcnum.common.cpu_count().
- kernel_size = 100
kernel size used for median filtering
- batch_size = 10000
number of events processed at once
mp.RawArray for temporary batch input data
mp.RawArray for temporary batch output data
numpy array reshaped view on self.shared_input_raw. The first axis enumerating the events
numpy array reshaped view on self.shared_output_raw. The first axis enumerating the events
- current_batch = 0
current batch index (see self.process and process_next_batch)
- worker_counter
counter tracking process of workers
- queue
queue for median computation jobs
- workers
list of workers (processes)
- static check_user_kwargs(*, kernel_size: int = 100, batch_size: int = 10000)[source]
Check user-defined properties of this class
This method primarily exists so that the CLI knows which keyword arguments can be passed to this class.
- Parameters:
kernel_size (int) – Kernel size for median computation. This is the number of events that are used to compute the median for each pixel.
batch_size (int) – Number of events to process at the same time. Increasing this number much more than two orders of magnitude larger than
kernel_sizewill not increase computation speed. Larger values lead to a higher memory consumption.
- class dcnum.feat.feat_background.BackgroundSparseMed(input_data, output_path, kernel_size=200, split_time=1.0, thresh_cleansing=0, frac_cleansing=0.8, offset_correction=True, compress=True, num_cpus=None)[source]
Bases:
dcnum.feat.feat_background.base.BackgroundSparse median background correction with cleansing
In contrast to the rolling median background correction, this algorithm only computes the background image every
split_timeseconds, but with a larger window (default kernel size is 200 frames instead of 100 frames).At time stamps every split_time seconds, a background image is computed, resulting in a background series.
Cleansing: The background series is checked for images that contain event data using a lengthy algorithm that is documented in the source code (sorry). In short, this gets rid of background images that contain streaks of RBCs.
Each frame gets the background image closest to it based on time from the background series.
- Parameters:
input_data (array-like or pathlib.Path) – The input data can be either a path to an HDF5 file with the “evtens/image” dataset or an array-like object that behaves like an image stack (first axis enumerates events).
output_path (pathlib.Path) – Path to the output file. If input_data is a path, you can set output_path to the same path to write directly to the input file. The data are written in the “events/image_bg” dataset in the output file.
kernel_size (int) – Kernel size for median computation. This is the number of events that are used to compute the median for each pixel.
split_time (float) – Time between background images in the background series
thresh_cleansing (float) – A positive floating point value for scaling the thresholding operation when excluding background images from the series. Larger values mean more background images are excluded. Set to zero to enforce a fixed fraction via frac_cleansing.
frac_cleansing (float) – Fraction between 0 and 1 indicating how many background images must still be present after cleansing (in case the cleansing factor is too large). Set to 1 to disable cleansing altogether.
offset_correction (bool) – The sparse median background correction produces one median image for multiple input frames (BTW this also leads to very efficient data storage with internal HDF5 basins). In case the input frames are subject to frame-by-frame brightness variations (e.g. flickering of the illumination source), it is useful to have an offset value per frame that can then be used in a later step to perform a more accurate background correction. This offset is computed here by taking a 20px wide slice from each frame (where the channel wall is located) and computing the median therein relative to the computed background image. The data are written to the “bg_off” feature in the output file alongside “image_bg”. To obtain the corrected background image, add “image_bg” and “bg_off”. Set this to False if you don’t need the “bg_off” feature.
compress (bool) – Whether to compress background data. Set this to False for faster processing.
num_cpus (int) – Number of CPUs to use for median computation. Defaults to dcnum.common.cpu_count().
versionchanged: (..) – 0.23.5: The background image data are stored as an internal mapped basin to reduce the output file size.
- kernel_size = 200
kernel size used for median filtering
- split_time = 1.0
time between background images in the background series
- thresh_cleansing = 0
cleansing threshold factor
- frac_cleansing = 0.8
keep at least this many background images from the series
- offset_correction = True
offset/flickering correction
- time = None
- duration
duration of the measurement
- step_times
- bg_images
array containing all background images
mp.RawArray for temporary batch input data
mp.RawArray for the median background image
numpy array reshaped view on self.shared_input_raw. The First axis enumerating the events
numpy array reshaped view on self.shared_output_raw. The First axis enumerating the events
- worker_counter
counter tracking process of workers
- queue
queue for median computation jobs
- workers
list of workers (processes)
- static check_user_kwargs(*, kernel_size: int = 200, split_time: float = 1.0, thresh_cleansing: float = 0, frac_cleansing: float = 0.8, offset_correction: bool = True)[source]
Initialize user-defined properties of this class
This method primarily exists so that the CLI knows which keyword arguments can be passed to this class.
- Parameters:
kernel_size (int) – Kernel size for median computation. This is the number of events that are used to compute the median for each pixel.
split_time (float) – Time between background images in the background series
thresh_cleansing (float) – A positive floating point value for scaling the thresholding operation when excluding background images from the series. Larger values mean more background images are excluded. Set to 0 (default) to enforce a fixed fraction frac_cleansing.
frac_cleansing (float) – Fraction between 0 and 1 indicating how many background images must still be present after cleansing (in case the cleansing factor is too large). Set to 1 to disable cleansing altogether.
offset_correction (bool) – The sparse median background correction produces one median image for multiple input frames (BTW this also leads to very efficient data storage with internal HDF5 basins). In case the input frames are subject to frame-by-frame brightness variations (e.g. flickering of the illumination source), it is useful to have an offset value per frame that can then be used in a later step to perform a more accurate background correction. This offset is computed here by taking a 20px wide slice from each frame (where the channel wall is located) and computing the median therein relative to the computed background image. The data are written to the “bg_off” feature in the output file alongside “image_bg”. To obtain the corrected background image, add “image_bg” and “bg_off”. Set this to False if you don’t need the “bg_off” feature.