The design of dcnum
Submodule Structure
The general idea of dcnum is to have a toolset for processing raw DC data, which includes reading images, segmenting events, extracting features for each event, and writing to an output file.
Each of the individual submodules serves one particular aspect of the pipeline:
Submodule |
Description |
|---|---|
Feature extraction from segmented image data. |
|
Contains the necessary logic (the glue) to combine all
the other submodules for processing a dataset.
|
|
Handles metadata, most importantly the pipeline identifiers
(PPIDs).
|
|
For reading raw HDF5 (.rtdc) files. |
|
Event segmentation finds objects in an image and returns a
binary mask for each object.
|
|
For writing data to HDF5 (.rtdc) files. |
Pipeline sequence
A pipeline (including its PPID) is defined via the
logic.job.DCNumPipelineJob class which represents the recipe for a
pipeline. The pipeline is executed with the logic.ctrl.DCNumJobRunner.
Here is a simple example that runs the default pipeline for an .rtdc file.
from dcnum.logic import DCNumPipelineJob, DCNumJobRunner
job = logic.DCNumPipelineJob(path_in="input.rtdc")
with logic.DCNumJobRunner(job=job) as runner:
runner.run()
Take a look at the keyword arguments that the classes mentioned above
accept. Note that you can specify methods for background correction as
well as segmentation, and that you have full access to the keyword arguments
for every step in the pipeline. Also note that a reproducible PPID is derived
from these keyword arguments (logic.job.DCNumPipelineJob.get_ppid()).
The following happens when you run the above code snippet:
The file input.rtdc is opened using the module
read.The
DCNumJobRunnercreates two managers:segm.segmenter_manager_thread.SegmenterManagerThreadwhich spawns segmentation workers (subclasses ofsegm.segmenter.Segmenter) in separate subprocesses.feat.event_extractor_manager_thread.EventExtractorManagerThreadwhich spawns feature extraction workers (feat.queue_event_extractor.QueueEventExtractor) in separate subprocesses.
The segmentation workers read a chunk of image data and return the label image (integer-valued labels, one mask per event in a frame).
The label images are fed via a shared array to the feature extraction workers.
The feature extraction workers put the event information (one event per unique integer-labeled mask in the label image) in the event queue.
A
write.queue_collector_thread.QueueCollectorThreadputs the events in the right order and stages them for writing in chunks.A
write.dequeue_writer_thread.DequeWriterThreadwrites the chunks to the output file.