dcnum.write.writer

Attributes

hdf5plugin

Exceptions

`CreatingFileWithoutBasinWarning`	Initialize self. See help(type(self)) for accurate signature.
`IgnoringBasinTypeWarning`	Initialize self. See help(type(self)) for accurate signature.

Classes

HDF5Writer

Write deformability cytometry HDF5 data

Functions

`create_with_basins`(path_out, basin_paths)	Create an .rtdc file with basins
`copy_basins`(h5_src, h5_dst[, internal_basins])	Reassemble basin data in the output file
`copy_features`(h5_src, h5_dst, features[, mapping, ds_kwds])	Copy feature data from one HDF5 file to another
`copy_metadata`(h5_src, h5_dst)	Copy attributes, tables, and logs from one H5File to another
`set_default_filter_kwargs`([ds_kwds, compression])

Module Contents

dcnum.write.writer.hdf5plugin

exception dcnum.write.writer.CreatingFileWithoutBasinWarning[source]

Bases: UserWarning

Initialize self. See help(type(self)) for accurate signature.

exception dcnum.write.writer.IgnoringBasinTypeWarning[source]

Bases: UserWarning

Initialize self. See help(type(self)) for accurate signature.

class dcnum.write.writer.HDF5Writer(obj: dcnum.common.h5py.File | pathlib.Path | str, mode: str = 'a', ds_kwds: dict | None = None)[source]

Write deformability cytometry HDF5 data

Parameters:

obj (h5py.File | pathlib.Path | str) – object to instantiate the writer from; If this is already a h5py.File object, then it is used, otherwise the argument is passed to h5py.File
mode (str) – opening mode when using h5py.File
ds_kwds (dict) – keyword arguments with which to initialize new Datasets (e.g. compression)

events

ds_kwds = None

__enter__()[source]

__exit__(exc_type, exc_val, exc_tb)[source]

close()[source]

static get_best_nd_chunks(item_shape, feat_dtype=np.float64)[source]

Return best chunks for HDF5 datasets

Chunking has performance implications. It’s recommended to keep the total size of dataset chunks between 10 KiB and 1 MiB. This number defines the maximum chunk size as well as half the maximum cache size for each dataset.

require_feature(feat: str, item_shape: tuple[int], feat_dtype: numpy.dtype, ds_kwds: dict | None = None, group_name: str = 'events')[source]

Create a new feature in the “events” group

Parameters:

feat (str) – name of the feature
item_shape (tuple[int]) – shape for one event of this feature, e.g. for a scalar event, the shape would be (1,) and for an image, the shape could be (80, 300).
feat_dtype (np.dtype) – dtype of the feature
ds_kwds (dict) – HDF5 Dataset keyword arguments (e.g. compression, fletcher32)
group_name (str) – name of the HDF5 group where the feature should be written to; defaults to the “events” group, but a different group can be specified for storing e.g. internal basin features.

Write an HDF5-based file basin

Parameters:

name (str) – basin name; Names do not have to be unique.
paths (list of str or pathlib.Path or None) – location(s) of the basin; must be None when storing internal data, a list of paths otherwise
features (list of str) – list of features provided by paths
description (str) – optional string describing the basin
mapping (1D array) – integer array with indices that map the basin dataset to this dataset
internal_data (dict of ndarrays) – internal basin data to store; If this is set, then features and paths must be set to None.
identifier (str) – the measurement identifier of the basin as computed by the get_measurement_identifier() function.

store_feature_chunk(feat, data, group_name='events')[source]

Store feature data

The “chunk” implies that always chunks of data are stored, never single events.

store_log(log: str, data: list[str], override: bool = False) → dcnum.common.h5py.Dataset[source]

Store log data

Store the log data under the key log. The data kwarg must be a list of strings. If the log entry already exists, ValueError is raised unless override is set to True.

dcnum.write.writer.create_with_basins(path_out: str | pathlib.Path, basin_paths: list[str | pathlib.Path] | list[list[str | pathlib.Path]])[source]

Create an .rtdc file with basins

Parameters:

path_out – The output .rtdc file where basins are written to
basin_paths – The paths to the basins written to path_out. This can be either a list of paths (to different basins) or a list of lists for paths (for basins containing the same information, commonly used for relative and absolute paths).

dcnum.write.writer.copy_basins(h5_src: dcnum.common.h5py.File, h5_dst: dcnum.common.h5py.File, internal_basins: bool = True)[source]

Reassemble basin data in the output file

This does not just copy the datasets defined in the “basins” group, but it also loads the “basinmap?” features and stores them as new “basinmap?” features in the output file.

dcnum.write.writer.copy_features(h5_src: dcnum.common.h5py.File, h5_dst: dcnum.common.h5py.File, features: list[str], mapping: numpy.ndarray | None = None, ds_kwds: dict | None = None)[source]

Copy feature data from one HDF5 file to another

The feature must not exist in the destination file.

Parameters:

h5_src (h5py.File) – Input HDF5File containing features in the “events” group
h5_dst (h5py.File) – Output HDF5File opened in write mode not containing features
features (list[str]) – List of features to copy from source to destination
mapping (1D array) – If given, contains indices in the input file that should be written to the output file. If set to None, all features are written.
ds_kwds – keyword arguments with which to initialize new Datasets (e.g. compression); only relevant when mapping is not None

dcnum.write.writer.copy_metadata(h5_src: dcnum.common.h5py.File, h5_dst: dcnum.common.h5py.File)[source]

Copy attributes, tables, and logs from one H5File to another

Notes

Metadata in h5_dst are never overridden, only metadata that are not defined already are added.

dcnum.write.writer.set_default_filter_kwargs(ds_kwds: dict | None = None, compression: bool = True)[source]