dcnum.write.writer

Attributes

hdf5plugin

Exceptions

CreatingFileWithoutBasinWarning

Initialize self. See help(type(self)) for accurate signature.

IgnoringBasinTypeWarning

Initialize self. See help(type(self)) for accurate signature.

Classes

HDF5Writer

Write deformability cytometry HDF5 data

Functions

create_with_basins(path_out, basin_paths)

Create an .rtdc file with basins

copy_basins(h5_src, h5_dst[, internal_basins])

Reassemble basin data in the output file

copy_features(h5_src, h5_dst, features[, mapping, ds_kwds])

Copy feature data from one HDF5 file to another

copy_metadata(h5_src, h5_dst)

Copy attributes, tables, and logs from one H5File to another

set_default_filter_kwargs([ds_kwds, compression])

Module Contents

dcnum.write.writer.hdf5plugin
exception dcnum.write.writer.CreatingFileWithoutBasinWarning[source]

Bases: UserWarning

Initialize self. See help(type(self)) for accurate signature.

exception dcnum.write.writer.IgnoringBasinTypeWarning[source]

Bases: UserWarning

Initialize self. See help(type(self)) for accurate signature.

class dcnum.write.writer.HDF5Writer(obj: dcnum.common.h5py.File | pathlib.Path | str, mode: str = 'a', ds_kwds: dict | None = None)[source]

Write deformability cytometry HDF5 data

Parameters:
  • obj (h5py.File | pathlib.Path | str) – object to instantiate the writer from; If this is already a h5py.File object, then it is used, otherwise the argument is passed to h5py.File

  • mode (str) – opening mode when using h5py.File

  • ds_kwds (dict) – keyword arguments with which to initialize new Datasets (e.g. compression)

events
ds_kwds = None
__enter__()[source]
__exit__(exc_type, exc_val, exc_tb)[source]
close()[source]
static get_best_nd_chunks(item_shape, feat_dtype=np.float64)[source]

Return best chunks for HDF5 datasets

Chunking has performance implications. It’s recommended to keep the total size of dataset chunks between 10 KiB and 1 MiB. This number defines the maximum chunk size as well as half the maximum cache size for each dataset.

require_feature(feat: str, item_shape: tuple[int], feat_dtype: numpy.dtype, ds_kwds: dict | None = None, group_name: str = 'events')[source]

Create a new feature in the “events” group

Parameters:
  • feat (str) – name of the feature

  • item_shape (tuple[int]) – shape for one event of this feature, e.g. for a scalar event, the shape would be (1,) and for an image, the shape could be (80, 300).

  • feat_dtype (np.dtype) – dtype of the feature

  • ds_kwds (dict) – HDF5 Dataset keyword arguments (e.g. compression, fletcher32)

  • group_name (str) – name of the HDF5 group where the feature should be written to; defaults to the “events” group, but a different group can be specified for storing e.g. internal basin features.

store_basin(name: str, paths: list[str | pathlib.Path] | None = None, features: list[str] | None = None, description: str | None = None, mapping: numpy.ndarray | None = None, internal_data: dict | None = None, identifier: str | None = None)[source]

Write an HDF5-based file basin

Parameters:
  • name (str) – basin name; Names do not have to be unique.

  • paths (list of str or pathlib.Path or None) – location(s) of the basin; must be None when storing internal data, a list of paths otherwise

  • features (list of str) – list of features provided by paths

  • description (str) – optional string describing the basin

  • mapping (1D array) – integer array with indices that map the basin dataset to this dataset

  • internal_data (dict of ndarrays) – internal basin data to store; If this is set, then features and paths must be set to None.

  • identifier (str) – the measurement identifier of the basin as computed by the get_measurement_identifier() function.

store_feature_chunk(feat, data, group_name='events')[source]

Store feature data

The “chunk” implies that always chunks of data are stored, never single events.

store_log(log: str, data: list[str], override: bool = False) dcnum.common.h5py.Dataset[source]

Store log data

Store the log data under the key log. The data kwarg must be a list of strings. If the log entry already exists, ValueError is raised unless override is set to True.

dcnum.write.writer.create_with_basins(path_out: str | pathlib.Path, basin_paths: list[str | pathlib.Path] | list[list[str | pathlib.Path]])[source]

Create an .rtdc file with basins

Parameters:
  • path_out – The output .rtdc file where basins are written to

  • basin_paths – The paths to the basins written to path_out. This can be either a list of paths (to different basins) or a list of lists for paths (for basins containing the same information, commonly used for relative and absolute paths).

dcnum.write.writer.copy_basins(h5_src: dcnum.common.h5py.File, h5_dst: dcnum.common.h5py.File, internal_basins: bool = True)[source]

Reassemble basin data in the output file

This does not just copy the datasets defined in the “basins” group, but it also loads the “basinmap?” features and stores them as new “basinmap?” features in the output file.

dcnum.write.writer.copy_features(h5_src: dcnum.common.h5py.File, h5_dst: dcnum.common.h5py.File, features: list[str], mapping: numpy.ndarray | None = None, ds_kwds: dict | None = None)[source]

Copy feature data from one HDF5 file to another

The feature must not exist in the destination file.

Parameters:
  • h5_src (h5py.File) – Input HDF5File containing features in the “events” group

  • h5_dst (h5py.File) – Output HDF5File opened in write mode not containing features

  • features (list[str]) – List of features to copy from source to destination

  • mapping (1D array) – If given, contains indices in the input file that should be written to the output file. If set to None, all features are written.

  • ds_kwds – keyword arguments with which to initialize new Datasets (e.g. compression); only relevant when mapping is not None

dcnum.write.writer.copy_metadata(h5_src: dcnum.common.h5py.File, h5_dst: dcnum.common.h5py.File)[source]

Copy attributes, tables, and logs from one H5File to another

Notes

Metadata in h5_dst are never overridden, only metadata that are not defined already are added.

dcnum.write.writer.set_default_filter_kwargs(ds_kwds: dict | None = None, compression: bool = True)[source]