dcnum.read.hdf5_data ==================== .. py:module:: dcnum.read.hdf5_data Exceptions ---------- .. autoapisummary:: dcnum.read.hdf5_data.BasinIdentifierMismatchError Classes ------- .. autoapisummary:: dcnum.read.hdf5_data.HDF5Data Functions --------- .. autoapisummary:: dcnum.read.hdf5_data.concatenated_hdf5_data dcnum.read.hdf5_data.get_measurement_identifier Module Contents --------------- .. py:exception:: BasinIdentifierMismatchError Bases: :py:obj:`BaseException` Initialize self. See help(type(self)) for accurate signature. .. py:class:: HDF5Data(path: pathlib.Path | dcnum.common.h5py.File | BinaryIO, pixel_size: float | None = None, md5_5m: str | None = None, meta: dict | None = None, basins: list[dict[str, list[str] | str]] | None = None, logs: dict[str, list[str]] | None = None, tables: dict[str, numpy.ndarray] | None = None, image_cache_size: int = 2, image_chunk_size: int = 1000, index_mapping: int | slice | list | numpy.ndarray | None = None) :param path: path to data file :param pixel_size: pixel size in µm :param md5_5m: MD5 sum of the first 5 MiB; computed if not provided :param meta: metadata dictionary; extracted from HDF5 attributes if not provided :param basins: list of basin dictionaries; extracted from HDF5 attributes if not provided :param logs: dictionary of logs; extracted from HDF5 attributes if not provided :param tables: dictionary of tables; extracted from HDF5 attributes if not provided :param image_cache_size: size of the image cache to use when accessing image data :param image_chunk_size: maximum number of images in each image cache chunk :param index_mapping: select only a subset of input events, transparently reducing the size of the dataset, possible data types are - int `N`: use the first `N` events - slice: use the events defined by a slice - list: list of integers specifying the event indices to use Numpy indexing rules apply. E.g. to only process the first 100 events, set this to `100` or `slice(0, 100)`. .. py:method:: __contains__(item) .. py:method:: __enter__() .. py:method:: __exit__(exc_type, exc_val, exc_tb) .. py:method:: __getitem__(feat) .. py:method:: __getstate__() .. py:method:: __setstate__(state) .. py:method:: __len__() .. py:property:: h5 .. py:property:: image :type: dcnum.read.cache.HDF5ImageCache | None .. py:property:: image_bg :type: dcnum.read.cache.HDF5ImageCache | None .. py:property:: image_corr :type: dcnum.read.cache.ImageCorrCache | None .. py:property:: image_num_chunks Number of image chunks given `self.image_chunk_size` .. py:property:: mask .. py:property:: meta_nest Return `self.meta` as nested dicitonary This gets very close to the dclab `config` property of datasets. .. py:property:: pixel_size .. py:method:: extract_basin_dicts(h5, check=True) :staticmethod: Return list of basin dictionaries .. py:property:: features_scalar_frame Scalar features that apply to all events in a frame This is a convenience function for copying scalar features over to new processed datasets. Return a list of all features that describe a frame (e.g. temperature or time). .. py:method:: close() Close the underlying HDF5 file .. py:method:: get_ppid() .. py:method:: get_ppid_code() :classmethod: .. py:method:: get_ppid_from_ppkw(kwargs) :classmethod: .. py:method:: get_ppid_index_mapping(index_mapping) :staticmethod: Return the pipeline identifier part for index mapping .. py:method:: get_ppkw_from_ppid(dat_ppid) :staticmethod: .. py:method:: get_basin_data(index: int) -> tuple[dcnum.common.h5py.Group, list, int | slice | list | numpy.ndarray] Return HDF5Data info for a basin index in `self.basins` :param index: index of the basin from which to get data :type index: int :returns: * **group** (*h5py.Group*) -- HDF5 group containing HDF5 Datasets with the names listed in `features` * **features** (*list of str*) -- list of features made available by this basin * *index_mapping* -- a mapping (see `__init__`) that defines mapping from the basin dataset to the referring dataset .. py:method:: _get_basin_data_file(bn_dict) .. py:method:: _get_basin_data_internal(bn_dict) .. py:method:: get_image_cache(feat) Create an HDF5ImageCache object for the current dataset This method also tries to find image data in `self.basins`. .. py:method:: keys() .. py:function:: concatenated_hdf5_data(*args, **kwargs) .. py:function:: get_measurement_identifier(h5: dcnum.common.h5py.Group) -> str | None Return the measurement identifier for the given H5File object The basin identifier is taken from the HDF5 attributes. If the "experiment:run identifier" attribute is not set, it is computed from the HDF5 attributes "experiment:time", "experiment:date", and "setup:identifier". If the measurement identifier cannot be found or computed, return None.