evefile.controllers.version_mapping module

Mapping eveH5 contents to the data structures of the evefile package.

There are different versions of the schema underlying the eveH5 files. Hence, mapping the contents of an eveH5 file to the data model of the evefile package requires to get the correct mapper for the specific version. This is the typical use case for the factory pattern.

Users of the module hence will typically only obtain a VersionMapperFactory object to get the correct mappers for individual files. Furthermore, “users” basically boils down to the EveFile class. Therefore, users of the evefile package usually do not interact directly with any of the classes provided by this module.

Overview

Being version agnostic with respect to eveH5 and SCML schema versions is a central aspect of the evefile package. This requires facilities mapping the actual eveH5 files to the data model provided by the entities technical layer of the evefile subpackage. The File facade obtains the correct VersionMapper object via the VersionMapperFactory, providing an HDF5File resource object to the factory. It is the duty of the factory to obtain the “version” attribute from the HDF5File object (explicitly getting the attributes of the root group of the HDF5File object).

../../_images/evefile.controllers.version_mapping.svg

Fig. 10 Class hierarchy of the evefile.controllers.version_mapping module, providing the functionality to map different eveH5 file schemas to the data structure provided by the EveFile class. The factory will be used to get the correct mapper for a given eveH5 file. For each eveH5 schema version, there exists an individual VersionMapperVx class dealing with the version-specific mapping. The idea behind the Mapping class is to provide simple mappings for attributes and alike that need not be hard-coded and can be stored externally, e.g. in YAML files. This would make it easier to account for (simple) changes.

For each eveH5 schema version, there exists an individual VersionMapperVx class dealing with the version-specific mapping. That part of the mapping common to all versions of the eveH5 schema takes place in the VersionMapper parent class, e.g. removing the chain. The idea behind the Mapping class is to provide simple mappings for attributes and alike that can be stored externally, e.g. in YAML files. This would make it easier to account for (simple) changes.

Mapping tasks for eveH5 schema

What follows is a summary of the different aspects, for the time being not divided for the different formats (up to v7):

  • Map attributes of / and /c1 to the file metadata. ✓

  • Convert monitor datasets from the device group to MonitorData objects. ✓

    • We probably need to create subclasses for the different monitor datasets, at least distinguishing between numeric and non-numeric values.

  • Map /c1/meta/PosCountTimer to TimestampData object. ✓

  • Starting with eveH5 v5: Map /LiveComment to LogMessage objects. ✓

  • Filter all datasets from the main section, with different goals:

    • Map array data to ArrayChannelData objects (HDF5 groups having an attribute DeviceType set to Channel). ✗

      • Distinguish between MCA and scope data (at least). ✗

      • Map additional datasets in main section (and snapshot). ✗

    • Map all axis datasets to AxisData objects. ✓

      • How to distinguish between axes with and without encoders? ✗

      • Read channels with RBV and replace axis values with RBV. ✗

        • Most probably, the corresponding channel has the same name (not XML-ID, though!) as the axis, but with suffix _RBV, and can thus be identified.

        • In case of axes with encoders, there may be additional datasets present, e.g., those with suffix _Enc.

        • In this case, instead of NonencodedAxisData, an AxisData object needs to be created. (Currently, only AxisData objects are created, what is a mistake as well…)

      • How to deal with pseudo-axes used as options in channel datasets? Do we need to deal with axes later? ✗

    • Distinguish between single point and area data, and map area data to AreaChannelData objects. (✗)

      • Distinguish between scientific and sample cameras. ✗

      • Which dataset is the “main” dataset for scientific cameras? ✗

        • Starting with eve v1.39, it is TIFF1:chan1, before, this is less clear, and there might not exist a dataset containing filenames with full paths, but only numbers.

      • Map sample camera datasets. ✗

    • Map the additional data for average and interval channel data provided in the respective HDF5 groups to AverageChannelData and IntervalChannelData objects, respectively. ✓

    • Map normalized channel data (and the data provided in the respective HDF5 groups) to NormalizedChannelData. ✓

    • Add all data objects to the data attribute of the EveFile object. (Has been done during mapping already.)

  • Filter all datasets from the snapshot section, with different goals:

    • Map all HDF5 datasets that belong to one of the data objects in the data attribute of the EveFile object to their respective attributes.

    • Map all HDF5 datasets remaining (if any) to data objects corresponding to their respective data type.

    • Add all data objects to the snapshots attribute of the EveFile object. ✗

Most probably, not all these tasks can be inferred from the contents of an eveH5 file alone. In this case, additional mapping tables, eventually perhaps even on a per-measurement-station level, are necessary.

Questions to address

  • How were the log messages/live comments saved before v5?

  • How to deal with options that are monitored? Check whether they change for a given channel/axis and if so, expand them (“fill”) for each PosCount of the corresponding channel/axis, and otherwise set as scalar attribute?

  • How to deal with the situation that not all actual data read from eveH5 are numeric. Of course, non-numeric data cannot be plotted. But how to distinguish sensibly?

    • The evefile.entities.data module provides some distinct classes for this, at least for now NonnumericChannelData.

Module documentation

class evefile.controllers.version_mapping.VersionMapperFactory

Bases: object

Factory for obtaining the correct version mapper object.

There are different versions of the schema underlying the eveH5 files. Hence, mapping the contents of an eveH5 file to the data model of the evefile package requires to get the correct mapper for the specific version. This is the typical use case for the factory pattern.

eveh5

Python object representation of an eveH5 file

Type:

evefile.boundaries.eveh5.HDF5File

Raises:

ValueError – Raised if no eveh5 object is present

Examples

Using the factory is pretty simple. There are actually two ways how to set the eveh5 attribute – either explicitly or when calling the get_mapper() method of the factory:

factory = VersionMapperFactory()
factory.eveh5 = eveh5_object
mapper = factory.get_mapper()
factory = VersionMapperFactory()
mapper = factory.get_mapper(eveh5=eveh5_object)

In both cases, mapper will contain the correct mapper object, and eveh5_object contains the Python object representation of an eveH5 file.

get_mapper(eveh5=None)

Return the correct mapper for a given eveH5 file.

For convenience, the returned mapper has its VersionMapper.source attribute already set to the eveh5 object used to get the mapper for.

Parameters:

eveh5 (evefile.boundaries.eveh5.HDF5File) – Python object representation of an eveH5 file

Returns:

mapper – Mapper used to map the eveH5 file contents to evefile structures.

Return type:

VersionMapper

Raises:
class evefile.controllers.version_mapping.VersionMapper

Bases: object

Mapper for mapping the eveH5 file contents to evefile structures.

This is the base class for all version-dependent mappers. Given that there are different versions of the eveH5 schema, each version gets handled by a distinct mapper subclass.

To get an object of the appropriate class, use the VersionMapperFactory factory.

source

Python object representation of an eveH5 file

Type:

evefile.boundaries.eveh5.HDF5File

destination

High(er)-level evefile structure representing an eveH5 file

Type:

evefile.boundaries.evefile.EveFile

datasets2map_in_main

Names of the datasets in the main section not yet mapped.

In order to not have to check all datasets several times, this list contains only those datasets not yet mapped. Hence, every private mapping method removes those names from the list it handled successfully.

Type:

list

datasets2map_in_snapshot

Names of the datasets in the snapshot section not yet mapped.

In order to not have to check all datasets several times, this list contains only those datasets not yet mapped. Hence, every private mapping method removes those names from the list it handled successfully.

Type:

list

datasets2map_in_monitor

Names of the datasets in the monitor section not yet mapped.

Note that the monitor section is usually termed “device”.

In order to not have to check all datasets several times, this list contains only those datasets not yet mapped. Hence, every private mapping method removes those names from the list it handled successfully.

Type:

list

Raises:

ValueError – Raised if either source or destination are not provided

Examples

Although the VersionMapper class is not meant to be used directly, its use is prototypical for all the concrete mappers:

mapper = VersionMapper()
mapper.map(source=eveh5, destination=evefile)

Usually, you will obtain the correct mapper from the VersionMapperFactory. In this case, the returned mapper has its source attribute already set for convenience:

factory = VersionMapperFactory()
mapper = factory.get_mapper(eveh5=eveh5)
mapper.map(destination=evefile)
map(source=None, destination=None)

Map the eveH5 file contents to evefile structures.

Parameters:
Raises:

ValueError – Raised if either source or destination are not provided

static get_hdf5_dataset_importer(dataset=None, mapping=None)

Get an importer object for HDF5 datasets with properties set.

Data are loaded on demand, not already when initially loading the eveH5 file. Hence, the need for a mechanism to provide the relevant information where to get the relevant data from and how. Different versions of the underlying eveH5 schema differ even in whether all data belonging to one Data object are located in one HDF5 dataset or spread over multiple HDF5 datasets. In the latter case, individual importers are necessary for the separate HDF5 datasets.

As the VersionMapper class deals with each HDF5 dataset individually, some fundamental settings for the HDF5DataImporter are readily available. Additionally, the mapping parameter provides the information necessary to create the correct information in the HDF5DataImporter.mapping attribute.

Important

The keys in the dictionary provided via the mapping parameter are integers, not strings, as usual for dictionaries. This allows to directly use the keys for indexing the tuple returned by numpy.dtype.names. To be explicit, here is an example:

dataset = HDF5Dataset()
importer_mapping = {
    0: "milliseconds",
    1: "data",
}
importer = self.get_hdf5_dataset_importer(
    dataset=dataset, mapping=importer_mapping
)

Of course, in reality you will not just instantiate an empty HDF5Dataset object, but have one available within your mapper.

Parameters:
  • dataset (evefile.boundaries.eveh5.HDF5Dataset) – Representation of an HDF5 dataset.

  • mapping (dict) –

    Table mapping HDF5 dataset columns to data class attributes.

    Note: The keys in this dictionary are integers, not strings, as usual for dictionaries. This allows to directly use the keys for indexing the tuple returned by numpy.dtype.names.

Returns:

importer – HDF5 dataset importer

Return type:

evefile.entities.data.HDF5DataImporter

static get_dataset_name(dataset=None)

Get the name of an HDF5 dataset.

The name here refers to the last part of the path within the HDF5 file, i.e. the part after the last slash.

Parameters:

dataset (evedata.evefile.boundaries.eveh5.HDF5Dataset) – Representation of an HDF5 dataset.

Returns:

name – Name of the HDF5 dataset

Return type:

str

static set_basic_metadata(hdf5_item=None, dataset=None)

Set the basic metadata of a dataset from an HDF5 item.

The metadata attributes id, name, access_mode, and pv are set.

Parameters:
  • hdf5_item (evedata.evefile.boundaries.eveh5.HDF5Item) – Representation of an HDF5 item.

  • dataset (evedata.evefile.entities.data.Data) – Data object the metadata should be set for

class evefile.controllers.version_mapping.VersionMapperV5

Bases: VersionMapper

Mapper for mapping eveH5 v5 file contents to evefile structures.

More description comes here…

Important

EveH5 files of version v5 and earlier do not contain a date and time for the end of the measurement. Hence, the corresponding attribute File.metadata.end is set to the UNIX start date (1970-01-01T00:00:00). Thus, with these files, it is not possible to automatically calculate the duration of the measurement.

Note, however, that using the File.position_timestamps attribute and taking the timestamp for the last recorded position count, one could infer the duration of the measurement, and hence set the time for the end of the measurement.

source

Python object representation of an eveH5 file

Type:

evefile.boundaries.eveh5.HDF5File

destination

High(er)-level evefile structure representing an eveH5 file

Type:

evefile.boundaries.evefile.File

Raises:

ValueError – Raised if either source or destination are not provided

Examples

Mapping a given eveH5 file to the evefile structures is the same for each of the mappers:

mapper = VersionMapperV5()
mapper.map(source=eveh5, destination=evefile)

Usually, you will obtain the correct mapper from the VersionMapperFactory. In this case, the returned mapper has its source attribute already set for convenience:

factory = VersionMapperFactory()
mapper = factory.get_mapper(eveh5=eveh5)
mapper.map(destination=evefile)
static get_dataset_name(dataset=None)

Get the name of an HDF5 dataset.

The name here refers to the last part of the path within the HDF5 file, i.e. the part after the last slash.

Parameters:

dataset (evedata.evefile.boundaries.eveh5.HDF5Dataset) – Representation of an HDF5 dataset.

Returns:

name – Name of the HDF5 dataset

Return type:

str

static get_hdf5_dataset_importer(dataset=None, mapping=None)

Get an importer object for HDF5 datasets with properties set.

Data are loaded on demand, not already when initially loading the eveH5 file. Hence, the need for a mechanism to provide the relevant information where to get the relevant data from and how. Different versions of the underlying eveH5 schema differ even in whether all data belonging to one Data object are located in one HDF5 dataset or spread over multiple HDF5 datasets. In the latter case, individual importers are necessary for the separate HDF5 datasets.

As the VersionMapper class deals with each HDF5 dataset individually, some fundamental settings for the HDF5DataImporter are readily available. Additionally, the mapping parameter provides the information necessary to create the correct information in the HDF5DataImporter.mapping attribute.

Important

The keys in the dictionary provided via the mapping parameter are integers, not strings, as usual for dictionaries. This allows to directly use the keys for indexing the tuple returned by numpy.dtype.names. To be explicit, here is an example:

dataset = HDF5Dataset()
importer_mapping = {
    0: "milliseconds",
    1: "data",
}
importer = self.get_hdf5_dataset_importer(
    dataset=dataset, mapping=importer_mapping
)

Of course, in reality you will not just instantiate an empty HDF5Dataset object, but have one available within your mapper.

Parameters:
  • dataset (evefile.boundaries.eveh5.HDF5Dataset) – Representation of an HDF5 dataset.

  • mapping (dict) –

    Table mapping HDF5 dataset columns to data class attributes.

    Note: The keys in this dictionary are integers, not strings, as usual for dictionaries. This allows to directly use the keys for indexing the tuple returned by numpy.dtype.names.

Returns:

importer – HDF5 dataset importer

Return type:

evefile.entities.data.HDF5DataImporter

map(source=None, destination=None)

Map the eveH5 file contents to evefile structures.

Parameters:
Raises:

ValueError – Raised if either source or destination are not provided

static set_basic_metadata(hdf5_item=None, dataset=None)

Set the basic metadata of a dataset from an HDF5 item.

The metadata attributes id, name, access_mode, and pv are set.

Parameters:
  • hdf5_item (evedata.evefile.boundaries.eveh5.HDF5Item) – Representation of an HDF5 item.

  • dataset (evedata.evefile.entities.data.Data) – Data object the metadata should be set for

class evefile.controllers.version_mapping.VersionMapperV6

Bases: VersionMapperV5

Mapper for mapping eveH5 v6 file contents to evefile structures.

The only difference to the previous version v5: Times for start and now even end of a measurement are available and are mapped as datetime.datetime objects onto the File.metadata.start and File.metadata.end attributes, respectively.

Note

Previous to v6 eveH5 files, no end date/time of the measurement was available, hence no duration of the measurement can be calculated.

source

Python object representation of an eveH5 file

Type:

evefile.boundaries.eveh5.HDF5File

destination

High(er)-level evefile structure representing an eveH5 file

Type:

evefile.boundaries.evefile.File

Raises:

ValueError – Raised if either source or destination are not provided

Examples

Mapping a given eveH5 file to the evefile structures is the same for each of the mappers:

mapper = VersionMapperV6()
mapper.map(source=eveh5, destination=evefile)

Usually, you will obtain the correct mapper from the VersionMapperFactory. In this case, the returned mapper has its source attribute already set for convenience:

factory = VersionMapperFactory()
mapper = factory.get_mapper(eveh5=eveh5)
mapper.map(destination=evefile)
static get_dataset_name(dataset=None)

Get the name of an HDF5 dataset.

The name here refers to the last part of the path within the HDF5 file, i.e. the part after the last slash.

Parameters:

dataset (evedata.evefile.boundaries.eveh5.HDF5Dataset) – Representation of an HDF5 dataset.

Returns:

name – Name of the HDF5 dataset

Return type:

str

static get_hdf5_dataset_importer(dataset=None, mapping=None)

Get an importer object for HDF5 datasets with properties set.

Data are loaded on demand, not already when initially loading the eveH5 file. Hence, the need for a mechanism to provide the relevant information where to get the relevant data from and how. Different versions of the underlying eveH5 schema differ even in whether all data belonging to one Data object are located in one HDF5 dataset or spread over multiple HDF5 datasets. In the latter case, individual importers are necessary for the separate HDF5 datasets.

As the VersionMapper class deals with each HDF5 dataset individually, some fundamental settings for the HDF5DataImporter are readily available. Additionally, the mapping parameter provides the information necessary to create the correct information in the HDF5DataImporter.mapping attribute.

Important

The keys in the dictionary provided via the mapping parameter are integers, not strings, as usual for dictionaries. This allows to directly use the keys for indexing the tuple returned by numpy.dtype.names. To be explicit, here is an example:

dataset = HDF5Dataset()
importer_mapping = {
    0: "milliseconds",
    1: "data",
}
importer = self.get_hdf5_dataset_importer(
    dataset=dataset, mapping=importer_mapping
)

Of course, in reality you will not just instantiate an empty HDF5Dataset object, but have one available within your mapper.

Parameters:
  • dataset (evefile.boundaries.eveh5.HDF5Dataset) – Representation of an HDF5 dataset.

  • mapping (dict) –

    Table mapping HDF5 dataset columns to data class attributes.

    Note: The keys in this dictionary are integers, not strings, as usual for dictionaries. This allows to directly use the keys for indexing the tuple returned by numpy.dtype.names.

Returns:

importer – HDF5 dataset importer

Return type:

evefile.entities.data.HDF5DataImporter

map(source=None, destination=None)

Map the eveH5 file contents to evefile structures.

Parameters:
Raises:

ValueError – Raised if either source or destination are not provided

static set_basic_metadata(hdf5_item=None, dataset=None)

Set the basic metadata of a dataset from an HDF5 item.

The metadata attributes id, name, access_mode, and pv are set.

Parameters:
  • hdf5_item (evedata.evefile.boundaries.eveh5.HDF5Item) – Representation of an HDF5 item.

  • dataset (evedata.evefile.entities.data.Data) – Data object the metadata should be set for

class evefile.controllers.version_mapping.VersionMapperV7

Bases: VersionMapperV6

Mapper for mapping eveH5 v7 file contents to evefile structures.

The only difference to the previous version v6: the attribute Simulation has beem added on the file root level and is mapped as a Boolean value onto the File.metadata.simulation attribute.

source

Python object representation of an eveH5 file

Type:

evefile.boundaries.eveh5.HDF5File

destination

High(er)-level evefile structure representing an eveH5 file

Type:

evefile.boundaries.evefile.File

Raises:

ValueError – Raised if either source or destination are not provided

Examples

Mapping a given eveH5 file to the evefile structures is the same for each of the mappers:

mapper = VersionMapperV7()
mapper.map(source=eveh5, destination=evefile)

Usually, you will obtain the correct mapper from the VersionMapperFactory. In this case, the returned mapper has its source attribute already set for convenience:

factory = VersionMapperFactory()
mapper = factory.get_mapper(eveh5=eveh5)
mapper.map(destination=evefile)
static get_dataset_name(dataset=None)

Get the name of an HDF5 dataset.

The name here refers to the last part of the path within the HDF5 file, i.e. the part after the last slash.

Parameters:

dataset (evedata.evefile.boundaries.eveh5.HDF5Dataset) – Representation of an HDF5 dataset.

Returns:

name – Name of the HDF5 dataset

Return type:

str

static get_hdf5_dataset_importer(dataset=None, mapping=None)

Get an importer object for HDF5 datasets with properties set.

Data are loaded on demand, not already when initially loading the eveH5 file. Hence, the need for a mechanism to provide the relevant information where to get the relevant data from and how. Different versions of the underlying eveH5 schema differ even in whether all data belonging to one Data object are located in one HDF5 dataset or spread over multiple HDF5 datasets. In the latter case, individual importers are necessary for the separate HDF5 datasets.

As the VersionMapper class deals with each HDF5 dataset individually, some fundamental settings for the HDF5DataImporter are readily available. Additionally, the mapping parameter provides the information necessary to create the correct information in the HDF5DataImporter.mapping attribute.

Important

The keys in the dictionary provided via the mapping parameter are integers, not strings, as usual for dictionaries. This allows to directly use the keys for indexing the tuple returned by numpy.dtype.names. To be explicit, here is an example:

dataset = HDF5Dataset()
importer_mapping = {
    0: "milliseconds",
    1: "data",
}
importer = self.get_hdf5_dataset_importer(
    dataset=dataset, mapping=importer_mapping
)

Of course, in reality you will not just instantiate an empty HDF5Dataset object, but have one available within your mapper.

Parameters:
  • dataset (evefile.boundaries.eveh5.HDF5Dataset) – Representation of an HDF5 dataset.

  • mapping (dict) –

    Table mapping HDF5 dataset columns to data class attributes.

    Note: The keys in this dictionary are integers, not strings, as usual for dictionaries. This allows to directly use the keys for indexing the tuple returned by numpy.dtype.names.

Returns:

importer – HDF5 dataset importer

Return type:

evefile.entities.data.HDF5DataImporter

map(source=None, destination=None)

Map the eveH5 file contents to evefile structures.

Parameters:
Raises:

ValueError – Raised if either source or destination are not provided

static set_basic_metadata(hdf5_item=None, dataset=None)

Set the basic metadata of a dataset from an HDF5 item.

The metadata attributes id, name, access_mode, and pv are set.

Parameters:
  • hdf5_item (evedata.evefile.boundaries.eveh5.HDF5Item) – Representation of an HDF5 item.

  • dataset (evedata.evefile.entities.data.Data) – Data object the metadata should be set for