evefile.controllers.version_mapping module
Mapping eveH5 contents to the data structures of the evefile package.
There are different versions of the schema underlying the eveH5 files. Hence, mapping the contents of an eveH5 file to the data model of the evefile package requires to get the correct mapper for the specific version. This is the typical use case for the factory pattern.
Users of the module hence will typically only obtain a
VersionMapperFactory
object to get the correct mappers for individual
files. Furthermore, “users” basically boils down to the EveFile
class. Therefore, users of
the evefile package usually do not interact directly with any of the
classes provided by this module.
Overview
Being version agnostic with respect to eveH5 and SCML schema versions is a
central aspect of the evefile package. This requires facilities mapping
the actual eveH5 files to the data model provided by the entities
technical layer of the evefile subpackage. The File
facade obtains
the correct VersionMapper
object via the
VersionMapperFactory
, providing an HDF5File
resource object to the
factory. It is the duty of the factory to obtain the “version” attribute
from the HDF5File
object (explicitly getting the attributes of the root group of the
HDF5File
object).
Fig. 10 Class hierarchy of the evefile.controllers.version_mapping
module, providing the functionality to map different eveH5 file
schemas to the data structure provided by the EveFile
class. The factory
will be used to get the correct mapper for a given eveH5 file.
For each eveH5 schema version, there exists an individual
VersionMapperVx
class dealing with the version-specific mapping.
The idea behind the Mapping
class is to provide simple mappings for
attributes and alike that need not be hard-coded and can be stored
externally, e.g. in YAML files. This would make it easier to account
for (simple) changes.
For each eveH5 schema version, there exists an individual
VersionMapperVx
class dealing with the version-specific mapping. That
part of the mapping common to all versions of the eveH5 schema takes place
in the VersionMapper
parent class, e.g. removing the chain. The
idea behind the Mapping
class is to provide simple mappings for
attributes and alike that can be stored externally, e.g. in YAML files.
This would make it easier to account for (simple) changes.
Mapping tasks for eveH5 schema
What follows is a summary of the different aspects, for the time being not divided for the different formats (up to v7):
Map attributes of
/
and/c1
to the file metadata. ✓Convert monitor datasets from the
device
group toMonitorData
objects. ✓We probably need to create subclasses for the different monitor datasets, at least distinguishing between numeric and non-numeric values.
Map
/c1/meta/PosCountTimer
toTimestampData
object. ✓Starting with eveH5 v5: Map
/LiveComment
toLogMessage
objects. ✓Filter all datasets from the
main
section, with different goals:Map array data to
ArrayChannelData
objects (HDF5 groups having an attributeDeviceType
set toChannel
). ✗Distinguish between MCA and scope data (at least). ✗
Map additional datasets in main section (and snapshot). ✗
Map all axis datasets to
AxisData
objects. ✓How to distinguish between axes with and without encoders? ✗
Read channels with RBV and replace axis values with RBV. ✗
Most probably, the corresponding channel has the same name (not XML-ID, though!) as the axis, but with suffix
_RBV
, and can thus be identified.In case of axes with encoders, there may be additional datasets present, e.g., those with suffix
_Enc
.In this case, instead of
NonencodedAxisData
, anAxisData
object needs to be created. (Currently, onlyAxisData
objects are created, what is a mistake as well…)
How to deal with pseudo-axes used as options in channel datasets? Do we need to deal with axes later? ✗
Distinguish between single point and area data, and map area data to
AreaChannelData
objects. (✗)Distinguish between scientific and sample cameras. ✗
Which dataset is the “main” dataset for scientific cameras? ✗
Starting with eve v1.39, it is
TIFF1:chan1
, before, this is less clear, and there might not exist a dataset containing filenames with full paths, but only numbers.
Map sample camera datasets. ✗
Map the additional data for average and interval channel data provided in the respective HDF5 groups to
AverageChannelData
andIntervalChannelData
objects, respectively. ✓Map normalized channel data (and the data provided in the respective HDF5 groups) to
NormalizedChannelData
. ✓Add all data objects to the
data
attribute of theEveFile
object. (Has been done during mapping already.)
Filter all datasets from the
snapshot
section, with different goals:Map all HDF5 datasets that belong to one of the data objects in the
data
attribute of theEveFile
object to their respective attributes.Map all HDF5 datasets remaining (if any) to data objects corresponding to their respective data type.
Add all data objects to the
snapshots
attribute of theEveFile
object. ✗
Most probably, not all these tasks can be inferred from the contents of an eveH5 file alone. In this case, additional mapping tables, eventually perhaps even on a per-measurement-station level, are necessary.
Questions to address
How were the log messages/live comments saved before v5?
How to deal with options that are monitored? Check whether they change for a given channel/axis and if so, expand them (“fill”) for each PosCount of the corresponding channel/axis, and otherwise set as scalar attribute?
How to deal with the situation that not all actual data read from eveH5 are numeric. Of course, non-numeric data cannot be plotted. But how to distinguish sensibly?
The
evefile.entities.data
module provides some distinct classes for this, at least for nowNonnumericChannelData
.
Module documentation
- class evefile.controllers.version_mapping.VersionMapperFactory
Bases:
object
Factory for obtaining the correct version mapper object.
There are different versions of the schema underlying the eveH5 files. Hence, mapping the contents of an eveH5 file to the data model of the evefile package requires to get the correct mapper for the specific version. This is the typical use case for the factory pattern.
- eveh5
Python object representation of an eveH5 file
- Raises:
ValueError – Raised if no eveh5 object is present
Examples
Using the factory is pretty simple. There are actually two ways how to set the eveh5 attribute – either explicitly or when calling the
get_mapper()
method of the factory:factory = VersionMapperFactory() factory.eveh5 = eveh5_object mapper = factory.get_mapper()
factory = VersionMapperFactory() mapper = factory.get_mapper(eveh5=eveh5_object)
In both cases,
mapper
will contain the correct mapper object, andeveh5_object
contains the Python object representation of an eveH5 file.- get_mapper(eveh5=None)
Return the correct mapper for a given eveH5 file.
For convenience, the returned mapper has its
VersionMapper.source
attribute already set to theeveh5
object used to get the mapper for.- Parameters:
eveh5 (
evefile.boundaries.eveh5.HDF5File
) – Python object representation of an eveH5 file- Returns:
mapper – Mapper used to map the eveH5 file contents to evefile structures.
- Return type:
- Raises:
ValueError – Raised if no eveh5 object is present
AttributeError – Raised if no matching
VersionMapper
class can be found
- class evefile.controllers.version_mapping.VersionMapper
Bases:
object
Mapper for mapping the eveH5 file contents to evefile structures.
This is the base class for all version-dependent mappers. Given that there are different versions of the eveH5 schema, each version gets handled by a distinct mapper subclass.
To get an object of the appropriate class, use the
VersionMapperFactory
factory.- source
Python object representation of an eveH5 file
- destination
High(er)-level evefile structure representing an eveH5 file
- datasets2map_in_main
Names of the datasets in the main section not yet mapped.
In order to not have to check all datasets several times, this list contains only those datasets not yet mapped. Hence, every private mapping method removes those names from the list it handled successfully.
- Type:
- datasets2map_in_snapshot
Names of the datasets in the snapshot section not yet mapped.
In order to not have to check all datasets several times, this list contains only those datasets not yet mapped. Hence, every private mapping method removes those names from the list it handled successfully.
- Type:
- datasets2map_in_monitor
Names of the datasets in the monitor section not yet mapped.
Note that the monitor section is usually termed “device”.
In order to not have to check all datasets several times, this list contains only those datasets not yet mapped. Hence, every private mapping method removes those names from the list it handled successfully.
- Type:
- Raises:
ValueError – Raised if either source or destination are not provided
Examples
Although the
VersionMapper
class is not meant to be used directly, its use is prototypical for all the concrete mappers:mapper = VersionMapper() mapper.map(source=eveh5, destination=evefile)
Usually, you will obtain the correct mapper from the
VersionMapperFactory
. In this case, the returned mapper has itssource
attribute already set for convenience:factory = VersionMapperFactory() mapper = factory.get_mapper(eveh5=eveh5) mapper.map(destination=evefile)
- map(source=None, destination=None)
Map the eveH5 file contents to evefile structures.
- Parameters:
source (
evefile.boundaries.eveh5.HDF5File
) – Python object representation of an eveH5 filedestination (
evefile.boundaries.evefile.EveFile
) – High(er)-level evefile structure representing an eveH5 file
- Raises:
ValueError – Raised if either source or destination are not provided
- static get_hdf5_dataset_importer(dataset=None, mapping=None)
Get an importer object for HDF5 datasets with properties set.
Data are loaded on demand, not already when initially loading the eveH5 file. Hence, the need for a mechanism to provide the relevant information where to get the relevant data from and how. Different versions of the underlying eveH5 schema differ even in whether all data belonging to one
Data
object are located in one HDF5 dataset or spread over multiple HDF5 datasets. In the latter case, individual importers are necessary for the separate HDF5 datasets.As the
VersionMapper
class deals with each HDF5 dataset individually, some fundamental settings for theHDF5DataImporter
are readily available. Additionally, themapping
parameter provides the information necessary to create the correct information in theHDF5DataImporter.mapping
attribute.Important
The keys in the dictionary provided via the
mapping
parameter are integers, not strings, as usual for dictionaries. This allows to directly use the keys for indexing the tuple returned bynumpy.dtype.names
. To be explicit, here is an example:dataset = HDF5Dataset() importer_mapping = { 0: "milliseconds", 1: "data", } importer = self.get_hdf5_dataset_importer( dataset=dataset, mapping=importer_mapping )
Of course, in reality you will not just instantiate an empty
HDF5Dataset
object, but have one available within your mapper.- Parameters:
dataset (
evefile.boundaries.eveh5.HDF5Dataset
) – Representation of an HDF5 dataset.mapping (
dict
) –Table mapping HDF5 dataset columns to data class attributes.
Note: The keys in this dictionary are integers, not strings, as usual for dictionaries. This allows to directly use the keys for indexing the tuple returned by
numpy.dtype.names
.
- Returns:
importer – HDF5 dataset importer
- Return type:
- static get_dataset_name(dataset=None)
Get the name of an HDF5 dataset.
The name here refers to the last part of the path within the HDF5 file, i.e. the part after the last slash.
- Parameters:
dataset (
evedata.evefile.boundaries.eveh5.HDF5Dataset
) – Representation of an HDF5 dataset.- Returns:
name – Name of the HDF5 dataset
- Return type:
- static set_basic_metadata(hdf5_item=None, dataset=None)
Set the basic metadata of a dataset from an HDF5 item.
The metadata attributes
id
,name
,access_mode
, andpv
are set.- Parameters:
hdf5_item (
evedata.evefile.boundaries.eveh5.HDF5Item
) – Representation of an HDF5 item.dataset (
evedata.evefile.entities.data.Data
) – Data object the metadata should be set for
- class evefile.controllers.version_mapping.VersionMapperV5
Bases:
VersionMapper
Mapper for mapping eveH5 v5 file contents to evefile structures.
More description comes here…
Important
EveH5 files of version v5 and earlier do not contain a date and time for the end of the measurement. Hence, the corresponding attribute
File.metadata.end
is set to the UNIX start date (1970-01-01T00:00:00). Thus, with these files, it is not possible to automatically calculate the duration of the measurement.Note, however, that using the
File.position_timestamps
attribute and taking the timestamp for the last recorded position count, one could infer the duration of the measurement, and hence set the time for the end of the measurement.- source
Python object representation of an eveH5 file
- destination
High(er)-level evefile structure representing an eveH5 file
- Type:
evefile.boundaries.evefile.File
- Raises:
ValueError – Raised if either source or destination are not provided
Examples
Mapping a given eveH5 file to the evefile structures is the same for each of the mappers:
mapper = VersionMapperV5() mapper.map(source=eveh5, destination=evefile)
Usually, you will obtain the correct mapper from the
VersionMapperFactory
. In this case, the returned mapper has itssource
attribute already set for convenience:factory = VersionMapperFactory() mapper = factory.get_mapper(eveh5=eveh5) mapper.map(destination=evefile)
- static get_dataset_name(dataset=None)
Get the name of an HDF5 dataset.
The name here refers to the last part of the path within the HDF5 file, i.e. the part after the last slash.
- Parameters:
dataset (
evedata.evefile.boundaries.eveh5.HDF5Dataset
) – Representation of an HDF5 dataset.- Returns:
name – Name of the HDF5 dataset
- Return type:
- static get_hdf5_dataset_importer(dataset=None, mapping=None)
Get an importer object for HDF5 datasets with properties set.
Data are loaded on demand, not already when initially loading the eveH5 file. Hence, the need for a mechanism to provide the relevant information where to get the relevant data from and how. Different versions of the underlying eveH5 schema differ even in whether all data belonging to one
Data
object are located in one HDF5 dataset or spread over multiple HDF5 datasets. In the latter case, individual importers are necessary for the separate HDF5 datasets.As the
VersionMapper
class deals with each HDF5 dataset individually, some fundamental settings for theHDF5DataImporter
are readily available. Additionally, themapping
parameter provides the information necessary to create the correct information in theHDF5DataImporter.mapping
attribute.Important
The keys in the dictionary provided via the
mapping
parameter are integers, not strings, as usual for dictionaries. This allows to directly use the keys for indexing the tuple returned bynumpy.dtype.names
. To be explicit, here is an example:dataset = HDF5Dataset() importer_mapping = { 0: "milliseconds", 1: "data", } importer = self.get_hdf5_dataset_importer( dataset=dataset, mapping=importer_mapping )
Of course, in reality you will not just instantiate an empty
HDF5Dataset
object, but have one available within your mapper.- Parameters:
dataset (
evefile.boundaries.eveh5.HDF5Dataset
) – Representation of an HDF5 dataset.mapping (
dict
) –Table mapping HDF5 dataset columns to data class attributes.
Note: The keys in this dictionary are integers, not strings, as usual for dictionaries. This allows to directly use the keys for indexing the tuple returned by
numpy.dtype.names
.
- Returns:
importer – HDF5 dataset importer
- Return type:
- map(source=None, destination=None)
Map the eveH5 file contents to evefile structures.
- Parameters:
source (
evefile.boundaries.eveh5.HDF5File
) – Python object representation of an eveH5 filedestination (
evefile.boundaries.evefile.EveFile
) – High(er)-level evefile structure representing an eveH5 file
- Raises:
ValueError – Raised if either source or destination are not provided
- static set_basic_metadata(hdf5_item=None, dataset=None)
Set the basic metadata of a dataset from an HDF5 item.
The metadata attributes
id
,name
,access_mode
, andpv
are set.- Parameters:
hdf5_item (
evedata.evefile.boundaries.eveh5.HDF5Item
) – Representation of an HDF5 item.dataset (
evedata.evefile.entities.data.Data
) – Data object the metadata should be set for
- class evefile.controllers.version_mapping.VersionMapperV6
Bases:
VersionMapperV5
Mapper for mapping eveH5 v6 file contents to evefile structures.
The only difference to the previous version v5: Times for start and now even end of a measurement are available and are mapped as
datetime.datetime
objects onto theFile.metadata.start
andFile.metadata.end
attributes, respectively.Note
Previous to v6 eveH5 files, no end date/time of the measurement was available, hence no duration of the measurement can be calculated.
- source
Python object representation of an eveH5 file
- destination
High(er)-level evefile structure representing an eveH5 file
- Type:
evefile.boundaries.evefile.File
- Raises:
ValueError – Raised if either source or destination are not provided
Examples
Mapping a given eveH5 file to the evefile structures is the same for each of the mappers:
mapper = VersionMapperV6() mapper.map(source=eveh5, destination=evefile)
Usually, you will obtain the correct mapper from the
VersionMapperFactory
. In this case, the returned mapper has itssource
attribute already set for convenience:factory = VersionMapperFactory() mapper = factory.get_mapper(eveh5=eveh5) mapper.map(destination=evefile)
- static get_dataset_name(dataset=None)
Get the name of an HDF5 dataset.
The name here refers to the last part of the path within the HDF5 file, i.e. the part after the last slash.
- Parameters:
dataset (
evedata.evefile.boundaries.eveh5.HDF5Dataset
) – Representation of an HDF5 dataset.- Returns:
name – Name of the HDF5 dataset
- Return type:
- static get_hdf5_dataset_importer(dataset=None, mapping=None)
Get an importer object for HDF5 datasets with properties set.
Data are loaded on demand, not already when initially loading the eveH5 file. Hence, the need for a mechanism to provide the relevant information where to get the relevant data from and how. Different versions of the underlying eveH5 schema differ even in whether all data belonging to one
Data
object are located in one HDF5 dataset or spread over multiple HDF5 datasets. In the latter case, individual importers are necessary for the separate HDF5 datasets.As the
VersionMapper
class deals with each HDF5 dataset individually, some fundamental settings for theHDF5DataImporter
are readily available. Additionally, themapping
parameter provides the information necessary to create the correct information in theHDF5DataImporter.mapping
attribute.Important
The keys in the dictionary provided via the
mapping
parameter are integers, not strings, as usual for dictionaries. This allows to directly use the keys for indexing the tuple returned bynumpy.dtype.names
. To be explicit, here is an example:dataset = HDF5Dataset() importer_mapping = { 0: "milliseconds", 1: "data", } importer = self.get_hdf5_dataset_importer( dataset=dataset, mapping=importer_mapping )
Of course, in reality you will not just instantiate an empty
HDF5Dataset
object, but have one available within your mapper.- Parameters:
dataset (
evefile.boundaries.eveh5.HDF5Dataset
) – Representation of an HDF5 dataset.mapping (
dict
) –Table mapping HDF5 dataset columns to data class attributes.
Note: The keys in this dictionary are integers, not strings, as usual for dictionaries. This allows to directly use the keys for indexing the tuple returned by
numpy.dtype.names
.
- Returns:
importer – HDF5 dataset importer
- Return type:
- map(source=None, destination=None)
Map the eveH5 file contents to evefile structures.
- Parameters:
source (
evefile.boundaries.eveh5.HDF5File
) – Python object representation of an eveH5 filedestination (
evefile.boundaries.evefile.EveFile
) – High(er)-level evefile structure representing an eveH5 file
- Raises:
ValueError – Raised if either source or destination are not provided
- static set_basic_metadata(hdf5_item=None, dataset=None)
Set the basic metadata of a dataset from an HDF5 item.
The metadata attributes
id
,name
,access_mode
, andpv
are set.- Parameters:
hdf5_item (
evedata.evefile.boundaries.eveh5.HDF5Item
) – Representation of an HDF5 item.dataset (
evedata.evefile.entities.data.Data
) – Data object the metadata should be set for
- class evefile.controllers.version_mapping.VersionMapperV7
Bases:
VersionMapperV6
Mapper for mapping eveH5 v7 file contents to evefile structures.
The only difference to the previous version v6: the attribute
Simulation
has beem added on the file root level and is mapped as a Boolean value onto theFile.metadata.simulation
attribute.- source
Python object representation of an eveH5 file
- destination
High(er)-level evefile structure representing an eveH5 file
- Type:
evefile.boundaries.evefile.File
- Raises:
ValueError – Raised if either source or destination are not provided
Examples
Mapping a given eveH5 file to the evefile structures is the same for each of the mappers:
mapper = VersionMapperV7() mapper.map(source=eveh5, destination=evefile)
Usually, you will obtain the correct mapper from the
VersionMapperFactory
. In this case, the returned mapper has itssource
attribute already set for convenience:factory = VersionMapperFactory() mapper = factory.get_mapper(eveh5=eveh5) mapper.map(destination=evefile)
- static get_dataset_name(dataset=None)
Get the name of an HDF5 dataset.
The name here refers to the last part of the path within the HDF5 file, i.e. the part after the last slash.
- Parameters:
dataset (
evedata.evefile.boundaries.eveh5.HDF5Dataset
) – Representation of an HDF5 dataset.- Returns:
name – Name of the HDF5 dataset
- Return type:
- static get_hdf5_dataset_importer(dataset=None, mapping=None)
Get an importer object for HDF5 datasets with properties set.
Data are loaded on demand, not already when initially loading the eveH5 file. Hence, the need for a mechanism to provide the relevant information where to get the relevant data from and how. Different versions of the underlying eveH5 schema differ even in whether all data belonging to one
Data
object are located in one HDF5 dataset or spread over multiple HDF5 datasets. In the latter case, individual importers are necessary for the separate HDF5 datasets.As the
VersionMapper
class deals with each HDF5 dataset individually, some fundamental settings for theHDF5DataImporter
are readily available. Additionally, themapping
parameter provides the information necessary to create the correct information in theHDF5DataImporter.mapping
attribute.Important
The keys in the dictionary provided via the
mapping
parameter are integers, not strings, as usual for dictionaries. This allows to directly use the keys for indexing the tuple returned bynumpy.dtype.names
. To be explicit, here is an example:dataset = HDF5Dataset() importer_mapping = { 0: "milliseconds", 1: "data", } importer = self.get_hdf5_dataset_importer( dataset=dataset, mapping=importer_mapping )
Of course, in reality you will not just instantiate an empty
HDF5Dataset
object, but have one available within your mapper.- Parameters:
dataset (
evefile.boundaries.eveh5.HDF5Dataset
) – Representation of an HDF5 dataset.mapping (
dict
) –Table mapping HDF5 dataset columns to data class attributes.
Note: The keys in this dictionary are integers, not strings, as usual for dictionaries. This allows to directly use the keys for indexing the tuple returned by
numpy.dtype.names
.
- Returns:
importer – HDF5 dataset importer
- Return type:
- map(source=None, destination=None)
Map the eveH5 file contents to evefile structures.
- Parameters:
source (
evefile.boundaries.eveh5.HDF5File
) – Python object representation of an eveH5 filedestination (
evefile.boundaries.evefile.EveFile
) – High(er)-level evefile structure representing an eveH5 file
- Raises:
ValueError – Raised if either source or destination are not provided
- static set_basic_metadata(hdf5_item=None, dataset=None)
Set the basic metadata of a dataset from an HDF5 item.
The metadata attributes
id
,name
,access_mode
, andpv
are set.- Parameters:
hdf5_item (
evedata.evefile.boundaries.eveh5.HDF5Item
) – Representation of an HDF5 item.dataset (
evedata.evefile.entities.data.Data
) – Data object the metadata should be set for