mml.core.data_preparation.data_archive

class DataArchive[source]

Bases: object

A simple dataclass holding information about an data archive (e.g. a zipfile).

__init__(path: Path, kind: DataKind = DataKind.MIXED, md5sum: str | None = None, password: str | None = None, keep_top_level: bool = False) None
check_hash() None[source]

Checks if the optional md5sum of the DataArchive matches the actual files md5sum.

keep_top_level: bool = False
kind: DataKind = 'mixed_data'
md5sum: str | None = None
password: str | None = None
path: Path
class DataKind[source]

Bases: StrEnum

Kinds of data. Used to somehow sort into distinct top level folders. If multiple kinds are mixed, MIXED should be used as default. Usage is not enforced but may help to structure any data storage.

MIXED = 'mixed_data'
TESTING_DATA = 'testing_data'
TESTING_LABELS = 'testing_labels'
TRAINING_DATA = 'training_data'
TRAINING_LABELS = 'training_labels'
UNLABELED_DATA = 'unlabeled_data'
__new__(value)