mml.core.scripts.schedulers.clean_scheduler

class CleanScheduler[source]

AbstractBaseScheduler implementation for the cleaning of files. Includes the following subroutines: - temp - download

__init__(cfg: DictConfig)[source]
before_finishing_hook()[source]
create_routine()[source]

This scheduler implements two subroutines, one for temp files cleaning and one for downloads cleaning.

Returns:

None

log_cumulative_sizes() None[source]
prepare_exp() None[source]

We skip task struct creation, since it is not needed or may not be finished.

remove_downloads(dset_name: str) None[source]

Routine to remove the downloads of a dataset. May remove data for multiple tasks at once! Make sure that original download data is still available later on for full reproducibility.

Parameters:

dset_name (str) – name of the dset

Returns:

None

remove_temp_files(task_name: str) None[source]

Routine to remove temporary files that may remain as artefacts during task creation. They are located inside the data path either below RAW or PREPROCESSED inside the dataset folders and are named temp.json or temp_X.json with integer X.

Parameters:

task_name (str) – name of the task to remove temp files (will concern both RAW and PREPROCESSED variants)

Returns:

None