mml.interactive

The “mml.interactive” module contains helpers for using mml within interactive sessions, such as the REPL or a jupyter notebook.

class AllTasksInfos[source]

Bases: object

A class to store all standard meta information on a set of tasks.

__init__(num_classes: Dict[str, int], num_samples: Dict[str, int], imbalance_ratios: Dict[str, float], datasets: Dict[str, str], keywords: Dict[str, Set[Keyword]], task_types: Dict[str, TaskType], domains: Dict[str, Keyword], dimensions: Dict[str, int], max_resolution: Dict[str, int], min_resolution: Dict[str, int], small_tasks: List[str], medium_tasks: List[str], large_tasks: List[str]) → None

check_consistency()[source]: Performs assertions that all information cover the same set of tasks. :return:

datasets: Dict[str, str]

dimensions: Dict[str, int]

domains: Dict[str, Keyword]

classmethod from_csv(path: Path) → AllTasksInfos[source]

Load stored AllTasksInfos from a csv file.

Parameters:: path – path to load csv file
Returns:: AllTasksInfos

get_transformed(transforms: Sequence[str] = ('boxcox', 'normalize')) → AllTasksInfos[source]

Allows to receive a modified instance of the task information where a couple of attributes are transformed.

transformed attributes are: ‘num_classes’, ‘num_samples’, ‘imbalance_ratios’, ‘dimensions’, ‘max_resolution’,: ‘min_resolution’

available single transforms are: ‘boxcox’, ‘normalize’, ‘zscore’

Parameters:: transforms (Sequence[str]) – a sequence of legal transforms
Returns:: a modified version of the task information, transforms have been applied on all attributes listed above

imbalance_ratios: Dict[str, float]

keywords: Dict[str, Set[Keyword]]

large_tasks: List[str]

max_resolution: Dict[str, int]

medium_tasks: List[str]

min_resolution: Dict[str, int]

num_classes: Dict[str, int]

num_samples: Dict[str, int]

small_tasks: List[str]

store_csv(path: Path) → None[source]

Reformat meta information and write as a csv file.

Parameters:: path (Path) – path to store csv file
Returns:: None

task_types: Dict[str, TaskType]

class DefaultRequirements[source]

Bases: JobPrefixRequirements

The default how to call MML from e.g. a local machine (assuming it to be installed and the environment to be loaded.

get_prefix() → str[source]

class EmbeddedJobRunner[source]

Bases: JobRunner

The embedded runner allows to start mml directly from within the same python interpreter, hence any previous variables, imports, etc. are available during runtime. This also allows to receive the return value of MML.

run(job: MMLJobDescription)[source]

class JobPrefixRequirements[source]

Bases: object

The job prefix requirements to a job. Basically resolves how to invoke mml on the system.

get_prefix() → str[source]

class JobRunner[source]

Bases: object

The runner that invokes the rendered MML call.

run(job: MMLJobDescription)[source]

class MMLJobDescription[source]

Bases: object

Combined description of an MML call. Includes prefix requirements, config options and a multirun flag for hpo.

__init__(prefix_req: JobPrefixRequirements, mode: str, config_options: Dict[str, str | float | List[str | int | float] | int], multirun: bool = False) → None

config_options: Dict[str, str | float | List[str | int | float] | int]

mode: str

multirun: bool = False

prefix_req: JobPrefixRequirements

render() → str[source]

Actually renders the job description.

Returns:: A string that might be pasted into a terminal to start the job described.

run(runner: JobRunner) → float | None[source]

Runs the job with the given runner.

Parameters:: runner (JobRunner) – the runner to run the job.
Returns:: Potentially a float that represents the return value of the specified experiment (not guaranteed)

class SubprocessJobRunner[source]

Bases: JobRunner

The subprocess runner only inherits the virtual environment but starts a new process including a new interpreter. Any variables in the current interpreter will not be available during this run. It does not receive any return values of an experiment.

run(job: MMLJobDescription)[source]

default_file_manager(reuse_config: DictConfig | ReuseConfig | None = None) → Generator[MMLFileManager, None, None][source]

Convenience method to get a MMLFileManager instance. To be used in a with statement:

with default_file_manager() as fm:
    fm.do_something (e.g. extract information)
    ...

continue code with extracted information (without fm)

Returns:

get_task_infos(task_list: List[str], dims: str | None = None) → AllTasksInfos[source]

Most convenient way to receive a :class:AllTasksInfos instance. Provide a list of aliases and optional a project name that computed dimensions before.

Parameters:

task_list (List[str]) – list of task names, tasks must be available on the machine (run create before if not)
dims (Optional[str]) – (optional) project name that computed dimensions with mml dim proj=THIS_ARG

Returns:

relevant meta information on all tasks combined in one object

Return type:

AllTasksInfos

get_task_structs(tasks: str | Sequence[str], preprocessing: str = 'default') → List[TaskStruct][source]

Create a task struct on the fly.

Parameters:

tasks (str) – task name or sequence of task names
preprocessing (str) – the preprocessing id of the task (default: ‘default’)

Returns:

the corresponding task struct

Return type:

TaskStruct

init(env_path: Path | None = None)[source]

The init function loads environment variables and mml plugins. It is recommended as first function call after imports within a jupyter notebook or any other interactive session to plan, process or analyze any mml experiments.

Parameters:: env_path (Optional[Path]) – as jupyter sometimes struggles to load MML_ENV_PATH it may be provided here
Returns:

load_project_models(project: str) → Dict[str, List[ModelStorage]][source]

Loading utility to get all models of a given project.

Parameters:: project (str) – name of the project, what has been inserted with ‘proj=…’
Returns:: dict with task name keys and a list of all corresponding ModelStorages

merge_project_models(project_models_list: Iterable[Dict[str, List[ModelStorage]]]) → Dict[str, List[ModelStorage]][source]

Merges models loaded from multiple projects.

Parameters:: project_models_list – list of dicts, as returned by multiple calls from func::load_project_models
Returns:: merged list, as if all models were trained in one single project

write_out_commands(cmd_list: List[MMLJobDescription], name: str = 'output', seperator: str | None = 'sleep 2\n', max_cmds: int | None = None) → None[source]

Writes a list of :class:MMLJobDescription into a file that may be called by a shell afterward. This is particularly useful if the commands should be transferred to a different host via ssh, e.g. with:

ssh user@host 'bash -s' < /path/to/output.txt

Parameters:

cmd_list (List[MMLJobDescription]) – list of commands
name (Optional[str]) – a file name to relate cmds to a common project or experiment, defaults to ‘output’
seperator (Optional[str]) – (optional) a line seperator, useful if e.g. sleep X should delay cmd submission to a cluster
max_cmds (Optional[int]) – (optional) max number of cmds per file, will split into consecutive files if more cmds are present

Returns: