Getting started =============== This section is intended to help with the learning curve of ``mml`` and let you master it as smoothly as possible. It splits itself into the following chunks: * a quick overview on the **core** concepts of ``mml`` * **basic** usage of the ``mml`` CLI, for full descriptions of all command line options see :doc:`cli/overview` * variants of hooking into ``mml`` to define your own scheduler, datasets, path assignments and more * interactive experiment pre- and post-processing After reading through you may want to continue reading the :doc:`modes` for a more detailed overview on ``mml`` modes and :doc:`guides` for more specific use cases. Finally if you want to dive deeper into the ``mml`` internals, read through :doc:`api/overview` section. Concept ------- ``mml`` is a full toolkit to be leveraged in interacting with RGB images for deep learning. It can be accessed through the command line interface (CLI) or interactively via jupyter notebooks. An ``experiment`` (or ``run``) comprises a single call to the ``mml`` CLI. Each experiment is assigned to a ``project``, that determines where the produced artefacts (e.g. trained models) and experiment logs reside (see `proj`_). Experiments within a project may reuse artefacts from a different project though (see `reuse`_). The imaging data (except for plotting and visual logs) is kept separately from conducted experiments and stored at the location given by the ``MML_DATA_PATH`` variable. The installation of data is performed via an explicit ``mml create`` call. ``mml pp`` allows for preprocessing data and storing the results as well - if this step is omitted or the ``preprocessing`` configuration changes (i.e. no preprocessed data is stored) then data will be preprocessed on the fly automatically. ``create`` and ``pp` are two exemplary modes of ``mml`` (see `mode`_), basically top level instructions. More fine-grained configuration can be achieved more pretty much all internal details, see :doc:`cli/overview`. Next to this configurability, the strengths of ``mml`` lie in its extendability e.g. through adding more modes (see :doc:`extensions`) and ease of data integration (see :doc:`guides`). ``mml`` also offers a plugin system, and ships with a selection of useful plugins to enrich the experience (see :doc:`plugins`). Basics ------ ``mml`` has a very flexible configuration mechanism from the command line using the `hydra `_ framework. To understand the usage better we tackle piece by piece of the configuration. mode ~~~~ ``mml`` has a bunch of **modes** to choose from, which determine its behaviour. The results of these modes may interact with each other, e.g. first training a model on one task with ``train`` and afterwards reusing this model on a different task with transfer learning via ``tl`` (the details on the ``reuse`` functionality are explained below). Each time you call ``mml`` you should therefore specify this mode. If no mode is given ``mml`` only shows basic information. An overview of modes can be found at :doc:`modes`. Each mode usually has further configuration options, for example a mode may split into several **subroutines** that can be composed individually. A typical call may be ``train mode.subroutines=[train,predict] mode.cv=false mode.nested=false`` that trains a model on fold 0 (no cross validation) of the training data and uses the same model to predict the test samples of a task. proj ~~~~ Each ``mml`` run is assigned to a **project**, which is represented by a top level directory at the ``MML_RESULTS_PATH``. You can use any string to define such a project (``proj=fancy_new_feature``) and also reuse existing projects. By default the ``proj=default`` project is used, any new project name will automatically create the folder. Multiple runs can be assigned to the same project sequentially or even in parallel. Inside the project folder (precisely inside its ``runs`` subdir) each run will individually create a run folder that contains e.g. the ``exp.log`` file as well as other information. Note that the results of a run are not stored at experiment level, but at project level to enable shared usage across runs (and even across projects). .. _continue-option: continue ~~~~~~~~ In case a run failed (e.g. CUDA OOM) or you had for some reason to stop the run, you may **continue** this run later on via specifying the continue flag. This is advantageous in case you are developing a new feature or have a long series of computations already done within the run you do not want to repeat. You can either specify the exact ``runs`` subdir (``continue=2023-03-23/11-51-42``) to continue or as a shortcut start from the latest run (``continue=latest``). If you use ``continue`` any other argument besides ``proj`` is ignored. tasks/task_list/pivot ~~~~~~~~~~~~~~~~~~~~~ To determine which task is used within a mode to be processed use either ``task_list=[task_a,task_b,task_c]`` or define a `my_tasks.yaml`` config file in ``configs/tasks`` and simplify to ``tasks=my_tasks``. Many modes behave differently if a designated **pivot task** is given via ``pivot.name=task_a``, note that providing ``pivot.name`` automatically adds the task in the ``task_list`` (or ``tasks`` config file) if not already present. hydra.verbose ~~~~~~~~~~~~~ For debugging purposes you may activate verbose logging (note that ``mml`` logs both to the ``stdout`` as well as to the ``exp.log`` file of the run) by setting general ``hydra.verbose=true`` or specifying the loggers/modules you want to debug by e.g. ``hydra.verbose=[mml.core,hydra]`` (see `hydra docs `_). reuse ~~~~~ If you have produced some result and want to reuse them in another experiment (e.g. extracted features for a task in another task similarity experiment) you can use the ``reuse`` config option as shown in the examples below: .. code-block:: bash mml XXX proj=test reuse=none # won't load any reusables (default) mml XXX proj=test reuse.models=other_proj # loads models from project 'other_proj' mml XXX proj=test reuse.predictions=[other_proj,foo_proj, baz_proj] # loads predictions from multiple projects mml XXX proj=test reuse.parameters=other_proj#3 # loads parameters with number 3 from project 'other_proj' By default the most recent results within any project are reused! Appending ``#`` and some integer refers to a specific file number (e.g. the parameter file ``model_0003.pth`` in the example above). If multiple projects are specified the last found entry is kept (e.g. if in the example above ``other_proj`` and `foo_proj`` hold predictions for a task, but not ``baz_proj``, then the last predictions from ``foo_proj`` are reused. A fundamental exception to this mechanism are models since here ALL models are loaded - within a project and across a given list of projects (specifying ``#`` is not allowed). trainer ~~~~~~~ Under the hood ``mml`` uses `lightning `_ to run deep learning routines. This allows a very flexible parametrization of training behaviour through the interface of the `lightning trainer class `_. You can pass through any arguments to the trainer via ``trainer.kwarg=value`` from the CLI. (Some values are set by default from ``mml`` others are not, so you may sometimes need to add a ``+`` in front for those not used previously.) .. code-block:: bash mml train proj=test trainer.accelerator=tpu # use given TPU's for computations (default=auto) mml train proj=test trainer.max_epochs=40 # will stop training after 40 epochs mml train proj=test +trainer.profiler=advanced # use lightning advanced profiler during training others ~~~~~~ `Lightning `_ offers more features like callbacks and model tuning which are mapped to ``callbacks`` and ``tune`` CLI within ``mml``. Furthermore there are plenty of other possibilities to set behaviour from CLI. To give you examples: sampling, seed, gpus, arch, .... .. code-block:: bash mml train proj=test callbacks=[mixup,swa] cbs.swa.swa_lrs=0.005 # use MixUp and SWA callbacks, set swa lr mml train proj=test augmentations=randaugment tune.lr=true # use RandAugment and auto LR finder mml train proj=test sampling.sample_num=1000 sampling.batch_size=100 # set batch size and number of samples per epoch mml train proj=test seed=42 arch.name=resnet50 # use random seed 42 and a resnet50 model Type ``mml --help`` to see all available provided config files (or look into the ``mml/configs`` folder for more details). At all times you may add ``--cfg=job`` to your command to give you the fully compiled config file (may interesting to detect new options and become aware of defaults). --multirun ~~~~~~~~~~ Attaching ``--multirun`` to your command will start the job in hpo mode. Note that multirun does not offer the ``continue`` functionality! Read more about this in :doc:`hpo`. Hook into MML ------------- Depending on your use case there might be necessity to hook into the ``mml`` runtime to provide your own scheduler, datasets, path assignments and more. To make ``mml`` use a local config folder within your project read the corresponding section in :doc:`install`. There you can already create newly available config files or modify default configurations. But to define e.g. a new mode with a new scheduler you have to make this scheduler available inside ``mml``. Here are multiple options: * call the ``mml`` CLI from inside your code * make your package importable and use ``hydra.instantiate`` to refer to your class/function through the configs * provide the ``mml`` entry point from inside your package, to load it as plugin during ``mml`` initialization * clone the ``mml`` source code and make your adaptions directly within ``mml`` The options are ordered by increasing complexity which means more possibilities on the one hand but also requiring deeper understanding of ``mml``. call mml CLI ~~~~~~~~~~~~ An example for the first option is given in the quickstart guide of :doc:`index`. It involves importing the objects of ``mml`` you want to modify, e.g. register a data creator and finally call the ``mml.cli.main`` function to pass any CLI parameters forward. Note that as a downside ``hydra`` cannot instantiate your defined objects unless your package is installable and you also have no runtime access to e.g. the path assignments of the file manager. hydra.instantiate ~~~~~~~~~~~~~~~~~ The next option is to package your code. This basically requires a ``setup.cfg`` and/or ``pyproject.toml`` file in your project. Please refer to the `packaging documentation `_ for the details of this process. Assume your package is named ``foo`` and you have a module ``foo.bar`` defining a class ``BuzzScheduler`` (inherited from :class:`~mml.core.scripts.base_scheduler.AbstractBaseScheduler`). Then you could create a new config file ``buzz.yaml`` inside ``configs/mode`` as follows: .. code-block:: yaml # @package _global_ defaults: - override /augmentations: no_norm - override /sampling: extraction_default mode: id: BUZZ scheduler: _target_: foo.bar.BuzzScheduler subroutines: - a - b var_one: 1337 var_two: 42 sampling: sample_num: 1000 This will behave as follows: after hydra compiles the config with a CLI command starting like ``mml buzz`` the ``buzz.yaml`` file is included and overrides the default ``augmentation`` and ``sampling`` configs. Further it even more overwrites ``sampling.sample_num`` value and when ``MML`` starts it will use ``hydra.instantiate`` to load the ``foo.bar.BuzzScheduler`` scheduler. It may implement one or multiple subroutines determining its behaviour and also take ``cfg.mode.var_one`` and ``cfg.mode.var_two`` values into consideration. See :doc:`extensions` for more details on writing your own scheduler. entry point ~~~~~~~~~~~ If you want to modify or extend ``MML``'s behaviour outside the scope of a a single class (like the scheduler above) and provide e.g. additional options to some of the core functions, like a new method of :class:`~mml.core.scripts.base_scheduler.AbstractBaseScheduler` or automatically register a :class:`~mml.core.data_preparation.task_creator.TaskCreator` to be available in mode ``create``, you can make your package a plugin of ``MML`` by adding the following section to your ``setup.cfg``: .. code-block:: none [options.entry_points] mml.plugins = your_plugin_name = foo.bar Each time ``MML`` starts all available plugins are loaded automatically, which means importing of ``foo.bar`` in the above case. The ``__init__.py`` file of this module may then modify ``MML`` internals. You can find examples for this at :doc:`plugins`. edit source ~~~~~~~~~~~ Finally consider cloning the ``MML`` repository and modify it's behaviour directly at the source. Have a look at :doc:`api/overview` as good starting point to navigate through the internals of ``MML``. Pre- and Post-processing ------------------------ experiment preparation ~~~~~~~~~~~~~~~~~~~~~~ Especially to planning ``MML`` experiments there is the :mod:`mml.interactive` module, offering the :class:`~mml.interactive.planning.MMLJobDescription` class. The following snippet shows an example usage: .. code-block:: python from mml.interactive import DefaultRequirements, EmbeddedJobRunner, MMLJobDescription, write_out_commands reqs = DefaultRequirements() project = 'my_project' all_tasks = ['task_a', 'task_b', 'task_c'] cmds = list() # step one: task creation cmds.append(MMLJobDescription(prefix_req=reqs, mode='create', config_options={'task_list': all_tasks, 'proj': project})) # step two: make sure all tasks are preprocessed cmds.append(MMLJobDescription(prefix_req=reqs, mode='pp', config_options={'task_list': all_tasks, 'proj': project})) # (optional) step three: modify tasks (in this case create subsets) cmds.append(MMLJobDescription(prefix_req=reqs, mode='info', config_options={'task_list': all_tasks, 'tagging.all': '+subset?0_1', 'proj': project})) # now either put all commands into a bash file write_out_commands(cmd_list=cmds, name='my_commands_file.txt') # or run them directly runner = EmbeddedJobRunner() for job in cmds: runner.run(job=job) experiment evaluation ~~~~~~~~~~~~~~~~~~~~~ The MML framework offers extensive log information, both to the console and to the ``exp.log`` file of each run. In addition any NN training can be monitored by some experiment logger. By default ``logging.exp_logger=tensorboard`` is active. To show these information you need to install `tensorboard `_. This can for example be done via .. code-block:: bash pip install tensorboard To start tensorboard call .. code-block:: bash tensorboard --logdir path/to/results and navigate to `localhost:6006` within your preferred browser. Loss curves and metrics are shown in the ``SCALARS`` tab. Multirun experiments may be best inspected via the ``HPARAMS`` tab, comparing specific combinations of hyperparameters with different views. Setting ``logging.samples=n`` also logs n sample images with model predictions per epoch in the ``IMAGES`` tab. There you can also find a confusion matrix for each epoch if ``logging.cm`` is set to ``true``. For large scale evaluation loading model storages and the corresponding pipelines works as easy as follows with the ``mml.interactive`` module: .. code-block:: python import mml.interactive # some interactive sessions do not inherit MML_ENV_PATH env variable, you may provide this directly mml.interactive.init(env_path=...) all_models = mml.interactive.load_project_models(project='my_project') # this will return a dictionary with all instantiated models storages in a list assigned as value to each task as key model_storage = all_models['my_sample_task'][0] model_storage.metrics # holds all train/val metrics across the training model_storage.pipeline # holds the path to the yaml file specifying all relevant training configurations model_storage.parameters # holds path to model weights after training model_storage.predictions # holds paths to all predictions made with this model ...