Training mode
Task training mode offers flexibility in task training, testing and prediction. By default, nesting and cross-validation are active.
!mml train tasks=fake trainer.max_epochs=2 tune.lr=false proj=DEMO
[2024-01-17 15:29:36,209][mml][INFO] - Started MML 0.12.0 on Python 3.8.13 with mode TRAIN.
[2024-01-17 15:29:36,209][mml][INFO] - Plugins loaded: ['mml-tasks', 'mml-tags', 'mml-dimensionality', 'mml-inference', 'mml-sql', 'mml-similarity']
[2024-01-17 15:29:36,355][mml.core.scripts.schedulers.base_scheduler][INFO] - Pivot task is mml_fake_task.
[2024-01-17 15:29:36,361][py.warnings][WARNING] - /home/scholzpa/Documents/development/gitlab/mml/src/mml/core/scripts/schedulers/train_scheduler.py:99: UserWarning: Cross-Validation will store 5 model parameters. To reduce memory consumption you may consider either setting mode.store_parameters=false (which will omit storing the model parameters) or reuse.clean_up.parameters=true (which deletes the model parameters at the end of the experiment.
[2024-01-17 15:29:36,361][mml][INFO] - MML init time was 0.0h 0.0m 0.15s.
[2024-01-17 15:29:36,362][mml.core.scripts.schedulers.base_scheduler][INFO] - Preparing experiment ...
[2024-01-17 15:29:36,366][mml.core.scripts.schedulers.base_scheduler][INFO] - Starting experiment!
[2024-01-17 15:29:36,367][mml.core.scripts.schedulers.train_scheduler][INFO] - Starting training for task mml_fake_task+nested?0 and fold 0.
[2024-01-17 15:29:36,802][timm.models._builder][INFO] - Loading pretrained weights from Hugging Face hub (timm/resnet34.a1_in1k)
[2024-01-17 15:29:36,984][timm.models._hub][INFO] - [timm/resnet34.a1_in1k] Safe alternative available for 'pytorch_model.bin' (as 'model.safetensors'). Loading weights using safetensors.
[2024-01-17 15:29:37,011][mml.core.models.lightning_single_frame][INFO] - Since sampling is unbalanced will try to auto activate loss weights for classes.
[2024-01-17 15:29:37,283][lightning_fabric.utilities.rank_zero][INFO] - Using 16bit Automatic Mixed Precision (AMP)
[2024-01-17 15:29:37,292][lightning_fabric.utilities.rank_zero][INFO] - GPU available: True (cuda), used: True
[2024-01-17 15:29:37,292][lightning_fabric.utilities.rank_zero][INFO] - TPU available: False, using: 0 TPU cores
[2024-01-17 15:29:37,292][lightning_fabric.utilities.rank_zero][INFO] - IPU available: False, using: 0 IPUs
[2024-01-17 15:29:37,292][lightning_fabric.utilities.rank_zero][INFO] - HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
| Name | Type | Params
---------------------------------------------------
0 | model | TimmGenericModel | 21.3 M
1 | criteria | ModuleDict | 0
2 | train_metrics | ModuleDict | 0
3 | val_metrics | ModuleDict | 0
4 | test_metrics | ModuleDict | 0
5 | train_cms | ModuleDict | 0
6 | val_cms | ModuleDict | 0
7 | test_cms | ModuleDict | 0
---------------------------------------------------
21.3 M Trainable params
0 Non-trainable params
21.3 M Total params
85.159 Total estimated model params size (MB)
[2024-01-17 15:29:37,690][py.warnings][WARNING] - /home/scholzpa/miniconda3/envs/mml/lib/python3.8/site-packages/lightning/pytorch/loops/fit_loop.py:293: The number of training batches (3) is smaller than the logging interval Trainer(log_every_n_steps=50). Set a lower value for log_every_n_steps if you want to see logs for the training epoch.
Epoch 0: 100%|██████| 3/3 [00:03<00:00, 0.84it/s, v_num=9-36, train/loss=2.360]
Validation: | | 0/? [00:00<?, ?it/s]
Validation: 0%| | 0/1 [00:00<?, ?it/s]
Validation DataLoader 0: 0%| | 0/1 [00:00<?, ?it/s]
Validation DataLoader 0: 100%|████████████████████| 1/1 [00:00<00:00, 2.90it/s]
Epoch 1: 100%|██████| 3/3 [00:01<00:00, 2.31it/s, v_num=9-36, train/loss=2.310]
Validation: | | 0/? [00:00<?, ?it/s]
Validation: 0%| | 0/1 [00:00<?, ?it/s]
Validation DataLoader 0: 0%| | 0/1 [00:00<?, ?it/s]
Validation DataLoader 0: 100%|████████████████████| 1/1 [00:00<00:00, 14.81it/s]
Epoch 1: 100%|██████| 3/3 [00:02<00:00, 1.39it/s, v_num=9-36, train/loss=2.310][2024-01-17 15:29:45,203][lightning_fabric.utilities.rank_zero][INFO] - `Trainer.fit` stopped: `max_epochs=2` reached.
Epoch 1: 100%|██████| 3/3 [00:02<00:00, 1.38it/s, v_num=9-36, train/loss=2.310]
[2024-01-17 15:29:45,689][mml.core.scripts.schedulers.train_scheduler][INFO] - Finished training for task mml_fake_task+nested?0 and fold 0.
[2024-01-17 15:29:45,690][mml.core.scripts.schedulers.train_scheduler][INFO] - Starting training for task mml_fake_task+nested?1 and fold 0.
[2024-01-17 15:29:45,902][timm.models._builder][INFO] - Loading pretrained weights from Hugging Face hub (timm/resnet34.a1_in1k)
[2024-01-17 15:29:46,029][timm.models._hub][INFO] - [timm/resnet34.a1_in1k] Safe alternative available for 'pytorch_model.bin' (as 'model.safetensors'). Loading weights using safetensors.
[2024-01-17 15:29:46,057][mml.core.models.lightning_single_frame][INFO] - Since sampling is unbalanced will try to auto activate loss weights for classes.
[2024-01-17 15:29:46,164][lightning_fabric.utilities.rank_zero][INFO] - Using 16bit Automatic Mixed Precision (AMP)
[2024-01-17 15:29:46,172][lightning_fabric.utilities.rank_zero][INFO] - GPU available: True (cuda), used: True
[2024-01-17 15:29:46,172][lightning_fabric.utilities.rank_zero][INFO] - TPU available: False, using: 0 TPU cores
[2024-01-17 15:29:46,172][lightning_fabric.utilities.rank_zero][INFO] - IPU available: False, using: 0 IPUs
[2024-01-17 15:29:46,172][lightning_fabric.utilities.rank_zero][INFO] - HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
| Name | Type | Params
---------------------------------------------------
0 | model | TimmGenericModel | 21.3 M
1 | criteria | ModuleDict | 0
2 | train_metrics | ModuleDict | 0
3 | val_metrics | ModuleDict | 0
4 | test_metrics | ModuleDict | 0
5 | train_cms | ModuleDict | 0
6 | val_cms | ModuleDict | 0
7 | test_cms | ModuleDict | 0
---------------------------------------------------
21.3 M Trainable params
0 Non-trainable params
21.3 M Total params
85.159 Total estimated model params size (MB)
Epoch 0: 100%|██████| 3/3 [00:01<00:00, 1.96it/s, v_num=9-45, train/loss=2.350]
Validation: | | 0/? [00:00<?, ?it/s]
Validation: 0%| | 0/1 [00:00<?, ?it/s]
Validation DataLoader 0: 0%| | 0/1 [00:00<?, ?it/s]
Validation DataLoader 0: 100%|████████████████████| 1/1 [00:00<00:00, 14.95it/s]
Epoch 1: 100%|██████| 3/3 [00:01<00:00, 2.48it/s, v_num=9-45, train/loss=2.280]
Validation: | | 0/? [00:00<?, ?it/s]
Validation: 0%| | 0/1 [00:00<?, ?it/s]
Validation DataLoader 0: 0%| | 0/1 [00:00<?, ?it/s]
Validation DataLoader 0: 100%|████████████████████| 1/1 [00:00<00:00, 15.38it/s]
Epoch 1: 100%|██████| 3/3 [00:02<00:00, 1.49it/s, v_num=9-45, train/loss=2.280][2024-01-17 15:29:51,606][lightning_fabric.utilities.rank_zero][INFO] - `Trainer.fit` stopped: `max_epochs=2` reached.
Epoch 1: 100%|██████| 3/3 [00:02<00:00, 1.47it/s, v_num=9-45, train/loss=2.280]
[2024-01-17 15:29:52,102][mml.core.scripts.schedulers.train_scheduler][INFO] - Finished training for task mml_fake_task+nested?1 and fold 0.
[2024-01-17 15:29:52,103][mml.core.scripts.schedulers.train_scheduler][INFO] - Starting training for task mml_fake_task+nested?2 and fold 0.
[2024-01-17 15:29:52,456][timm.models._builder][INFO] - Loading pretrained weights from Hugging Face hub (timm/resnet34.a1_in1k)
[2024-01-17 15:29:52,577][timm.models._hub][INFO] - [timm/resnet34.a1_in1k] Safe alternative available for 'pytorch_model.bin' (as 'model.safetensors'). Loading weights using safetensors.
[2024-01-17 15:29:52,603][mml.core.models.lightning_single_frame][INFO] - Since sampling is unbalanced will try to auto activate loss weights for classes.
[2024-01-17 15:29:52,703][lightning_fabric.utilities.rank_zero][INFO] - Using 16bit Automatic Mixed Precision (AMP)
[2024-01-17 15:29:52,712][lightning_fabric.utilities.rank_zero][INFO] - GPU available: True (cuda), used: True
[2024-01-17 15:29:52,712][lightning_fabric.utilities.rank_zero][INFO] - TPU available: False, using: 0 TPU cores
[2024-01-17 15:29:52,712][lightning_fabric.utilities.rank_zero][INFO] - IPU available: False, using: 0 IPUs
[2024-01-17 15:29:52,712][lightning_fabric.utilities.rank_zero][INFO] - HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
| Name | Type | Params
---------------------------------------------------
0 | model | TimmGenericModel | 21.3 M
1 | criteria | ModuleDict | 0
2 | train_metrics | ModuleDict | 0
3 | val_metrics | ModuleDict | 0
4 | test_metrics | ModuleDict | 0
5 | train_cms | ModuleDict | 0
6 | val_cms | ModuleDict | 0
7 | test_cms | ModuleDict | 0
---------------------------------------------------
21.3 M Trainable params
0 Non-trainable params
21.3 M Total params
85.159 Total estimated model params size (MB)
Epoch 0: 67%|████ | 2/3 [00:01<00:00, 1.71it/s, v_num=9-52, train/loss=2.360][2024-01-17 15:29:54,479][py.warnings][WARNING] - /home/scholzpa/miniconda3/envs/mml/lib/python3.8/site-packages/torchmetrics/utilities/prints.py:43: UserWarning: Average precision score for one or more classes was `nan`. Ignoring these classes in macro-average
[2024-01-17 15:29:54,482][py.warnings][WARNING] - /home/scholzpa/miniconda3/envs/mml/lib/python3.8/site-packages/torchmetrics/utilities/prints.py:43: UserWarning: No positive samples in targets, true positive value should be meaningless. Returning zero tensor in true positive score
Epoch 0: 100%|██████| 3/3 [00:01<00:00, 1.92it/s, v_num=9-52, train/loss=2.340]
Validation: | | 0/? [00:00<?, ?it/s]
Validation: 0%| | 0/1 [00:00<?, ?it/s]
Validation DataLoader 0: 0%| | 0/1 [00:00<?, ?it/s]
Validation DataLoader 0: 100%|████████████████████| 1/1 [00:00<00:00, 14.62it/s]
Epoch 1: 100%|██████| 3/3 [00:01<00:00, 2.54it/s, v_num=9-52, train/loss=2.280]
Validation: | | 0/? [00:00<?, ?it/s]
Validation: 0%| | 0/1 [00:00<?, ?it/s]
Validation DataLoader 0: 0%| | 0/1 [00:00<?, ?it/s]
Validation DataLoader 0: 100%|████████████████████| 1/1 [00:00<00:00, 16.41it/s]
Epoch 1: 100%|██████| 3/3 [00:01<00:00, 1.51it/s, v_num=9-52, train/loss=2.280][2024-01-17 15:29:58,202][lightning_fabric.utilities.rank_zero][INFO] - `Trainer.fit` stopped: `max_epochs=2` reached.
Epoch 1: 100%|██████| 3/3 [00:02<00:00, 1.49it/s, v_num=9-52, train/loss=2.280]
[2024-01-17 15:29:58,688][mml.core.scripts.schedulers.train_scheduler][INFO] - Finished training for task mml_fake_task+nested?2 and fold 0.
[2024-01-17 15:29:58,690][mml.core.scripts.schedulers.train_scheduler][INFO] - Starting training for task mml_fake_task+nested?3 and fold 0.
[2024-01-17 15:29:58,903][timm.models._builder][INFO] - Loading pretrained weights from Hugging Face hub (timm/resnet34.a1_in1k)
[2024-01-17 15:29:59,021][timm.models._hub][INFO] - [timm/resnet34.a1_in1k] Safe alternative available for 'pytorch_model.bin' (as 'model.safetensors'). Loading weights using safetensors.
[2024-01-17 15:29:59,047][mml.core.models.lightning_single_frame][INFO] - Since sampling is unbalanced will try to auto activate loss weights for classes.
[2024-01-17 15:29:59,154][lightning_fabric.utilities.rank_zero][INFO] - Using 16bit Automatic Mixed Precision (AMP)
[2024-01-17 15:29:59,163][lightning_fabric.utilities.rank_zero][INFO] - GPU available: True (cuda), used: True
[2024-01-17 15:29:59,163][lightning_fabric.utilities.rank_zero][INFO] - TPU available: False, using: 0 TPU cores
[2024-01-17 15:29:59,163][lightning_fabric.utilities.rank_zero][INFO] - IPU available: False, using: 0 IPUs
[2024-01-17 15:29:59,163][lightning_fabric.utilities.rank_zero][INFO] - HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
| Name | Type | Params
---------------------------------------------------
0 | model | TimmGenericModel | 21.3 M
1 | criteria | ModuleDict | 0
2 | train_metrics | ModuleDict | 0
3 | val_metrics | ModuleDict | 0
4 | test_metrics | ModuleDict | 0
5 | train_cms | ModuleDict | 0
6 | val_cms | ModuleDict | 0
7 | test_cms | ModuleDict | 0
---------------------------------------------------
21.3 M Trainable params
0 Non-trainable params
21.3 M Total params
85.159 Total estimated model params size (MB)
Epoch 0: 100%|██████| 3/3 [00:01<00:00, 1.95it/s, v_num=9-58, train/loss=2.370]
Validation: | | 0/? [00:00<?, ?it/s]
Validation: 0%| | 0/1 [00:00<?, ?it/s]
Validation DataLoader 0: 0%| | 0/1 [00:00<?, ?it/s]
Validation DataLoader 0: 100%|████████████████████| 1/1 [00:00<00:00, 2.92it/s]
Epoch 1: 100%|██████| 3/3 [00:01<00:00, 2.57it/s, v_num=9-58, train/loss=2.300]
Validation: | | 0/? [00:00<?, ?it/s]
Validation: 0%| | 0/1 [00:00<?, ?it/s]
Validation DataLoader 0: 0%| | 0/1 [00:00<?, ?it/s]
Validation DataLoader 0: 100%|████████████████████| 1/1 [00:00<00:00, 15.46it/s]
Epoch 1: 100%|██████| 3/3 [00:01<00:00, 1.51it/s, v_num=9-58, train/loss=2.300][2024-01-17 15:30:04,883][lightning_fabric.utilities.rank_zero][INFO] - `Trainer.fit` stopped: `max_epochs=2` reached.
Epoch 1: 100%|██████| 3/3 [00:02<00:00, 1.49it/s, v_num=9-58, train/loss=2.300]
[2024-01-17 15:30:05,431][mml.core.scripts.schedulers.train_scheduler][INFO] - Finished training for task mml_fake_task+nested?3 and fold 0.
[2024-01-17 15:30:05,432][mml.core.scripts.schedulers.train_scheduler][INFO] - Starting training for task mml_fake_task+nested?4 and fold 0.
[2024-01-17 15:30:05,643][timm.models._builder][INFO] - Loading pretrained weights from Hugging Face hub (timm/resnet34.a1_in1k)
[2024-01-17 15:30:05,763][timm.models._hub][INFO] - [timm/resnet34.a1_in1k] Safe alternative available for 'pytorch_model.bin' (as 'model.safetensors'). Loading weights using safetensors.
[2024-01-17 15:30:05,790][mml.core.models.lightning_single_frame][INFO] - Since sampling is unbalanced will try to auto activate loss weights for classes.
[2024-01-17 15:30:05,898][lightning_fabric.utilities.rank_zero][INFO] - Using 16bit Automatic Mixed Precision (AMP)
[2024-01-17 15:30:05,909][lightning_fabric.utilities.rank_zero][INFO] - GPU available: True (cuda), used: True
[2024-01-17 15:30:05,909][lightning_fabric.utilities.rank_zero][INFO] - TPU available: False, using: 0 TPU cores
[2024-01-17 15:30:05,909][lightning_fabric.utilities.rank_zero][INFO] - IPU available: False, using: 0 IPUs
[2024-01-17 15:30:05,909][lightning_fabric.utilities.rank_zero][INFO] - HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
| Name | Type | Params
---------------------------------------------------
0 | model | TimmGenericModel | 21.3 M
1 | criteria | ModuleDict | 0
2 | train_metrics | ModuleDict | 0
3 | val_metrics | ModuleDict | 0
4 | test_metrics | ModuleDict | 0
5 | train_cms | ModuleDict | 0
6 | val_cms | ModuleDict | 0
7 | test_cms | ModuleDict | 0
---------------------------------------------------
21.3 M Trainable params
0 Non-trainable params
21.3 M Total params
85.159 Total estimated model params size (MB)
Epoch 0: 100%|██████| 3/3 [00:01<00:00, 1.91it/s, v_num=0-05, train/loss=2.380]
Validation: | | 0/? [00:00<?, ?it/s]
Validation: 0%| | 0/1 [00:00<?, ?it/s]
Validation DataLoader 0: 0%| | 0/1 [00:00<?, ?it/s]
Validation DataLoader 0: 100%|████████████████████| 1/1 [00:00<00:00, 14.40it/s]
Epoch 1: 100%|██████| 3/3 [00:01<00:00, 2.47it/s, v_num=0-05, train/loss=2.330]
Validation: | | 0/? [00:00<?, ?it/s]
Validation: 0%| | 0/1 [00:00<?, ?it/s]
Validation DataLoader 0: 0%| | 0/1 [00:00<?, ?it/s]
Validation DataLoader 0: 100%|████████████████████| 1/1 [00:00<00:00, 14.39it/s]
Epoch 1: 100%|██████| 3/3 [00:02<00:00, 1.48it/s, v_num=0-05, train/loss=2.330][2024-01-17 15:30:11,459][lightning_fabric.utilities.rank_zero][INFO] - `Trainer.fit` stopped: `max_epochs=2` reached.
Epoch 1: 100%|██████| 3/3 [00:02<00:00, 1.45it/s, v_num=0-05, train/loss=2.330]
[2024-01-17 15:30:11,970][mml.core.scripts.schedulers.train_scheduler][INFO] - Finished training for task mml_fake_task+nested?4 and fold 0.
[2024-01-17 15:30:11,971][mml.core.data_loading.file_manager][INFO] - A total of 15 paths have been created during this run.
[2024-01-17 15:30:12,081][mml.core.scripts.schedulers.base_scheduler][INFO] - Successfully finished all experiments!
[2024-01-17 15:30:12,081][mml][INFO] - MML run time was 0.0h 0.0m 35.72s.
[2024-01-17 15:30:12,081][mml][INFO] - Return value is 2.3153738498687746.
You can reuse the created models to perform predictions as follows.
!mml train tasks=fake mode.subroutines=[predict] proj=DEMO reuse.models=DEMO
[2024-01-17 19:43:12,136][mml][INFO] - Started MML 0.12.0 on Python 3.8.13 with mode TRAIN.
[2024-01-17 19:43:12,136][mml][INFO] - Plugins loaded: ['mml-tasks', 'mml-tags', 'mml-dimensionality', 'mml-inference', 'mml-sql', 'mml-similarity']
[2024-01-17 19:43:12,283][mml.core.scripts.schedulers.base_scheduler][INFO] - Pivot task is mml_fake_task.
[2024-01-17 19:43:12,294][py.warnings][WARNING] - /home/scholzpa/Documents/development/gitlab/mml/src/mml/core/scripts/schedulers/train_scheduler.py:99: UserWarning: Cross-Validation will store 5 model parameters. To reduce memory consumption you may consider either setting mode.store_parameters=false (which will omit storing the model parameters) or reuse.clean_up.parameters=true (which deletes the model parameters at the end of the experiment.
[2024-01-17 19:43:12,295][mml][INFO] - MML init time was 0.0h 0.0m 0.16s.
[2024-01-17 19:43:12,296][mml.core.scripts.schedulers.base_scheduler][INFO] - Preparing experiment ...
[2024-01-17 19:43:12,302][mml.core.scripts.schedulers.base_scheduler][INFO] - Starting experiment!
[2024-01-17 19:43:12,303][mml.core.scripts.schedulers.train_scheduler][INFO] - Starting predicting for task mml_fake_task+nested?0 and fold 0.
[2024-01-17 19:43:12,303][mml.core.scripts.schedulers.train_scheduler][INFO] - Found 6 matching model storages, used the latest from 2024-01-17 19:22:20.
[2024-01-17 19:43:12,755][timm.models._builder][INFO] - Loading pretrained weights from Hugging Face hub (timm/resnet34.a1_in1k)
[2024-01-17 19:43:12,961][timm.models._hub][INFO] - [timm/resnet34.a1_in1k] Safe alternative available for 'pytorch_model.bin' (as 'model.safetensors'). Loading weights using safetensors.
[2024-01-17 19:43:12,990][mml.core.models.lightning_single_frame][INFO] - Since sampling is unbalanced will try to auto activate loss weights for classes.
[2024-01-17 19:43:13,252][lightning_fabric.utilities.rank_zero][INFO] - Using 16bit Automatic Mixed Precision (AMP)
[2024-01-17 19:43:13,260][lightning_fabric.utilities.rank_zero][INFO] - GPU available: True (cuda), used: True
[2024-01-17 19:43:13,260][lightning_fabric.utilities.rank_zero][INFO] - TPU available: False, using: 0 TPU cores
[2024-01-17 19:43:13,261][lightning_fabric.utilities.rank_zero][INFO] - IPU available: False, using: 0 IPUs
[2024-01-17 19:43:13,261][lightning_fabric.utilities.rank_zero][INFO] - HPU available: False, using: 0 HPUs
[2024-01-17 19:43:13,338][lightning_fabric.utilities.rank_zero][INFO] - Restoring states from the checkpoint path at /home/scholzpa/Documents/exp/mml_results/DEMO/PARAMETERS/mml_fake_task+nested?0/model_0005.pth
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
[2024-01-17 19:43:13,638][lightning_fabric.utilities.rank_zero][INFO] - Loaded model weights from the checkpoint at /home/scholzpa/Documents/exp/mml_results/DEMO/PARAMETERS/mml_fake_task+nested?0/model_0005.pth
Predicting DataLoader 0: 100%|████████████████████| 1/1 [00:00<00:00, 1.74it/s]
[2024-01-17 19:43:15,146][mml.core.scripts.schedulers.train_scheduler][INFO] - Finished predicting for task mml_fake_task+nested?0 and fold 0.
[2024-01-17 19:43:15,148][mml.core.scripts.schedulers.train_scheduler][INFO] - Starting predicting for task mml_fake_task+nested?1 and fold 0.
[2024-01-17 19:43:15,148][mml.core.scripts.schedulers.train_scheduler][INFO] - Found 3 matching model storages, used the latest from 2024-01-17 15:29:52.
[2024-01-17 19:43:15,373][timm.models._builder][INFO] - Loading pretrained weights from Hugging Face hub (timm/resnet34.a1_in1k)
[2024-01-17 19:43:15,489][timm.models._hub][INFO] - [timm/resnet34.a1_in1k] Safe alternative available for 'pytorch_model.bin' (as 'model.safetensors'). Loading weights using safetensors.
[2024-01-17 19:43:15,516][mml.core.models.lightning_single_frame][INFO] - Since sampling is unbalanced will try to auto activate loss weights for classes.
[2024-01-17 19:43:15,618][lightning_fabric.utilities.rank_zero][INFO] - Using 16bit Automatic Mixed Precision (AMP)
[2024-01-17 19:43:15,626][lightning_fabric.utilities.rank_zero][INFO] - GPU available: True (cuda), used: True
[2024-01-17 19:43:15,626][lightning_fabric.utilities.rank_zero][INFO] - TPU available: False, using: 0 TPU cores
[2024-01-17 19:43:15,626][lightning_fabric.utilities.rank_zero][INFO] - IPU available: False, using: 0 IPUs
[2024-01-17 19:43:15,626][lightning_fabric.utilities.rank_zero][INFO] - HPU available: False, using: 0 HPUs
[2024-01-17 19:43:15,640][lightning_fabric.utilities.rank_zero][INFO] - Restoring states from the checkpoint path at /home/scholzpa/Documents/exp/mml_results/DEMO/PARAMETERS/mml_fake_task+nested?1/model_0002.pth
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
[2024-01-17 19:43:15,923][lightning_fabric.utilities.rank_zero][INFO] - Loaded model weights from the checkpoint at /home/scholzpa/Documents/exp/mml_results/DEMO/PARAMETERS/mml_fake_task+nested?1/model_0002.pth
Predicting DataLoader 0: 100%|████████████████████| 1/1 [00:00<00:00, 2.59it/s]
[2024-01-17 19:43:17,198][mml.core.scripts.schedulers.train_scheduler][INFO] - Finished predicting for task mml_fake_task+nested?1 and fold 0.
[2024-01-17 19:43:17,200][mml.core.scripts.schedulers.train_scheduler][INFO] - Starting predicting for task mml_fake_task+nested?2 and fold 0.
[2024-01-17 19:43:17,200][mml.core.scripts.schedulers.train_scheduler][INFO] - Found 3 matching model storages, used the latest from 2024-01-17 15:29:58.
[2024-01-17 19:43:17,425][timm.models._builder][INFO] - Loading pretrained weights from Hugging Face hub (timm/resnet34.a1_in1k)
[2024-01-17 19:43:17,543][timm.models._hub][INFO] - [timm/resnet34.a1_in1k] Safe alternative available for 'pytorch_model.bin' (as 'model.safetensors'). Loading weights using safetensors.
[2024-01-17 19:43:17,569][mml.core.models.lightning_single_frame][INFO] - Since sampling is unbalanced will try to auto activate loss weights for classes.
[2024-01-17 19:43:17,816][lightning_fabric.utilities.rank_zero][INFO] - Using 16bit Automatic Mixed Precision (AMP)
[2024-01-17 19:43:17,825][lightning_fabric.utilities.rank_zero][INFO] - GPU available: True (cuda), used: True
[2024-01-17 19:43:17,825][lightning_fabric.utilities.rank_zero][INFO] - TPU available: False, using: 0 TPU cores
[2024-01-17 19:43:17,825][lightning_fabric.utilities.rank_zero][INFO] - IPU available: False, using: 0 IPUs
[2024-01-17 19:43:17,825][lightning_fabric.utilities.rank_zero][INFO] - HPU available: False, using: 0 HPUs
[2024-01-17 19:43:17,838][lightning_fabric.utilities.rank_zero][INFO] - Restoring states from the checkpoint path at /home/scholzpa/Documents/exp/mml_results/DEMO/PARAMETERS/mml_fake_task+nested?2/model_0002.pth
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
[2024-01-17 19:43:18,108][lightning_fabric.utilities.rank_zero][INFO] - Loaded model weights from the checkpoint at /home/scholzpa/Documents/exp/mml_results/DEMO/PARAMETERS/mml_fake_task+nested?2/model_0002.pth
Predicting DataLoader 0: 100%|████████████████████| 1/1 [00:00<00:00, 2.53it/s]
[2024-01-17 19:43:19,431][mml.core.scripts.schedulers.train_scheduler][INFO] - Finished predicting for task mml_fake_task+nested?2 and fold 0.
[2024-01-17 19:43:19,432][mml.core.scripts.schedulers.train_scheduler][INFO] - Starting predicting for task mml_fake_task+nested?3 and fold 0.
[2024-01-17 19:43:19,432][mml.core.scripts.schedulers.train_scheduler][INFO] - Found 3 matching model storages, used the latest from 2024-01-17 15:30:05.
[2024-01-17 19:43:19,654][timm.models._builder][INFO] - Loading pretrained weights from Hugging Face hub (timm/resnet34.a1_in1k)
[2024-01-17 19:43:19,772][timm.models._hub][INFO] - [timm/resnet34.a1_in1k] Safe alternative available for 'pytorch_model.bin' (as 'model.safetensors'). Loading weights using safetensors.
[2024-01-17 19:43:19,796][mml.core.models.lightning_single_frame][INFO] - Since sampling is unbalanced will try to auto activate loss weights for classes.
[2024-01-17 19:43:19,903][lightning_fabric.utilities.rank_zero][INFO] - Using 16bit Automatic Mixed Precision (AMP)
[2024-01-17 19:43:19,912][lightning_fabric.utilities.rank_zero][INFO] - GPU available: True (cuda), used: True
[2024-01-17 19:43:19,912][lightning_fabric.utilities.rank_zero][INFO] - TPU available: False, using: 0 TPU cores
[2024-01-17 19:43:19,913][lightning_fabric.utilities.rank_zero][INFO] - IPU available: False, using: 0 IPUs
[2024-01-17 19:43:19,913][lightning_fabric.utilities.rank_zero][INFO] - HPU available: False, using: 0 HPUs
[2024-01-17 19:43:19,929][lightning_fabric.utilities.rank_zero][INFO] - Restoring states from the checkpoint path at /home/scholzpa/Documents/exp/mml_results/DEMO/PARAMETERS/mml_fake_task+nested?3/model_0002.pth
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
[2024-01-17 19:43:20,196][lightning_fabric.utilities.rank_zero][INFO] - Loaded model weights from the checkpoint at /home/scholzpa/Documents/exp/mml_results/DEMO/PARAMETERS/mml_fake_task+nested?3/model_0002.pth
Predicting DataLoader 0: 100%|████████████████████| 1/1 [00:00<00:00, 2.60it/s]
[2024-01-17 19:43:21,442][mml.core.scripts.schedulers.train_scheduler][INFO] - Finished predicting for task mml_fake_task+nested?3 and fold 0.
[2024-01-17 19:43:21,443][mml.core.scripts.schedulers.train_scheduler][INFO] - Starting predicting for task mml_fake_task+nested?4 and fold 0.
[2024-01-17 19:43:21,444][mml.core.scripts.schedulers.train_scheduler][INFO] - Found 3 matching model storages, used the latest from 2024-01-17 15:30:11.
[2024-01-17 19:43:21,657][timm.models._builder][INFO] - Loading pretrained weights from Hugging Face hub (timm/resnet34.a1_in1k)
[2024-01-17 19:43:21,774][timm.models._hub][INFO] - [timm/resnet34.a1_in1k] Safe alternative available for 'pytorch_model.bin' (as 'model.safetensors'). Loading weights using safetensors.
[2024-01-17 19:43:21,802][mml.core.models.lightning_single_frame][INFO] - Since sampling is unbalanced will try to auto activate loss weights for classes.
[2024-01-17 19:43:21,905][lightning_fabric.utilities.rank_zero][INFO] - Using 16bit Automatic Mixed Precision (AMP)
[2024-01-17 19:43:21,913][lightning_fabric.utilities.rank_zero][INFO] - GPU available: True (cuda), used: True
[2024-01-17 19:43:21,913][lightning_fabric.utilities.rank_zero][INFO] - TPU available: False, using: 0 TPU cores
[2024-01-17 19:43:21,913][lightning_fabric.utilities.rank_zero][INFO] - IPU available: False, using: 0 IPUs
[2024-01-17 19:43:21,914][lightning_fabric.utilities.rank_zero][INFO] - HPU available: False, using: 0 HPUs
[2024-01-17 19:43:21,933][lightning_fabric.utilities.rank_zero][INFO] - Restoring states from the checkpoint path at /home/scholzpa/Documents/exp/mml_results/DEMO/PARAMETERS/mml_fake_task+nested?4/model_0002.pth
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
[2024-01-17 19:43:22,201][lightning_fabric.utilities.rank_zero][INFO] - Loaded model weights from the checkpoint at /home/scholzpa/Documents/exp/mml_results/DEMO/PARAMETERS/mml_fake_task+nested?4/model_0002.pth
Predicting DataLoader 0: 100%|████████████████████| 1/1 [00:00<00:00, 2.54it/s]
[2024-01-17 19:43:23,467][mml.core.scripts.schedulers.train_scheduler][INFO] - Finished predicting for task mml_fake_task+nested?4 and fold 0.
[2024-01-17 19:43:23,468][py.warnings][WARNING] - /home/scholzpa/miniconda3/envs/mml/lib/python3.8/site-packages/numpy/core/fromnumeric.py:3432: RuntimeWarning: Mean of empty slice.
[2024-01-17 19:43:23,469][py.warnings][WARNING] - /home/scholzpa/miniconda3/envs/mml/lib/python3.8/site-packages/numpy/core/_methods.py:190: RuntimeWarning: invalid value encountered in double_scalars
[2024-01-17 19:43:23,469][mml.core.data_loading.file_manager][INFO] - A total of 10 paths have been created during this run.
[2024-01-17 19:43:23,469][mml.core.scripts.schedulers.base_scheduler][INFO] - Successfully finished all experiments!
[2024-01-17 19:43:23,469][mml][INFO] - MML run time was 0.0h 0.0m 11.17s.
[2024-01-17 19:43:23,470][mml][INFO] - Return value is nan.
Also training and testing is possible. Testing does not support CV though!
!mml train tasks=fake mode.subroutines=[train,test] proj=DEMO2 trainer.max_epochs=2 tune.lr=false mode.cv=false
[2024-01-17 19:45:23,134][mml][INFO] - Started MML 0.12.0 on Python 3.8.13 with mode TRAIN.
[2024-01-17 19:45:23,134][mml][INFO] - Plugins loaded: ['mml-tasks', 'mml-tags', 'mml-dimensionality', 'mml-inference', 'mml-sql', 'mml-similarity']
[2024-01-17 19:45:23,285][mml.core.scripts.schedulers.base_scheduler][INFO] - Pivot task is mml_fake_task.
[2024-01-17 19:45:23,291][py.warnings][WARNING] - /home/scholzpa/Documents/development/gitlab/mml/src/mml/core/scripts/schedulers/train_scheduler.py:107: UserWarning: Chose mode.nested=true so the testing subroutine will be performed NOT on the (potential) official task test split, but on the hold-out fold.
[2024-01-17 19:45:23,292][mml][INFO] - MML init time was 0.0h 0.0m 0.16s.
[2024-01-17 19:45:23,293][mml.core.scripts.schedulers.base_scheduler][INFO] - Preparing experiment ...
[2024-01-17 19:45:23,294][mml.core.scripts.schedulers.base_scheduler][INFO] - Starting experiment!
[2024-01-17 19:45:23,295][mml.core.scripts.schedulers.train_scheduler][INFO] - Starting training for task mml_fake_task+nested?0 and fold 0.
[2024-01-17 19:45:23,741][timm.models._builder][INFO] - Loading pretrained weights from Hugging Face hub (timm/resnet34.a1_in1k)
[2024-01-17 19:45:23,955][timm.models._hub][INFO] - [timm/resnet34.a1_in1k] Safe alternative available for 'pytorch_model.bin' (as 'model.safetensors'). Loading weights using safetensors.
[2024-01-17 19:45:23,979][mml.core.models.lightning_single_frame][INFO] - Since sampling is unbalanced will try to auto activate loss weights for classes.
[2024-01-17 19:45:24,232][lightning_fabric.utilities.rank_zero][INFO] - Using 16bit Automatic Mixed Precision (AMP)
[2024-01-17 19:45:24,240][lightning_fabric.utilities.rank_zero][INFO] - GPU available: True (cuda), used: True
[2024-01-17 19:45:24,240][lightning_fabric.utilities.rank_zero][INFO] - TPU available: False, using: 0 TPU cores
[2024-01-17 19:45:24,240][lightning_fabric.utilities.rank_zero][INFO] - IPU available: False, using: 0 IPUs
[2024-01-17 19:45:24,240][lightning_fabric.utilities.rank_zero][INFO] - HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
| Name | Type | Params
---------------------------------------------------
0 | model | TimmGenericModel | 21.3 M
1 | criteria | ModuleDict | 0
2 | train_metrics | ModuleDict | 0
3 | val_metrics | ModuleDict | 0
4 | test_metrics | ModuleDict | 0
5 | train_cms | ModuleDict | 0
6 | val_cms | ModuleDict | 0
7 | test_cms | ModuleDict | 0
---------------------------------------------------
21.3 M Trainable params
0 Non-trainable params
21.3 M Total params
85.159 Total estimated model params size (MB)
[2024-01-17 19:45:24,662][py.warnings][WARNING] - /home/scholzpa/miniconda3/envs/mml/lib/python3.8/site-packages/lightning/pytorch/loops/fit_loop.py:293: The number of training batches (3) is smaller than the logging interval Trainer(log_every_n_steps=50). Set a lower value for log_every_n_steps if you want to see logs for the training epoch.
Epoch 0: 100%|██████| 3/3 [00:03<00:00, 0.81it/s, v_num=5-23, train/loss=2.360]
Validation: | | 0/? [00:00<?, ?it/s]
Validation: 0%| | 0/1 [00:00<?, ?it/s]
Validation DataLoader 0: 0%| | 0/1 [00:00<?, ?it/s]
Validation DataLoader 0: 100%|████████████████████| 1/1 [00:00<00:00, 3.01it/s]
Epoch 1: 100%|██████| 3/3 [00:01<00:00, 2.40it/s, v_num=5-23, train/loss=2.310]
Validation: | | 0/? [00:00<?, ?it/s]
Validation: 0%| | 0/1 [00:00<?, ?it/s]
Validation DataLoader 0: 0%| | 0/1 [00:00<?, ?it/s]
Validation DataLoader 0: 100%|████████████████████| 1/1 [00:00<00:00, 14.96it/s]
Epoch 1: 100%|██████| 3/3 [00:02<00:00, 1.44it/s, v_num=5-23, train/loss=2.310][2024-01-17 19:45:32,217][lightning_fabric.utilities.rank_zero][INFO] - `Trainer.fit` stopped: `max_epochs=2` reached.
Epoch 1: 100%|██████| 3/3 [00:02<00:00, 1.42it/s, v_num=5-23, train/loss=2.310]
[2024-01-17 19:45:32,699][mml.core.scripts.schedulers.train_scheduler][INFO] - Finished training for task mml_fake_task+nested?0 and fold 0.
[2024-01-17 19:45:32,701][mml.core.scripts.schedulers.train_scheduler][INFO] - Starting testing for task mml_fake_task+nested?0
[2024-01-17 19:45:32,701][mml.core.scripts.schedulers.train_scheduler][INFO] - Found 1 matching model storages, used the latest from 2024-01-17 19:45:32.
[2024-01-17 19:45:32,921][timm.models._builder][INFO] - Loading pretrained weights from Hugging Face hub (timm/resnet34.a1_in1k)
[2024-01-17 19:45:33,040][timm.models._hub][INFO] - [timm/resnet34.a1_in1k] Safe alternative available for 'pytorch_model.bin' (as 'model.safetensors'). Loading weights using safetensors.
[2024-01-17 19:45:33,066][mml.core.models.lightning_single_frame][INFO] - Since sampling is unbalanced will try to auto activate loss weights for classes.
[2024-01-17 19:45:33,171][lightning_fabric.utilities.rank_zero][INFO] - Using 16bit Automatic Mixed Precision (AMP)
[2024-01-17 19:45:33,180][lightning_fabric.utilities.rank_zero][INFO] - GPU available: True (cuda), used: True
[2024-01-17 19:45:33,180][lightning_fabric.utilities.rank_zero][INFO] - TPU available: False, using: 0 TPU cores
[2024-01-17 19:45:33,180][lightning_fabric.utilities.rank_zero][INFO] - IPU available: False, using: 0 IPUs
[2024-01-17 19:45:33,180][lightning_fabric.utilities.rank_zero][INFO] - HPU available: False, using: 0 HPUs
[2024-01-17 19:45:33,190][lightning_fabric.utilities.rank_zero][INFO] - Restoring states from the checkpoint path at /home/scholzpa/Documents/exp/mml_results/DEMO2/PARAMETERS/mml_fake_task+nested?0/model_0001.pth
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
[2024-01-17 19:45:33,430][lightning_fabric.utilities.rank_zero][INFO] - Loaded model weights from the checkpoint at /home/scholzpa/Documents/exp/mml_results/DEMO2/PARAMETERS/mml_fake_task+nested?0/model_0001.pth
Testing DataLoader 0: 100%|███████████████████████| 1/1 [00:00<00:00, 1.51it/s]
────────────────────────────────────────────────────────────────────────────────
Test metric DataLoader 0
────────────────────────────────────────────────────────────────────────────────
test/loss 2.314453125
test/mml_fake_task+nested?0/MulticlassAU 0.5028374791145325
ROC_mean
test/mml_fake_task+nested?0/MulticlassAU 0.015444174408912659
ROC_std
test/mml_fake_task+nested?0/MulticlassAc 0.0963830053806305
curacy_mean
test/mml_fake_task+nested?0/MulticlassAc 0.004115646705031395
curacy_std
test/mml_fake_task+nested?0/MulticlassAv 0.13723304867744446
eragePrecision_mean
test/mml_fake_task+nested?0/MulticlassAv 0.014806008897721767
eragePrecision_std
test/mml_fake_task+nested?0/MulticlassCa 0.04819680005311966
librationError_mean
test/mml_fake_task+nested?0/MulticlassCa 0.020869700238108635
librationError_std
test/mml_fake_task+nested?0/MulticlassF1 0.016199154779314995
Score_mean
test/mml_fake_task+nested?0/MulticlassF1 0.003188840113580227
Score_std
test/mml_fake_task+nested?0/MulticlassMa -0.04116348177194595
tthewsCorrCoef_mean
test/mml_fake_task+nested?0/MulticlassMa 0.039030902087688446
tthewsCorrCoef_std
test/mml_fake_task+nested?0/MulticlassPr 0.009331938810646534
ecision_mean
test/mml_fake_task+nested?0/MulticlassPr 0.0018178964965045452
ecision_std
test/mml_fake_task+nested?0/MulticlassRe 0.09479976445436478
call_mean
test/mml_fake_task+nested?0/MulticlassRe 0.004909082315862179
call_std
test/mml_fake_task+nested?0/loss 2.3151323795318604
────────────────────────────────────────────────────────────────────────────────
[2024-01-17 19:45:34,969][mml.core.scripts.schedulers.train_scheduler][INFO] - Results: [{'test/mml_fake_task+nested?0/loss': 2.3151323795318604, 'test/loss': 2.314453125}]
[2024-01-17 19:45:34,969][mml.core.scripts.schedulers.train_scheduler][INFO] - Finished testing for task mml_fake_task+nested?0
[2024-01-17 19:45:34,970][mml.core.data_loading.file_manager][INFO] - A total of 3 paths have been created during this run.
[2024-01-17 19:45:34,996][mml.core.scripts.schedulers.base_scheduler][INFO] - Successfully finished all experiments!
[2024-01-17 19:45:34,996][mml][INFO] - MML run time was 0.0h 0.0m 11.70s.
[2024-01-17 19:45:34,996][mml][INFO] - Return value is 2.3151772022247314.
Inspect any logged results with tensorboard:
tensorboard –logdir path/to/MML_RESULTS/DEMO2 Open browser -> http://localhost:6006