page_type | languages | products | description | experimental | |||
---|---|---|---|---|---|---|---|
sample |
|
|
Learn how to train and log metrics with [PyTorch Lightning](https://github.com/PyTorchLightning/pytorch-lightning) and Azure ML. |
issues with multinode pytorch lightning |
PyTorch Lightning is a lightweight open-source library that provides a high-level interface for PyTorch.
The model training code for this tutorial can be found in src
. This tutorial goes over the steps to run PyTorch Lightning on Azure ML, and it includes the following parts:
- train-single-node: Train single-node and single-node, multi-GPU PyTorch Lightning on Azure ML.
- log-with-tensorboard: Use Lightning's built-in TensorBoardLogger to log metrics and leverage Azure ML's TensorBoard integration.
- log-with-mlflow: Use Lightning's MLFlowLogger to log metrics and leverage Azure ML's MLflow integration.
- train-multi-node-ddp: Train multi-node, multi-GPU PyTorch Lightning with DistributedDataParallel (DDP) on Azure ML.