Pretrained SOTA Deep Learning models, callbacks and more for research and production with PyTorch Lightning and PyTorch

Pretrained SOTA Deep Learning models, callbacks and more for research and production with PyTorch Lightning and PyTorch


WebsiteInstallationMain goalslatest Docsstable DocsCommunityGrid AILicence

PyPI Status PyPI Status Build Status codecov CodeFactor

Documentation Status Slack Discourse status license


Continuous Integration

CI testing
System / PyTorch ver. 1.6 (min. req.) 1.8 (latest)
Linux py3.{6,8} CI full testing CI full testing
OSX py3.{6,8} CI full testing CI full testing
Windows py3.7* CI base testing CI base testing
  • * testing just the package itself, we skip full test suite - excluding tests folder

Install

View install

Simple installation from PyPI

pip install lightning-bolts

Install bleeding-edge (no guarantees)

pip install git+https://github.com/PytorchLightning/[email protected] --upgrade

In case you want to have full experience you can install all optional packages at once

pip install lightning-bolts["extra"]

What is Bolts

Bolts is a Deep learning research and production toolbox of:

  • SOTA pretrained models.
  • Model components.
  • Callbacks.
  • Losses.
  • Datasets.

Main Goals of Bolts

The main goal of Bolts is to enable rapid model idea iteration.

Example 1: Finetuning on data

from pl_bolts.models.self_supervised import SimCLR
from pl_bolts.models.self_supervised.simclr.transforms import SimCLRTrainDataTransform, SimCLREvalDataTransform
import pytorch_lightning as pl

# data
train_data = DataLoader(MyDataset(transforms=SimCLRTrainDataTransform(input_height=32)))
val_data = DataLoader(MyDataset(transforms=SimCLREvalDataTransform(input_height=32)))

# model
weight_path = 'https://pl-bolts-weights.s3.us-east-2.amazonaws.com/simclr/bolts_simclr_imagenet/simclr_imagenet.ckpt'
simclr = SimCLR.load_from_checkpoint(weight_path, strict=False)

simclr.freeze()

# finetune

Example 2: Subclass and ideate

from pl_bolts.models import ImageGPT
from pl_bolts.models.self_supervised import SimCLR

class VideoGPT(ImageGPT):

    def training_step(self, batch, batch_idx):
        x, y = batch
        x = _shape_input(x)

        logits = self.gpt(x)
        simclr_features = self.simclr(x)

        # -----------------
        # do something new with GPT logits + simclr_features
        # -----------------

        loss = self.criterion(logits.view(-1, logits.size(-1)), x.view(-1).long())

        logs = {"loss": loss}
        return {"loss": loss, "log": logs}

Who is Bolts for?

  • Corporate production teams
  • Professional researchers
  • Ph.D. students
  • Linear + Logistic regression heroes

I don't need deep learning

Great! We have LinearRegression and LogisticRegression implementations with numpy and sklearn bridges for datasets! But our implementations work on multiple GPUs, TPUs and scale dramatically...

Check out our Linear Regression on TPU demo

from pl_bolts.models.regression import LinearRegression
from pl_bolts.datamodules import SklearnDataModule
from sklearn.datasets import load_diabetes
import pytorch_lightning as pl

# sklearn dataset
X, y = load_diabetes(return_X_y=True)
loaders = SklearnDataModule(X, y)

model = LinearRegression(input_dim=13)

# try with gpus=4!
# trainer = pl.Trainer(gpus=4)
trainer = pl.Trainer()
trainer.fit(model, train_dataloader=loaders.train_dataloader(), val_dataloaders=loaders.val_dataloader())
trainer.test(test_dataloaders=loaders.test_dataloader())

Is this another model zoo?

No!

Bolts is unique because models are implemented using PyTorch Lightning and structured so that they can be easily subclassed and iterated on.

For example, you can override the elbo loss of a VAE, or the generator_step of a GAN to quickly try out a new idea. The best part is that all the models are benchmarked so you won't waste time trying to "reproduce" or find the bugs with your implementation.

Team

Bolts is supported by the PyTorch Lightning team and the PyTorch Lightning community!


Licence

Please observe the Apache 2.0 license that is listed in this repository. In addition the Lightning framework is Patent Pending.

Citation

To cite bolts use:

@article{falcon2020framework,
  title={A Framework For Contrastive Self-Supervised Learning And Designing A New Approach},
  author={Falcon, William and Cho, Kyunghyun},
  journal={arXiv preprint arXiv:2009.00104},
  year={2020}
}

To cite other contributed models or modules, please cite the authors directly (if they don't have bibtex, ping the authors on a GH issue)

Comments
  • Add RetinaNet Object detection with Backbones

    Add RetinaNet Object detection with Backbones

    What does this PR do?

    Fixes #391

    Before submitting

    • [x] Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
    • [x] Did you read the contributor guideline, Pull Request section?
    • [x] Did you make sure your PR does only one thing, instead of bundling different changes together?
    • [x] Did you make sure to update the documentation with your changes?
    • [x] Did you write any new necessary tests? [not needed for typos/docs]
    • [x] Did you verify new and existing tests pass locally with your changes?
    • [x] If you made a notable change (that affects users), did you update the CHANGELOG?

    PR review

    • [x] Is this pull request ready for review?

    Did you have fun?

    I think yes :stuck_out_tongue:

  • Add YOLO object detection model

    Add YOLO object detection model

    What does this PR do?

    This PR adds the YOLO object detection model. The implementation is based on the YOLOv3 and YOLOv4 Darknet implementations, although it doesn't include all the features of YOLOv4. Detection seems to work with weights that have been trained using the Darknet implementation, so the network architecture should be more or less identical. The network architecture is read from a configuration file in the same format as in the Darknet implementation. It supports loading weights from a Darknet model file too, if you don't want to start training from a randomly initialized model.

    Fixes #22

    Before submitting

    • [x] Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
    • [x] Did you read the contributor guideline, Pull Request section?
    • [x] Did you make sure your PR does only one thing, instead of bundling different changes together?
    • [x] Did you make sure to update the documentation with your changes?
    • [x] Did you write any new necessary tests? [not needed for typos/docs]
    • [x] Did you verify new and existing tests pass locally with your changes?
    • [x] If you made a notable change (that affects users), did you update the CHANGELOG?

    PR review

    • [x] Is this pull request ready for review? (if not, please submit in draft mode)

    Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

    Did you have fun?

    Make sure you had fun coding 🙃

  • Add SRGAN and datamodules for super resolution

    Add SRGAN and datamodules for super resolution

    What does this PR do?

    Adds a SRGAN implementation to bolts as proposed in #412.

    Closes #412

    Before submitting

    • [x] Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
    • [x] Did you read the contributor guideline, Pull Request section?
    • [x] Did you make sure your PR does only one thing, instead of bundling different changes together? Otherwise, we ask you to create a separate PR for every change.
    • [x] Did you make sure to update the documentation with your changes?
    • [x] Did you write any new necessary tests?
    • [x] Did you verify new and existing tests pass locally with your changes?
    • [x] If you made a notable change (that affects users), did you update the CHANGELOG?
    • [x] Add train logs and example images

    PR review

    • [x] Is this pull request ready for review? (if not, please submit in draft mode)

    Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

    Did you have fun?

    Make sure you had fun coding 🙃

  • Adding types to some of datamodules

    Adding types to some of datamodules

    What does this PR do?

    Adding types to pl_bolts.datamodules.

    related to #434

    Before submitting

    • [x] Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
    • [x] Did you read the contributor guideline, Pull Request section?
    • [x] Did you make sure your PR does only one thing, instead of bundling different changes together? Otherwise, we ask you to create a separate PR for every change.
    • [ ] Did you make sure to update the documentation with your changes?
    • [ ] Did you write any new necessary tests?
    • [ ] Did you verify new and existing tests pass locally with your changes?
    • [ ] If you made a notable change (that affects users), did you update the CHANGELOG?

    PR review

    • [ ] Is this pull request ready for review? (if not, please submit in draft mode)

    Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

    Did you have fun?

    Make sure you had fun coding 🙃

  • Add DCGAN module

    Add DCGAN module

    What does this PR do?

    As proposed in #401, this PR adds a DCGAN implementation closely following the one in PyTorch's examples (https://github.com/pytorch/examples/blob/master/dcgan/main.py).

    Fixes #401

    Before submitting

    • [x] Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
    • [x] Did you read the contributor guideline, Pull Request section?
    • [x] Did you make sure your PR does only one thing, instead of bundling different changes together? Otherwise, we ask you to create a separate PR for every change.
    • [x] Did you make sure to update the documentation with your changes?
    • [x] Did you write any new necessary tests?
    • [x] Did you verify new and existing tests pass locally with your changes?
    • [x] If you made a notable change (that affects users), did you update the CHANGELOG?

    PR review

    • [x] Is this pull request ready for review? (if not, please submit in draft mode)

    Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

    Did you have fun?

    Make sure you had fun coding 🙃

  • Add EMNISTDataModule

    Add EMNISTDataModule

    What does this PR do?

    Closes #672, #676 and #685.

    A summary of changes and modifications :star: :fire: [CLICK TO EXPAND]

    • File Added:

      • [x] pl_bolts/datasets/emnist_dataset.py :green_circle:
      • [x] Contents:
        • [x] EMNIST_METADATA
        • [x] EMNIST dataset
        • [x] BinaryEMNIST dataset Need New PR or add to #672 :warning:
    • File Added:

      • [x] pl_bolts/datamodules/emnist_dataset.py :green_circle:
      • [x] Contents:
        • [x] EMNISTDataModule
        • [x] BinaryEMNISTDataModule Need New PR or add to #672 :warning:
    • Files Modified

      • Package: pl_bolts

        • [x] pl_bolts/datasets/__init__.py :green_circle:
        • [x] pl_bolts/datamodules/__init__.py :green_circle:
      • Tests:

        • For datamodules:
          • [x] tests/datamodules/test_imports.py :green_circle:
          • [x] tests/datamodules/test_datamodules.py WIP :orange_circle:

    Adding BinaryEMNIST and BinaryEMNISTDataModule was logical, looking at how MNIST and BinaryMNIST (dataset and datamodules) were implemented.

    About the dataset

    image source: https://arxiv.org/pdf/1702.05373.pdf [Table-I]

    image source: https://arxiv.org/pdf/1702.05373.pdf [Table-II]

    Before submitting

    • [x] Was this discussed/approved via a Github issue? (no need for typos and docs improvements) #672
    • [x] Did you read the contributor guideline, Pull Request section? Y :green_circle:
    • [x] Did you make sure your PR does only one thing, instead of bundling different changes together? Y :green_circle:
    • [x] Did you make sure to update the documentation with your changes? Y :green_circle:
    • [x] Did you write any new necessary tests? [not needed for typos/docs] Y :green_circle:
    • [x] Did you verify new and existing tests pass locally with your changes? Y :green_circle:
    • [x] If you made a notable change (that affects users), did you update the CHANGELOG? Y :green_circle:

    PR review

    • [x] Is this pull request ready for review? (if not, please submit in draft mode) READY :green_circle:

    Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

  • Implemented GIoU

    Implemented GIoU

    What does this PR do?

    Implements Generalized Intersection over Union as mentioned in #251

    Before submitting

    • [x] Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
    • [x] Did you read the contributor guideline, Pull Request section?
    • [x] Did you make sure your PR does only one thing, instead of bundling different changes together? Otherwise, we ask you to create a separate PR for every change.
    • [x] Did you make sure to update the documentation with your changes?
    • [x] Did you write any new necessary tests?
    • [x] Did you verify new and existing tests pass locally with your changes?
    • [x] If you made a notable change (that affects users), did you update the CHANGELOG?

    PR review

    • [x] Is this pull request ready for review? (if not, please submit in draft mode)

    Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

    Did you have fun?

    Make sure you had fun coding 🙃

  • ci: Fix dataset downloading errors

    ci: Fix dataset downloading errors

    What does this PR do?

    As pointed out in https://github.com/PyTorchLightning/pytorch-lightning-bolts/pull/377#issuecomment-730193148 by @Borda, the tests try to download datasets, which sometimes fail with the following error:

    UNEXPECTED EXCEPTION: RuntimeError('Failed download from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz')

    Description of the changes

    1. ~It seems that those failing tests are often doctest, so this PR simply removes the doctest from ci_test-full.yml as we still have doctest in ci_test-base.yml.~ ~https://github.com/PyTorchLightning/pytorch-lightning-bolts/blob/b8ac85154465956b06fd1005b21b071af5493f11/.github/workflows/ci_test-full.yml#L86~ ~https://github.com/PyTorchLightning/pytorch-lightning-bolts/blob/b8ac85154465956b06fd1005b21b071af5493f11/.github/workflows/ci_test-base.yml#L69~
    2. ~This PR also includes minor changes in some tests using LitMNIST to utilize dataset caching since they currently download and store MNIST datasets in ./ instead of in ./datasets/ (datadir fixture).~ See #414.

    Before submitting

    • [x] Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
    • [x] Did you read the contributor guideline, Pull Request section?
    • [x] Did you make sure your PR does only one thing, instead of bundling different changes together? Otherwise, we ask you to create a separate PR for every change.
    • [x] Did you make sure to update the documentation with your changes?
    • [ ] Did you write any new necessary tests?
    • [ ] Did you verify new and existing tests pass locally with your changes?
    • [x] If you made a notable change (that affects users), did you update the CHANGELOG?

    PR review

    • [ ] Is this pull request ready for review? (if not, please submit in draft mode)

    Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

    Did you have fun?

    Make sure you had fun coding 🙃

  • Adds Backbones to FRCNN Take 2

    Adds Backbones to FRCNN Take 2

    What does this PR do?

    Before submitting

    • [x] Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
    • [x] Did you read the contributor guideline, Pull Request section?
    • [x] Did you make sure your PR does only one thing, instead of bundling different changes together?
    • [x] Did you make sure to update the documentation with your changes?
    • [x] Did you write any new necessary tests? [not needed for typos/docs]
    • [x] Did you verify new and existing tests pass locally with your changes?
    • [x] If you made a notable change (that affects users), did you update the CHANGELOG?

    PR review

    • [x] Is this pull request ready for review? (if not, please submit in draft mode)

    Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

    Did you have fun?

    Make sure you had fun coding 🙃

    Redo #382 . Closes #340

  • Add logger for Azure Machine Learning

    Add logger for Azure Machine Learning

    Before submitting

    • [x] Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
    • [x] Did you read the contributor guideline, Pull Request section?
    • [x] Did you make sure to update the docs?
    • [x] Did you write any new necessary tests?

    What does this PR do?

    This pull request adds an Azure Machine Learning logger to PyTorch Lightning Bolts. Since AML is integrated with MLFlow for experiment tracking (link), this logger subclasses the MLFlow logger.

    It will enable users to track their metrics in the Azure ML UI by writing code like:

    from azureml.core import Run
    from pytorch_lightning.loggers import AzureMlLogger
    
    # This is optional
    run = Run.get_context()
    azureml_logger = AzureMlLogger(run)
    
    trainer = Trainer(logger=azureml_logger)
    

    This pull request closes #180 .

    PR review

    Everyone is welcome to comment and review!

    Did you have fun?

    Yes!

  • Accuracy metric does not work on Model predictions.

    Accuracy metric does not work on Model predictions.

    🐛 Bug

    Error when using pytorch_lightning.metrics.Accuracy on logits, even though that is exactly the use-case stated in the docs.

    To Reproduce

    Steps to reproduce the behavior:

    Using the accuracy metric on either the training or validation step with the expected logit inputs ((64 x 10), 10 = # of classes) and target vector (64) throws the error ValueError: The `preds` should be probabilities, but values were detected outside of [0,1] range. similar to #551.

    Code sample

    class ToyModel(pl.LightningModule):
      def __init__(self, batch_size, max_epochs = 10, learning_rate=0.01, num_classes=10, noise=0.6, alpha=0.1, beta=1):
        super(ToyModel, self).__init__()
    
        # Model
        self.conv1 = nn.Conv2d(in_channels=3, out_channels=64, kernel_size=3, stride=1, padding=1)
        self.conv1_bn = nn.BatchNorm2d(64)
        
        self.conv2 = nn.Conv2d(64, 128, 3, padding=1)
        self.conv2_bn = nn.BatchNorm2d(128) # pooling changes in_channel for next layer?
    
        self.conv3 = nn.Conv2d(128, 196, 3, padding=1)
        self.conv3_bn = nn.BatchNorm2d(196)
    
        self.fc1 = nn.Linear(in_features=3136, out_features=256)
        self.fc1_bn = nn.BatchNorm1d(256)
    
        self.fc2 = nn.Linear(256, num_classes)
    
        # Parameters
        self.alpha = alpha
        self.beta = beta
        self.noise = noise
    
        self.max_epochs = max_epochs
        self.learning_rate= learning_rate
        self.batch_size = batch_size
    
        # Accuracy Metric
        self.train_acc = pl.metrics.Accuracy()
    
      def loss(self, softmax_pred, target):
        return self.alpha * nn.CrossEntropyLoss()(softmax_pred, target) + self.beta * nn.CrossEntropyLoss()(softmax_pred, target)
    
    
      def forward(self, x):
        x = self.conv1(x)
        x = self.conv1_bn(x)
        x = F.relu(x)
        x = F.max_pool2d(x, kernel_size=2, stride=2)
        
        x = self.conv2(x)
        x = self.conv2_bn(x)
        x = F.relu(x)
        x = F.max_pool2d(x, kernel_size=2, stride=2)
    
        x = self.conv3(x)
        x = self.conv3_bn(x)
        x = F.relu(x)
        x = F.max_pool2d(x, kernel_size=2, stride=2)
        
        x = torch.flatten(x, start_dim=1)
        x = self.fc1(x)
        x = self.fc1_bn(x)
        x = F.relu(x)
    
        return self.fc2(x) 
        
      def training_step(self, batch, batch_idx):
        inputs, targets = batch
        predictions = self(inputs)
        loss = self.loss(predictions, targets)
        self.log("train_acc_step", self.train_acc(predictions, targets))
        return loss
    
      def training_epoch_end(self, outs):
        # log epoch metric
        self.log('train_acc_epoch', self.train_acc.compute())
    

    Passing torch.tensor(64, 3, 32, 32) as training data should trigger the error

    Expected behavior

    self.train_acc(predictions, targets) should compute the accuracy

    Environment

    I used Google Colab

    • PyTorch Version (e.g., 1.0): 1.8.0
    • OS (e.g., Linux): Linux
    • How you installed PyTorch (conda, pip, source): pip
    • Python version: 3.8
    • CUDA/cuDNN version: 11.0

    Additional context

    Workaround: I followed the advice in #551 and reinstalled pytorch lightning to 1.1.8.

    Since this Lightning version triggers #6210 in Colab, I also had to reinstall a different version of torch:

    !pip install torchtext==0.8.0 torch==1.7.1+cu110 torchvision==0.8.2+cu110 torchaudio==0.7.2 -f https://download.pytorch.org/whl/torch_stable.html -q
    !pip install pytorch-lightning==1.1.8 -q
    
  • Call for core contributors 🧙

    Call for core contributors 🧙

    🚀 Feature

    First, we are very happy about all your contribution that the community-made so far! Unfortunately, we are getting a bit short :( Second, we would like to re-ignite this project/repository and get it back on track with the latest research and PL API! Third, as part o the challenge, we are going to rethink the structure and integration process to be up to date and smooth as possible (also see complementary issue #741)

    Motivation

    We want to form a new contributor's team which would be willing to take (participate) this challenge of re-igniting this project in the best Lightning spirit!

    Pitch

    Become a key contributor, collaborate with the best, learn and practice what you love and help us make the Lightning community an even better place!

    Alternatives

    Ping @Borda on slack to chat more...

    Additional context

    Note that to be part of Bolt's core is not the same group as being a Core contributor of the main PL, but it will set you on a promising track to become PL core later on...

  • SwAV example not working

    SwAV example not working

    Hello, I was trying to get the lightning_bolts SwAV example running. unfortunately the example code does not work.

    
    import pytorch_lightning as pl
    from pl_bolts.models.self_supervised import SwAV
    from pl_bolts.datamodules import STL10DataModule
    from pl_bolts.models.self_supervised.swav.transforms import (
        SwAVTrainDataTransform, SwAVEvalDataTransform
    )
    from pl_bolts.transforms.dataset_normalizations import stl10_normalization
    
    # data
    batch_size = 128
    dm = STL10DataModule(data_dir='.', batch_size=batch_size)
    dm.train_dataloader = dm.train_dataloader_mixed
    dm.val_dataloader = dm.val_dataloader_mixed
    
    dm.train_transforms = SwAVTrainDataTransform(
        normalize=stl10_normalization()
    )
    
    dm.val_transforms = SwAVEvalDataTransform(
        normalize=stl10_normalization()
    )
    
    # model
    model = SwAV(
        gpus=1,
        num_samples=dm.num_unlabeled_samples,
        dataset='stl10',
        batch_size=batch_size
    )
    
    # fit
    trainer = pl.Trainer(precision=16)
    trainer.fit(model)
    

    the fit function is missing the data module. It should be: trainer.fit(model, datamodule=dm)

    But even then I get the following error message: RuntimeError: The size of tensor a (128) must match the size of tensor b (0) at non-singleton dimension 0

    How can i fix that? THX

  • Add features to the YOLO model from the latest YOLO variants

    Add features to the YOLO model from the latest YOLO variants

    What does this PR do?

    The YOLO model has been largely refactored, so that it's easy to incorporate new features from different YOLO variants, such as different algorithms for assigning targets to anchors. It also supports defining the network structure as a PyTorch module, in addition to loading a Darknet configuration file. The implementation now supports the most important features of YOLOv3, YOLOv4, YOLOv5, Scaled-YOLOv4, and YOLOX.

    • Supports several new algorithms for matching targets to anchors.
    • Added support for DIoU and CIoU losses.
    • By default the target for confidence of a detection that has been assigned a target is 1.0. This pull request adds a support for using the overlap between the target and the predicted box instead.
    • The code is refactored from one big class into model, anchor matching, and loss function classes.
    • Target class labels may be specified as a matrix of class probabilities, allowing multiple classes per object.
    • Automatic padding in convolutional and max pooling layers works now in every case.
    • Weight decay is applied only to convolutional layer weights.
    • Command line interface is now using LightningCLI.
    • Network architectures can be written as PyTorch modules. YOLOv4, YOLOv5, and YOLOX architectures are included.
    • Calculates MAP metric using TorchMetrics.
    • Added complete type hints for static type checking.

    Fixes #816

    Before submitting

    • [ ] Was this discussed/approved via a Github issue? (no need for typos and docs improvements) - issue for discussion: #816
    • [x] Did you read the contributor guideline, Pull Request section?
    • [x] Did you make sure your PR does only one thing, instead of bundling different changes together?
    • [x] Did you make sure to update the documentation with your changes?
    • [x] Did you write any new necessary tests? [not needed for typos/docs]
    • [x] Did you verify new and existing tests pass locally with your changes?
    • [x] If you made a notable change (that affects users), did you update the CHANGELOG?

    PR review

    • [x] Is this pull request ready for review? (if not, please submit in draft mode)

    Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

    Did you have fun?

    Make sure you had fun coding 🙃

  • Upgrade the YOLO model with features from new variants

    Upgrade the YOLO model with features from new variants

    🚀 Feature

    Several new variants of the YOLO model have been proposed in the past couple of years. The current implementation in PyTorch Lightning Bolts is based on the original Darknet implementation. I have updated it with features from the new PyTorch based variants and created a pull request.

    Motivation

    The code has been refactored, allowing easy variation of the network architecture, target matching algorithm, or loss function. Some important changes that have been proposed to these algorithms are implemented in this pull request in a modular way. This is the only implementation that combines features from the latest variants, including YOLOv5 and YOLOX, as well as allows loading Darknet based models.

    You can find my pull request here: https://github.com/PyTorchLightning/lightning-bolts/pull/817

  • BYOL training fails after 10 epochs because of the scheduler

    BYOL training fails after 10 epochs because of the scheduler

    🐛 Bug

    When running byol_module.py with default options, the training fails after 10 epochs because of the learning rate scheduler.

    To Reproduce

    Steps to reproduce the behavior:

    run the following command:

    python byol_module.py
    

    Traceback

    Traceback (most recent call last):                                              
      File "/home/alain/miniconda3/envs/ts/lib/python3.10/site-packages/pl_bolts/models/self_supervised/byol/byol_module.py", line 233, in <module>
        cli_main()
      File "/home/alain/miniconda3/envs/ts/lib/python3.10/site-packages/pl_bolts/models/self_supervised/byol/byol_module.py", line 229, in cli_main
        trainer.fit(model, datamodule=dm)
      File "/home/alain/miniconda3/envs/ts/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 740, in fit
        self._call_and_handle_interrupt(
      File "/home/alain/miniconda3/envs/ts/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 685, in _call_and_handle_interrupt
        return trainer_fn(*args, **kwargs)
      File "/home/alain/miniconda3/envs/ts/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 777, in _fit_impl
        self._run(model, ckpt_path=ckpt_path)
      File "/home/alain/miniconda3/envs/ts/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 1199, in _run
        self._dispatch()
      File "/home/alain/miniconda3/envs/ts/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 1279, in _dispatch
        self.training_type_plugin.start_training(self)
      File "/home/alain/miniconda3/envs/ts/lib/python3.10/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 202, in start_training
        self._results = trainer.run_stage()
      File "/home/alain/miniconda3/envs/ts/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 1289, in run_stage
        return self._run_train()
      File "/home/alain/miniconda3/envs/ts/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 1319, in _run_train
        self.fit_loop.run()
      File "/home/alain/miniconda3/envs/ts/lib/python3.10/site-packages/pytorch_lightning/loops/base.py", line 145, in run
        self.advance(*args, **kwargs)
      File "/home/alain/miniconda3/envs/ts/lib/python3.10/site-packages/pytorch_lightning/loops/fit_loop.py", line 234, in advance
        self.epoch_loop.run(data_fetcher)
      File "/home/alain/miniconda3/envs/ts/lib/python3.10/site-packages/pytorch_lightning/loops/base.py", line 145, in run
        self.advance(*args, **kwargs)
      File "/home/alain/miniconda3/envs/ts/lib/python3.10/site-packages/pytorch_lightning/loops/epoch/training_epoch_loop.py", line 201, in advance
        self.update_lr_schedulers("epoch", update_plateau_schedulers=False)
      File "/home/alain/miniconda3/envs/ts/lib/python3.10/site-packages/pytorch_lightning/loops/epoch/training_epoch_loop.py", line 441, in update_lr_schedulers
        self._update_learning_rates(
      File "/home/alain/miniconda3/envs/ts/lib/python3.10/site-packages/pytorch_lightning/loops/epoch/training_epoch_loop.py", line 505, in _update_learning_rates
        lr_scheduler["scheduler"].step()
      File "/home/alain/miniconda3/envs/ts/lib/python3.10/site-packages/torch/optim/lr_scheduler.py", line 154, in step
        values = self.get_lr()
      File "/home/alain/miniconda3/envs/ts/lib/python3.10/site-packages/pl_bolts/optimizers/lr_scheduler.py", line 86, in get_lr
        if (self.last_epoch - 1 - self.max_epochs) % (2 * (self.max_epochs - self.warmup_epochs)) == 0:
    TypeError: unsupported operand type(s) for -: 'int' and 'NoneType'
    

    Expected behavior

    The training should continue normally

    Environment

    • PyTorch Version (e.g., 1.0): 1.11.0
    • OS (e.g., Linux): Ubuntu 21.10
    • How you installed PyTorch (conda, pip, source): conda
    • Build command you used (if compiling from source):
    • Python version: 3.10
    • CUDA/cuDNN version: 11.4
    • GPU models and configuration: 1 single NVIDIA GeForce 1080 Ti
    • Any other relevant information:
  • TrainingDataMonitor does not support logging with DummyLogger

    TrainingDataMonitor does not support logging with DummyLogger

    ❓ Questions and Help

    What is your question?

    I don't know why I get the error: TrainingDataMonitor does not support logging with DummyLogger

    When trying to use TrainingDataMonitor

    Code

    # get logger before importing lightning
    logging.basicConfig(
        format='%(asctime)s [%(levelname)s] %(message)s',
        level=logging.INFO,
        datefmt='%H:%M:%S',
        stream=sys.stdout,
    )
    log = logging.getLogger('SageMaker')
    log.setLevel(logging.INFO)
    
    tb_logger = TensorBoardLogger(
                    save_dir='/logs/tensorboard',
                    name='',
                    version='',
                    log_graph=True,
    )
    tb_logger.logger = args.logger
    
    data_monitor_cb = TrainingDataMonitor(log_every_n_steps=25)
    callbacks = [data_monitor_cb]
    
    trainer = pl.Trainer.from_argparse_args(args,
                  logger=[tb_logger],                                            
                  callbacks=callbacks,
                  strategy=DDPStrategy(find_unused_parameters=False),
    )
    

    What have you tried?

    Tried sending single logger instead of list to trainer

    What's your environment?

    • OS: Ubuntu, sagemaker
    • Packaging pip
    • Version latest pypi
Reference implementation of code generation projects from Facebook AI Research. General toolkit to apply machine learning to code, from dataset creation to model training and evaluation. Comes with pretrained models.

This repository is a toolkit to do machine learning for programming languages. It implements tokenization, dataset preprocessing, model training and m

May 13, 2022
Template repository for managing machine learning research projects built with PyTorch-Lightning

Tutorial Repository with a minimal example for showing how to deploy training across various compute infrastructure.

Feb 11, 2022
Awesome Deep Graph Clustering is a collection of SOTA, novel deep graph clustering methods

ADGC: Awesome Deep Graph Clustering ADGC is a collection of state-of-the-art (SOTA), novel deep graph clustering methods (papers, codes and datasets).

May 21, 2022
A collection of SOTA Image Classification Models in PyTorch
A collection of SOTA Image Classification Models in PyTorch

A collection of SOTA Image Classification Models in PyTorch

May 5, 2022
May 17, 2022
NVIDIA Merlin is an open source library providing end-to-end GPU-accelerated recommender systems, from feature engineering and preprocessing to training deep learning models and running inference in production.

NVIDIA Merlin NVIDIA Merlin is an open source library designed to accelerate recommender systems on NVIDIA’s GPUs. It enables data scientists, machine

May 19, 2022
A general framework for deep learning experiments under PyTorch based on pytorch-lightning

torchx Torchx is a general framework for deep learning experiments under PyTorch based on pytorch-lightning. TODO list gan-like training wrapper text

Mar 17, 2022
An Agnostic Computer Vision Framework - Pluggable to any Training Library: Fastai, Pytorch-Lightning with more to come
An Agnostic Computer Vision Framework - Pluggable to any Training Library: Fastai, Pytorch-Lightning with more to come

IceVision is the first agnostic computer vision framework to offer a curated collection with hundreds of high-quality pre-trained models from torchvision, MMLabs, and soon Pytorch Image Models. It orchestrates the end-to-end deep learning workflow allowing to train networks with easy-to-use robust high-performance libraries such as Pytorch-Lightning and Fastai

May 14, 2022
[ICCV 2021] Official Pytorch implementation for Discriminative Region-based Multi-Label Zero-Shot Learning SOTA results on NUS-WIDE and OpenImages
[ICCV 2021]  Official Pytorch implementation for Discriminative Region-based Multi-Label Zero-Shot Learning SOTA results on NUS-WIDE and OpenImages

Discriminative Region-based Multi-Label Zero-Shot Learning (ICCV 2021) [arXiv][Project page >> coming soon] Sanath Narayan*, Akshita Gupta*, Salman Kh

Apr 20, 2022
[ICCV 2021] Official Pytorch implementation for Discriminative Region-based Multi-Label Zero-Shot Learning SOTA results on NUS-WIDE and OpenImages
[ICCV 2021]  Official Pytorch implementation for Discriminative Region-based Multi-Label Zero-Shot Learning SOTA results on NUS-WIDE and OpenImages

Discriminative Region-based Multi-Label Zero-Shot Learning (ICCV 2021) [arXiv][Project page >> coming soon] Sanath Narayan*, Akshita Gupta*, Salman Kh

Apr 20, 2022
A PyTorch Lightning Callback for pushing models to the Hugging Face Hub 🤗⚡️

hf-hub-lightning A callback for pushing lightning models to the Hugging Face Hub. Note: I made this package for myself, mostly...if folks seem to be i

Apr 16, 2022
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

May 19, 2022
Pretrained Pytorch face detection (MTCNN) and recognition (InceptionResnet) models
Pretrained Pytorch face detection (MTCNN) and recognition (InceptionResnet) models

Face Recognition Using Pytorch Python 3.7 3.6 3.5 Status This is a repository for Inception Resnet (V1) models in pytorch, pretrained on VGGFace2 and

May 22, 2022
Official PyTorch implementation and pretrained models of the paper Self-Supervised Classification Network
Official PyTorch implementation and pretrained models of the paper Self-Supervised Classification Network

Self-Classifier: Self-Supervised Classification Network Official PyTorch implementation and pretrained models of the paper Self-Supervised Classificat

May 15, 2022
Base pretrained models and datasets in pytorch (MNIST, SVHN, CIFAR10, CIFAR100, STL10, AlexNet, VGG16, VGG19, ResNet, Inception, SqueezeNet)

This is a playground for pytorch beginners, which contains predefined models on popular dataset. Currently we support mnist, svhn cifar10, cifar100 st

May 20, 2022
(ImageNet pretrained models) The official pytorch implemention of the TPAMI paper "Res2Net: A New Multi-scale Backbone Architecture"
(ImageNet pretrained models) The official pytorch implemention of the TPAMI paper

Res2Net The official pytorch implemention of the paper "Res2Net: A New Multi-scale Backbone Architecture" Our paper is accepted by IEEE Transactions o

May 14, 2022
Implementation of Squeezenet in pytorch, pretrained models on Cifar 10 data to come

Pytorch Squeeznet Pytorch implementation of Squeezenet model as described in https://arxiv.org/abs/1602.07360 on cifar-10 Data. The definition of Sque

May 13, 2022
FAIR's research platform for object detection research, implementing popular algorithms like Mask R-CNN and RetinaNet.
FAIR's research platform for object detection research, implementing popular algorithms like Mask R-CNN and RetinaNet.

Detectron is deprecated. Please see detectron2, a ground-up rewrite of Detectron in PyTorch. Detectron Detectron is Facebook AI Research's software sy

May 18, 2022