A PyTorch Implementation of Single Shot MultiBox Detector

SSD: Single Shot MultiBox Object Detector, in PyTorch

A PyTorch implementation of Single Shot MultiBox Detector from the 2016 paper by Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang, and Alexander C. Berg. The official and original Caffe code can be found here.

Table of Contents

       

Installation

  • Install PyTorch by selecting your environment on the website and running the appropriate command.
  • Clone this repository.
    • Note: We currently only support Python 3+.
  • Then download the dataset by following the instructions below.
  • We now support Visdom for real-time loss visualization during training!
    • To use Visdom in the browser:
    # First install Python server and client
    pip install visdom
    # Start the server (probably in a screen or tmux)
    python -m visdom.server
    • Then (during training) navigate to http://localhost:8097/ (see the Train section below for training details).
  • Note: For training, we currently support VOC and COCO, and aim to add ImageNet support soon.

Datasets

To make things easy, we provide bash scripts to handle the dataset downloads and setup for you. We also provide simple dataset loaders that inherit torch.utils.data.Dataset, making them fully compatible with the torchvision.datasets API.

COCO

Microsoft COCO: Common Objects in Context

Download COCO 2014
# specify a directory for dataset to be downloaded into, else default is ~/data/
sh data/scripts/COCO2014.sh

VOC Dataset

PASCAL VOC: Visual Object Classes

Download VOC2007 trainval & test
# specify a directory for dataset to be downloaded into, else default is ~/data/
sh data/scripts/VOC2007.sh # <directory>
Download VOC2012 trainval
# specify a directory for dataset to be downloaded into, else default is ~/data/
sh data/scripts/VOC2012.sh # <directory>

Training SSD

mkdir weights
cd weights
wget https://s3.amazonaws.com/amdegroot-models/vgg16_reducedfc.pth
  • To train SSD using the train script simply specify the parameters listed in train.py as a flag or manually change them.
python train.py
  • Note:
    • For training, an NVIDIA GPU is strongly recommended for speed.
    • For instructions on Visdom usage/installation, see the Installation section.
    • You can pick-up training from a checkpoint by specifying the path as one of the training parameters (again, see train.py for options)

Evaluation

To evaluate a trained network:

python eval.py

You can specify the parameters listed in the eval.py file by flagging them or manually changing them.

Performance

VOC2007 Test

mAP
Original Converted weiliu89 weights From scratch w/o data aug From scratch w/ data aug
77.2 % 77.26 % 58.12% 77.43 %
FPS

GTX 1060: ~45.45 FPS

Demos

Use a pre-trained SSD network for detection

Download a pre-trained network

SSD results on multiple datasets

Try the demo notebook

  • Make sure you have jupyter notebook installed.
  • Two alternatives for installing jupyter notebook:
    1. If you installed PyTorch with conda (recommended), then you should already have it. (Just navigate to the ssd.pytorch cloned repo and run): jupyter notebook

    2. If using pip:

# make sure pip is upgraded
pip3 install --upgrade pip
# install jupyter notebook
pip install jupyter
# Run this inside ssd.pytorch
jupyter notebook

Try the webcam demo

  • Works on CPU (may have to tweak cv2.waitkey for optimal fps) or on an NVIDIA GPU
  • This demo currently requires opencv2+ w/ python bindings and an onboard webcam
    • You can change the default webcam in demo/live.py
  • Install the imutils package to leverage multi-threading on CPU:
    • pip install imutils
  • Running python -m demo.live opens the webcam and begins detecting!

TODO

We have accumulated the following to-do list, which we hope to complete in the near future

  • Still to come:
    • Support for the MS COCO dataset
    • Support for SSD512 training and testing
    • Support for training on custom datasets

Authors

Note: Unfortunately, this is just a hobby of ours and not a full-time job, so we'll do our best to keep things up to date, but no guarantees. That being said, thanks to everyone for your continued help and feedback as it is really appreciated. We will try to address everything as soon as possible.

References

Owner
Max deGroot
Amazon Alexa | ML Research at Vanderbilt University
Max deGroot
Comments
  • ValueError: not enough values to unpack (expected 2, got 0)

    ValueError: not enough values to unpack (expected 2, got 0)

    I am getting error when I try to use the pretrained model: python demo/live.py --weights ./weights/ssd300_mAP_77.43_v2.pth

    /home/cya/git_clones/ssd.pytorch/ssd.py:34: UserWarning: volatile was removed and now has no effect. Use `with torch.no_grad():` instead.
      self.priors = Variable(self.priorbox.forward(), volatile=True)
    /home/cya/git_clones/ssd.pytorch/layers/modules/l2norm.py:17: UserWarning: nn.init.constant is now deprecated in favor of nn.init.constant_.
      init.constant(self.weight,self.gamma)
    [INFO] starting threaded video stream...
    Traceback (most recent call last):
      File "demo/live.py", line 82, in <module>
        cv2_demo(net.eval(), transform)
      File "demo/live.py", line 55, in cv2_demo
        frame = predict(frame)
      File "demo/live.py", line 25, in predict
        y = net(x)  # forward pass
      File "/home/cya/anaconda3/envs/tensorflow/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in __call__
        result = self.forward(*input, **kwargs)
      File "/home/cya/git_clones/ssd.pytorch/ssd.py", line 103, in forward
        self.priors.type(type(x.data))                  # default boxes
      File "/home/cya/git_clones/ssd.pytorch/layers/functions/detection.py", line 54, in forward
        ids, count = nms(boxes, scores, self.nms_thresh, self.top_k)
    ValueError: not enough values to unpack (expected 2, got 0)
    FATAL: exception not rethrown
    [1]    11209 abort (core dumped)  python demo/live.py --weights ./weights/ssd300_mAP_77.43_v2.pth
    
  • runtime error

    runtime error

    hi,

    have you successfully run the train.py? I encountered a runtime error saying: "div_ only supports scalar multiplication" from line "x/=norm.expand_as(x)" in modules/l2norm.py Then I changed this line to "x = x.div(nor.expand_as(x))" but got another cuda runtime error "device-side assert triggered" from line "return torch.cat([g_cxcy, g_wh], 1)" in box_utils.py

    BTW, i am using python 2.7 instead of python3.

  • StopIteration ERROR during training

    StopIteration ERROR during training

    My environment is

    8GB RAM Ubuntu 16.04 LTS Pytorch 0.4 with CUDA 9.0 cuDNN v7 Python 3.5 Geforce GTX 1080 8GB.

    I have geforce gtx 1080 8gb so i have tried to train network with 16 batch size. And run the training with python3 train.py --batch_size=16 after 1030 iteration,

    ..... iter 1020 || Loss: 9.2115 || timer: 0.1873 sec. iter 1030 || Loss: 8.1139 || Traceback (most recent call last): File "train.py", line 255, in <module> train() File "train.py", line 165, in train images, targets = next(batch_iterator) File "/home/han/.local/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 326, in __next__ raise StopIteration StopIteration

    So i tried training with other batch size like 8, 20, everytime it prints out that signal. and i calculated batch_size * iteration step

    then everytime the calculated number is around 16,480 with difference batch size and iter steps.

    The problem occured part of pytorch dataloader is

    if self.batches_outstanding == 0: self._shutdown_workers() raise StopIteration

    in /home/han/.local/lib/python3.5/site-packages/torch/utils/data/dataloader.py.

    Is it possible my 8GB RAM can occurs this problem? (but i have checked RAM was enough with Ubuntu System Monitor)

    Is there anybody can solve this problem? Please help me guyz :)

  • NaN values at Multibox encoding

    NaN values at Multibox encoding

    I've tried to implement my own dataset detector, however at training time, the localization loss is NaN due to negative values present on g_wh layers/box_utils.py#L137, I don't know if this error is related to the format of the bounding boxes or if it's related to the output of the SSD model.

    I would like to know if am I doing something wrong while loading the dataset or if the error is related to a bug on the base implementation.

  • RunTime Error in Training with default values

    RunTime Error in Training with default values

    python train.py Loading base network... Initializing weights... Loading Dataset... Training SSD on VOC0712 Traceback (most recent call last): File "train.py", line 232, in <module> train() File "train.py", line 181, in train out = net(images) File "/users/gpu/utkrsh/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 224, in __call__ result = self.forward(*input, **kwargs) File "/users/gpu/utkrsh/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 60, in forward outputs = self.parallel_apply(replicas, inputs, kwargs) File "/users/gpu/utkrsh/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 70, in parallel_apply return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)]) File "/users/gpu/utkrsh/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 67, in parallel_apply raise output File "/users/gpu/utkrsh/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 42, in _worker output = module(*input, **kwargs) File "/users/gpu/utkrsh/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 224, in __call__ result = self.forward(*input, **kwargs) File "/data/gpu/utkrsh/code/ssd.pytorch/ssd.py", line 76, in forward s = self.L2Norm(x) File "/users/gpu/utkrsh/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 224, in __call__ result = self.forward(*input, **kwargs) File "/data/gpu/utkrsh/code/ssd.pytorch/layers/modules/l2norm.py", line 21, in forward x/=norm.expand_as(x) File "/users/gpu/utkrsh/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/autograd/variable.py", line 725, in expand_as return Expand.apply(self, (tensor.size(),)) File "/users/gpu/utkrsh/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/autograd/_functions/tensor.py", line 111, in forward result = i.expand(*new_size) RuntimeError: The expanded size of the tensor (512) must match the existing size (8) at non-singleton dimension 1. at /opt/conda/conda-bld/pytorch_1502009910772/work/torch/lib/THC/generic/T$CTensor.c:323

    I am getting the above stack trace after running train.py for default values. The dataset and weights were downloaded in the default location. I am using python 3.6 and pytorch 0.2.0 I do understand the meaning of the error, I am just not able to find the source. Can anyone point in the right direction?

  • Error when I training my dataset

    Error when I training my dataset

    I have error : File "train.py", line 186, in train loss_l, loss_c = criterion(out, targets) File "/home/thangtran/miniconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in call result = self.forward(*input, **kwargs) File "/home/thangtran/ThangTran/DungPham_Checked/ssd_voc/ssd.pytorch-master/layers/modules/multibox_loss.py", line 97, in forward loss_c[pos] = 0 # filter out pos boxes for now RuntimeError: copy_if failed to synchronize: device-side assert triggered (base) [email protected]:~/ThangTran/DungPham_Checked/ssd_voc/ssd.pytorch-mas Can you help me fix it

  • use .pth trained by myself to run eval.py is too slow

    use .pth trained by myself to run eval.py is too slow

    when I use the ssd300_mAP_77.43_v2.pth to eval, 100 picture only spend 3.5s

    im_detect: 100/4952 3.978s im_detect: 200/4952 7.211s

    but after I run train.py saved the weight, and I use the weight trained by myself is very slow, 100 picture spend 35s

    im_detect: 100/4952 34.390s im_detect: 200/4952 67.907s im_detect: 300/4952 101.621s im_detect: 400/4952 135.185s im_detect: 500/4952 168.701s im_detect: 600/4952 201.940s

  • How can I find the Average Precision and Average Recall from the eval.py file?

    How can I find the Average Precision and Average Recall from the eval.py file?

    I want to find the Average Recall and Precision values. I have tried, printing these value Code link rec = tp / float(npos) prec = tp / np.maximum(tp + fp, np.finfo(np.float64).eps)

    But it is printing long list of values which i am not able to get.

  • Error with train custom dataset

    Error with train custom dataset

    I want to learn using the dataset I have. I have solved some problems, but the below error can not be solved.

    The ground truth file in my dataset consists of (x, y, w, h, class_label) in *.txt file.

    loss_c = loss_c.view(num, -1) loss_c[pos] = 0 # filter out pos boxes for now

    Is there a problem here? error

    this is source in multibox_loss.py error_source

    If you give the option --cuda = False for confirmation, the following error occurs. File "/home/jhryu/Downloads/ssd.pytorch/layers/modules/multibox_loss.py", line 103, in forward loss_c = log_sum_exp (batch_conf) - batch_conf.gather (1, conf_t.view (-1, 1))

    RuntimeError: Invalid index in gather at /opt/conda/conda-bld/pytorch_1524586445097/work/aten/src/TH/generic/THTensorMath.c:600

    Thank you for your reply.

  • AP:0 and result : 0

    AP:0 and result : 0

    i have trained my own dataset ,iter 120000,and i modified lower lr=1e_6 ,lower batch_size=8. But loss is still huge ,like 120. and run eval.py ,the result AP:0 , result :0. so i can not detect any class . Why? can anybody help me? plz tell me how to solve it,thanks

  • COCO performance

    COCO performance

    Has anyone trained and tested this ssd on coco dataset? I got a MAP=0.213 which is 0.02 less than the original paper (0.232). So I wonder what's you guys performance? Thanks!

  • eval my voc dataset ' __init__() missing 5 required positional arguments'

    eval my voc dataset ' __init__() missing 5 required positional arguments'

    D:\LeStoreDownload\anaconda3\envs\mytest\python.exe "D:/LeStoreDownload/PyCharm Community Edition 2022.2.2/plugins/python-ce/helpers/pydev/pydevd.py" --multiprocess --qt-support=auto --client 127.0.0.1 --port 52771 --file D:\Carolyn\INS_AI\ssd.pytorch\eval.py Connected to pydev debugger (build 222.4167.33) Traceback (most recent call last): File "D:/LeStoreDownload/PyCharm Community Edition 2022.2.2/plugins/python-ce/helpers/pydev/pydevd.py", line 1496, in exec pydev_imports.execfile(file, globals, locals) # execute the script File "D:\LeStoreDownload\PyCharm Community Edition 2022.2.2\plugins\python-ce\helpers\pydev_pydev_imps_pydev_execfile.py", line 18, in execfile exec(compile(contents+"\n", file, 'exec'), glob, loc) File "D:\Carolyn\INS_AI\ssd.pytorch\eval.py", line 430, in net = build_ssd('test', 300, num_classes) # initialize SSD File "D:\Carolyn\INS_AI\ssd.pytorch\ssd.py", line 218, in build_ssd return SSD(phase, size, base, extras_, head_, num_classes) File "D:\Carolyn\INS_AI\ssd.pytorch\ssd.py", line 51, in init self.detect = Detect() TypeError: init() missing 5 required positional arguments: 'num_classes', 'bkg_label', 'top_k', 'conf_thresh', and 'nms_thresh'

  • Result produced too many boxes

    Result produced too many boxes

    hello dear dose any one know why result of SSD Algorithm produced too many boxes on my video during inferences .I trained for 20000 iteration with 32 batch size and earning rate of 1e-7 . second issue I used GPU but I think my GPU dose not work because GPU utilization is zero or sometimes be 1 however in my python GPU found is true and also with device name ..... But I don't know why the training time is so slow??????????

    desktop

  • what does target actually look like in criterion(out, target)?

    what does target actually look like in criterion(out, target)?

    Hi, I've been working on a semi-supervised implementation and wanted to write pseudo-label code for this codebase. I thought the naive way of doing criterion(out, model(out)) would have sufficed, but it seems that the target are in some ground-truth format rather than the output of the model. Same issue happens if model has been built using the 'test' parameter in build_ssd as well, this time there is a shape mismatch. Can anyone shed some light on what targets looks like (from the dataloader)

  • RuntimeError: Expected a 'cuda' device type for generator but found 'cpu'

    RuntimeError: Expected a 'cuda' device type for generator but found 'cpu'

    D:\ProgramData\Anaconda3\python.exe D:/code/python/ssd.pytorch/train.py Loading base network... Initializing weights... Loading the dataset... Training SSD on: VOC0712 Using the specified args: Namespace(basenet='vgg16_reducedfc.pth', batch_size=32, cuda=True, dataset='VOC', dataset_root='D:\code\python\ssd.pytorch\data/VOCdevkit/', gamma=0.1, lr=0.001, momentum=0.9, num_workers=4, resume=None, save_folder='weights/', start_iter=0, visdom=False, weight_decay=0.0005) Traceback (most recent call last): File "D:/code/python/ssd.pytorch/train.py", line 261, in train() File "D:/code/python/ssd.pytorch/train.py", line 150, in train batch_iterator = iter(data_loader) File "D:\ProgramData\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 359, in iter return self._get_iterator() File "D:\ProgramData\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 305, in _get_iterator return _MultiProcessingDataLoaderIter(self) File "D:\ProgramData\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 944, in init self._reset(loader, first_iter=True) File "D:\ProgramData\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 975, in _reset self._try_put_index() File "D:\ProgramData\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 1209, in _try_put_index index = self._next_index() File "D:\ProgramData\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 512, in _next_index return next(self._sampler_iter) # may raise StopIteration File "D:\ProgramData\Anaconda3\lib\site-packages\torch\utils\data\sampler.py", line 229, in iter for idx in self.sampler: File "D:\ProgramData\Anaconda3\lib\site-packages\torch\utils\data\sampler.py", line 126, in iter yield from torch.randperm(n, generator=generator).tolist() RuntimeError: Expected a 'cuda' device type for generator but found 'cpu'

A light and fast one class detection framework for edge devices. We provide face detector, head detector, pedestrian detector, vehicle detector......

A Light and Fast Face Detector for Edge Devices Big News: LFD, which is a big update of LFFD, now is released (2021.03.09). It is strongly recommended

Nov 20, 2022
A PyTorch Implementation of Single Shot Scale-invariant Face Detector.
A PyTorch Implementation of Single Shot Scale-invariant Face Detector.

S³FD: Single Shot Scale-invariant Face Detector A PyTorch Implementation of Single Shot Scale-invariant Face Detector. Eval python wider_eval_pytorch.

Nov 24, 2022
A whale detector design for the Kaggle whale-detector challenge!
A whale detector design for the Kaggle whale-detector challenge!

CNN (InceptionV1) + STFT based Whale Detection Algorithm So, this repository is my PyTorch solution for the Kaggle whale-detection challenge. The obje

Sep 28, 2021
Lane follower: Lane-detector (OpenCV) + Object-detector (YOLO5) + CAN-bus
Lane follower: Lane-detector (OpenCV) + Object-detector (YOLO5) + CAN-bus

Lane Follower This code is for the lane follower, including perception and control, as shown below. Environment Hardware Industrial Camera Intel-NUC(1

Jul 7, 2022
HeartRate detector with ArduinoandPython - Use Arduino and Python create a heartrate detector.

Syllabus of Contents Syllabus of Contents Introduction Of Project Features Develop With Python code introduction Installation License Developer Contac

Jan 5, 2022
Video lie detector using xgboost - A video lie detector using OpenFace and xgboost

video_lie_detector_using_xgboost a video lie detector using OpenFace and xgboost

Jan 11, 2022
Imposter-detector-2022 - HackED 2022 Team 3IQ - 2022 Imposter Detector
Imposter-detector-2022 - HackED 2022 Team 3IQ - 2022 Imposter Detector

HackED 2022 Team 3IQ - 2022 Imposter Detector By Aneeljyot Alagh, Curtis Kan, Jo

Aug 20, 2022
Embracing Single Stride 3D Object Detector with Sparse Transformer

SST: Single-stride Sparse Transformer This is the official implementation of paper: Embracing Single Stride 3D Object Detector with Sparse Transformer

Nov 28, 2022
Official PyTorch Implementation of Mask-aware IoU and maYOLACT Detector [BMVC2021]
Official PyTorch Implementation of Mask-aware IoU and maYOLACT Detector [BMVC2021]

The official implementation of Mask-aware IoU and maYOLACT detector. Our implementation is based on mmdetection. Mask-aware IoU for Anchor Assignment

Oct 21, 2021
Pytorch implementation of 'Fingerprint Presentation Attack Detector Using Global-Local Model'
Pytorch implementation of 'Fingerprint Presentation Attack Detector Using Global-Local Model'

RTK-PAD This is an official pytorch implementation of 'Fingerprint Presentation Attack Detector Using Global-Local Model', which is accepted by IEEE T

Aug 1, 2022
A Pytorch Implementation of [Source data‐free domain adaptation of object detector through domain

A Pytorch Implementation of Source data‐free domain adaptation of object detector through domain‐specific perturbation Please follow Faster R-CNN and

Dec 25, 2021
A Pytorch Implementation of Domain adaptation of object detector using scissor-like networks

A Pytorch Implementation of Domain adaptation of object detector using scissor-like networks Please follow Faster R-CNN and DAF to complete the enviro

Oct 7, 2022
FLAVR is a fast, flow-free frame interpolation method capable of single shot multi-frame prediction
FLAVR is a fast, flow-free frame interpolation method capable of single shot multi-frame prediction

FLAVR is a fast, flow-free frame interpolation method capable of single shot multi-frame prediction. It uses a customized encoder decoder architecture with spatio-temporal convolutions and channel gating to capture and interpolate complex motion trajectories between frames to generate realistic high frame rate videos. This repository contains original source code for the paper accepted to CVPR 2021.

Dec 2, 2022
Code associated with the paper "Deep Optics for Single-shot High-dynamic-range Imaging"
Code associated with the paper

Deep Optics for Single-shot High-dynamic-range Imaging Code associated with the paper "Deep Optics for Single-shot High-dynamic-range Imaging" CVPR, 2

Nov 8, 2022
Code for One-shot Talking Face Generation from Single-speaker Audio-Visual Correlation Learning (AAAI 2022)

One-shot Talking Face Generation from Single-speaker Audio-Visual Correlation Learning (AAAI 2022) Paper | Demo Requirements Python >= 3.6 , Pytorch >

Nov 27, 2022
Official Implementation of DDOD (Disentangle your Dense Object Detector), ACM MM2021

Disentangle Your Dense Object Detector This repo contains the supported code and configuration files to reproduce object detection results of Disentan

Oct 11, 2022
Official PyTorch implementation of MX-Font (Multiple Heads are Better than One: Few-shot Font Generation with Multiple Localized Experts)

Introduction Pytorch implementation of Multiple Heads are Better than One: Few-shot Font Generation with Multiple Localized Expert. | paper Song Park1

Dec 5, 2022
Official PyTorch Implementation of Hypercorrelation Squeeze for Few-Shot Segmentation, arXiv 2021
Official PyTorch Implementation of Hypercorrelation Squeeze for Few-Shot Segmentation, arXiv 2021

Hypercorrelation Squeeze for Few-Shot Segmentation This is the implementation of the paper "Hypercorrelation Squeeze for Few-Shot Segmentation" by Juh

Dec 2, 2022
Implementation of Cross Transformer for spatially-aware few-shot transfer, in Pytorch
Implementation of Cross Transformer for spatially-aware few-shot transfer, in Pytorch

Cross Transformers - Pytorch (wip) Implementation of Cross Transformer for spatially-aware few-shot transfer, in Pytorch Install $ pip install cross-t

Nov 21, 2022