A PyTorch toolkit for 2D Human Pose Estimation.

PyTorch-Pose

screenshot

PyTorch-Pose is a PyTorch implementation of the general pipeline for 2D single human pose estimation. The aim is to provide the interface of the training/inference/evaluation, and the dataloader with various data augmentation options for the most popular human pose databases (e.g., the MPII human pose, LSP and FLIC).

Some codes for data preparation and augmentation are brought from the Stacked hourglass network. Thanks to the original author.

Update: this repository is compatible with PyTorch 0.4.1/1.0 now!

Features

  • Multi-thread data loading
  • Multi-GPU training
  • Logger
  • Training/testing results visualization

Installation

  1. PyTorch (>= 0.4.1): Please follow the installation instruction of PyTorch. Note that the code is developed with Python2 and has not been tested with Python3 yet.

  2. Clone the repository with submodule

    git clone --recursive https://github.com/bearpaw/pytorch-pose.git
    
  3. Create a symbolic link to the images directory of the MPII dataset:

    ln -s PATH_TO_MPII_IMAGES_DIR data/mpii/images
    

    For training/testing on COCO, please refer to COCO Readme.

  1. Download annotation file:

Usage

Please refer to TRAINING.md for detailed training recipes!

Testing

You may download our pretrained models (e.g., 2-stack hourglass model) for a quick start.

Run the following command in terminal to evaluate the model on MPII validation split (The train/val split is from Tompson et al. CVPR 2015).

CUDA_VISIBLE_DEVICES=0 python example/main.py --dataset mpii -a hg --stacks 2 --blocks 1 --checkpoint checkpoint/mpii/hg_s2_b1 --resume checkpoint/mpii/hg_s2_b1/model_best.pth.tar -e -d
  • -a specifies a network architecture
  • --resume will load the weight from a specific model
  • -e stands for evaluation only
  • -d will visualize the network output. It can be also used during training

The result will be saved as a .mat file (preds_valid.mat), which is a 2958x16x2 matrix, in the folder specified by --checkpoint.

Evaluate the [email protected] score

Evaluate with MATLAB

You may use the matlab script evaluation/eval_PCKh.m to evaluate your predictions. The evaluation code is ported from Tompson et al. CVPR 2015.

The results ([email protected] score) trained using this code is reported in the following table.

Model Head Shoulder Elbow Wrist Hip Knee Ankle Mean
hg_s2_b1 (last) 95.80 94.57 88.12 83.31 86.24 80.88 77.44 86.76
hg_s2_b1 (best) 95.87 94.68 88.27 83.64 86.29 81.20 77.70 86.95
hg_s8_b1 (last) 96.79 95.19 90.08 85.32 87.48 84.26 80.73 88.64
hg_s8_b1 (best) 96.79 95.28 90.27 85.56 87.57 84.3 81.06 88.78

Training / validation curve is visualized as follows.

curve

Evaluate with Python

You may also evaluate the result by running python evaluation/eval_PCKh.py to evaluate the predictions. It will produce exactly the same result as that of the MATLAB. Thanks @sssruhan1 for the contribution.

Training

Run the following command in terminal to train an 8-stack of hourglass network on the MPII human pose dataset.

CUDA_VISIBLE_DEVICES=0 python example/main.py --dataset mpii -a hg --stacks 8 --blocks 1 --checkpoint checkpoint/mpii/hg8 -j 4

Here,

  • CUDA_VISIBLE_DEVICES=0 identifies the GPU devices you want to use. For example, use CUDA_VISIBLE_DEVICES=0,1 if you want to use two GPUs with ID 0 and 1.
  • -j specifies how many workers you want to use for data loading.
  • --checkpoint specifies where you want to save the models, the log and the predictions to.

Miscs

Supported dataset

Supported models

Contribute

Please create a pull request if you want to contribute.

Owner
Wei Yang
NVIDIA Robotics Research Lab
Wei Yang
Comments
  • Bug in crop method

    Bug in crop method

    Hi, I was visualizing the heatmap inputs for the model and I'm not sure what I'm doing wrong but the crop method doesn't seem to work. This is the original crop method from this repo:

    def crop(img, center, scale, res, rot=0):
        img = im_to_numpy(img)
    
        # Preprocessing for efficient cropping
        ht, wd = img.shape[0], img.shape[1]
        sf = scale * 200.0 / res[0]
        if sf < 2:
            sf = 1
        else:
            new_size = int(np.math.floor(max(ht, wd) / sf))
            new_ht = int(np.math.floor(ht / sf))
            new_wd = int(np.math.floor(wd / sf))
            if new_size < 2:
                return torch.zeros(res[0], res[1], img.shape[2]) \
                            if len(img.shape) > 2 else torch.zeros(res[0], res[1])
            else:
                img = cv2.resize(img, (new_ht,new_wd))
                #img = scipy.misc.imresize(img, [new_ht, new_wd])
                center = center * 1.0 / sf
                scale = scale / sf
    
        # Upper left point
        ul = np.array(transform([0, 0], center, scale, res, invert=1))
        # Bottom right point
        br = np.array(transform(res, center, scale, res, invert=1))
    
        # Padding so that when rotated proper amount of context is included
        pad = int(np.linalg.norm(br - ul) / 2 - float(br[1] - ul[1]) / 2)
        if not rot == 0:
            ul -= pad
            br += pad
    
        new_shape = [br[1] - ul[1], br[0] - ul[0]]
        if len(img.shape) > 2:
            new_shape += [img.shape[2]]
        new_img = np.zeros(new_shape)
    
        # Range to fill new array
        new_x = max(0, -ul[0]), min(br[0], img.shape[1]) - ul[0]
        new_y = max(0, -ul[1]), min(br[1], img.shape[0]) - ul[1]
        # Range to sample from original image
        old_x = max(0, ul[0]), min(img.shape[1], br[0])
        old_y = max(0, ul[1]), min(img.shape[0], br[1])
        new_img[new_y[0]:new_y[1], new_x[0]:new_x[1]] = img[old_y[0]:old_y[1], old_x[0]:old_x[1]]
    
        if not rot == 0:
            # Remove padding
            new_img = scipy.misc.imrotate(new_img, rot)
            new_img = new_img[pad:-pad, pad:-pad]
    
        #new_img = im_to_torch(scipy.misc.imresize(new_img, res))
        new_img = im_to_torch(cv2.resize(new_img, tuple(res)))
        return new_img
    
    

    I've replaced scipy.misc with cv2. On the other hand, this is the crop method from https://github.com/princeton-vl/pytorch_stacked_hourglass/blob/master/utils/img.py

    def crop_newell(img, center, scale, res, rot=0):
        img = im_to_numpy(img)
        # Upper left point
        ul = np.array(transform([0, 0], center, scale, res, invert=1))
        # Bottom right point
        br = np.array(transform(res, center, scale, res, invert=1))
    
        new_shape = [br[1] - ul[1], br[0] - ul[0]]
        if len(img.shape) > 2:
            print(img.shape)
            new_shape += [img.shape[2]]
        new_img = np.zeros(new_shape)
    
        # Range to fill new array
        new_x = max(0, -ul[0]), min(br[0], len(img[0])) - ul[0]
        new_y = max(0, -ul[1]), min(br[1], len(img)) - ul[1]
        # Range to sample from original image
        old_x = max(0, ul[0]), min(len(img[0]), br[0])
        old_y = max(0, ul[1]), min(len(img), br[1])
        new_img[new_y[0]:new_y[1], new_x[0]:new_x[1]] = img[old_y[0]:old_y[1], old_x[0]:old_x[1]]
    
        new_img = im_to_torch(cv2.resize(new_img, tuple(res)))
        return new_img
    

    These are the results I get: (Left: crop_newell, Right: crop) 1 2 3

    As you can see, the crop method sometimes works well, and sometimes doesn't. It's usually the latter. What could be the issue? Am I doing something wrong? @bearpaw

  • About function

    About function "transform" in transforms.py

    Hi,

    Thanks for your code. I have one question about some code in the "transform" function.

    new_pt = np.array([pt[0] - 1, pt[1] - 1, 1.]).T new_pt = np.dot(t, new_pt) return new_pt[:2].astype(int) + 1

    According to the above code, you first subtract 1 from the coordinates and then add 1 after the transformation. I don't see the reason of doing this. There are two places calling this "transform" function. The first place is in datasets/mpii.py function,

    tpts[i, 0:2] = to_torch(transform(tpts[i, 0:2]+1, c, s, [self.out_res, self.out_res], rot=r)) target[i] = draw_labelmap(target[i], tpts[i]-1, self.sigma, type=self.label_type)

    Here you first add 1 and then subtract 1 before and after calling the "transform" function, which just offset what you do inside it. For this case, we could remove the plus 1 and minus 1 for clarity.

    Second, function "final_preds" calls function "transform_preds" which then calls "transform" as follows:

    coords[p, 0:2] = to_torch(transform(coords[p, 0:2], center, scale, res, 1, 0))

    In this case, I also read the original torch code: https://github.com/anewell/pose-hg-demo/blob/master/util.lua It seems they don't add 1 and subtract 1 afterwards. I think adding 1 is not equivalent to subtracting 1 after the trasformation. Could please explain your reason?

    Thanks,

  • Does this code reproduce the results of 8 stacked hg in the original paper?

    Does this code reproduce the results of 8 stacked hg in the original paper?

    Hi,

    Thanks for sharing your code! Does this code reproduce the results of 8 stacked HG in the original paper? If not, what's your results of 8 stacked HG? Any possible reasons between the gap?

    Best,

  • Can anyone reproduce the same training accurarcy performance as claimed with pytorch 0.4?

    Can anyone reproduce the same training accurarcy performance as claimed with pytorch 0.4?

    I trained with the origin code and dataset on 2 different machine, one with a 1060 gpu and another with 2 1080Ti, but never have I got an accurarcy rate over 70% and it was growing pretty slow (some got 20% after 2 epochs, but mine is still way lower than 10%). I noticed someone mentioned in another issue said that he couldn't get good performance on pytorch 0.4.0 either, so I wonder if anyone got good performance. I really don't want to down-grade my pytorch version since I have been modifying the code to implement some points of a paper that couldn't work on lower version pytorch.

  • (1)raise ValueError(reduction +

    (1)raise ValueError(reduction + " is not a valid value for reduction") (2)dists[c, n] = torch.dist(preds[n,c,:], target[n,c,:])/normalize[n] RuntimeError: Expected object of type torch.FloatTensor but found type torch.cuda.FloatTensor for argument #2 'other'

    There are two bugs in the latest repo updated on Jan 8th,2019.The first one is as the title described.I think it is caused by the pytorch version 0.4.1 vs 1.0 . I solved this problem by changing the param of the torch.nn.MSELoss() to 'size_average=True'

    The second bug is : 'dists[c, n] = torch.dist(preds[n,c,:], target[n,c,:])/normalize[n] RuntimeError: Expected object of type torch.FloatTensor but found type torch.cuda.FloatTensor for argument #2 'other'' It is caused by the type of the variable.I solved it by changing the type of variable 'target' from cuda() to cpu data.

  • Validation acc varies in different test batch size.

    Validation acc varies in different test batch size.

    @bearpaw thank you for your wonderful work, I have trained my network. While, when I validated my best model in different test batch size, the validation acc suffered from different results, as I think the acc should be independent of test batch size.

    test batchsize------Val Acc 1--------------------0.8743 6--------------------0.8660 16-------------------0.8685

    And what's the test batch size do you use in the result you published?

  • got stuck in training.

    got stuck in training.

    Epoch: 54 | LR: 0.00005000
    Processing |################################| (3708/3708) Data: 0.000265s | Batch: 0.286s | Total: 0:20:19 | ETA: 0:00:01 | Loss: 0.0033 | Acc:  0.7968
    Processing |################################| (493/493) Data: 0.000184s | Batch: 0.127s | Total: 0:01:02 | ETA: 0:00:01 | Loss: 0.0035 | Acc:  0.8025
    
    Epoch: 55 | LR: 0.00005000
    Processing |################################| (3708/3708) Data: 0.000226s | Batch: 0.283s | Total: 0:20:22 | ETA: 0:00:01 | Loss: 0.0033 | Acc:  0.7994
    Processing |################################| (493/493) Data: 0.000168s | Batch: 0.128s | Total: 0:01:02 | ETA: 0:00:01 | Loss: 0.0036 | Acc:  0.7947
    
    Epoch: 56 | LR: 0.00005000
    Processing |################################| (3708/3708) Data: 0.000174s | Batch: 0.249s | Total: 0:20:24 | ETA: 0:00:01 | Loss: 0.0033 | Acc:  0.7997
    Processing |################################| (493/493) Data: 0.000158s | Batch: 0.128s | Total: 0:01:03 | ETA: 0:00:01 | Loss: 0.0038 | Acc:  0.8001
    
    Epoch: 57 | LR: 0.00005000
    Processing |################################| (3708/3708) Data: 0.000286s | Batch: 0.309s | Total: 0:20:31 | ETA: 0:00:01 | Loss: 0.0033 | Acc:  0.8026
    Processing |################################| (493/493) Data: 0.000217s | Batch: 0.128s | Total: 0:01:03 | ETA: 0:00:01 | Loss: 0.0035 | Acc:  0.7993
    
    Epoch: 58 | LR: 0.00005000
    Processing |################################| (3708/3708) Data: 0.000279s | Batch: 0.296s | Total: 0:20:16 | ETA: 0:00:01 | Loss: 0.0033 | Acc:  0.8038
    Processing |################################| (493/493) Data: 0.000223s | Batch: 0.122s | Total: 0:01:00 | ETA: 0:00:01 | Loss: 0.0036 | Acc:  0.7977
    
    Epoch: 59 | LR: 0.00005000
    Processing |######                          | (789/3708) Data: 0.000346s | Batch: 0.328s | Total: 0:04:18 | ETA: 0:15:57 | Loss: 0.0033 | Acc:  0.8042
    

    it just stay here, and don't move any more.

  • Data loader is slow

    Data loader is slow

    The data loader seems to extremely slow for few batches. After every few batches (like after 10 or 20 batches), it takes few seconds (up to 15s) to load the data. I have tried increasing the number of data loader workers (via option -j 12) and increasing the train batch size, but this issue persists. Is this issue expected? Is it because of the data transforms? This issue becomes severe when I run the code on more than one GPU. Most of the times, the GPU's remains idle which increases the overall time taken for one epoch (which for me is 1hr, 20 mins).

    My machine configurations are: 4x1080Ti, Intel Xeon E5-2640, and I am loading the data from an SSD.

  • evaluation.py

    evaluation.py

    hi, i ran the training code and met the following errror:

    Traceback (most recent call last):
      File "example/mpii.py", line 318, in <module>
        main(parser.parse_args())
      File "example/mpii.py", line 104, in main
        valid_loss, valid_acc, predictions = validate(val_loader, model, criterion, args.debug, args.flip)
      File "example/mpii.py", line 233, in validate
        acc = accuracy(score_map.cuda(), target, idx)
      File "~/pytorch-pose/pose/utils/evaluation.py", line 61, in accuracy
        acc[i+1] = dist_acc(dists[:, idxs[i]-1, :])
    IndexError: index 10 is out of range for dimension 1 (of size 6)
    

    is this a bug?

  • ImportError: No module named pose

    ImportError: No module named pose

    # Thanks for you code! could you give me some advice?

    [email protected]:~$ cd /home/sun/pytorch-pose [email protected]:~/pytorch-pose$ ln -s PATH_TO_MPII_IMAGES_DIR data/mpii/images [email protected]:~/pytorch-pose$ CUDA_VISIBLE_DEVICES=0 python example/mpii.py -a hg4 --checkpoint checkpoint/mpii/hg4 --resume checkpoint/mpii/hg4/model_best.pth.tar -e -d Traceback (most recent call last): File "example/mpii.py", line 14, in from pose import Bar ImportError: No module named pose [email protected]:~/pytorch-pose$ CUDA_VISIBLE_DEVICES=0 python example/mpii.py -a hg4 --checkpoint checkpoint/mpii/hg4 --resume checkpoint/mpii/hg4/model_best.pth.tar -e -d Traceback (most recent call last): File "example/mpii.py", line 14, in from pose import Bar ImportError: No module named pose [email protected]:~/pytorch-pose$ CUDA_VISIBLE_DEVICES=0 python example/mpii.py -a hg1 --checkpoint checkpoint/mpii/hg1 -j 4 Traceback (most recent call last): File "example/mpii.py", line 14, in from pose import Bar ImportError: No module named pose [email protected]:~/pytorch-pose$ really need help! thank you !

  • train acc is low

    train acc is low

    acc hello, this is a really great work ! But I have a question about the train accuracy, why my train acc is always lower than the val acc ? hope you can help me @bearpaw

  • Question about calculating PCK in Python

    Question about calculating PCK in Python

    parser = argparse.ArgumentParser(description='MPII PCKh Evaluation') parser.add_argument('-r', '--result', default='checkpoint/mpii/hg_s2_b1/preds.mat', type=str, metavar='PATH', help='path to result (default: checkpoint/mpii/hg_s2_b1/preds.mat)') args = parser.parse_args( I can`t find about preds.mat somethings. Hope to get some answers, thanks

  • I cannot train!

    I cannot train!

    ==> creating model 'hg', stacks=1, blocks=1 => no checkpoint found at './checkpoint/mpii/hg-s1-b1/checkpoint.pth.tar' Total params: 3.59M Traceback (most recent call last): File "./example/main.py", line 431, in main(parser.parse_args()) File "./example/main.py", line 116, in main train_dataset = datasets.dict[args.dataset](is_train=True, **vars(args)) File "/home/ubuntu/zq/Projects/hourglass0122/pytorch-pose/example/../pose/datasets/mpii.py", line 138, in mpii return Mpii(**kwargs) File "/home/ubuntu/zq/Projects/hourglass0122/pytorch-pose/example/../pose/datasets/mpii.py", line 30, in init with open(self.jsonfile) as anno_file: IOError: [Errno 2] No such file or directory: ''

  • Minor bug fix in _make_fc function

    Minor bug fix in _make_fc function

    function _make_fc gets parameters inplanes and outplanes the conv layer inside _make_fc is followed by bn because the input channels of bn must match with the output channels of conv layer, parameter of bn should be changed from inplanes to outplanes it did not cause any error because _make_fc is always used with same value of inplanes and outplanes even so, I think it should be fixed

Repository for the paper "PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation", CVPR 2021.
Repository for the paper

PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation Code repository for the paper: PoseAug: A Differentiable Pose Augme

Dec 17, 2022
Human POSEitioning System (HPS): 3D Human Pose Estimation and Self-localization in Large Scenes from Body-Mounted Sensors, CVPR 2021
 Human POSEitioning System (HPS): 3D Human Pose Estimation and Self-localization in Large Scenes from Body-Mounted Sensors, CVPR 2021

Human POSEitioning System (HPS): 3D Human Pose Estimation and Self-localization in Large Scenes from Body-Mounted Sensors Human POSEitioning System (H

Dec 21, 2022
SE3 Pose Interp - Interpolate camera pose or trajectory in SE3, pose interpolation, trajectory interpolation
SE3 Pose Interp - Interpolate camera pose or trajectory in SE3, pose interpolation, trajectory interpolation

SE3 Pose Interpolation Pose estimated from SLAM system are always discrete, and

Dec 15, 2022
PyTorch implementation for 3D human pose estimation
PyTorch implementation for 3D human pose estimation

Towards 3D Human Pose Estimation in the Wild: a Weakly-supervised Approach This repository is the PyTorch implementation for the network presented in:

Dec 22, 2022
This repository is the offical Pytorch implementation of ContextPose: Context Modeling in 3D Human Pose Estimation: A Unified Perspective (CVPR 2021).
This repository is the offical Pytorch implementation of ContextPose: Context Modeling in 3D Human Pose Estimation: A Unified Perspective (CVPR 2021).

Context Modeling in 3D Human Pose Estimation: A Unified Perspective (CVPR 2021) Introduction This repository is the offical Pytorch implementation of

Nov 21, 2022
This repo is official PyTorch implementation of MobileHumanPose: Toward real-time 3D human pose estimation in mobile devices(CVPRW 2021).
This repo is official PyTorch implementation of MobileHumanPose: Toward real-time 3D human pose estimation in mobile devices(CVPRW 2021).

Github Code of "MobileHumanPose: Toward real-time 3D human pose estimation in mobile devices" Introduction This repo is official PyTorch implementatio

Jan 5, 2023
《Unsupervised 3D Human Pose Representation with Viewpoint and Pose Disentanglement》(ECCV 2020) GitHub: [fig9]
《Unsupervised 3D Human Pose Representation with Viewpoint and Pose Disentanglement》(ECCV 2020) GitHub: [fig9]

Unsupervised 3D Human Pose Representation [Paper] The implementation of our paper Unsupervised 3D Human Pose Representation with Viewpoint and Pose Di

Nov 24, 2022
Simple Pose: Rethinking and Improving a Bottom-up Approach for Multi-Person Pose Estimation
Simple Pose: Rethinking and Improving a Bottom-up Approach for Multi-Person Pose Estimation

SimplePose Code and pre-trained models for our paper, “Simple Pose: Rethinking and Improving a Bottom-up Approach for Multi-Person Pose Estimation”, a

Dec 24, 2022
This repository contains codes of ICCV2021 paper: SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation

SO-Pose This repository contains codes of ICCV2021 paper: SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation This paper is basically an

Nov 25, 2022
Re-implementation of the Noise Contrastive Estimation algorithm for pyTorch, following "Noise-contrastive estimation: A new estimation principle for unnormalized statistical models." (Gutmann and Hyvarinen, AISTATS 2010)

Noise Contrastive Estimation for pyTorch Overview This repository contains a re-implementation of the Noise Contrastive Estimation algorithm, implemen

Nov 24, 2022
The project is an official implementation of our CVPR2019 paper "Deep High-Resolution Representation Learning for Human Pose Estimation"
The project is an official implementation of our CVPR2019 paper

Deep High-Resolution Representation Learning for Human Pose Estimation (CVPR 2019) News [2020/07/05] A very nice blog from Towards Data Science introd

Jan 5, 2023
Human head pose estimation using Keras over TensorFlow.
Human head pose estimation using Keras over TensorFlow.

RealHePoNet: a robust single-stage ConvNet for head pose estimation in the wild.

Jan 5, 2023
Deep Dual Consecutive Network for Human Pose Estimation (CVPR2021)
Deep Dual Consecutive Network for Human Pose Estimation (CVPR2021)

Deep Dual Consecutive Network for Human Pose Estimation (CVPR2021) Introduction This is the official code of Deep Dual Consecutive Network for Human P

Dec 29, 2022
Bottom-up Human Pose Estimation

Introduction This is the official code of Rethinking the Heatmap Regression for Bottom-up Human Pose Estimation. This paper has been accepted to CVPR2

Dec 1, 2022
This is an official implementation of our CVPR 2021 paper "Bottom-Up Human Pose Estimation Via Disentangled Keypoint Regression" (https://arxiv.org/abs/2104.02300)

Bottom-Up Human Pose Estimation Via Disentangled Keypoint Regression Introduction In this paper, we are interested in the bottom-up paradigm of estima

Dec 27, 2022
HPRNet: Hierarchical Point Regression for Whole-Body Human Pose Estimation

HPRNet: Hierarchical Point Regression for Whole-Body Human Pose Estimation Official PyTroch implementation of HPRNet. HPRNet: Hierarchical Point Regre

Dec 4, 2022
A large-scale video dataset for the training and evaluation of 3D human pose estimation models
A large-scale video dataset for the training and evaluation of 3D human pose estimation models

ASPset-510 ASPset-510 (Australian Sports Pose Dataset) is a large-scale video dataset for the training and evaluation of 3D human pose estimation mode

Oct 30, 2022
A large-scale video dataset for the training and evaluation of 3D human pose estimation models
A large-scale video dataset for the training and evaluation of 3D human pose estimation models

ASPset-510 (Australian Sports Pose Dataset) is a large-scale video dataset for the training and evaluation of 3D human pose estimation models. It contains 17 different amateur subjects performing 30 sports-related actions each, for a total of 510 action clips.

Jun 20, 2021
The project is an official implementation of our paper "3D Human Pose Estimation with Spatial and Temporal Transformers".
The project is an official implementation of our paper

3D Human Pose Estimation with Spatial and Temporal Transformers This repo is the official implementation for 3D Human Pose Estimation with Spatial and

Dec 28, 2022