Official MegEngine implementation of CREStereo(CVPR 2022 Oral).

[CVPR 2022] Practical Stereo Matching via Cascaded Recurrent Network with Adaptive Correlation

This repository contains MegEngine implementation of our paper:

hydrussoftware

Practical Stereo Matching via Cascaded Recurrent Network with Adaptive Correlation
Jiankun Li, Peisen Wang, Pengfei Xiong, Tao Cai, Ziwei Yan, Lei Yang, Jiangyu Liu, Haoqiang Fan, Shuaicheng Liu
CVPR 2022

arXiv | BibTeX

Datasets

The Proposed Dataset

Download

There are two ways to download the dataset(~400GB) proposed in our paper:

  • Download using shell scripts dataset_download.sh
sh dataset_download.sh

the dataset will be downloaded and extracted in ./stereo_trainset/crestereo

  • Download from BaiduCloud here(Extraction code: aa3g) and extract the tar files manually.

Disparity Format

The disparity is saved as .png uint16 format which can be loaded using opencv imread function:

def get_disp(disp_path):
    disp = cv2.imread(disp_path, cv2.IMREAD_UNCHANGED)
    return disp.astype(np.float32) / 32

Other Public Datasets

Other public datasets we use including

Dependencies

CUDA Version: 10.1, Python Version: 3.6.9

  • MegEngine v1.8.2
  • opencv-python v3.4.0
  • numpy v1.18.1
  • Pillow v8.4.0
  • tensorboardX v2.1
python3 -m pip install -r requirements.txt

We also provide docker to run the code quickly:

docker run --gpus all -it -v /tmp:/tmp ylmegvii/crestereo
shotwell /tmp/disparity.png

Inference

Download the pretrained MegEngine model from here and run:

python3 test.py --model_path path_to_mge_model --left img/test/left.png --right img/test/right.png --size 1024x1536 --output disparity.png

Training

Modify the configurations in cfgs/train.yaml and run the following command:

python3 train.py

You can launch a TensorBoard to monitor the training process:

tensorboard --logdir ./train_log

and navigate to the page at http://localhost:6006 in your browser.

Acknowledgements

Part of the code is adapted from previous works:

We thank all the authors for their awesome repos.

Citation

If you find the code or datasets helpful in your research, please cite:

@misc{Li2022PracticalSM,
      title={Practical Stereo Matching via Cascaded Recurrent Network with Adaptive Correlation},
      author={Jiankun Li and Peisen Wang and Pengfei Xiong and Tao Cai and Ziwei Yan and Lei Yang and Jiangyu Liu and Haoqiang Fan and Shuaicheng Liu},
      year={2022},
      eprint={2203.11483},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}
Owner
MEGVII Research
Power Human with AI. 持续创新拓展认知边界 非凡科技成就产品价值
MEGVII Research
Comments
  • Is CUDA 11.6 supported?

    Is CUDA 11.6 supported?

    This is a really promising project, congratulations and thanks for releasing it!

    I'm trying to run the test script with your Eth3d model and this command: python3 test.py --model_path path_to_mge_model --left img/test/left.png --right img/test/right.png --size 1024x1536 --output disparity.png

    But the code hangs up and doesn't return from this line in extractor.py:82: self.conv2 = M.Conv2d(128, output_dim, kernel_size=1)

    which is called form load_model in test.py:15 model = Model(max_disp=256, mixed_precision=False, test_mode=True)

    My GPU is NVIDIA RTX A6000 and the CUDA version on the system is v11.6

  • Results on Holopix50k dataset

    Results on Holopix50k dataset

    Hello! Thank you for sharing the codes and the model. I tested the pre-trained model on Holopix50k test dataset, but didn't get similar results that you showed on the paper. If I would like to run crestereo_eth3d.mge model on this dataset, does it require different parameter setting or pre-preprocessing? How I can get the similar results on Holopix50k dataset? Any advice would be very helpful. Thank you in advance! 0001 0002 0007 0008

  • Did you obtain results on Holopix50k with published model?

    Did you obtain results on Holopix50k with published model?

    I've tried to run published model with few images from Holopix50k and got awful results. Can you please tell how to obtain results similar to paper? Another model / another preprocessing?

  • MegEngine 1.9.0 causes test.py error

    MegEngine 1.9.0 causes test.py error

    I have been playing around a bit with the code (thank you so much, by the way. Having heaps of fun with it) and found out that MegEngine 1.9.0 causes test.py to die with the following output:

    Images resized: 1024x1536
    Model Forwarding...
    Traceback (most recent call last):
      File "test.py", line 94, in <module>
        pred = inference(left_img, right_img, model_func, n_iter=20)
      File "test.py", line 45, in inference
        pred_flow_dw2 = model(imgL_dw2, imgR_dw2, iters=n_iter, flow_init=None)
      File "/usr/local/lib/python3.6/dist-packages/megengine/module/module.py", line 149, in __call__
        outputs = self.forward(*inputs, **kwargs)
      File "/home/dgxmartin/workspace/CREStereo/nets/crestereo.py", line 210, in forward
        align_corners=True,
      File "/usr/local/lib/python3.6/dist-packages/megengine/functional/vision.py", line 663, in interpolate
        [wscale, Tensor([0, 0], dtype="float32", device=inp.device)], axis=0
      File "/usr/local/lib/python3.6/dist-packages/megengine/functional/tensor.py", line 405, in concat
        (result,) = apply(builtin.Concat(axis=axis, comp_node=device.to_c()), *inps)
    TypeError: py_apply expects tensor as inputs
    

    For the time being the MegEngine version should be set to exactly 1.8.2

  • What datasets are used for pretraining?

    What datasets are used for pretraining?

    The pretrained model works amazingly well on the real-life photos! What datasets are used for pretraining? Can you please provide the training details of the pretrained model? Thanks!

  • Update requirements.txt to MegEngine v1.9.1

    Update requirements.txt to MegEngine v1.9.1

    function.Pad may lead to some weird NaN in MegEngine v1.8.2, MegEngine v1.9.0 resolve this but brings more problems, which is pointed out in https://github.com/megvii-research/CREStereo/pull/14 .

    The most recent release v1.9.1 resolves all of these problems, updates MegEngine version constraint to v1.9.1 or later

  • nan

    nan

    2022/06/01 14:17:17 Model params saved: train_logs/models/epoch-1.mge 2022/06/01 14:17:25 0.66 b/s,passed:00:13:16,eta:21:41:36,data_time:0.16,lr:0.0004,[2/100:5/500] ==> loss:26.19 2022/06/01 14:17:32 0.65 b/s,passed:00:13:24,eta:21:40:40,data_time:0.17,lr:0.0004,[2/100:10/500] ==> loss:6.847 2022/06/01 14:17:40 0.68 b/s,passed:00:13:31,eta:21:39:57,data_time:0.14,lr:0.0004,[2/100:15/500] ==> loss:6.83 2022/06/01 14:17:47 0.67 b/s,passed:00:13:39,eta:21:39:12,data_time:0.16,lr:0.0004,[2/100:20/500] ==> loss:16.89 2022/06/01 14:17:55 0.66 b/s,passed:00:13:46,eta:21:38:28,data_time:0.17,lr:0.0004,[2/100:25/500] ==> loss:43.18 2022/06/01 14:18:02 0.66 b/s,passed:00:13:54,eta:21:37:36,data_time:0.17,lr:0.0004,[2/100:30/500] ==> loss:20.37 2022/06/01 14:18:10 0.65 b/s,passed:00:14:01,eta:21:36:52,data_time:0.18,lr:0.0004,[2/100:35/500] ==> loss:15.24 2022/06/01 14:18:17 0.65 b/s,passed:00:14:09,eta:21:36:18,data_time:0.19,lr:0.0004,[2/100:40/500] ==> loss:9.399 2022/06/01 14:18:25 0.67 b/s,passed:00:14:16,eta:21:35:41,data_time:0.16,lr:0.0004,[2/100:45/500] ==> loss:40.27 2022/06/01 14:18:32 0.68 b/s,passed:00:14:24,eta:21:34:58,data_time:0.14,lr:0.0004,[2/100:50/500] ==> loss:15.02 2022/06/01 14:18:40 0.69 b/s,passed:00:14:31,eta:21:34:14,data_time:0.14,lr:0.0004,[2/100:55/500] ==> loss:32.48 2022/06/01 14:18:47 0.65 b/s,passed:00:14:39,eta:21:33:42,data_time:0.18,lr:0.0004,[2/100:60/500] ==> loss:9.96 2022/06/01 14:18:55 0.65 b/s,passed:00:14:46,eta:21:33:16,data_time:0.18,lr:0.0004,[2/100:65/500] ==> loss:14.69 2022/06/01 14:19:02 0.68 b/s,passed:00:14:54,eta:21:32:35,data_time:0.13,lr:0.0004,[2/100:70/500] ==> loss:nan 2022/06/01 14:19:10 0.65 b/s,passed:00:15:01,eta:21:31:55,data_time:0.19,lr:0.0004,[2/100:75/500] ==> loss:nan 2022/06/01 14:19:17 0.68 b/s,passed:00:15:09,eta:21:31:14,data_time:0.15,lr:0.0004,[2/100:80/500] ==> loss:nan 2022/06/01 14:19:25 0.67 b/s,passed:00:15:16,eta:21:30:34,data_time:0.15,lr:0.0004,[2/100:85/500] ==> loss:nan 2022/06/01 14:19:32 0.67 b/s,passed:00:15:24,eta:21:30:08,data_time:0.17,lr:0.0004,[2/100:90/500] ==> loss:nan 2022/06/01 14:19:40 0.69 b/s,passed:00:15:31,eta:21:29:28,data_time:0.14,lr:0.0004,[2/100:95/500] ==> loss:nan 2022/06/01 14:19:47 0.65 b/s,passed:00:15:39,eta:21:28:54,data_time:0.17,lr:0.0004,[2/100:100/500] ==> loss:nan 2022/06/01 14:19:55 0.68 b/s,passed:00:15:46,eta:21:28:11,data_time:0.14,lr:0.0004,[2/100:105/500] ==> loss:nan 2022/06/01 14:20:02 0.65 b/s,passed:00:15:54,eta:21:27:38,data_time:0.17,lr:0.0004,[2/100:110/500] ==> loss:nan 2022/06/01 14:20:10 0.64 b/s,passed:00:16:01,eta:21:27:04,data_time:0.2,lr:0.0004,[2/100:115/500] ==> loss:nan 2022/06/01 14:20:17 0.67 b/s,passed:00:16:09,eta:21:26:28,data_time:0.16,lr:0.0004,[2/100:120/500] ==> loss:nan 2022/06/01 14:20:25 0.66 b/s,passed:00:16:16,eta:21:26:04,data_time:0.17,lr:0.0004,[2/100:125/500] ==> loss:nan 2022/06/01 14:20:32 0.68 b/s,passed:00:16:24,eta:21:25:20,data_time:0.15,lr:0.0004,[2/100:130/500] ==> loss:nan

    hello! this is my train logs,why?

  • Dataset for reproducing the results

    Dataset for reproducing the results

    Thank you for the great job! Is it possible or future plan to release the datasets used for training the model so that someone else can reproduce the results reported in the paper ?

  • finetune  in the secend batch, loss is nan.

    finetune in the secend batch, loss is nan.

    hi, it's a real nice work! but when I fine-tune the model using your pre-trained model ,the loss in the secend batch be nan. I checked the data input to the model, the left and right image are the original data without any preprocessing, and the disparity is the absolute value. I don't know where is the problem? Can you offer some advice? thanks. the log is follow:

    left.max(), left.min(): Tensor(255.0, device=xpux:0) Tensor(0.0, device=xpux:0) right.max(), right.min(): Tensor(255.0, device=xpux:0) Tensor(0.0, device=xpux:0) gt_disp.max(), gt_disp.min(): Tensor(65.625, device=xpux:0) Tensor(0.0, device=xpux:0) valid_mask.max(), valid_mask.min(): Tensor(1.0, device=xpux:0) Tensor(0.0, device=xpux:0) The i-th iteration prediction loss : 0 Tensor(68.409615, device=xpux:0) Tensor(-0.72061765, device=xpux:0) 1 Tensor(69.27495, device=xpux:0) Tensor(-7.1237144, device=xpux:0) 2 Tensor(68.630264, device=xpux:0) Tensor(-2.3412788, device=xpux:0) 3 Tensor(67.001595, device=xpux:0) Tensor(-0.64989996, device=xpux:0) 4 Tensor(67.27512, device=xpux:0) Tensor(-0.53194094, device=xpux:0) 5 Tensor(66.031105, device=xpux:0) Tensor(-1.1353028, device=xpux:0) 6 Tensor(66.7748, device=xpux:0) Tensor(-2.5566366, device=xpux:0) 7 Tensor(66.69823, device=xpux:0) Tensor(-0.30609164, device=xpux:0) 8 Tensor(66.8682, device=xpux:0) Tensor(-0.37459654, device=xpux:0) 9 Tensor(66.893974, device=xpux:0) Tensor(-0.80092835, device=xpux:0) 10 Tensor(66.295364, device=xpux:0) Tensor(-1.110324, device=xpux:0) 11 Tensor(67.22122, device=xpux:0) Tensor(-3.059827, device=xpux:0) 12 Tensor(66.74182, device=xpux:0) Tensor(-0.807206, device=xpux:0) 13 Tensor(66.88104, device=xpux:0) Tensor(-0.45083997, device=xpux:0) 14 Tensor(67.27106, device=xpux:0) Tensor(-0.62685704, device=xpux:0) 15 Tensor(67.43465, device=xpux:0) Tensor(-0.7094991, device=xpux:0) 16 Tensor(67.55379, device=xpux:0) Tensor(-0.38040105, device=xpux:0) 17 Tensor(67.453476, device=xpux:0) Tensor(-1.5267422, device=xpux:0) 18 Tensor(67.46704, device=xpux:0) Tensor(-0.3359019, device=xpux:0) 19 Tensor(67.47497, device=xpux:0) Tensor(-0.32194442, device=xpux:0) Tensor(255.0, device=xpux:0) Tensor(0.0, device=xpux:0) Tensor(255.0, device=xpux:0) Tensor(0.0, device=xpux:0) Tensor(69.34766, device=xpux:0) Tensor(0.0, device=xpux:0) Tensor(1.0, device=xpux:0) Tensor(0.0, device=xpux:0) 0 Tensor(nan, device=xpux:0) Tensor(nan, device=xpux:0) 1 Tensor(nan, device=xpux:0) Tensor(nan, device=xpux:0) 2 Tensor(nan, device=xpux:0) Tensor(nan, device=xpux:0) 3 Tensor(nan, device=xpux:0) Tensor(nan, device=xpux:0) 4 Tensor(nan, device=xpux:0) Tensor(nan, device=xpux:0) 5 Tensor(nan, device=xpux:0) Tensor(nan, device=xpux:0) 6 Tensor(nan, device=xpux:0) Tensor(nan, device=xpux:0) 7 Tensor(nan, device=xpux:0) Tensor(nan, device=xpux:0) 8 Tensor(nan, device=xpux:0) Tensor(nan, device=xpux:0) 9 Tensor(nan, device=xpux:0) Tensor(nan, device=xpu

Official code for the CVPR 2022 (oral) paper "Extracting Triangular 3D Models, Materials, and Lighting From Images".
Official code for the CVPR 2022 (oral) paper

nvdiffrec Joint optimization of topology, materials and lighting from multi-view image observations as described in the paper Extracting Triangular 3D

Jun 26, 2022
MegEngine implementation of YOLOX
MegEngine implementation of YOLOX

Introduction YOLOX is an anchor-free version of YOLO, with a simpler design but better performance! It aims to bridge the gap between research and ind

Jun 24, 2022
[CVPR 2022] CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation

CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation Prerequisite Please create and activate the following conda envrionment. To r

Jul 4, 2022
[CVPR 2022 Oral] Rethinking Minimal Sufficient Representation in Contrastive Learning

Rethinking Minimal Sufficient Representation in Contrastive Learning PyTorch implementation of Rethinking Minimal Sufficient Representation in Contras

Jun 21, 2022
(CVPR 2022 - oral) Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry
(CVPR 2022 - oral) Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry

Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry Official implementation of the paper Multi-View Depth Est

Jun 29, 2022
[CVPR 2022 Oral] TubeDETR: Spatio-Temporal Video Grounding with Transformers

TubeDETR: Spatio-Temporal Video Grounding with Transformers Website • STVG Demo • Paper This repository provides the code for our paper. This includes

Jul 3, 2022
[CVPR 2022 Oral] Crafting Better Contrastive Views for Siamese Representation Learning
[CVPR 2022 Oral] Crafting Better Contrastive Views for Siamese Representation Learning

Crafting Better Contrastive Views for Siamese Representation Learning (CVPR 2022 Oral) 2022-03-29: The paper was selected as a CVPR 2022 Oral paper! 2

Jul 6, 2022
Code for "Neural 3D Scene Reconstruction with the Manhattan-world Assumption" CVPR 2022 Oral
Code for

News 05/10/2022 To make the comparison on ScanNet easier, we provide all quantitative and qualitative results of baselines here, including COLMAP, COL

Jun 27, 2022
[CVPR 2022 Oral] EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation
[CVPR 2022 Oral] EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation

EPro-PnP EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation In CVPR 2022 (Oral). [paper] Hanshen

Jun 25, 2022
The Pytorch code of "Joint Distribution Matters: Deep Brownian Distance Covariance for Few-Shot Classification", CVPR 2022 (Oral).
The Pytorch code of

DeepBDC for few-shot learning        Introduction In this repo, we provide the implementation of the following paper: "Joint Distribution Matters: Dee

Jun 23, 2022
[CVPR 2022 Oral] Versatile Multi-Modal Pre-Training for Human-Centric Perception
[CVPR 2022 Oral] Versatile Multi-Modal Pre-Training for Human-Centric Perception

Versatile Multi-Modal Pre-Training for Human-Centric Perception Fangzhou Hong1  Liang Pan1  Zhongang Cai1,2,3  Ziwei Liu1* 1S-Lab, Nanyang Technologic

Jun 17, 2022
Scribble-Supervised LiDAR Semantic Segmentation, CVPR 2022 (ORAL)
Scribble-Supervised LiDAR Semantic Segmentation, CVPR 2022 (ORAL)

Scribble-Supervised LiDAR Semantic Segmentation Dataset and code release for the paper Scribble-Supervised LiDAR Semantic Segmentation, CVPR 2022 (ORA

Jun 28, 2022
Not All Points Are Equal: Learning Highly Efficient Point-based Detectors for 3D LiDAR Point Clouds (CVPR 2022, Oral)
Not All Points Are Equal: Learning Highly Efficient Point-based Detectors for 3D LiDAR Point Clouds (CVPR 2022, Oral)

Not All Points Are Equal: Learning Highly Efficient Point-based Detectors for 3D LiDAR Point Clouds (CVPR 2022, Oral) This is the official implementat

Jul 1, 2022
[CVPR 2022] PoseTriplet: Co-evolving 3D Human Pose Estimation, Imitation, and Hallucination under Self-supervision (Oral)
[CVPR 2022] PoseTriplet: Co-evolving 3D Human Pose Estimation, Imitation, and Hallucination under Self-supervision (Oral)

PoseTriplet: Co-evolving 3D Human Pose Estimation, Imitation, and Hallucination under Self-supervision Kehong Gong*, Bingbing Li*, Jianfeng Zhang*, Ta

Jun 26, 2022
[CVPR 2022 Oral] MixFormer: End-to-End Tracking with Iterative Mixed Attention
[CVPR 2022 Oral] MixFormer: End-to-End Tracking with Iterative Mixed Attention

MixFormer The official implementation of the CVPR 2022 paper MixFormer: End-to-End Tracking with Iterative Mixed Attention [Models and Raw results] (G

Jun 30, 2022
Temporally Efficient Vision Transformer for Video Instance Segmentation, CVPR 2022, Oral
Temporally Efficient Vision Transformer for Video Instance Segmentation, CVPR 2022, Oral

Temporally Efficient Vision Transformer for Video Instance Segmentation Temporally Efficient Vision Transformer for Video Instance Segmentation (CVPR

Jun 22, 2022
Focal Sparse Convolutional Networks for 3D Object Detection (CVPR 2022, Oral)
Focal Sparse Convolutional Networks for 3D Object Detection (CVPR 2022, Oral)

Focal Sparse Convolutional Networks for 3D Object Detection (CVPR 2022, Oral) This is the official implementation of Focals Conv (CVPR 2022), a new sp

Jun 30, 2022
Official pytorch implementation of paper "Inception Convolution with Efficient Dilation Search" (CVPR 2021 Oral).

IC-Conv This repository is an official implementation of the paper Inception Convolution with Efficient Dilation Search. Getting Started Download Imag

Mar 21, 2022
Official PyTorch implementation of RobustNet (CVPR 2021 Oral)
Official PyTorch implementation of RobustNet (CVPR 2021 Oral)

RobustNet (CVPR 2021 Oral): Official Project Webpage Codes and pretrained models will be released soon. This repository provides the official PyTorch

Jun 27, 2022