PyTorch implementation for MINE: Continuous-Depth MPI with Neural Radiance Fields

MINE: Continuous-Depth MPI with Neural Radiance Fields

Project Page | Video

PyTorch implementation for our ICCV 2021 paper.

MINE: Towards Continuous Depth MPI with NeRF for Novel View Synthesis
Jiaxin Li*1, Zijian Feng*1, Qi She1, Henghui Ding1, Changhu Wang1, Gim Hee Lee2
1ByteDance, 2National University of Singapore
*denotes equal contribution

Our MINE takes a single image as input and densely reconstructs the frustum of the camera, through which we can easily render novel views of the given scene:

ferngif

The overall architecture of our method:

Run training on the LLFF dataset:

Firstly, set up your conda environment:

conda env create -f environment.yml 
conda activate MINE

Download the pre-downsampled version of the LLFF dataset from Google Drive, unzip it and put it in the root of the project, then start training by running the following command:

sh start_training.sh MASTER_ADDR="localhost" MASTER_PORT=1234 N_NODES=1 GPUS_PER_NODE=2 NODE_RANK=0 WORKSPACE=/run/user/3861/vs_tmp DATASET=llff VERSION=debug EXTRA_CONFIG='{"training.gpus": "0,1"}'

You may find the tensorboard logs and checkpoints in the sub-working directory (WORKSPACE + VERSION).

Apart from the LLFF dataset, we experimented on the RealEstate10K, KITTI Raw and the Flowers Light Fields datasets - the data pre-processing codes and training flow for these datasets will be released later.

Running our pretrained models:

We release the pretrained models trained on the RealEstate10K, KITTI and the Flowers datasets:

Dataset N Input Resolution Download Link
RealEstate10K 32 384x256 Google Drive
RealEstate10K 64 384x256 Google Drive
KITTI 32 768x256 Google Drive
KITTI 64 768x256 Google Drive
Flowers 32 512x384 Google Drive
Flowers 64 512x384 Google Drive

To run the models, download the checkpoint and the hyper-parameter yaml file and place them in the same directory, then run the following script:

python3 visualizations/image_to_video.py --checkpoint_path MINE_realestate10k_384x256_monodepth2_N64/checkpoint.pth --gpus 0 --data_path visualizations/home.jpg --output_dir .

Citation

If you find our work helpful to your research, please cite our paper:

@inproceedings{mine2021,
  title={MINE: Towards Continuous Depth MPI with NeRF for Novel View Synthesis},
  author={Jiaxin Li and Zijian Feng and Qi She and Henghui Ding and Changhu Wang and Gim Hee Lee},
  year={2021},
  booktitle={ICCV},
}
Owner
Zijian Feng
machine learning | computer vision | random traveller | music enthusiast
Zijian Feng
Comments
  • KITTI split and LPIPS computation

    KITTI split and LPIPS computation

    Hi,

    Thank you for the fantastic work! I have two small questions regarding model evaluation.

    1. KITTI raw data split Section 4.1 mentions that there are 20 city sequences from KITTI Raw used for training and 4 sequences used for test. However, there are 28 city sequences in KITTI Raw in total. Do you use the rest of 4 sequences anywhere in the pipeline? Are the 20 training sequences and 4 test sequences exactly the same as used in Tulsiani 2018, as implemented here?

    2. LPIPS computation You computed LPIPS here. According the dataloader implemented here, your inputs to LPIPS are in range [0, 1] while LPIPS expects inputs in range [-1, 1] as mentioned in their doc. Am I missing anything here, or the input should indeed be normalized to have the correct LPIPS score?

    Thank you in advance for the time.

  • minimum hardward requirements

    minimum hardward requirements

    Thank you for your nice work!

    What if I want to run your code, do I need 48 V100 GPUs as you mentioned in the paper?

    What are the minimum requirements to run this code?

    Thanks in advance.

  • Question about Training Data Requirements

    Question about Training Data Requirements

    Hi, Thanks for your interesting work.

    I have a question regarding training data but I seem not to be able to find it in the paper. Do you need ground truth depth maps during training or not? Say I give you a purely image dataset like CIFAR-10, can you run your method on this data or it should contain "additional" information? If so, what is this "additional" information?

    I know that during inference you only need the image, but I want to know what information is required during training.

    Sincerely, Hadi.

  • Training on multiple images per scene

    Training on multiple images per scene

    Hi,

    I noticed in your code that there is an option to train MINE with multiple images as input. In that case, there is no scale ambiguity, right? Can you give an example of a data-loader for that case?

  • Why image normalization twice

    Why image normalization twice

    Hi, Image normalization is realized by "img_transforms" when loading image in function of "nerf_dataset.py" . Why normalize the input image again in "ResnetEncoder forward step" ???

  • Qualitative comparision about KITTI

    Qualitative comparision about KITTI

    Hi, there is a qualitative comparision with single-view MPI on KITTI dataset in your paper, but I do not find their pretrained model on KITTI from their repository. Did you train their model to get the qualitative results? Could you provide me a copy of these qualitative results? (just for academic purposes) Thank you.

  • KITTI traning code

    KITTI traning code

    Hi, I was wondering if you plan to release KITTI training code at any time? Apart from this, are the released model checkpoints all pretrained on ImageNet? Thanks!

  • Questions about eq(3), eq (8) and eq(12) in the paper

    Questions about eq(3), eq (8) and eq(12) in the paper

    I have some questions about the equations in the paper. I think those equations should be corrected. If I misunderstood something, please let me know.

    (3) in the paper image

    expected image

    (8) Parenthesis position is somewhat weird. in the paper image expected image

    (12) Scale factor defined in MPI and MINE is in a reverse relationship, but equations do not reflect the difference. in the paper image expected image

  • Correspondence of formula and code(torch.cumprod)

    Correspondence of formula and code(torch.cumprod)

    Dear authors: Thanks for your impressive work. I found the opetation "torch.cumprod" in code def plane_volume_rendering(rgb_BS3HW, sigma_BS1HW, xyz_BS3HW, is_bg_depth_inf): transparency_acc = torch.cumprod(transparency + 1e-6, dim=1) # BxSx1xHxW However, I can't see an equation that contains cumprod opetation in paper "MINE:...". Where should I refer to the corresponding formula. Thanks a lot.

  • out of memory

    out of memory

    I train on the LLFF dataset with two 2080ti gpus, but it reports "out of memory". I changed the batch size from 2 to 1 in config file but still not work. What should I do?

  • Inplement detail about plane homography warping between src camera and tgt camera.

    Inplement detail about plane homography warping between src camera and tgt camera.

    In operations/homography_sampler.py file, 1 Line 107-108 calculate plane homography warping matrix between src camera and tgt camera, following the equation: 2 While the K_inv should be K_tgt_inv, not the K_src_inv, K should be K_src. This issue will not happen when K_tgt=K_src, but cause error when intrinsics are not equal. H_tgt_src = torch.matmul(K_src, torch.matmul(R_tnd, K_tgt_inv))

  • Preprocessing and Training Flow for Other Datasets

    Preprocessing and Training Flow for Other Datasets

    Hello authors, thank you for your great work.

    You noted in the README:

    Apart from the LLFF dataset, we experimented on the RealEstate10K, KITTI Raw and the Flowers Light Fields datasets - the data pre-processing codes and training flow for these datasets will be released later.

    I believe the last update on this was in October 2021, so I am following up. Will you be able to release the dataloaders/code soon?

    All the best,

  • how to prepear my dataset?

    how to prepear my dataset?

    hi, thanks for your good job! but if i want to train my data? how to process? i see the llff data have cameras.bin images.bin,points3D.bin。。。how to get these? could you share the code for that? Thanks.

Related tags
Code release for DS-NeRF (Depth-supervised Neural Radiance Fields)
Code release for DS-NeRF (Depth-supervised Neural Radiance Fields)

Depth-supervised NeRF: Fewer Views and Faster Training for Free Project | Paper | YouTube Pytorch implementation of our method for learning neural rad

Sep 25, 2022
This repository contains the source code for the paper "DONeRF: Towards Real-Time Rendering of Compact Neural Radiance Fields using Depth Oracle Networks",
This repository contains the source code for the paper

DONeRF: Towards Real-Time Rendering of Compact Neural Radiance Fields using Depth Oracle Networks Project Page | Video | Presentation | Paper | Data L

Sep 23, 2022
This repository contains a PyTorch implementation of "AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis".
This repository contains a PyTorch implementation of

AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis | Project Page | Paper | PyTorch implementation for the paper "AD-NeRF: Audio

Sep 23, 2022
A PyTorch implementation of NeRF (Neural Radiance Fields) that reproduces the results.
A PyTorch implementation of NeRF (Neural Radiance Fields) that reproduces the results.

NeRF-pytorch NeRF (Neural Radiance Fields) is a method that achieves state-of-the-art results for synthesizing novel views of complex scenes. Here are

Sep 22, 2022
A PyTorch re-implementation of Neural Radiance Fields
A PyTorch re-implementation of Neural Radiance Fields

nerf-pytorch A PyTorch re-implementation Project | Video | Paper NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis Ben Mildenhall

Sep 17, 2022
Pytorch implementation for A-NeRF: Articulated Neural Radiance Fields for Learning Human Shape, Appearance, and Pose
Pytorch implementation for A-NeRF: Articulated Neural Radiance Fields for Learning Human Shape, Appearance, and Pose

A-NeRF: Articulated Neural Radiance Fields for Learning Human Shape, Appearance, and Pose Paper | Website | Data A-NeRF: Articulated Neural Radiance F

Sep 15, 2022
Neural Radiance Fields Using PyTorch
Neural Radiance Fields Using PyTorch

This project is a PyTorch implementation of Neural Radiance Fields (NeRF) for reproduction of results whilst running at a faster speed.

Feb 11, 2022
SatelliteNeRF - PyTorch-based Neural Radiance Fields adapted to satellite domain
SatelliteNeRF - PyTorch-based Neural Radiance Fields adapted to satellite domain

SatelliteNeRF PyTorch-based Neural Radiance Fields adapted to satellite domain.

Sep 5, 2022
Unofficial & improved implementation of NeRF--: Neural Radiance Fields Without Known Camera Parameters
Unofficial & improved implementation of NeRF--: Neural Radiance Fields Without Known Camera Parameters

[Unofficial code-base] NeRF--: Neural Radiance Fields Without Known Camera Parameters [ Project | Paper | Official code base ] ⬅️ Thanks the original

Sep 20, 2022
This is a JAX implementation of Neural Radiance Fields for learning purposes.

learn-nerf This is a JAX implementation of Neural Radiance Fields for learning purposes. I've been curious about NeRF and its follow-up work for a whi

Aug 22, 2022
A minimal TPU compatible Jax implementation of NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis

NeRF Minimal Jax implementation of NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis. Result of Tiny-NeRF RGB Depth

Jul 24, 2022
Implementation of "Generalizable Neural Performer: Learning Robust Radiance Fields for Human Novel View Synthesis"
Implementation of

Generalizable Neural Performer: Learning Robust Radiance Fields for Human Novel View Synthesis Abstract: This work targets at using a general deep lea

Sep 23, 2022
This is the code for Deformable Neural Radiance Fields, a.k.a. Nerfies.

Deformable Neural Radiance Fields This is the code for Deformable Neural Radiance Fields, a.k.a. Nerfies. Project Page Paper Video This codebase conta

Sep 22, 2022
Open source repository for the code accompanying the paper 'Non-Rigid Neural Radiance Fields Reconstruction and Novel View Synthesis of a Deforming Scene from Monocular Video'.
Open source repository for the code accompanying the paper 'Non-Rigid Neural Radiance Fields Reconstruction and Novel View Synthesis of a Deforming Scene from Monocular Video'.

Non-Rigid Neural Radiance Fields This is the official repository for the project "Non-Rigid Neural Radiance Fields: Reconstruction and Novel View Synt

Sep 22, 2022
(Arxiv 2021) NeRF--: Neural Radiance Fields Without Known Camera Parameters

NeRF--: Neural Radiance Fields Without Known Camera Parameters Project Page | Arxiv | Colab Notebook | Data Zirui Wang¹, Shangzhe Wu², Weidi Xie², Min

Sep 21, 2022
Mip-NeRF: A Multiscale Representation for Anti-Aliasing Neural Radiance Fields.
Mip-NeRF: A Multiscale Representation for Anti-Aliasing Neural Radiance Fields.

This repository contains the code release for Mip-NeRF: A Multiscale Representation for Anti-Aliasing Neural Radiance Fields. This implementation is written in JAX, and is a fork of Google's JaxNeRF implementation. Contact Jon Barron if you encounter any issues.

Sep 26, 2022
Code for KiloNeRF: Speeding up Neural Radiance Fields with Thousands of Tiny MLPs
Code for KiloNeRF: Speeding up Neural Radiance Fields with Thousands of Tiny MLPs

KiloNeRF: Speeding up Neural Radiance Fields with Thousands of Tiny MLPs Check out the paper on arXiv: https://arxiv.org/abs/2103.13744 This repo cont

Sep 16, 2022
BARF: Bundle-Adjusting Neural Radiance Fields 🤮 (ICCV 2021 oral)

BARF ?? : Bundle-Adjusting Neural Radiance Fields Chen-Hsuan Lin, Wei-Chiu Ma, Antonio Torralba, and Simon Lucey IEEE International Conference on Comp

Sep 17, 2022
[ICCV21] Self-Calibrating Neural Radiance Fields
[ICCV21] Self-Calibrating Neural Radiance Fields

Self-Calibrating Neural Radiance Fields, ICCV, 2021 Project Page | Paper | Video Author Information Yoonwoo Jeong [Google Scholar] Seokjun Ahn [Google

Sep 22, 2022