Code release for NeRF (Neural Radiance Fields)

NeRF: Neural Radiance Fields

Project Page | Video | Paper | Data

Open Tiny-NeRF in Colab
Tensorflow implementation of optimizing a neural representation for a single scene and rendering new views.

NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
Ben Mildenhall*1, Pratul P. Srinivasan*1, Matthew Tancik*1, Jonathan T. Barron2, Ravi Ramamoorthi3, Ren Ng1
1UC Berkeley, 2Google Research, 3UC San Diego
*denotes equal contribution
in ECCV 2020 (Oral Presentation, Best Paper Honorable Mention)

TL;DR quickstart

To setup a conda environment, download example training data, begin the training process, and launch Tensorboard:

conda env create -f environment.yml
conda activate nerf
bash download_example_data.sh
python run_nerf.py --config config_fern.txt
tensorboard --logdir=logs/summaries --port=6006

If everything works without errors, you can now go to localhost:6006 in your browser and watch the "Fern" scene train.

Setup

Python 3 dependencies:

  • Tensorflow 1.15
  • matplotlib
  • numpy
  • imageio
  • configargparse

The LLFF data loader requires ImageMagick.

We provide a conda environment setup file including all of the above dependencies. Create the conda environment nerf by running:

conda env create -f environment.yml

You will also need the LLFF code (and COLMAP) set up to compute poses if you want to run on your own real data.

What is a NeRF?

A neural radiance field is a simple fully connected network (weights are ~5MB) trained to reproduce input views of a single scene using a rendering loss. The network directly maps from spatial location and viewing direction (5D input) to color and opacity (4D output), acting as the "volume" so we can use volume rendering to differentiably render new views.

Optimizing a NeRF takes between a few hours and a day or two (depending on resolution) and only requires a single GPU. Rendering an image from an optimized NeRF takes somewhere between less than a second and ~30 seconds, again depending on resolution.

Running code

Here we show how to run our code on two example scenes. You can download the rest of the synthetic and real data used in the paper here.

Optimizing a NeRF

Run

bash download_example_data.sh

to get the our synthetic Lego dataset and the LLFF Fern dataset.

To optimize a low-res Fern NeRF:

python run_nerf.py --config config_fern.txt

After 200k iterations (about 15 hours), you should get a video like this at logs/fern_test/fern_test_spiral_200000_rgb.mp4:

ferngif

To optimize a low-res Lego NeRF:

python run_nerf.py --config config_lego.txt

After 200k iterations, you should get a video like this:

legogif

Rendering a NeRF

Run

bash download_example_weights.sh

to get a pretrained high-res NeRF for the Fern dataset. Now you can use render_demo.ipynb to render new views.

Replicating the paper results

The example config files run at lower resolutions than the quantitative/qualitative results in the paper and video. To replicate the results from the paper, start with the config files in paper_configs/. Our synthetic Blender data and LLFF scenes are hosted here and the DeepVoxels data is hosted by Vincent Sitzmann here.

Extracting geometry from a NeRF

Check out extract_mesh.ipynb for an example of running marching cubes to extract a triangle mesh from a trained NeRF network. You'll need the install the PyMCubes package for marching cubes plus the trimesh and pyrender packages if you want to render the mesh inside the notebook:

pip install trimesh pyrender PyMCubes

Generating poses for your own scenes

Don't have poses?

We recommend using the imgs2poses.py script from the LLFF code. Then you can pass the base scene directory into our code using --datadir <myscene> along with -dataset_type llff. You can take a look at the config_fern.txt config file for example settings to use for a forward facing scene. For a spherically captured 360 scene, we recomment adding the --no_ndc --spherify --lindisp flags.

Already have poses!

In run_nerf.py and all other code, we use the same pose coordinate system as in OpenGL: the local camera coordinate system of an image is defined in a way that the X axis points to the right, the Y axis upwards, and the Z axis backwards as seen from the image.

Poses are stored as 3x4 numpy arrays that represent camera-to-world transformation matrices. The other data you will need is simple pinhole camera intrinsics (hwf = [height, width, focal length]) and near/far scene bounds. Take a look at our data loading code to see more.

Citation

@inproceedings{mildenhall2020nerf,
  title={NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis},
  author={Ben Mildenhall and Pratul P. Srinivasan and Matthew Tancik and Jonathan T. Barron and Ravi Ramamoorthi and Ren Ng},
  year={2020},
  booktitle={ECCV},
}
Comments
  • What can we do with our own trained model?

    What can we do with our own trained model?

    I've trained my own nerf model using 80~100 real images successfully, it takes 8 hours for 50000 iters (probably my GPU not so good), so i think it will take four times hours for 20k iters, it's too long.

    And i have no idea about how to use the trained model, because it's not like most general deep learning models. In my opinion, when we train a model for a specific object, then this model can only be tested for images which similar to the train set. so what can we do with our own trained model? all i know is that we can use it to extract mesh model.

    My goal is to build a 3D model with some 2D images, i don't konw whether this repo can achieve it or not. The biggest problem is how to use depth map to add color to the extract mesh model. Can you show some guideline for me? I'll be very appreciate.

  • How to translate depth in NDC to real depth?

    How to translate depth in NDC to real depth?

    When in NDC space, the predicted depth has range 0~1. How to translate that into real depth? We know that predicted depth=0 means that it's at the near plane, which is at real distance 1.0, how about predicted depth=1? In the formula it corresponds to infinity, but in reality, do we translate that into the real farthest depth provided by COLMAP? And how about the values that are between 0 and 1?

    I ask this question because I want to reconstruct the LLFF data in 3D. Using the predicted depth in 0~1 gives something visually plausible but I think it is mathematically wrong, we need to convert it to real depth. Screenshot from 2020-04-22 22-21-54

  • tips for training real 360 inward-facing scene

    tips for training real 360 inward-facing scene

    I tried to train on my own 360 inward-facing scenes, however, there is a huge portion of noise in the output: It doesn't disappear no matter how many steps I train. 002 I follow the suggestion in the readme and add --no_ndc --spherify --lindisp, the other settings are same as fern. I suspect that it is due to the fact that input images has many arbitrary backgrounds, which deteriorates the training? For example some of the training images: IMG_4491 IMG_4500 These arbitrary backgrounds are inevitable unless I have a infinite ground or an infinitely large table... What's your opinion on this problem? Is it really due to the background or any other reason?

  • original blender files

    original blender files

    I followed the instructions to obtain the lego scene from blendswap and I parsed the json files to get the poses and FOV. The poses look correct, but two pieces of info seem to be missing. One is that the scene doesn't appear to be the correct scale. The cameras are within the object without scaling by a factor of 25 or so. The second is that the lego bulldozer scene has a controllable bucket and the renderings in NeRF clearly use a setting other than the default in the blendswap scene.

    Do the authors still have the original blender files they used to render these scenes? The licenses seem fairly permissive. It would be great to use these as a starting point for new experiments, but be able to modify them.

  • LLFF data preprocessing

    LLFF data preprocessing

    From what I can decipher, the pose_bounds.npy contains 3x5 pose matrices and 2 depth bounds for each image. Each pose has [R T] as the left 3x4 matrix and [H W F] as the right 3x1 matrix.

    However I get confused by the functions poses_avg and recenter_poses. What do these functions do and why?

    I checked the original code but there aren't these averaging or recentering.

  • about render_path_spiral and viewmatrix

    about render_path_spiral and viewmatrix

    hello, i'm a novice in this field. I was confused about two functions below, they seem to perform transformation between coordinates, could you give more details?

    def render_path_spiral(c2w, up, rads, focal, zdelta, zrate, rots, N):
        render_poses = []
        rads = np.array(list(rads) + [1.])
        hwf = c2w[:,4:5]
        for theta in np.linspace(0., 2. * np.pi * rots, N+1)[:-1]:
            c = np.dot(c2w[:3,:4], np.array([np.cos(theta), -np.sin(theta), -np.sin(theta*zrate), 1.]) * rads) 
            z = normalize(c - np.dot(c2w[:3,:4], np.array([0,0,-focal, 1.])))
            render_poses.append(np.concatenate([viewmatrix(z, up, c), hwf], 1))
        return render_poses
    
    def viewmatrix(z, up, pos):
        vec2 = normalize(z)
        vec1_avg = up
        vec0 = normalize(np.cross(vec1_avg, vec2)) #np.cross叉积
        vec1 = normalize(np.cross(vec2, vec0))
        m = np.stack([vec0, vec1, vec2, pos], 1)
        return m
    

    hope you can help me. Thanks!

  • Opencv extrinsic instead of colmap

    Opencv extrinsic instead of colmap

    Hi, Thanks for your great work of nerf ! Actually, Using colmap to estimate camera extrinsic spends much time (the installation and runing reconstrusction code...). I tried to put a chessboard in my own llff data and then calibrate it by using Opencv. However, the camera parameters could not be correctly used in nerf. I think may be Opencv and colmap have different coordinate systems which makes the camera extrinsic in Opencv can not be used in the code. If you know the answer about how to transform the extrinsic, please tell me. THANK U very much !!

  • Question about ray directions calculation in code.

    Question about ray directions calculation in code.

    dirs = np.stack([(i-W*.5)/focal, -(j-H*.5)/focal, -np.ones_like(i)], -1)

    'i','j' is the pixel coordinates, why this is the ray direction? Are there some assumptions?

    Thanks a lot!

  • Depth GT

    Depth GT

    Hi, I just want to confirm that the depth images in the testing dataset of nerf_synthetic are the ground truth or not?

    If they are ground truth, could you please also release the depth images for the training and validation datasets?

    Many Thanks!

  • How to read the depth map?

    How to read the depth map?

    I read the depth map in synthetic test file by d[u][v] = (255-depth[u][v])/255, but the final results seems not consistent in multi-view. So how can I get the accurate depth map?

  • Question about NERF for indoor scenes? Invalid Disparity value (NAN or Inf) during training and failed training

    Question about NERF for indoor scenes? Invalid Disparity value (NAN or Inf) during training and failed training

    Hi,

    I am trying to use NERF to learn the implicit representation of a indoor room. I generate a sequence of around 100 images with camera facing towards one end of the room (for example, the camera are located at the left side of the indoor room and look at the right side) without too much view point variation.

    I personally think indoor scenes as LLFF data (the watched wall region) but with a much larger distance to the camera since all the stuff such as tables, chairs in the middle of the room is still valuble compared to that of LLFF data in which the scene is somewhat close to the camera and not much other space to estimate occuopancy between camera and the scenes

    I set depth range to [0.1m, 10m] without NDC option and linspace( sample in depth range instead of disp range) .

    During training, I sometimes met the numerical problems of sigular disp values and fail the training..

    Does this because the NERF is not good at this setup or is there ant tips for NERF to work on indoor scenes?

  • Why uniform sampling in disparity affects performance w.r.t. positional encoding

    Why uniform sampling in disparity affects performance w.r.t. positional encoding

    For the LLFF forward facing scenes, ndc is used. Issue-18 says that with disparity sampling, higher frequencies of positional encoding confuses NeRF. In that case, why not input inverse z as input to the positional encoder instead of z?

  • config of vasedeck

    config of vasedeck

    Hi, thanks for your great work. Is there a config.txt file that provides vasedeck data?

    I used horn's config to train vasedeck and the effect is not very good. image

  • A question about the trick of ndc_rays

    A question about the trick of ndc_rays

    Hi all experts, I am confused why the near is set to tf.cast(1., tf.float32) here.
    In the NeRF paper it states the origin is shifted to the near plane, does this mean that it always assumes the near plane starts from z=-1? (Appendix C in NeRF paper: NDC ray space derivation)

    # In func render()
    if ndc:
        # for forward facing scenes
        rays_o, rays_d = ndc_rays(
            H, W, focal, tf.cast(1., tf.float32), rays_o, rays_d)
    

    Thanks in advance for the kindly answer and discussion

  • Question about include_input

    Question about include_input

    I have and minor questions related to include_input args and figure 7.

    Q1. I'm curious as to what you think of the effect on the presence or absence of input. There is an analysis of positional encoding, but the effect of the presence or absence of input is not mentioned in the paper. From my understanding, the paper doesn't mention related to "include_input" effects and the figure also doesn't represent the input part. Can you clearly explain the effect of concating the input?

    Q2. Have you compared the minor input level options (add vs concat) from Transformer architecture [47]? It looks like your code uses the concatenation operation instead of the add operation. Have you compared the results of each operation and chosen a better one?

    Q3. Is the location of the density estimation part shown in the figure correct?? From my understanding, the density estimation part (figure 7) should be moved before concating the viewing direction (9th layer -> 8th layer).

    image

    https://github.com/bmild/nerf/blob/20a91e764a28816ee2234fcadb73bd59a613a44c/run_nerf_helpers.py#L34

  • Set up Discussions

    Set up Discussions

    I'm interested in this project and have some questions. But, I always feel bad about opening up issues for this sort of discussion. Would the team be open to setting up Discussions?

(Arxiv 2021) NeRF--: Neural Radiance Fields Without Known Camera Parameters

NeRF--: Neural Radiance Fields Without Known Camera Parameters Project Page | Arxiv | Colab Notebook | Data Zirui Wang¹, Shangzhe Wu², Weidi Xie², Min

Aug 6, 2022
Unofficial & improved implementation of NeRF--: Neural Radiance Fields Without Known Camera Parameters
Unofficial & improved implementation of NeRF--: Neural Radiance Fields Without Known Camera Parameters

[Unofficial code-base] NeRF--: Neural Radiance Fields Without Known Camera Parameters [ Project | Paper | Official code base ] ⬅️ Thanks the original

Jul 31, 2022
Mip-NeRF: A Multiscale Representation for Anti-Aliasing Neural Radiance Fields.
Mip-NeRF: A Multiscale Representation for Anti-Aliasing Neural Radiance Fields.

This repository contains the code release for Mip-NeRF: A Multiscale Representation for Anti-Aliasing Neural Radiance Fields. This implementation is written in JAX, and is a fork of Google's JaxNeRF implementation. Contact Jon Barron if you encounter any issues.

Aug 1, 2022
This repository contains a PyTorch implementation of "AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis".
This repository contains a PyTorch implementation of

AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis | Project Page | Paper | PyTorch implementation for the paper "AD-NeRF: Audio

Aug 1, 2022
A PyTorch implementation of NeRF (Neural Radiance Fields) that reproduces the results.
A PyTorch implementation of NeRF (Neural Radiance Fields) that reproduces the results.

NeRF-pytorch NeRF (Neural Radiance Fields) is a method that achieves state-of-the-art results for synthesizing novel views of complex scenes. Here are

Jul 30, 2022
D-NeRF: Neural Radiance Fields for Dynamic Scenes
 D-NeRF: Neural Radiance Fields for Dynamic Scenes

D-NeRF: Neural Radiance Fields for Dynamic Scenes [Project] [Paper] D-NeRF is a method for synthesizing novel views, at an arbitrary point in time, of

Jul 29, 2022
Pytorch implementation for A-NeRF: Articulated Neural Radiance Fields for Learning Human Shape, Appearance, and Pose
Pytorch implementation for A-NeRF: Articulated Neural Radiance Fields for Learning Human Shape, Appearance, and Pose

A-NeRF: Articulated Neural Radiance Fields for Learning Human Shape, Appearance, and Pose Paper | Website | Data A-NeRF: Articulated Neural Radiance F

Aug 3, 2022
A minimal TPU compatible Jax implementation of NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis

NeRF Minimal Jax implementation of NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis. Result of Tiny-NeRF RGB Depth

Jul 24, 2022
Build upon neural radiance fields to create a scene-specific implicit 3D semantic representation, Semantic-NeRF
Build upon neural radiance fields to create a scene-specific implicit 3D semantic representation, Semantic-NeRF

Semantic-NeRF: Semantic Neural Radiance Fields Project Page | Video | Paper | Data In-Place Scene Labelling and Understanding with Implicit Scene Repr

Aug 4, 2022
Point-NeRF: Point-based Neural Radiance Fields
Point-NeRF: Point-based Neural Radiance Fields

Point-NeRF: Point-based Neural Radiance Fields Project Sites | Paper | Primary c

Aug 2, 2022
Official code release for "GRAF: Generative Radiance Fields for 3D-Aware Image Synthesis"
Official code release for

GRAF This repository contains official code for the paper GRAF: Generative Radiance Fields for 3D-Aware Image Synthesis. You can find detailed usage i

Aug 5, 2022
Instant-nerf-pytorch - NeRF trained SUPER FAST in pytorch

instant-nerf-pytorch This is WORK IN PROGRESS, please feel free to contribute vi

Jul 25, 2022
This is the code for Deformable Neural Radiance Fields, a.k.a. Nerfies.

Deformable Neural Radiance Fields This is the code for Deformable Neural Radiance Fields, a.k.a. Nerfies. Project Page Paper Video This codebase conta

Aug 5, 2022
Open source repository for the code accompanying the paper 'Non-Rigid Neural Radiance Fields Reconstruction and Novel View Synthesis of a Deforming Scene from Monocular Video'.
Open source repository for the code accompanying the paper 'Non-Rigid Neural Radiance Fields Reconstruction and Novel View Synthesis of a Deforming Scene from Monocular Video'.

Non-Rigid Neural Radiance Fields This is the official repository for the project "Non-Rigid Neural Radiance Fields: Reconstruction and Novel View Synt

Jul 29, 2022
Code for KiloNeRF: Speeding up Neural Radiance Fields with Thousands of Tiny MLPs
Code for KiloNeRF: Speeding up Neural Radiance Fields with Thousands of Tiny MLPs

KiloNeRF: Speeding up Neural Radiance Fields with Thousands of Tiny MLPs Check out the paper on arXiv: https://arxiv.org/abs/2103.13744 This repo cont

Jul 28, 2022
This repository contains the source code for the paper "DONeRF: Towards Real-Time Rendering of Compact Neural Radiance Fields using Depth Oracle Networks",
This repository contains the source code for the paper

DONeRF: Towards Real-Time Rendering of Compact Neural Radiance Fields using Depth Oracle Networks Project Page | Video | Presentation | Paper | Data L

Jul 30, 2022
This is the code for "HyperNeRF: A Higher-Dimensional Representation for Topologically Varying Neural Radiance Fields".

HyperNeRF: A Higher-Dimensional Representation for Topologically Varying Neural Radiance Fields This is the code for "HyperNeRF: A Higher-Dimensional

Aug 3, 2022
PyTorch implementation for MINE: Continuous-Depth MPI with Neural Radiance Fields
PyTorch implementation for  MINE: Continuous-Depth MPI with Neural Radiance Fields

MINE: Continuous-Depth MPI with Neural Radiance Fields Project Page | Video PyTorch implementation for our ICCV 2021 paper. MINE: Towards Continuous D

Jul 31, 2022
BARF: Bundle-Adjusting Neural Radiance Fields 🤮 (ICCV 2021 oral)

BARF ?? : Bundle-Adjusting Neural Radiance Fields Chen-Hsuan Lin, Wei-Chiu Ma, Antonio Torralba, and Simon Lucey IEEE International Conference on Comp

Jul 28, 2022