We introduce Neural Sparse Voxel Fields (NSVF), a new neural scene representation for fast and high-quality free-viewpoint rendering. NSVF defines a set of voxel-bounded implicit fields organized in a sparse voxel octree to model local properties in each cell. We progressively learn the underlying voxel structures with a diffentiable ray-marching operation from only a set of posed RGB images. With the sparse voxel octree structure, rendering novel views can be accelerated by skipping the voxels containing no relevant scene content. Our method is over 10 times faster than the state-of-the-art (namely, NeRF (Mildenhall et al., 2020)) at inference time while achieving higher quality results. Furthermore, by utilizing an explicit sparse voxel representation, our method can easily be applied to scene editing and scene composition. We also demonstrate several challenging tasks, including multi-scene learning, free-viewpoint rendering of a moving human, and large-scale scene rendering.

Full Video

Synthetic Results

Results of the BlendedMVS Dataset

Results of the Tanks&Temples Dataset

Results of Zoom In and Zoom Out

Results of a Dynamic Scene

Results of ScanNet Scenes

Results of Scene Editing and Composition


  title={Neural Sparse Voxel Fields},
  author={Liu, Lingjie and Gu, Jiatao and Lin, Kyaw Zaw and Chua, Tat-Seng and Theobalt, Christian},

This page is Zotero translator friendly. Page last updated Imprint. Data Protection.