CaesarNeRF: Calibrated Semantic Representation for Few-shot Generalizable Neural Rendering

Haidong Zhu^1,*

Tianyu Ding^2,*,†

Tianyi Chen²

Ilya Zharkov²

Ram Nevatia¹

Luming Liang^2,†

¹University of Southern California

²Microsoft

Novel view synthesis for novel scenes using ONE reference view on Shiny, LLFF, and MVImgNet (top to bottom). Each pair of images corresponds to the results from GNT (left) and CaesarNeRF (right).

Abstract

Generalizability and few-shot learning are key challenges in Neural Radiance Fields (NeRF), often due to the lack of a holistic understanding in pixel-level rendering. We introduce CaesarNeRF, an end-to-end approach that leverages scene-level CAlibratEd SemAntic Representation along with pixel-level representations to advance few-shot, generalizable neural rendering, facilitating a holistic understanding without compromising high-quality details. CaesarNeRF explicitly models pose differences of reference views to combine scene-level semantic representations, providing a calibrated holistic understanding. This calibration process aligns various viewpoints with precise location and is further enhanced by sequential refinement to capture varying details. Extensive experiments on public datasets, including LLFF, Shiny, mip-NeRF 360, and MVImgNet, show that CaesarNeRF delivers state-of-the-art performance across varying numbers of reference views, proving effective even with a single reference image.

TL;DR:

We incoperate holistic scene understanding along with pixel-level rendering for neural radiance field.

Singe-scene rendering

Generalizable rendering with few reference views

If you find our work helpful, please feel free to use the following BibTex entry


                    @article{zhu2023caesarnerf,

                      author = {Zhu, Haidong and Ding, Tianyu and Chen, Tianyi and Zharkov, Ilya and Nevatia, Ram and Liang, Luming},

                      title  = {CaesarNeRF: Calibrated Semantic Representation for Few-shot Generalizable Neural Rendering},

                      journal = {arXiv preprint arXiv:2311.15510},

                      year   = {2023},

                }

Acknowledgement

This webpage is borrowed from FreeNeRF and RefNeRF. We sincerely thank the authors for their great work.

IBRNet	GPNR	NeuRay	Ours	Ground-truth

IBRNet	GPNR	NeuRay	Ours	Ground-truth

CaesarNeRF: Calibrated Semantic Representation for Few-shot Generalizable Neural Rendering

Abstract

TL;DR:

Singe-scene rendering

Generalizable rendering with few reference views

One Reference View

Two Reference Views

Three Reference Views

Acknowledgement