In recent years,addressing ill-posed problems by leveraging prior knowledge contained in databases on learning techniques has gained much attention.In this paper,we focus on complete three-dimensional(3D)point cloud r...In recent years,addressing ill-posed problems by leveraging prior knowledge contained in databases on learning techniques has gained much attention.In this paper,we focus on complete three-dimensional(3D)point cloud reconstruction based on a single red-green-blue(RGB)image,a task that cannot be approached using classical reconstruction techniques.For this purpose,we used an encoder-decoder framework to encode the RGB information in latent space,and to predict the 3D structure of the considered object from different viewpoints.The individual predictions are combined to yield a common representation that is used in a module combining camera pose estimation and rendering,thereby achieving differentiability with respect to imaging process and the camera pose,and optimization of the two-dimensional prediction error of novel viewpoints.Thus,our method allows end-to-end training and does not require supervision based on additional ground-truth(GT)mask annotations or ground-truth camera pose annotations.Our evaluation of synthetic and real-world data demonstrates the robustness of our approach to appearance changes and self-occlusions,through outperformance of current state-of-the-art methods in terms of accuracy,density,and model completeness.展开更多
Recently, neural implicit function-basedrepresentation has attracted more and more attention,and has been widely used to represent surfacesusing differentiable neural networks. However, surfacereconstruction from poin...Recently, neural implicit function-basedrepresentation has attracted more and more attention,and has been widely used to represent surfacesusing differentiable neural networks. However, surfacereconstruction from point clouds or multi-view imagesusing existing neural geometry representations stillsuffer from slow computation and poor accuracy. Toalleviate these issues, we propose a multi-scale hashencoding-based neural geometry representation whicheffectively and efficiently represents the surface asa signed distance field. Our novel neural networkstructure carefully combines low-frequency Fourierposition encoding with multi-scale hash encoding. Theinitialization of the geometry network and geometryfeatures of the rendering module are accordinglyredesigned. Our experiments demonstrate that theproposed representation is at least 10 times faster forreconstructing point clouds with millions of points.It also significantly improves speed and accuracyof multi-view reconstruction. Our code and modelsare available at https://github.com/Dengzhi-USTC/Neural-Geometry-Reconstruction.展开更多
基金Supported by National Natural Science Foundation of China(Grant No.51935003).
文摘In recent years,addressing ill-posed problems by leveraging prior knowledge contained in databases on learning techniques has gained much attention.In this paper,we focus on complete three-dimensional(3D)point cloud reconstruction based on a single red-green-blue(RGB)image,a task that cannot be approached using classical reconstruction techniques.For this purpose,we used an encoder-decoder framework to encode the RGB information in latent space,and to predict the 3D structure of the considered object from different viewpoints.The individual predictions are combined to yield a common representation that is used in a module combining camera pose estimation and rendering,thereby achieving differentiability with respect to imaging process and the camera pose,and optimization of the two-dimensional prediction error of novel viewpoints.Thus,our method allows end-to-end training and does not require supervision based on additional ground-truth(GT)mask annotations or ground-truth camera pose annotations.Our evaluation of synthetic and real-world data demonstrates the robustness of our approach to appearance changes and self-occlusions,through outperformance of current state-of-the-art methods in terms of accuracy,density,and model completeness.
基金supported by the National Natural Science Foundation of China(Nos.62122071 and 62272433)the Fundamental Research Funds for the Central Universities(No.WK3470000021)the Alibaba Innovation Research Program(AIR).
文摘Recently, neural implicit function-basedrepresentation has attracted more and more attention,and has been widely used to represent surfacesusing differentiable neural networks. However, surfacereconstruction from point clouds or multi-view imagesusing existing neural geometry representations stillsuffer from slow computation and poor accuracy. Toalleviate these issues, we propose a multi-scale hashencoding-based neural geometry representation whicheffectively and efficiently represents the surface asa signed distance field. Our novel neural networkstructure carefully combines low-frequency Fourierposition encoding with multi-scale hash encoding. Theinitialization of the geometry network and geometryfeatures of the rendering module are accordinglyredesigned. Our experiments demonstrate that theproposed representation is at least 10 times faster forreconstructing point clouds with millions of points.It also significantly improves speed and accuracyof multi-view reconstruction. Our code and modelsare available at https://github.com/Dengzhi-USTC/Neural-Geometry-Reconstruction.